Performance inspiration found in List.Contains() method plus some additional functionality we can add.
While I was rewriting a more complete stack implementation I was using the actual generic stack as a guideline to make sure I implemented all of their methods and I came across their Contains implementation. There were some out of place operations in the code that would make it perform less than optimal, so I rewrote the code to get rid of the performance hit. That got me to thinking about the List implementation and whether or not the same performance problem was present there. Thankfully, the code there was written correctly. You can see what this would look like:
public class List<T> {
private int _size;
private T[] _items;
public bool Contains( T item ) {
if ( item == null ) {
for ( int i = 0; i < this._size; i++ ) {
if ( this._items[i] == null ) {
return true;
}
}
return false;
}
IComparer<T> comparer = Comparer<T>.Default;
for ( int i = 0; i < this._size; i++ ) {
if ( comparer.Equals( item, this._items[i] ) ) {
return true;
}
}
return false;
}
}
Notice that the null check loop is separate from the comparison loop. This drastically improves our performance when searching or a null item, and not only that, it takes an extra null check out of the comparer loop. For a comparison you'll have to take a look at the Stack<T>.Contains implementation:
public class Stack<T> {
private int _size;
private T[] _array;
public bool Contains( T item ) {
int offset = this._size;
IComparer<T> comparer = Comparer<T>.Default;
while ( offset-- > 0 ) {
if ( item == null ) {
if ( this._array[offset] == null ) {
return true;
}
continue;
}
if ( comparer.Equals( item, this._array[offset] ) ) {
return true;
}
}
return false;
}
}
Every time through the loop we'll eat a null check now. This is much less performant than the List implementation. Even the List implementation has it's issues though. The Comparer<T> has a fairly deep code-path. For instance, it creates a GenericComparer based on the underlying types IComparable<T> interface. This in turn has very specific code for the Equals method. Namely, the GenericComparer does the damn null check again, followed by an Equals call... If you are using an Int32, this will get routed to Int32.Equals(Int32)... and finally gets turned into a basic equality (==) check.
That is one hell of a code-path, but it is 100% completely necessary. The unfortunality is that you can't apply the equality (==) operator when using generics meaning a Comparer<T> has to be used, else you need to rely on the boxing object.Equals method. In turn not every element is required to implement IComparable<T> either, so you might wind up with a GenericComparer (built based on the IComparable<T> interface) or an ObjectComparer. As you might guess if you wind up with an ObjectComparer, then you are just doing the boxing .Equals anyway.
The code-paths are built for a balance of functionality and performance, but you don't really get the best performance or best functionality. They weren't willing to sacrifice one for the other. There really is a second option. The possibility exists to add a new version of the Contains method that takes a custom comparison routine. They allow this for other methods, just not the Contains method. Even better, it would allow an Equality delegate... If you are looking for the maximum performance, then you want to reduce the code-path to the minimal number of instructions. If you are doing intetger equality you want to reduce the code-path to an equality (==) check, but that simply isn't possible, so you have to eat at least one level of indirection. An equality delegate would do just what we are looking for and anonymous methods would be the best way to use it.
List<int> foo = new List<int>(); foo.Add(10); foo.Add(20);
Console.WriteLine(foo.Contains(10, (Equality<int>) delegate(int x, int y) { return x == y; }));
Console.WriteLine(foo.Contains((TrickEquality<int>) delegate(int y) { return y == 10; }));
In the future we may see some architectural changes take place. Generic types are closed meaning they are built for each set of type parameters passed to them. It seems to me that various closure time conditions could be made to select between a set of different options based on the type specified in the type parameters. For instance, it can be determined at closure whether or not certain methods are available, and if they are, then select them. Coding a collection would be a matter of selecting the methods you'd want in order of some precedence, falling back to a method available across the board. Other options would be the ability to constrain based on operators (something that is heavily requested, but not available in Whidbey, maybe later) even though that would limit a collection from performing at the most generic levels.
Closure conditions would be frigin awesome. Today we implement these with run-time reflection based switches. That is how you get the option between GenericComparer and ObjectComparer. Making these even more efficient is difficult to do. A really good JIT could inline and compress the entire set of operations if they were simpler, but the verbosity of the null checks in the methods would mean a generic optimization would miss this and only a specific optimization would actually find it.