Creating custom collections for better performance and filling in BCL holes.
Collections are fairly retarded in current versions of .NET, but they do get much better in Whidbey. Not retarded from the stand-point of not working, but retarded in that you really have to write your own, from complete scratch, if you want to get the best performance or fill in missing features. A feature that I often find missing is that of a sorted collection. You might be quick to point out SortedList, but that is of course a dictionary and not really a sorted list. It requires keys and stores two collections internally, not one. For these reasons it simply can't be nearly as fast as a custom solution.
To start writing my own custom collection, I thought seriously hard about how items are going to get added. In all normal scenarios, I would preallocate an array, keep a count, and simply move elements as necessary to insert new elements as they become available. In any sorted list, this is going to be a balance between making front loaded vs. end loaded copies. The largest problems arise when the number of elements borders on being quite big, in the 1000's or more. I end up copying nearly the entire array from the very beginning up only a single element several times.
public class SortedArray {
private const int defaultCapacity = 10;
private int[] sortedArray;
private int count = 0;
public SortedArray() : this(defaultCapacity) {
}
public SortedArray(int initialCapacity) {
sortedArray = new int[((initialCapacity < defaultCapacity) ? defaultCapacity : initialCapacity)];
}
public void Add(int value) {
if ( count == sortedArray.Length ) { EnsureCapacity(); }
int result = Array.BinarySearch(sortedArray, 0, count, value);
if ( result < 0 ) { result = ~result; }
Array.Copy(sortedArray, result, sortedArray, result + 1, count - result);
sortedArray[result] = value;
count++;
}
public int Count {
get {
return count;
}
}
public int this[int index] {
get {
return sortedArray[index];
}
}
private void EnsureCapacity() {
Console.WriteLine("Grow!");
int[] newArray = new int[sortedArray.Length * 2];
Array.Copy(sortedArray, 0, newArray, 0, sortedArray.Length);
sortedArray = newArray;
}
}
Okay, so knowing that I'm copying larger portions of the array than might be necessary, can I improve the collection any? If I center my elements in my array and allow copying of only half the array at any time by moving elements not only forwards, but also backwards, then I can cut down on some of the movement within the array. The trade-off is that the mechanics or complexity of adding items and accessing the array becomes more complex. I'll blow up an article with 5 or 6 custom collections I've written at some other time. For now, have some numbers just to see the advantages of making a complex code change, that allows more logical manipulation of the underlying data. Notice how the balanced sorted list is faster at inserting into a larger collection, but they are fairly equal at smaller numbers. I've also included an Array sorting sample that does a sort at the very end of placing all of the numbers. If your algorithm doesn't need to incrementally append items then the sorting is always going to be your best option.
Current Count: 10
Sorted Array: 00:00:02.8140464
Balanced Sorted Array: 00:00:02.7539600
Array Sort: 00:00:01.3319152
Current Count: 50
Sorted Array: 00:00:01.6423616
Balanced Sorted Array: 00:00:01.6824192
Array Sort: 00:00:00.8612384
Current Count: 100
Sorted Array: 00:00:03.5050400
Balanced Sorted Array: 00:00:03.6151984
Array Sort: 00:00:01.8827072
Current Count: 250
Sorted Array: 00:00:05.0272288
Balanced Sorted Array: 00:00:05.1273728
Array Sort: 00:00:02.6237728
Current Count: 500
Sorted Array: 00:00:02.3734128
Balanced Sorted Array: 00:00:02.2231968
Array Sort: 00:00:01.1216128
Current Count: 1000
Sorted Array: 00:00:05.9785968
Balanced Sorted Array: 00:00:05.2074880
Array Sort: 00:00:02.3934416
Current Count: 5000
Sorted Array: 00:00:08.0315488
Balanced Sorted Array: 00:00:05.2675744
Array Sort: 00:00:01.3619584
Current Count: 40000
Sorted Array: 00:00:40.8988096
Balanced Sorted Array: 00:00:22.1418384
Array Sort: 00:00:01.2417856