Creating custom collections for better performance and filling in BCL holes.

Collections are fairly retarded in current versions of .NET, but they do get much better in Whidbey. Not retarded from the stand-point of not working, but retarded in that you really have to write your own, from complete scratch, if you want to get the best performance or fill in missing features. A feature that I often find missing is that of a sorted collection. You might be quick to point out SortedList, but that is of course a dictionary and not really a sorted list. It requires keys and stores two collections internally, not one. For these reasons it simply can't be nearly as fast as a custom solution.

To start writing my own custom collection, I thought seriously hard about how items are going to get added. In all normal scenarios, I would preallocate an array, keep a count, and simply move elements as necessary to insert new elements as they become available. In any sorted list, this is going to be a balance between making front loaded vs. end loaded copies. The largest problems arise when the number of elements borders on being quite big, in the 1000's or more. I end up copying nearly the entire array from the very beginning up only a single element several times.

public class SortedArray {
    private const int defaultCapacity = 10;
    private int[] sortedArray;
    private int count = 0;
   
    public SortedArray() : this(defaultCapacity) {
    }

    public SortedArray(int initialCapacity) {
        sortedArray = new int[((initialCapacity < defaultCapacity) ? defaultCapacity : initialCapacity)];
    }
   
    public void Add(int value) {
        if ( count == sortedArray.Length ) { EnsureCapacity(); }
       
        int result = Array.BinarySearch(sortedArray, 0, count, value);
        if ( result < 0 ) { result = ~result; }
        Array.Copy(sortedArray, result, sortedArray, result + 1, count - result);
        sortedArray[result] = value;
        count++;
    }
   
    public int Count {
        get {
            return count;
        }
    }
   
    public int this[int index] {
        get {
            return sortedArray[index];
        }
    }
   
    private void EnsureCapacity() {
        Console.WriteLine("Grow!");
        int[] newArray = new int[sortedArray.Length * 2];
        Array.Copy(sortedArray, 0, newArray, 0, sortedArray.Length);
        sortedArray = newArray;
    }
}

Okay, so knowing that I'm copying larger portions of the array than might be necessary, can I improve the collection any? If I center my elements in my array and allow copying of only half the array at any time by moving elements not only forwards, but also backwards, then I can cut down on some of the movement within the array. The trade-off is that the mechanics or complexity of adding items and accessing the array becomes more complex. I'll blow up an article with 5 or 6 custom collections I've written at some other time. For now, have some numbers just to see the advantages of making a complex code change, that allows more logical manipulation of the underlying data. Notice how the balanced sorted list is faster at inserting into a larger collection, but they are fairly equal at smaller numbers. I've also included an Array sorting sample that does a sort at the very end of placing all of the numbers. If your algorithm doesn't need to incrementally append items then the sorting is always going to be your best option.

Current Count: 10
Sorted Array: 00:00:02.8140464
Balanced Sorted Array: 00:00:02.7539600
Array Sort: 00:00:01.3319152

Current Count: 50
Sorted Array: 00:00:01.6423616
Balanced Sorted Array: 00:00:01.6824192
Array Sort: 00:00:00.8612384

Current Count: 100
Sorted Array: 00:00:03.5050400
Balanced Sorted Array: 00:00:03.6151984
Array Sort: 00:00:01.8827072

Current Count: 250
Sorted Array: 00:00:05.0272288
Balanced Sorted Array: 00:00:05.1273728
Array Sort: 00:00:02.6237728

Current Count: 500
Sorted Array: 00:00:02.3734128
Balanced Sorted Array: 00:00:02.2231968
Array Sort: 00:00:01.1216128

Current Count: 1000
Sorted Array: 00:00:05.9785968
Balanced Sorted Array: 00:00:05.2074880
Array Sort: 00:00:02.3934416

Current Count: 5000
Sorted Array: 00:00:08.0315488
Balanced Sorted Array: 00:00:05.2675744
Array Sort: 00:00:01.3619584

Current Count: 40000
Sorted Array: 00:00:40.8988096
Balanced Sorted Array: 00:00:22.1418384
Array Sort: 00:00:01.2417856

Published Thursday, September 23, 2004 8:32 PM by Justin Rogers
Filed under: ,

Comments

Friday, September 24, 2004 11:58 AM by Barry Kelly

# re: Creating custom collections for better performance and filling in BCL holes.

You reckon you should use Buffer.BlockCopy instead of Array.Copy for copying arrays of primitive types.

On the other hand, I wouldn't use your insertion mechanism at all - it's got O(n^2) complexity. Skip lists are easier to program than binary trees and give probabilisticly similar results.

-- Barry Kelly
Friday, September 24, 2004 5:48 PM by Justin Rogers

# re: Creating custom collections for better performance and filling in BCL holes.

You might think that Buffer.BlockCopy would buy you something. Turns out, it can either save or lose you fractions of seconds on any of the given tests. You tend to get 5% faster results on the smaller lists and then the error changes +/- 5% as the lists begin to get larger. That makes Array.Copy pretty efficient.

I'm not sure anyone was talking about a binary tree. I surely wasn't. I was talking about a balanced, sorted array, which is completely different. The only removal of complexity in the balanced sorted array is shifting half of the array rather than the full array.

If you have a faster algorithm, lets see the code. This definitely fills a whole somewhere since the functionality isn't in the BCL by default, it maintains only a single array for data storage making it behave like other collections classes, and it is much faster than something like the SortedList which replicates data when you want a value sorted collection. It is also easy to make thread-safe with regards to atomic insertion, while still allowing reads in between writes.
Saturday, September 25, 2004 1:18 PM by Barry Kelly

# re: Creating custom collections for better performance and filling in BCL holes.

Your "balanced sorted array" buys you no complexity improvement. It's only different by a constant factor. It's still got O(n^2) complexity for n iterated insertions. It isn't worth its implementation effort.

To get O(n log n) complexity, you need to use some kind of balanced binary tree (e.g. red/black tree or AVL tree) or something similar. I mentioned balanced binary trees simply because they exhibit the fastest possible complexity bound - namely, O(n log n) - for iterated insertions of *arbitrary* data.

(For limited universe data like integers, there are obviously O(1) insertion algorithms which translate into O(n) when iterated like you are doing in your test).

I mentioned skip lists because they are easier to program than balanced binary trees but have probabilistically similar average complexity. Check out the obvious - http://google.com/search?=skip+list - for details.

Certainly, .NET needs something like C++'s set<> and friends or Java's TreeSet / TreeMap. C#'s collection library is pretty dire. Things will improve a bit with C# 2.0 and generics, but more collections with a wider range of implementation mechanics is needed.

-- Barry Kelly
Saturday, September 25, 2004 1:20 PM by Barry Kelly

# re: Creating custom collections for better performance and filling in BCL holes.

Monday, September 27, 2004 5:10 PM by Scott Mitchell

# re: Creating custom collections for better performance and filling in BCL holes.

I don't see why you are using an array here. If you want a sorted list, why not use a linked list? This would remove any need of copying data over to maintain sort order - you just plunker down the sucker in the list where he belongs.

Of course, a sorted array (be it implemented like you have it, or as a linked list) takes linear time for inserts since you must potentially scan the entire list to find where to insert the item. As Barry points out, self-balanced BSTs or skip lists would provide log n insertion time, since at worst log n comparisons would have to be made to find the spot to insert the item (rather than n).

I have an article on self-balancing BSTs and skip lists available here:
http://msdn.microsoft.com/vcsharp/default.aspx?pull=/library/en-us/dv_vstechart/html/datastructures_guide4.asp

(I also have a version for 2.0 in the works...)
Monday, September 27, 2004 5:48 PM by Justin Rogers

# re: Creating custom collections for better performance and filling in BCL holes.

A number of issues... We are storing integers, and no matter what you do, adding a linked list feature to storing integers is going to double the size of the data storage.

We are doing a binary-search, so finding our insertion point is log n + 1, and we copy at most n/2 elements. I'm not exactly sure I see this as bad performance by any means.

We also get all of the benefits of arrays and linear access to our data. What is the 50th element, array[50]... What is the 5000th element array[5000]... I think if you do the testing for various situations that you'll find not only is the balanced array insert very fast, but in many cases it is as fast or faster than using your other structures. It will also be half the size of those other structures.
Monday, September 27, 2004 6:34 PM by Justin Rogers

# re: Creating custom collections for better performance and filling in BCL holes.

I'm first to admit this is a naive test. But now matter how much you talk about collections and performance, if you ship something with poor performance, it doesn't make a great impact. While your SkipList implementation was not optimized to be fast I'm sure (though talking about perf you probably should have made it fast) and it wasn't optimized for storing integers, but rather made generic to work with IComparable (not that any implementation of my sorted array would be this slow)... Here you go:

Current Count: 10
Sorted Array: 00:00:02.5536720
Balanced Sorted Array: 00:00:02.5937296
Array Sort: 00:00:01.3920016
Skip List: 00:00:21.1904704

21 seconds? Nearly 1 magnitude slower at 10...

Over larger lists, much larger, over 10000 elements, the skip lists start to perform. I use that term loosely, because they still only perform as well as the array copy.

Thinking about the global performance of an algorithm given billions of elements may work perfectly well for some. However, most of the time you are only sorting 10-100 elements, and often times you have 100's of those collections sitting around that need to be sorted.
Tuesday, September 28, 2004 9:40 AM by PatternGuru

# re: Creating custom collections for better performance and filling in BCL holes.

If you want to use an array as basic stored and need something thats asymptotically better than O(n^2) then why not try a heap? Basically that's a binary tree squashed into an array using shl/shr operations in an indexing scheme. Transforming an arbitrary array to a heap takes O(nlgn) and inserting takes O(lgn). Getting the sorted list out of it can be more difficult, but can be done in O(nlgn) with peeking the highest/lowest value in O(1) and removing it in O(lgn), making it great for priority queues. It all depends on how you are going to use that sorted list and what you want to be fast in what cases.

Sounds like a job for http://www.amazon.com/exec/obidos/tg/detail/-/0262032937 (or something similar) to me...

I do agree that the default .NET collection support is somewhat lacking right now.
Tuesday, September 28, 2004 10:36 AM by Chris Chambers

# re: Creating custom collections for better performance and filling in BCL holes.

Well, I guess I'll throw my hat in the ring here.. I don't even remember what I was looking for when I found this page several hours ago.. but here's what I've come up with. It's sort of an indexing-sorting-array-tree-with-one-level-of-bucket-type-nodes.

Inserting is mediochre up until around 1000 entries, after which, it leaves the SortedArray implementation that started this whole thing in the dust. Lists over 5000 are particularly nice.

It's also fairly good about managing it's "slack." I'm sure the SortedArray could be better, and that performance doesn't depend on this, but when it grows, it doubles the size of the list unconditionally. That's a pretty bold assumption to make.

Accessing by index is about half as fast as SortedArray. It's hard to compete with flat array indexes.

Searching (IndexOf, for example) is, as a side product of the design, absolutely terrific at all set sizes. It can process 40,000 IndexOf requests on a collection of 40,000 values (random index requests, so hit or miss) in just about 0.28 sec on an athlon xp 3000+. For comparison, it takes the SortedArrayList a little over 9 seconds to do the same operation.

Memory footprint is typically the {size of data} + 3% or so. Sometimes better than Sorted List, sometimes worse. Depends on if you hit a number of elements that SortedList really likes. (10, 20, 40, 80, etc) Of course, I'm pretty sure SortedList could allocate smaller chunks without a huge performance hit, so that would solve that problem.

I guess that about does it. I've really enjoyed reading this discussion. Made me think outside the box, and that's never a bad thing.

If anyone is interested in the source for Rube(Goldberg)SortArray, and also the program I used to benchmark it, I can zip the code and post it somewhere. I'd be delighted to see what others think of the (slightly wacky) idea and also to get second opinions for benchmarks.
Tuesday, September 28, 2004 3:28 PM by Justin Rogers

# re: Creating custom collections for better performance and filling in BCL holes.

You should definitely post it so we can take a look. Test results arrived at under different conditions often provide bad heuristics for judging performance. A great example would be running on an Athlon versus an Intel.

I'll toss your code into the mix, and I'll upgrade it to generics and Whidbey if appropriate, since I also want to see if a platform upgrade is going to give us any more performance in the non array based scenarios.
Tuesday, September 28, 2004 3:38 PM by Justin Rogers

# re: Creating custom collections for better performance and filling in BCL holes.

PS > Were you doing performance against SortedList or the SortedArray I posted above? You mix terminology a few times. I never even bothered throwing SortedList into the mix because it was far too slow.
Tuesday, September 28, 2004 5:20 PM by Chris Chambers

# re: Creating custom collections for better performance and filling in BCL holes.

Oh, I see I *was* mixing terminology. Up too late, followed by up too early, I guess. I meant the SortedArray posted above in all cases. SortedList is a different beast altogether.

I could have gone generics and IComparable<T> with this, but I wasn't sure from the discussion if 1.x features only were more appropriate. Shouldn't be too hard to convert, though, if you want to give it a go.

Here's a link to the project. It's Whidbey beta 1, but I don't think there is anything in RubeSortArray that would keep it from compiling under the 1.x framework. Not sure about the little test program. It's just a little console app. Be warned, though, Whidbey steals the Console output (and input) and just throws an exception for Console.ReadLine. But this "feature" can be disabled.. And the link is:

http://www.bluehungarians.com/RubesArray.zip

URL recognition? Hope so.

And now, a few qualifications:

1. Pardon the badly written and possibly unclear code.

2. 1000 items is the breakover point. Below that threshold, only searching is faster. Above it, things start to get good.

and 3. Who knows if this will be any good at sorting objects other than int. It should be fine, but I just don't know.

There ya go. I'm really interested to see if it's any good in your benchmark. I have a tendency to test for what I'm testing for, if you know what I mean.. and I may have picked testing code that favors my object in some way.

Cheers
Tuesday, September 28, 2004 5:30 PM by Chris Chambers

# re: Creating custom collections for better performance and filling in BCL holes.

ps. One more qualification. Performance is, of course, better (more representative?) in release mode. Or with the optimized flag, anyway. I figure it's only fair since I'm up against the framework's Array.BinarySearch and Array.IndexOf.. Doesn't make a huge difference, but it does make some.
Tuesday, September 28, 2004 7:03 PM by Chris Chambers

# re: Creating custom collections for better performance and filling in BCL holes.

Ok, one more post. I decided to go ahead and try it with generics. Here's that version of the project.

http://www.bluehungarians.com/GenericRubesArray.zip

As far as the testing goes, I just replaced all the integer-only versions with the generic<int> class. Performance seems about the same in release mode. Debug mode took a little bit of a hit on indexed access.

Should work with any IComparable<T>. I love generics. I only wish they'd hurry up and release .net 2.0 so I could use them in more than just proof-of-concept projects. -sigh-
Tuesday, September 28, 2004 10:50 PM by Justin Rogers

# re: Creating custom collections for better performance and filling in BCL holes.

I'm probably not going to get around to performance analyzing things this evening, but I'll try and get something up and running tomorrow. Would you mind if I cleaned the code and included it in a larger code dump that demonstrates all of the algorithms? I'll drop it as an article, and link it in a short posting examining the heuristics of all of the methods discussed so far. That'll give us a good base for any future discussions.
Tuesday, September 28, 2004 11:02 PM by Justin Rogers

# re: Creating custom collections for better performance and filling in BCL holes.

PS > Your code adds that extra level of abstraction from 1 sorted array to many sorted arrays in order to reduce the number of operations during copies and searches, with the additional overhead of managing lists. It'll be interesting to see the threshold that this implementation passes the BalancedSortedArray, because I've basically added the same heuristic by maintaining *two* conceptually separate arrays, with one being on each side of some mid-point in a single dimension array.

An array BST with shifting might be similar in approach. As one user has suggested this, and I happen to have an array BST implementation, I'll try and clean that up for inclusion as well.
Wednesday, September 29, 2004 12:38 AM by Chris Chambers

# re: Creating custom collections for better performance and filling in BCL holes.

The BalancedSortedArray idea is interesting. I had considered that each "node" should maintain an array that could grow either at the head or the tail.. I was going to do it as one array with the data centered and space at the top and bottom (sounds like what you were describing).

I also considered an "optimization" for small sets (less than 1000, in this case). For no-splitting of the initial node until it becomes "profitable" to do so, speed wise. However, I haven't been able to imagine a good heuristic for what is a profitable number at which to begin splitting the list. I guess I could just tell it not to split until it has collected 1000 items, but magic numbers are... well, bad. I'd much prefer a more robust way to decide it's time to split the first node. On the bright side, this would give insertion times that approximate the SortedArray class's times up until 1000 items (since it's basically the same procedure), and then after that, it could switch to splitting nodes and indexing.

Feel free to include my code in your article. I may tinker with it for a while, if you wouldn't mind an update.

Anyone have recommendations for a good (and free) code profiler for the 2.0 framework? I had a look at nprof.. and it's ok, but I just wondered if anyone had a particular favorite.
Wednesday, September 29, 2004 12:40 AM by Chris Chambers

# re: Creating custom collections for better performance and filling in BCL holes.

ps I meant to say, "Feel free to include my code and do any cleaning up you'd like to do."
Wednesday, September 29, 2004 3:18 AM by Chris Chambers

# re: Creating custom collections for better performance and filling in BCL holes.

Doh. Earlier, I just tacked on IndexOf to your SortedArray for my benchmark, but I used Array.IndexOf instead of Array.BinarySearch in those tests. Needless to say, after making the change, SortedArray is now the faster at searching any size list. Silly me.

That leaves only two real benefits to the RubeSortArray:

1. Significantly better insert times for large lists
2. Possibly more efficient array allocation

And the downside:

1. Worse insert times for small lists
2. Worse searching times for all lists
3. Indexed access is about half the speed of SortedArray

So, after all that, maybe the RubeSortArray may be a lot of implementation for just a little benefit. But hey, if you need to store an often-changing list of 50,000 sorted objects, but don't want all the overhead of linked lists, it might just be for you.

-- Oh yeah, I tried adding the balanced array idea into the nodes, but the speed increase was marginal while the amount of slack doubled.
Wednesday, September 29, 2004 2:19 PM by Justin Rogers

# re: Creating custom collections for better performance and filling in BCL holes.

The amount of slack would definitely double, and the balanced array really doesn't start helping until you get over 500 to 1000 elements in the array. That means each of your RubesNodes would have to be at least that large for there to be a difference.

Thankfully, your algorithm also benefits from the BinarySearch replacement I wrote, which halves the time taken by SortedArray for smaller arrays (getting increasingly less crucial for larger array lengths).

I like the array allocation for larger arrays and the insert times do get a bit better. The algorithm is a bit more complicated and definitely specialized for very large arrays. I was thinking of adapting each of the algorithms for doing topN..bottomN as well, but realized these just set the SortedArray up to be performant since it works best at small array sizes.
Thursday, September 30, 2004 12:29 PM by Scott Mitchell

# re: Creating custom collections for better performance and filling in BCL holes.

Justin, you gave the following numbers:

Current Count: 10
Sorted Array: 00:00:02.5536720
Balanced Sorted Array: 00:00:02.5937296
Array Sort: 00:00:01.3920016
Skip List: 00:00:21.1904704

Care to explain what your test was doing? Was it just creating the structure and populating it with 10 elements? Or is it creating the structure and then searching the structure for various elements repeatedly? How about, adding a couple elements, searching, adding some more elements, searching, etc., and doing that a few thousand times?

They say there are three types of lies: lies, damned lies, and statistics. I could come up with a test where the SkipList beats the pants off of the array. My point is, the SkipList - or any data structure for that matter - has a time and a place. I'm not deriding your data structure, as it is obviously the best one when you only have a handful of elements. But to assume that it is universally sufficient is a mistake.
Thursday, September 30, 2004 5:24 PM by Justin Rogers

# re: Creating custom collections for better performance and filling in BCL holes.

Sorted Array, especially with the rewritten binary search is much faster than SkipList in the all of the scenarios you've pointed out.

Remember that searching for an element (IndexOf) is a BinarySearch. Remember that getting an element at an offset is a direct look-up (because we are using an array). The only time the SkipList could be faster is during inserts, because it isn't reallocating or moving array elements.

However, the overhead and management of the skip lists, and the fact that you used off the shelf components, such as CollectionBase for your NodeList all impact how fast it is. I'm sure you could take a few hours and come up with a better implementation. However, you'd be nearly 3 times the size for many of the numeric data types.

Does it have it's place? Of course. In the .NET world it appears to begin operating well at very high insert levels around 15,000+. At low insert levels the overhead of setting it up just doesn't make it worthwhile, as you can see in the tests. Adding/Searching doesn't make any difference because the array has ALWAYS been faster at those two things.

With that in mind, and assuming a performance increase through fixing the SkipList, we might be able to move the threshold from 15k to 5k (I won't say might, I'll actually do this and post it at some point). At that point I'd probably take some of Chris Chambers code (or probably before) and use that instead. The Slack on his is less than a SkipList and keeping the arrays small for repositioning would keep them competitive.
Thursday, September 30, 2004 5:54 PM by Scott Mitchell

# re: Creating custom collections for better performance and filling in BCL holes.

****************
Sorted Array, especially with the rewritten binary search is much faster than SkipList in the all of the scenarios you've pointed out.
****************

I apologize for not being clearer, I wasn't pointing out situations in which I thought the SkipList would outperform, merely taking some guesses at what your test case involved (which I am still in the dark on).

In any case, it seems like we've reached some common ground: different data structures each have their time and place. Basically, a data structure is designed to do well at particular tasks. For example, an array offers great access time, but sucks for searching. So you improved that with the SortedArray, so it could use BinarySearch, but the overhead now becomes inserting, since you have to shuffle around blocks of memory to do so. The SkipList (and other balanced BSTs), as you noted, improves on the searching from a native array, and insertions on a SortedArray, but has the space overhead and non-constant time access that you mentioned.

Bottom line is that there is no Holy Grail, no "ultimate data structure." You use what's best suited for your conditions. My initial point was that copying and recopying this memory seemed an inefficient approach. What I should have prefaced that with was, *assuming you are going to be working with large amount of data*. With a few hundred or few thousand records, it's no biggie.
Thursday, September 30, 2004 7:42 PM by Justin Rogers

# re: Creating custom collections for better performance and filling in BCL holes.

To note, my only major consideration was finding the cut-off, where it would be obviously profitable to make the switch from the array heuristics to the skip-list or binary tree. There is definitely a cut-off, and I'll probably talk about that at some point, but in order to do that I need the fastest, most highly optimized versions of those structures available.

# MILOL Blogging News &raquo; Blog Archive &raquo; Creating custom collections for better performance and filling in BCL holes.

Pingback from  MILOL Blogging News  &raquo; Blog Archive   &raquo; Creating custom collections for better performance and filling in BCL holes.

Leave a Comment

(required) 
(required) 
(optional)
(required)