O/R Mappers: Maximum Performance

I now have my WilsonORMapper (v1.0.0.3) performance very comparable to that of DataSets. In some cases I am actually beating DataSets, with or without private field reflection.

My tests compared DataReaders, DataSets, and my ObjectSets, both using my new optional IObjectHelper interface for direct access, as well as testing private field reflection. Each run consisted of a number of repetitions of loading the appropriate structure from a database table and then looping through the resulting structure to access the values. The database table consisted of 10,000 records filled with random data that I generated, with the table fields consisting of 2 ints, 3 strings, 1 float, 1 datetime, and 1 bit. The numbers posted all represent loading 10,000 records, but cases varied from 1 record repeated 10,000 times, to 100 records 100 times, and finally 10,000 records only 1 time. The tests were ran many different times, and the numbers were always consistently similar. I also tested a 100,000 record table, and the numbers were similar, just 10 times bigger.

Notice first that hitting the database many times for one record is noticeable slower. Next, note that DataSets are pretty much always twice as slow as using a DataRepeater. If you want to load a single record then my WilsonORMapper beats a DataSet hands down. This remains true even in the case where I continued to use private field reflection. On the other hand, my O/R mapping was 50% slower than the DataSet loading 100 records, and 75% slower than the DataSet when 10,000 records were loaded, using direct access. The numbers were another 2 times slower when I allowed the private field reflection. So performance varies depending on the number of records, although keep in mind that my WilsonORMapper supports paging to make the larger number of records easily manageable. I also added a new GetDataSet method that returns DataSets and performs just as good.

Why does my O/R mapper still perform a little slower than DataSets with many records? No matter what I did, almost every millisecond could be attributed to the fact that my mapping framework stores a second copy of each value in the manager for its state. This state allows you to check if the values of any entity object has been updated, as well as giving you the ability to actually cancel all the changes made to an object. I may also someday use these extra state values for more specific dynamic sql updates. On the other hand, large DataSets load faster initially since they don't load twice, but they also may have larger overhead in the long run as they track all later changes. DataSets also have a considerably larger serialization when they are remoted, so you should also consider this additional overhead that occurs in distributed environments.

What did I do, other than implementing the IObjectHelper interface, for performance? The biggest performance change I made, far bigger than reflection, was changing all of my foreach loops over hashtables to instead be regular for/next loops over typed arrays. The next biggest performance gain was changing a hashtable key from a struct to a string, which could not be just a regular object since each object instance was always different. Next, and still making a slightly better performance impact than private reflection, was accessing each datareader value only one time, even though I had to store it twice. I also now save the reflected FieldInfo, for the cases when reflection is still used, which did make a small but measureable difference, contrary to Steve Eichert's report. And of course, you can now implement the IObjectHelper interface to avoid reflection.

I also made a few other observations in the course of all this performance testing. Most surprising to me was that I could not find any significant difference between accessing datareader fields by index or by name -- it was barely measureable at all. I also confirmed that there was no measureable difference between sql and stored procs. Next, while Steve Maine noted the huge differences that private reflection can make, it is still a relatively small part of the bigger picture that includes data access. This is in agreement with several comments I received that there is a whole lot more going on in an O/R mapping framework than just worrying about how the values are set. Also note that public and protected field reflection was hardly noticeable in tests. But overall, the little things like foreach and boxing were the worst offenders.

So if you were first concerned that my WilsonORMapper didn't have adequate performance, then think again and download the demo for yourself (the demo only runs in the debugger).

# Records 1 100 10,000
# Repetitions 10,000 100 1

DataReader 1.91 0.14 0.11
DataSet 3.69 0.21 0.21
OR Direct 2.29 0.31 0.37
OR Reflect 2.78 0.75 0.81

17 Comments

  • Well we all now know to ignore my posts about how costly reflection is :) Nice work Paul! I enjoyed reading how you went about speeding things up. As many people noted it was probably "fast enough" before, but, hey a little extra speed certainly can't hurt!

  • could WilsonORMapper recognise dirty flag of property?



    eg:

    if i wanna update name of product,i will write:

    Product product = New Product;

    product.id = "001";

    product.name = "example";



    Global.Manager.PersistChanges(product);



    i hope only the name of it can be update and don't affect other property of product.



    the dynamic sql will be:

    update product set name = 'example' where id = '001'



    does WilsonORMapper work like this?

    vs.net in my computer does not work now,so i can't run your demo.





  • Hi Program:



    No. Right now at least it updates everything every time, although that would be relatively easy to change since all the state is already being tracked. So I may implement it at some time, although its not very high on my list right now -- but you could create your own custom version to do just that. I'm curious, I've seen a few people note this feature of some O/R mappers, but I don't really get why it matters much, at least in most cases -- can you share? The traditional stored procedure approach for updates certainly did not in most cases make such targeted updates, and its that approach that is my personal background, so I'm still learning other approaches.



    Thanks, Paul Wilson

  • this feature is very important to me.

    eg:i have a form to edit product,but the price can't be edited here,it will be built by many factor. if i edit the product information in client A,and maybe it's price be modified by client B just then. so i must post these dirty values to database, not involve "price".

  • considering the concurrence,a O/R Mapper also must support increment operation.

    like this:update product set quanity = quantity + 10 where id = '001'

    otherwise the consistency will by destroyed.

  • Hi Radim:



    DataSets and custom object collections are certainly not interchangeable in all situations, but there are many cases where a user of an O/R mapper simply wants to retrieve a collection and display it for presentation, which is very much like using a DataSet. On the other hand, once you go beyond that simple display of data, and start implementing business logic in your custom classes and then proceed to the actual persistence, that's where an O/R mapper certainly has the upper hand. Can it be used for real-world apps? Most certainly it can. Of course, real-world apps vary, so it may not be the best thing in every case, but there are many many applications that can easily benefit from O/R mappers, and the fact that they are so common in the Java world proves that. As for stored procedures with business logic, I already support using stored procs for inserts, updates, and deletes, which is where there often is a real need to have integrity checks and denormalization and simple business logic. I will probably be adding support for select stored procedures too, but while that may be nice in some situations, it will also mean a loss of basic search, sort, and paging functionality that comes for free with the regular type of O/R mapping. As for generating stored procedures, there are already countless code generators out there, with CodeSmith being my recommendation, so I don't really see the need, nor is that what O/R mappers are really about.



    Thanks, Paul Wilson

  • Hi Progame:



    The entities are tracked automatically when they are materialized from the database, or when a new entity is created using the GetObject method, or when an entity is passed to the StartTracking method. The tracking is released by a timer that wakes up periodically (default is 5 minutes, but its configurable) and removes all instances that correspond to entities that have been garbage collected and have not been accessed for some time (also configurable, defaults to 0, but necessary for server based systems that won't retain weak references). The manager class is not static -- you can create multiple instances of it to handle different databases simultaneously, as well as you should be able to use it in remoting situations. The entity classes do not have to have any structure, i.e. there is no required base class, although implementing the optional IObjectHelper interface will give you better performance. As for lazy loading, you can specify to not use lazy loading for child collections, which would be required for serialization cases, like remoting, but that's not really related to the manager tracking object state for other reasons.



    By the way, I agree not having "quanity = quantity + 10 where id = '001'" is something less than desirable, but this is where you can use a stored procedure with my O/R mapper, or you could be using triggers which may actually make more sense for pure data integrity across all applications. I may try to add that functionality as well, but I haven't given it much thought yet since stored procs and triggers are both viable solutions that already exist.



    Thanks, Paul Wilson

  • can this tracking recognise any change of property? or only compare property's value?



    eg:

    product.code = "001";

    product.name = "name";

    ....starttracking(product);

    product.name = "name";

    ...persistencechange(product);

    ....



    in this case,wor will recognise this change?

  • It doesn't have an internal event that is triggered when a property changes, since the entity can be any class with no special base class required. It can compare the values that you send when PersistChanges is called (although it does not right now) with what was sent originally with StartTracking. Currently this means that you can call GetObjectState to check for Updated (i.e. IsDirty) and you can also call CancelChanges.



    Later, Paul Wilson

  • i want to ge dirty flag of property,instead of class. although the "name" property value not changed,but it's dirty now. i hope the mappers can recognise this change in spite of assignment of the same value.

  • i want to recognise this change, do not need invoke any event.

  • btw: can you change font color of your blog? blue is not comfortable :(

  • I see now. Mine would only know if the class was dirty, and only because a value differed and not simply because it was set again to the same thing. You could certainly code that type of stuff in a mapper, but it would require either a significant base class, and/or a new dynamically compiled class that inherits from the original. Some of the more advanced mappers may have this functionality already, but I don't think it sounds like the direction I want to go.



    Sorry you don't like blue -- what if it was <font color=navy>Navy</font> instead of <font color=blue>Blue</font>?

  • yeah. the dynamic proxy in java can do that works. navy is better than blue,but i think black is the best. sorry for my requirement.

  • Hey Frans:



    My performance tuning so far has just been lots of hand placed timing statements where I gradually drill down until I find the lines that are taking longer than expected. Its time consuming, but I've actually found it works wonders in the hands of someone that has experience, like you or I. I tried to play with the demo of DevPartner Studio, but it returns way too many false positives to be of much use for me. I may try NProf though -- I've seen the blogs about it, and your results do sound intriguing.



    Thanks, Paul

  • What's happening if there are approx 100,000+ objects of Product (e.g., Product[] products = new Product[100,000]), will they make the app slow down?

    Which is the best approach to manage a huge list of objects?

    Thank you very much for any idea.

    Lenny

  • It doesn't really matter if you're using an O/R Mapper or regular ADO.NET -- don't work with more objects/records than necessary. If its something to display to the user then you use paging, and provide filtering and sorting options -- no person is going to look at 100,000 records. If its something that needs processing, then do it the most efficient way possible, which would be in the database using sql or a stored proc.

Comments have been disabled for this content.