O/R Mappers: Avoiding Reflection

Steve Eichert posted his findings yesterday about the performance cost of reflection.  I knew reflection was slower, but I had no clue it was that bad potentially.  I haven't done many tests yet myself to see if it really is that bad, but it doesn't really matter since I can agree that reflection is definitely slow.  So why does this matter -- well, right now my WilsonORMapper uses a lot of reflection to get and set the field values.  I was planning on doing something to fix that sooner or later, but Steve's post got me thinking about making it my next priority.

So how should I go about getting rid of reflection?  The first solution I came up with is the easiest for me to implement, although I'll admit it seems kind of kludgy and ugly to the user.  Basically, I would provide an interface with one property whose single parameter would be the member name specified in the mapping file.  The user (or my WilsonORHelper) would simply implement this interface's property with a switch block where they set or get their own member variable to avoid reflection if performance is a consideration.  I would simply need to use reflection one time, on the initial load of the mappings, to see if they implemented this interface.

OK, that does sound pretty kludgy, and it does mean that I would be requiring the user to do something specific for a change.  But this would not be "required" unless they want or need the performance gain, and its still not requiring them to inherit from a specific class either.  Implementing an interface, while definitely a requirement, is not as big of a "burden" since you can have multiple interface inheritance in .NET.  And again, its not really required unless the user really wants or needs the performance, which may be necessary for collections of many objects, but maybe not for other cases.

So what other options exist to avoid reflection?  One option that Steve mentioned is to use CodeDOM to dynamically create an assembly with a signature that the O/R mapping framework understands that knows how to call the public members or properties of the original class.  Those public members or properties might be specified totally in the mappings, or reflection might need to used one time at startup at most.  The problem I have with this technique is that it requires public members or properties, and it doesn't handle any member that is read-only publicly.  Assuming that there aren't many read-only cases, what's the problem with using public properties, since they almost always exist anyhow.  The problem is that properties are often (and should be) wrappers around the private fields that contain additional validation or business logic.  There's nothing wrong with your public property rejecting or modifying the user's attempt to set an invalid value, but it should not prevent me from loading data that currently exists in the database.

So what do other O/R mappers do instead?  Some O/R mappers use CodeDOM to dynamically create an assembly with new classes that inherit from the original classes which become the real ones used by the mapping framework.  This can be done by either having the original classes be abstract with all the real logic created in the new dynamic inherited class, or by requiring the original class to expose its member variables as protected.  The problems with this approach are that you can't use new to explicitly create your classes anymore, and the classes that the framework returns are actually a different type technically than was originally expected.  Neither of those are significant problems, but the requirements to do this aren't trivial.  You either have to forego providing your own implementation that includes your validation and business logic with an abstract class, or you have to expose all of your member variables as protected, and neither of those requirements are very friendly.  There may be solutions to this that some O/R mapping frameworks have discovered, so I'm not trying to imply there isn't, but I doubt any such solutions are trivial, and they are probably therefore out of my reach to easily implement.

What other solutions are used by O/R mappers?  Some O/R mappers require you to generate lots of code in order for them to work so that they don't have to use reflection and so that they can also gain other "insider" knowledge.  I don't want to imply that this is "bad", or not a valid technique for O/R mappers, both because that's not necessarily the case, and because there have been other discussions on this already.  That said, its not what I want to do with my O/R mapper, so this was not ever an alternative I seriously considered.  One thing it does do for me, however, is to at least validate that allowing the user to implement an optional interface with a single property if they want or need the extra performance is not something totally out of line with other tools.  And since I can make my WilsonORHelper generate this code if the user wants to use my helper and wants to turn on this feature, then I do feel like its not at all too much of a "burden".

So at this point I've just about concluded that my first solution is probably good enough, at least for my simple O/R mapper, and I also have decided that I don't really like the other alternatives, at least that I can think of or find.  Then it occurred to me that I should try to figure out what ObjectSpaces is doing, but I quickly gave up since their code is just too much for me to try to figure out without lots of time and work.  Then, on a whim, I decided to  Google on ObjectSpaces and reflection.  The first result was a blog entry by Andres Aguiar that confirms ObjectSpaces does use lots of reflection, but this is somehow going to be less of a performance hit in .NET v2.0.  The fifth result returns some documentation about ObjectSpaces and an IObjectHelper interface that I had never noticed before -- and remarkably it sounds exactly like what I was proposing to do!  There's also an IObjectNotification interface that can be implemented to enable your objects to receive events when is updated or deleted or when an exception occurs, which was something else I was wanting to do somehow.

That's enough research for me, since I liked my solution to begin with, and since I'm mostly using the syntax of ObjectSpaces anyhow, this will now definitely be the thing I implement in the next few week or so.  Of course, that still doesn't mean its the best or "right" solution, so I'm still interested in what others think of my solution and the other options.

9 Comments

  • Before you start to worry about redesigning the mapping layer, I would suggest you sit down and actually stress test your code.



    If it is orders of magnitude slower, perhaps a redesign/rethink is in order. However, a mapping layer, by nature, is going to add performance overheads to your system. That is the tradeoff with the mapper. Ease of use versus speed of execution.



    A reflection based mapping layer is never going to come close to hand rolled sql + hand written object instantiation code.



    Yes reflection is slow. No, it really isn't as slow as everyone thinks it is. (Take a look at the asp.net runtime code... theres a lot of reflection going on there)

  • Paul, while on the surface I can agree that, - to be pedantic - it might seem like a good idea to increase the performance of your OR mapper by replacing the Reflection, don't you kind of think that this is exactly the type of problem that Reflection was built to handle? It's probably not as if you are going to be handling MILLIONS of calls per minute :-) and having the Reflection cause bottlenecks.



    Either way, I'll bet that the bottleneck caused by the Reflective calls are significantly less than the bottlenecks caused by the system you are replacing :-) - I.e.: hand coding.

  • Frans,



    My old O/R mapping layer (non-.NET) was implemented using that pattern, but I felt like it didn't do enough for the user. I believe the user just wants to work with their own strongly typed fields. I believe that asking the user to write a ton of casting logic in their entity class' properties/methods is a pain in the butt for users. I believe the complexity should be in the O/R mapper, not in the entity objects themselves. As for the example of versioning the fields, that's easy enough to do within the mapping layer itself. Obviously the trade off is performance. There's a billion ways to skin the O/R Mapping cat. I guess it will be up to the users to choose a framework that makes the most sense to them.

  • What's your goal with this O/R Mapper - to figure out how things work, or are would you like to use it in a production environment?

  • All right, I've got my performance up and comparable to DataSets now, beating them in some cases.



    Later, Paul Wilson

  • Why not to save property info object for any BusinessEntity in shared hashtable and retrieve it from there when ever we need to set the value for any property through reflection.

  • As Steve noted, and I verified, most of the performance hit occurs when actually "using" reflection, regardless of whether or not the reflection "object" was reused or not. I actually do reuse the reflection "objects" to get the small amount of efficiency this provides, but its simply not at all the same as direct access, so I must recommend the interface approach when performance is really needed.

  • Yes Paul, you are right, I tested that and there is a small performance gain you get, but during the test, I found one interesting thing that if you set private member through reflection it takes a lot more then setting public member, any idea why it is so?

  • Yes that was also noted in earlier results by Steve -- I believe security was suggested as the reason since you have additional security checks to see if you can reflect on private members.

Comments have been disabled for this content.