Examples of O/R Mapping vs Stored Procedures

Saturday, September 18, 2004

Mike Schinkel of Xtras.net responded to my previous post about "dynamic sql" by asking for a real-world comparison "example" that would clearly demonstrate the pros and cons of O/R Mapping. I seriously thought about providing such an "example" for a few minutes, but (1) there are already enough "examples" for those that are truly interested and (2) it would be a lot of work that I try to avoid. :) Seriously, see my ASPAlliance article for an introduction, see my ORMapper.net site for more snippets, and finally, and most importantly, download my (or someone else's) ORMapper demo and see it all work for yourself. And I also really do seriously mean giving you what you want would be a lot of work -- but only on the "traditional" side, since the examples on the O/R mapping "side" are quite trivial, as the article, snippet, and demo should make very clear if you think about it.

Why? Lets say for a moment that our "example" is going to just be a single simple table with a few fields, along with the basic CRUD, and nothing else. The first thing I need to do for both "sides" is to create my database table -- simple enough. Now for my ORMapper, all I need to do is create a very simple entity class and an xml mapping file, both of which can either be done with my automated tool (the ORHelper) or manually without too much work. An entity class, if you don't know, is typically a class with private member variables for each field and public properties to wrap those members with any additional business or validation logic necessary. Note that you are also going to probably be creating this very same entity class in the "traditional" approach, unless you are going to work with untyped and bloated datasets (yuck), so the entity class is something you need on both "sides" again. That really just means that the xml mapping file is the only thing extra required to use the ORMapper -- and again it can be generated or created without too much trouble by hand (see my demo or snippets to see for yourself if you don't believe it). Seriously, that's all -- now you just create an instance of the ObjectSpace, passing it your connection string and the mapping file, and you can instantly create new objects, update them, and delete them -- let alone retrieve entire collections, filtered and sorted collections, and paged collections -- for any database provider at that.

What about the "traditional" approach? We've already noted that we need the database table and entity class for both "sides", so what else is needed? We need a minimum of 4 stored procedures, and I really mean a minimum since you will likely want at least 5 or 6, or even more. Why a minimum of 4? One for inserts, one for updates, one for deletes, and one for retrieval -- and its that last one that may require several variants even in the simplest of cases. Why? How about one stored proc to retrieve all objects, one to retrieve a single object by its primary key, and one to retrieve some other variations. Note that even this, a simple situation where just 6 stored procedures are involved, is often not enough since that "other variations" concept can get rather involved. Again, why? You probably want to retrieve one object by some other more "natural" key, and you are likely to want to retrieve collections with various filters and sorts -- and each of these either require a new proc -- or you're going to end up creating a huge ugly proc that accepts parameters for filters and sorts and which then uses dynamic sql to get the resultset! And that kind of brings us back full circle to what you wanted to avoid, and we haven't even hooked anything up yet -- all we've done is create a bunch of boring stored procs. Before we move on to see what else is needed, lets note that you're options really are either limit your query flexibility, write lots of stored procs, or write one massive proc that uses dynamic sql anyhow -- and will likely now require your application to know all of the field names in order to even work.

So what else is needed beyond procs? You need a DAL, a Data Access Layer, that enables you to call all of these stored procs -- and hopefully one that is easy to use (the MS DAAB is not in my opinion) and which actually closes all of your connections (CSLA examples didn't at one time, although hopefully it does now after it was pointed out by yours truly). I personally don't think a DAL is all that difficult, but then again that's why I could write an O/R mapper -- I've worked with lots of developers over the years and most fall seriously short in understanding how to use any DAL given to them, let alone build one themselves. OK, so lets assume you have a DAL or are going to use the MS DAAB -- now you still have to add all the code in your entity class to "hook up" the class with all of these various stored procs -- and I assume you've done this enough to recognize that even using cut-n-paste this isn't trivial -- its a lot of boring repetitive work which is prone to mistakes. And you've got to either have your code know all of the field names in order to associate the correct values with your class members, or you've got to very carefully assume a certain field order (hopefully you'll use an enum to at least make it a little more maintainable -- but most people don't). As noted earlier, you'll either have limited functionality, or a lot of procs to hard-code into your class, or a very big monster proc that can be used lots of ways and which will be difficult to work with in reality.

That seems like a heck of a lot of work to me, just for one simple little table -- and no I'm not going to create you an "example" of it to compare with my O/R Mapper since we've all done it a million times. Come on -- you should know the amount of work your "side" requires, so why don't you download the demo of my mapper, or one of the other ones, and see for yourself how little work they require! Yes, there are alternatives -- maybe you like to use untyped and bloated datasets instead of entity classes and commonly accepted object-oriented principles -- you've still got to write all those procs, get or build a DAL, and then "hook up" all those stored procs for retrieval and persistence with these datasets. Maybe you prefer typed datasets instead -- its still going to take that same work, although now you'll be able to at least avoid some compile-time errors and have a little bit of intellisense (not much). Finally, you could of course generate all of this "traditional" method (using CodeSmith or some other tool) -- and this is one alternative that I won't make fun of -- it really is a valid workable alternative. That said, you still have to either give up a lot of flexibility that a mapper will give you for free, or create and maintain some very sophisticated templates for your code generation. For those that need the utmost of control (yes, everyone thinks they are in this category, but realistically very few ever are) this is a valid option, but its a lot of work and I know that most development teams aren't up to it!

What about maintenance? What if you need to change a field name (silly example, but lots of people bring it up)? For my mapper you just change the name of the field in the xml mapping file -- end of story -- no recompile. Some mappers use attributes and would require a recompile, but what about these other "traditional" techniques. First, you need to change the name of this field in a lot of stored procedures -- and if you're very lucky that's all -- still more work than using my mapper. But if you hardcoded your field names in your application (and many really do as pointed out earlier) then you're also going to have to change source code, maybe in multiple places, and do a recompile -- and you'd better have the test team do some testing. If you used code generation then you maybe only need to regenerate the application's code and recompile -- still more work than my mapper. Oops, most teams really end up customizing the generated code since they didn't spend enough time on the templates and so they'll either have to redo this customization, or just give up on code generation and instead alter their procs and code manually! What if you need to add a new field (a little more realistic)? Add a new field to the mappings and a new member to the entity class if you're using a mapper -- ah ha the critics roar -- you have to recompile. Seriously, can you really just add this field to your stored procs and never recompile? Not -- because both of us also need to add this new field to our UIs or business logic in some manner -- otherwise why are we bothering to add this field in the first place.

Hopefully I've done a fair job at convincing you a good O/R Mapper really is a huge time-savings, both initially and for maintenance -- but what about other concerns? Everyone knows that performance sucks -- NOT -- people can debate this one forever, so try it for yourself -- that's exactly what I did before I made the jump from "traditional" to mapper! Most CRUD is pretty straight-forward actually, and most mappers avoid the bloat of datasets, so there's actually little difference, and sometimes they perform even better. But seriously, there are many other factors that weight much heavier -- network traffic or really cool UIs come to mind immediately -- so try it out for yourself and get over it. The other gotcha is security -- again mostly NOT -- although I'll readily admit this one isn't as clear-cut. Basically, you are "forced" to give a database "user" direct read and write to tables (unless you want to map stored procs and lose a lot of the benefits), but no one ever said you had to give out the credentials of this database "user" to your real end users! On the flip side, can your "traditional" approach support a different database if you have a client that demands it (common if you sell your application), can your updates handle optimistic concurrency or update only changed fields, and is your DAL and your application easy to port to a distributed scenario when it grows? Do your users demand more flexibility in their queries, with the ability to specify any criteria or sort, possibly along with paging? Does your team do a good job of syncing up all the changes to stored procs, entity-DAL logic, and UI changes, as opposed to a much smaller set of changes (I remember how painful this was in team development using the "traditional" model).

Finally, do you really think that your "traditional" model is really even all that traditional in the first place? If so, then talk to some enterprise Java developers -- they are probably either using an O/R Mapper or some serious code generation! Or what about SAP and other similar apps that are very configurable -- you'll find something very much like an O/R Mapper under the covers, although it probably won't be very generic. For that matter, if you look closely at very configurable Microsoft applications you're going to see some very "mapper"-like use of meta-data, as opposed to everything being stored procs! This is NOT something that Frans, Thomas, or myself made up -- O/R mappers are a commonly used, and tried and tested, enterprise scale design pattern! That's also why half of the Enterprise Pattern book that I have by Martin Fowler is about nothing other than O/R Mapping -- no kidding at all. No, I didn't invent this, nor did Frans -- in fact both of us were in the "traditional" Microsoft stored proc camp until we listened, tried it for ourselves, and learned something new. I only created my mapper since I think most of the others out there are too complex for my taste -- simplification is the one thing that I'm actually very good at.

So I stand by my claim that you should seriously think about either using a good O/R mapper, or do some serious code generation -- and if code generation then please spend the time and do it the right way or you'll be back where you started when you have to extend or maintain. The choices are simple -- spend all your time writing routine and boring CRUD instead of a killer application, use a tool like an O/R Mapper that does it for you and lets you focus on the real issues, or if you must insist on the utmost control then spend some real time up-front and write some killer code generation templates so that later you'll be free to focus on the real issues. I think most projects don't justify the time or need for the utmost control, so I typically prefer the O/R Mapper approach lately, but I can also agree that a well-designed set of templates can be reusable to a large degree if you have multiple projects that you can fit into the same scheme. Lastly, I also like O/R Mappers because I like simplicity -- I don't want to see the code of the third party grids that I use, and I really honestly don't typically need to see the code for all the boring CRUD either. But that's just me -- I just hope that you aren't doing it the old cut-n-paste way that many samples and books (and even tools like Visual Studio) "teach" you to do. Actually I take that back -- I hope you do continue to use those inefficient techniques so that I'll be able to underbid you on your next project! :)

Great going Paul (Boy, you know how to sell!)

;)

Srdjan - Can I have 5 of these please? NOW?

Srdjan - Saturday, September 18, 2004 11:09:00 PM

I agree on the benefits you describe for O/R mappers. Yes, I totally recognise all the work needed to write it the traditional way using MSDAAB and stored procedures. We're doing this all the time ...

If I understand correctly your Wilson ORMapper only works with ADO.NET 2.0 which is not released yet. Is there a good alternative for using an ORMapper with the current 1.1 version of the .NET framework?

Jochen Neyens - Sunday, September 19, 2004 7:22:00 AM

Paul,

Very nicely done. The point I would like to bring out most of all is that some of us are small, product-based, companies and our products need to work against multiple database engines out-of-the-box. Installing and maintaining stored procedures for each of the databases at each of our customers is not feasible. This is another reason I believe in O/R Mappers. Thanks again Paul!

JW - Sunday, September 19, 2004 7:53:00 AM

Hey Jochen:

The WilsonORMapper works with .NET v1.1! Its not hard to get to work with v1.0 either, but there's a few small mods required (references to Oracle). As far as I know every other O/R mapper out right now is also targeted to v1.*, and not v2.0. That's not to say our mappers won't work with v2.0, but they are available today for v1.*.

Thanks, Paul

Paul Wilson - Sunday, September 19, 2004 1:13:00 PM

Well said!

I agree wholesale with your post.

/Mats

Mats Helander - Sunday, September 19, 2004 1:46:00 PM

I can't agree more. The Dynamic SQL capabilities of O/R Mappers are a dream come true. O/R Mappers allow me to focus more on the business entities and rules as oppose to the plethora of ways my business framework may want to access and massage the data in the database. Not having stored procedures is just one less place where errors and problems can appear in one's applications.

Whether an O/R Mapper or code generation solution would require a re-compile or not when introducing a change in entity relations or entity members, I would have a difficult time not re-compiling and testing the application (or at least some piece of it) anyway just to make sure the application does not break or act in an unexpected manner. This is good practice and therefore is a moot point in terms of maintenance for any solution. However, the less work I have to perform to introduce the entity or relationship change the better!

CodeSmith is an awesome tool and one I use regularly for all kinds of template-based code/data generation. I realize some people enjoy CodeSmith for creating a DAL as they want the ultimate in control, but given the maturity of a lot of the .NET O/R Mappers now, I think many of the reasons for their use of CodeSmith as opposed to O/R Mappers may be outdated and invalid. Certainly nobody wants to walk away from an investment in CodeSmith, their templates, and the time and effort spent on learning the CodeSmith syntax, but one needs to constantly scan the market and stay up-to-date on the evolution of tools and solutions. I often talk to people who keep basing their decisions or arguing their point about something based on their experience with really early versions of software that have matured by leaps and bounds since they last used it. If this is the case, I recommend looking at some O/R Mappers again as well as any other tools you may have dismissed in the past.

Good post, Paul.

David Hayden - Sunday, September 19, 2004 5:28:00 PM

Good article! :)

Now for the 100,000$ question: when are O/R mappers not very efficient?

time starts... now!

*tick*

*tick*

*tick*

...

Frans Bouma - Sunday, September 19, 2004 6:50:00 PM

Time's up!

Some O/R mappers solely work with objects, and do not have database-directly logic. So when you want to delete a 1000 entities, or you want to update a 1000 entities (all employees' salary +10%!) you have to fetch them first, then perform the action. This can be inefficient.

I throw this example in, to show that O/R mapping is PART of the solution for the problem what's called 'Data Access'. O/R mapping is not THE solution entirely, it requires extra functionality to make it THE solution, and I think in future versions other O/R mappers than mine will have features like this. :)

It's complex, solving the Data Access problem... and people get confused very quickly when terms like 'O/R mapping' are entering the discussion... sadly enough..

Frans Bouma - Sunday, September 19, 2004 6:58:00 PM

By the way, the WilsonORMapper has long had DAL-friendly methods for situations like this, and more besides. It also gives you an interface for "events" that you can "hook" that occur at retrieval and before/after each persistence.

Now, one place where objects are not the answer is when you need to retrieve a huge number of records. O/R mappers can actually be faster for smaller numbers of records than the bloated dataset, but the reverse is true for larger number of records due to the large number of object instantiations. But the WilsonORMapper can handle this situation too, returning raw datasets, even typed ones, which can be necessary for reporting situations.

Paul Wilson - Sunday, September 19, 2004 8:12:00 PM

Great explaination! 100% agree with you.

Chris Ongsuco - Monday, September 20, 2004 1:31:00 AM

How do you handle sql injection in your dynamic sql? I know parameterized queries don't allow it, but what about dynamic Order By (which cannot be parameterized i believe)

Jeff Gonzalez - Tuesday, September 21, 2004 4:29:00 PM

Thanks for the information, we are using method #2 ourselves, but relying on cleaning data always makes me feel...dirty.

I guess you can probably account for 99.999% of attacks that way and that small 0.001% chance isn't enough benefit for the cost it takes to secure it.

Security and Flexibility always seem to be in direct contradiction of each other.

Thanks for the information

Jeff Gonzalez - Tuesday, September 21, 2004 10:15:00 PM

Also another good reason NOT to use SP's is that most people put _so_ much functionality in SP's that they most of the time would not do in dynamiq-inline-sql statements (parameterised of course). I just came on a project where sp's are calling sp's and have lots of if's and way more.

Some freezing is done with a datetime as parameter. SQL functionality is used to get the last day of the previous month and that date is used as parameter for other SP's. But what if something goes wrong and you have to get the last day-of-the-month for 3 months ago? Then you have to rewrite the SP's or do everything manually. Just a simple example of things that can go very wrong with SP's because you're possible to do. I can only fear of what to expect once we can use .NET code in SQL2005!

Dennis v/d Stelt - Wednesday, October 6, 2004 8:27:00 AM

Hi there,

just yesterday i started my master thesis about o/r mappers. Can anyone of u give me a hint for good theories about that stuff? I really need 2 learn that stuff from the beginning, any good books would be great.

My main thesis will be about the pros and cons of O/R Mapping, perfomance and testing/describing some tools...

For any information, links and book tips i will be most grateful.

Best regards Christian

Excuse my bad english, i am german and not that used to it, sorry.

Christian Braun - Wednesday, October 13, 2004 11:48:00 PM

Wow that was pretty fast, thx a lot Paul!

Christian Braun - Thursday, October 14, 2004 8:16:00 AM

15 Comments