Stored Procedures vs. Dynamic Queries.

In Ted Graham's blog this morning he wrote a little text about the debate that is starting to show up here and there: is there a move away from 'Stored Procedures' towards 'Dynamic Created Queries' ? I think there is.

In the big, ugly Data Layer Generator/Toolkit debate over at the asp.net forums, there was some discussion about dynamic queries vs. stored procedures. As you can see, if you read the thread, I was one of the people who thought stored procedures was the way to go, for performance reasons for example. In the last couple of weeks I did some tests to see if what I was thinking back then when the thread was going on, was actually as true as I thought it was.

In a recent blog, I talked about stored procedures with optional parameters. The reason I was looking into these was that if you have a table with a couple of interesting fields, and you want to filter on a combination of any of these fields, you'll end up with a lot of stored procedures. LLBLGen f.e. generates for each foreign key found in a table T a different stored procedure to filter on that foreign key field, thus not even on a combination of these foreign key fields, if more of these fields are found in a table. This does result in excessive code bloat if you are not careful.

During the past year, users of LLBLGen have asked me to generate even more stored procedures, f.e. selects with filters based on a combination of fields, with joins etcetera, etcetera. For version 2.x, which I'm currently developing, I looked into ways to do this without having to generate a zillion stored procedures or to require the user to design every stored procedure in a fancy editor (which would take ages). I came up with two solutions: 1) Stored procedures with optional parameters and 2) dynamic generated queries (so no stored procedures). Option 2) I didn't like very much, because my mind told me stored procedures were the way to go, for performance, for security, for separation of tiers and perhaps there are a couple of other reasons you can come up with. However, I'm not 15 anymore, the Stubbornness of Youth has left me a long time ago so I decided to put up a (simple) test.

To make this test fair, an explanation is necessary. As mentioned above, if you want to avoid excessive code bloat in the stored procedure API, you have to use stored procedures with optional parameters, there is no way around it. For a database with 10 tables or so, code bloat isn't an issue, however it gets nasty when you have 50 or 60 tables or even more. So I did a test with a simple select using 3 foreign keys as possible filters. The stored procedure used is mentioned in this blog. I wrote 2 routines. One would build up a dynamic query and based on the input parameters (f.e. if the caller wanted to filter on just CustomerID, the other parameters would be null) it would generate a dynamic query using parameters. The other would simply call the stored procedure.

I'll save you the code, but the results were interesting. I compiled two programs using the routines, one would call the dynamic query routine while feeding it random input parameters, the other one would call the stored procedure routine and also feed it with random input parameters. I ran both at the same time, as a semi simulation of multi-user database hammering. The programs were ran on my P3-933 workstation (however, in these days of P4's, a better name would be 'electric typewriter') using my dual P3-933 Serverworks powered server with SqlServer 2000 and the Northwind database. To be fair, each time a query was started the connection was opened and closed again. All queries retrieved data in a DataTable object.

The dynamic query routine was twice as fast as the stored procedure routine. I ran the routines for about 10,000 loops, and a number of times, but each time the dynamic query one was faster. This was of course because the dynamic query was tailored to the task, without expensive statements as COALESCE(), while the stored procedure was always using the same, slower execution plan. I didn't expect this, because I thought the execution plan of the dynamic query is thrown away each time a query is executed, but this is not the case. The stored procedure version was using more CPU power on the server, while the dynamic query was using more CPU power on the client. I didn't cache any generated query so every time I had to create the query again in a stringbuilder object however that worked fine (and garbage collection kicked in rather smoothly).

This convinced me to choose the route of the Dynamic Queries. It will create the code in your n-tier stack so much simpler and uniform. LLBLGen's code generator will also be much simpler because it can now generate C# / VB.NET statements creating a dynamic query using the dynamic query engine, instead of that I have to write a complete stored procedure generator.

Aren't there any disadvantages? Of course. There is less separation between the tiers. This means that security set on stored procedures should be set elsewhere. With n-tier web-applications this is not a real issue, since they most of the time connect via the same user to the database. Security is then managed through functionality in the application, not in the lowest layer of the application: if you can't call a function in the database because you can't call the caller of that function in the Business Tier, you can't call the function in the database. If you trim down the amount of users who can access your application, you can trim down the amount of users who can execute code in your database. Admitted, it is without stored procedures harder to set up 'last resort'-hurdles to avoid code execution. However because a lot of web-applications using ASP.NET are connecting to the database using the same user (ASPNET user) or an impersonated user, there is a tunnel to the stored procedure code via that user anyway (and which implies security through functionality has to be implemented as well).

14 Comments

  • Nice research. I guess now I'm still wondering my I shouldn't make the next leap and go with an O/R mapper all the way? A few months ago I would have said no way, but then again I was convinced that sprocs were the way too. So please more enlightening posts like this last one instead of the endless debate on the forums. Thanks.

  • Paul, a MONTH ago I would have said: "Stored procedures are the way to go". However, there are times in life when you face the results of choices you've made, you don't like what you see, you suddenly know you made the wrong decisions. I've to admit I've spend 2 months developing a visual stored procedure designer for my generator just to meet the requirement to first generate stored procedures, and then generate call classes for these stored procedures. These 2 months were a total waste of time, because it is a) undoable to write a decent visual stored procedure designer without forcing the user to write SQL statements, and b) it will result in time-consuming work because the use has to design EVERY SINGLE action he wants to perform on the database in a stored procedure. For CRUD operations, that's easy. But what do you do when you have 3 foreign keys in your table and you want to have several delete stored procedures based on any combination of these foreign keys? That's 6 stored procedures if I do the math correctly :) Not a pleasant sight with a lot of tables.





    I think the debates on the forums did help me find another way out of this by going the other way around, which ended up in tiny pieces of code which produce on the fly what is needed. If you set it up in a way that the small statements use factories which produce agnostic objects with db specific code INSIDE them, you have very portable code (just change the factory).





    So I won't say the 'endless' debates on the forums are not worth it, I think they are. Every developer thinks on a given point in time they know what's best in a given situation. During debates on forums you learn other peoples views, and they will probably not enlighten you on the spot, but will work through in your work in some way, I'm sure of it (either by strengthen you in your believe you're right, or in a realisation you're wrong so you can change your way of doing things).





    If you wa

  • (the comment got clipped)


    If you want to write all the logic by hand, there is NO WAY an O/R mapper can help you. O/R mappers, and related generators, are ment to generate common code based on a database scheme / input parameters and using a set of choices which might be the ove rall best. But NO O/R mapper will ever beat a handwritten, handoptimized DAL with stored procedures for every single possible query with a combined calling mechanism which is smart enough not to get data when it shouldn't. Like John Carmack will never b e able to deliver a generic 3D engine which will perform top notch in every single 3D game possible.


    <br>





    <br>


    So, if you are aware of the compromises taken, you can then start looking at the advantages of using a ready-to-roll framework over writing it all by hand and check if the advantages are outweighting the disadvantages related to the compromises taken. W ith a solid O/R mapper or related generator, I'm pretty sure the ready-to-roll framework is well worth it.


  • So are you basically saying your next version will be an O/R mapper instead of a code generator? If not, then my question is still why your code generation approach (but without the sprocs) vs. something like EntityBroker?





    Thanks for your time.

  • Hi Frans,





    Just out of curiosity, did you run the test with specific stored procedures too, instead of the generic one?





    I understand that that wasn't a realistic option in your opinion, but I'm still curious regarding the result.


    :-)





    Best Regards,


    Jimmy


    ###

  • :) No I haven't. I did however profile the difference in speed between a stored procedure with optional parameters and a stored procedure tailored to one single filterset and the optional parameter one was much slower, since the optimizer can't optimize away a filter on a column even when the parameter is optional.





    I have to test it extensively to say anything about the speed between dedicated stored procedures and dynamic queries (thus in the case of 3 foreign keys which are a possible filter, you have to write 6 stored procedures and then select one of them to run the query of that moment), but I don't think they'll be much faster than the dynamic queries (the dynamic queries have to be build up each time which takes some time, network speed can be a factor when a high amount of queries is executed), based on f.e. the books online documentation of SqlServer 2000.





    Hand-optimized stuff is always the prefered way of doing things when performance is absolutely a matter of life and death. So is assembler. Neither one of them is recommeded in day to day applications due to the hard way of maintain such code.

  • I got an error so I try again...





    It would also be interesting to see how dynamic SQL in a stored procedure would perform compared to the other options.





    Best Regards,


    Jimmy


    ###

  • Frans,





    If I understand correctly, your dynamic query generator will run against JET -- which is used a lot in shared hosting environments. (Not everyone wants to pay for SQL Server.)





    That would be very nice indeed...

  • Paul Schaeflein,





    I do understand that shared hosting environments sometimes offer access database support as the only support for databases, however I firmly believe that access is not the way to go for n-tier applications. Access is a single-user toolkit for applications which need a (small)database. Other applications can better target databases which are ment to perform in a multi-user environment. If they do not want to pay for a database, MSDE is there to the rescue.





    The generator works with database drivers and database specific generators. In theory it's possible to write a driver for JET and thus a query generator for JET, however that's not planned. (It will not be that much work however).

  • Mike:


    The execution plan is the same with every input, this is because the stored procedure uses the same code. Check the stored procedure in the linked blog. Specifying 'With Recompile' would slow down the execution without any benefit.





    Building strings INSIDE a stored procedure is even slower and not an option. T-SQL is not ment to be doing string contatenations, and it also kills the caching of execution plans, because you can't re-use execution plans based on parameters (allthough sqlserver tries to replace hardcoded values with parameters).





    The Dynamic queries ARE parameterized. Therefor the queries will re-use execution plans (there are 6 execution plans possible with the test done: 3 foreign keys, 6 combinations). The dynamic queries were not created using the concatenation of the values in the query, but using parameters.





    Delegating the call to other stored procedures inside a stored procedure wouldn't have helped either, I can do that selection in C# too, avoiding slower T-SQL execution code. The point was: is it worth it to do this:





    1) create 6 stored procedures (which are fast)


    2) create 1 stored procedure with optional parameters (which is slow(er)


    3) create dynamic queries with parameters (thus in the end 6 of them).





    1) is probably one of the fastest, but requires 6 stored procs in our situation for that particular table. Not that nice. 2) is slower than 3) so 3) is IMHO the best choice.





    About access control: I read that more and more, but it's hot air. I'll try to explain why later on.

  • Good work! I just posted my thoughts to my blog (linked to my name above) rather than spilling them here...

  • sql server blows

    mysql rocks

  • 1) create 6 stored procedures (which are fast)

    2) create 1 stored procedure with optional parameters (which is slow(er)

    3) create dynamic queries with parameters (thus in the end 6 of them).



    What about Creating One View indexed correctly and one SP.

    If you are not using indexed Views you are missing out one of the greatest features of SQL 2000.

    One that is improved in Yukon.

  • As with anything else SP with dynamic SQL statements is a subject for trade-off. I believe that a relatively simple stored procedure is faster than SQL statement. Of course as it was noticed if one wrote SP that did a lot what T-SQL was not meant to do such as parsing and building string and so on it might not be a solution. The better design for a complex SP would be to break them down to a set of smaller ones and call later from the one with optional parameters. This SP would be like FACADE for others. The separation of layer is a very important thing as with SP it provides a single point for change when the database changes. SPs could be written by a designated person thus enforce their consistency. In a contrary a built-in SQL statement peppered in different places by different programmers do not provide the single point for a change. To me the major problem with SP is not a performance but the fact that they are hard to maintain. If there's not a responsible and designated person put in charge to write them but everybody in a team can do then the pool of SP soon become littered with redundant SPs with different naming conventions and so on. It is because SQL server does not allow to structure SPs in the way VS allow to structuring components in the project. I hope that this is going to change in future versions. As well as T-SQL itself e.g. allowing to write SP in c#





Comments have been disabled for this content.