Let me start with a blunt statement: stored procedures are bad, they are a bad way to formulate data-access logic. I can't state that enough. Today I stumbled into a blog by Rob Howard, which tries to convince the reader that whatever you do, use stored procedures! With tears in my eyes I've read the arguments he brings to the table. Some are silly and one is even based on completely wrong information and assumptions and so far away from the truth it hurts. I've blogged about stored procedures before (here, here and here) and I used them a lot for 8 years, but I'm now almost stored procedure 'free' for 8 months now, and I love the feeling. The reason is obvious: the stress of maintaining a lot of stored procedures, to write another stored procedure for each thing you want to do, is gone. Dynamic SQL is the future. (Dynamic SQL is generated on the fly by a generic piece of code which gets various data as input and generates a parametrized query from it. Which can be cached on the client and will be cached on the server. Dynamic SQL generated based on objects written in C# or VB.NET).
Let's walk through Rob's list of arguments and see what we can make out of it. Let me start by saying that Rob's list of arguments is the default list of arguments any DBA will offer you when you ask him/her why stored procedures are better. However, keep in mind that the DBA most likely will offer you that list to save his/her job.
- Ad-hoc SQL Script is brittle
Sure, it is generated on the fly, it is gone when it is executed (except for the execution plan, Rob, but more on that later). Rob comes with the argument that putting ad-hoc SQL in your BL code is not that good. Agreed, but who said that there is just one other option to solve that problem and that option is stored procedures? ... small changes to the database can have severe impacts on the application. Yes, that's right, however changes to a relational model will have always an impact on the application that targets that model: add a non-nullable column to a table. You can use stored procedures or ad-hoc queries, you have to change the calling code to make sure that column gets a value when a new row is inserted. For Ad-hoc queries, you change the query, and you're set. For stored procedures, you have to change the signature of the stored procedure, since the INSERT/UPDATE procs have to receive a value for the new column. This can break other code targeting the stored procedure as well, which is a severe maintenance issue. A component which generates the SQL on the fly at runtime, doesn't suffer from this: it will for example receive an entity which has to be saved to the database, that entity contains the new field, the SQL is generated and the entity is saved. No maintenance problems. With a stored procedure this wouldn't be possible.
Ah yes, the good old dead horse, security! The most heard argument in favor of stored procedures is that with using stored procedures, security can be controlled using a fine grained mechanism: place the execution rights on the procedure and you're set. Well... ever heard of role-based security? Rob Howard hasn't obviously. Security is a subject that has to be taken seriously, very seriously. Therefore, a good DBA defines user-roles in SqlServer. Users are added to those roles and rights are defined per role, not per user. This way, you can control which users can insert / update and which users can for example select or delete or have access to views in an easy way: in most applications you have 2 roles: the average user, and the administrator user which configures the application. Define 2 roles in the SqlServer database, place the users in the right role, define the rights per role on the tables and views and off you go: fine grained security which works, without a single stored procedure in sight. A new user is added? You just add it to a user role and it has the rights it should have.
Another nice object in SqlServer, which I already mentioned earlier, is the view. Views are there to control which data is accessed on a column basis or row basis. This means that if you want user U to select only 2 or so columns from a table, you can give that user access to a view, not the underlying table. The same goes for rows in one or more tables. Create a view which shows those rows, filtering out others. Give access rights to the view, not the table, obviously using user-roles. This way you can limit access to sensitive data without having to compromise your programming model because you have to move to stored procedures. Views are totally ignored by Rob Howard, while the view is especially created for this purpose. See Books Online: Using Views as Security Mechanisms.
Rob mentions SQL injection attacks. SQL queries which are constructed by simply concatenating data into a query statement are indeed open for SQL injection attacks. However because ADO.NET has a great parameter support, why would anyone neglect this fine instrument? A good Dynamic SQL engine creates parametrized queries, which are not only faster (because the execution plan is cached, yes Rob, it is), they are also not open for SQL injection attacks due to the parameters.
Everyone who thinks stored procedures are pre-compiled, say "Aye!". Whoa, what a noise! For all of you who said "Aye!" a few seconds ago: open SqlServer's Books Online (v7 or v2000, doesn't matter), search for "cache execution plan". You'll find fine articles like "Execution Plan Caching and Reuse" and "SQL Stored Procedures". Let me just quote some lines from the "SQL Stored Procedures" article:
SQL Server 2000 and SQL Server version 7.0 incorporate a number of changes to statement processing that extend many of the performance benefits of stored procedures to all SQL statements. SQL Server 2000 and SQL Server 7.0 do not save a partially compiled plan for stored procedures when they are created. A stored procedure is compiled at execution time, like any other Transact-SQL statement. SQL Server 2000 and SQL Server 7.0 retain execution plans for all SQL statements in the procedure cache, not just stored procedure execution plans. I didn't make that up, people. It's there, for a long time (since SqlServer 7.0).
However, what does Rob Howard say? I quote:
There are also internal performance benefits to SQL Server for using stored procedures vs. ad-hoc SQL script. When stored procedures are used SQL Server can cache or pre-compile the ‘execution plan’ that it uses to execute the SQL vs. having to recalculate the execution plan on each request. Isn't that a little wrong, Rob? There is no precompilation of stored procedure code in SqlServer. It caches execution plans for each query, also ad-hoc queries. Even better: it will parametrize queries which don't even have parameters to keep the execution plan in the cache!
Rob also claims that you can better loop through rows in T-SQL (in 'many cases') than in the data-access layer because doing it in T-SQL is faster (according to Rob). Now, looping through rows in T-SQL requires a cursor. A cursor will always create a temp table in tempdb, which can hurt performance pretty bad if tempdb is full and has to be resized. Doing a lot of looping will require that sooner or later. Besides that, T-SQL is not a procedural language, it's a set-based language (© Joe Celko). This means that the only power T-SQL has over procedural languages is that it can work on sets of data very fast. Looping through a resultset means that you didn't understand what set-based means or what you want can't be done in a set-based batch of statements. It's then better to choose the best language for the procedural statements instead of using a set-based language for this and feed it the least amount of data to perform its task.
Now, let me add something about performance here. Say, you have a database with 100 tables, with an average of 7 fields per table. This requires 100 update stored procedures, at least (the crUd procs). These 100 procedures will have on average 7 parameters to update all fields at once, because you can't create optional update statements in stored procedures. You can't create stored procedures for all possible combinations either, that would require 100*7! procedures, which is quote a lot. A component which will create the SQL on the fly will tailor the UPDATE query to the job it has to do, for example updating that single varchar field in a row with 2 blob fields. The blob fields aren't passed to the database again, because that field isn't updated. Another nice thing about creating SQL on the fly is that you can create the WHERE predicate on the fly. Foreseeing every possibly WHERE predicate for an UPDATE stored procedure is undoable. You can fall back to optional parameters, but that will do you no good either, because COALESCE isn't that great when it comes to performance. (see my previous blogs about dynamic SQL and stored procedures).
Stored procedures also will open up a maintenance problem. The reason for this is that they form an API by themselves. Changing an API is not that good, it will break a lot of code in some situations. Adding new functionality or new procedures is the "best" way to extend an existing API. A set of stored procedures is no different. This means that when a table changes, or behaviour of a stored procedure changes and it requires a new parameter, a new stored procedure has to be added. This might sound like a minor problem but it isn't, especially when your system is already large and runs for some time (when it becomes 'legacy' but the amount of time and money invested in the system is that huge that replacing it will cost more than altering the current system). Every system developed runs the risk to become a legacy system that has to be maintained for several years. This takes a lot of time, because the communication between the developer(s) who maintain/write the stored procedures and the developer(s) who write the DAL/BL code has to be intense: a new stored procedure will be saved fine, however it will not be called correctly until the DAL code is altered. When you have Dynamic SQL in your BL at your hands, it's not a problem. You change the code there, create a different filter, whatever you like and whatever fits the functionality to implement. A good O/R mapper helps you with this. Microsoft also believes stored procedures are over: Microsoft's next generation business framework MBF is based on Objectspaces, which generates SQL on the fly.
I too find writing code like string s = "SELECT * FROM Foo WHERE Bar = " + barValue; in your code not the right thing to do. However the alternative is not stored procedures, it's a component that generates this SQL on the fly so you don't have the disadvantages of stored procedures and have the advantages of generating the SQL you need. Writing such a component is a one-time effort, you can reuse that component each time you access a database. Stored procedures are not the answer, Dynamic SQL is.