API's and production code shouldn't be designed by scientists

Monday, May 21, 2007

One of the biggest mistakes Microsoft made in Database land was the absense of a proper paging mechanism in SqlServer 7 and SqlServer 2000. (No, don't come to me with tricks with @@ROWCOUNT because these don't always work in all situations). People had to use temp tables to get a mechanism which always works in all situations. What a surprise it was to see that Microsoft said they solved it in SqlServer 2005: they added a construct which offered paging inside the database without the necessity of temp tables. Though, who designed this feature? Did that person ever write production SQL code or did that person ever write normal business applications? Did anyone ever tell that person that all other major databases on the planet, except Access, had a simple paging mechanism build in so anything more complex than that would be bad? Apparently not.

There's a saying in IT land: when you let smart people solve a simple problem you get a overly complicated solution. The solution Microsoft chose for paging in SqlServer 2005 is one of the most bulkiest, complicated solution to paging ever created. I dare to say it's even more complicated than temp tables. Let's look at an example and how other databases handle it, shall we? In the code snippets below, pageNumber starts with 1 (which is the first page) and pageSize is the normal size of the page to retrieve.

Example
The typical example we want to use is this query:

SELECT * FROM Orders
WHERE CustomerID LIKE 'A%'
ORDER BY OrderDate DESC, ShippingDate DESC

(I use '*' instead of a list of fields for simplicity here, but you get the idea). The query is pretty simple. Now, if you ask a random developer what s/he would add to the SELECT syntaxis to implement paging, what do you think that person will answer in general? Very likely the person will say that the SELECT statement has to have a page specification and a page size specification and that's about it. You know, how Firebird solves it:

SELECT FIRST pageSize SKIP ((pageNumber-1) * pageSize) * FROM Orders
WHERE CustomerID LIKE 'A%'
ORDER BY OrderDate DESC, ShippingDate DESC

Which basicly means: skip the first n rows and then simply take the first m rows. Pretty simple, works very straight forward and works on every resultset. Syntaxis is clean, simple to understand and to the point. Clearly the work of an engineer who knows what s/he's doing.

Another simple approach would be how MySql and PostgreSql solve it: with a specification at the end of the query:

-- MySql
SELECT * FROM ORDERS
WHERE CustomerID LIKE 'A%'
ORDER BY OrderDate DESC, ShippingDate DESC
LIMIT ((pageNumber -1) * pageSize), pageSize

-- PostgreSql
SELECT * FROM ORDERS
WHERE CustomerID LIKE 'A%'
ORDER BY OrderDate DESC, ShippingDate DESC
LIMIT pageSize OFFSET ((pageNumber -1) * pageSize)

Which basicly comes down to the FIRST, SKIP approach: skip a set of rows and take the subset with the length specified.

Though are the big guns in the field, i.e. DB2 and Oracle, also doing paging this way? No, they have a slightly different approach. They do it a little more complicated, but not that much complicated. What they basicly do is that you can refer to the number of the row in the total resultset and simply filter on that. So if the example query results in 1,000,000 rows, and you want the 10th page (which is the page with pageNumber==10) of 25 results, you can simply refer to the ((10-1) * 25)th row in the resultset as the first row (row counting starts with 0) in the resultset to grab and then grab the 25 rows you want. You can do this with a normal WHERE clause. It's quite straight forward. Let's look at DB2's query first:

-- IBM DB2
SELECT * FROM 
(
	SELECT a.* , rownumber() over () AS rn 
	FROM 
	(
		SELECT * FROM ORDERS WHERE CustomerID LIKE 'A%'
		ORDER BY OrderDate DESC, ShippingDate DESC
	) AS a
) AS rs 
WHERE rs.rn between 
	(((pageNumber-1) * pageSize)+1) 
	AND
	(pageNumber * pageSize)

This looks complicated at first but it's not: whatever query you have, it's always placed at the same spot, so the paging query itself is always the same. That's a key point, as writing queries can be tedious and if you can use a paging query which is nothing more than a wrapper around the main query, you're not likely to make a mistake, in fact you can factor out the paging query to a piece of sequel you simply apply to your main query by using a routine.

The Oracle one is a bit simpler:

SELECT * FROM
(
	SELECT a.*, rownum r__
	FROM
	(
		SELECT * FROM ORDERS WHERE CustomerID LIKE 'A%'
		ORDER BY OrderDate DESC, ShippingDate DESC
	) a
	WHERE rownum < ((pageNumber * pageSize) + 1 )
)
WHERE r__ >= (((pageNumber-1) * pageSize) + 1)

The same basic idea: refer to the physical row number in the final resultset of the row you want to retrieve and then return that row and the n following rows which all form the page you want to fetch. In Oracle you too can use a wrapper routine for every query you have.

So, two main alternatives to choose from: add simple keywords to the SELECT syntaxis and hide every little dirty element necessary for paging for the developer or offer a way to use the row number inside a paging query and basicly let the developer wrap any query with a standard paging query.

So what does Microsoft bring to the table? Well, it does sound like the DB2/Oracle method: the ROW_NUMBER() function. At first, it looks like we can go the DB2/Oracle route and simply write a wrapper query which pages everything. But no, that's not the case. Of every option thinkable, the scientist who designed this syntax thought it had to be as much complicated as it possibly can be made: your original query has to be split up to work properly with the ROW_NUMBER() function. That means that any wrapper approach isn't going to work, which means for every query that has to be paged, a complicated new query has to be formulated. No, there's no simple SELECT statement syntaxis. Any engineer working on this API would have opted for that approach. A person however who never uses his/her own work will perhaps opt for a different approach but even then, the DB2/Oracle approach is more simpler to use than the SqlServer 2005's approach will ever be. Let's take a look at how the simple example query will look like in SqlServer 2005. I won't use a CTE here, to focus on the main SQL.

select TOP 10 b.*
FROM
(
	SELECT ROW_NUMBER() OVER (
		ORDER BY OrderDate DESC, ShippingDate DESC) as row_number, *
	FROM
	(
		SELECT * FROM Orders
	) AS a
	WHERE a.CustomerID LIKE 'A%'
) AS b
where b.row_number > (pageSize * (pageNumber-1))

So, let me get this straight: to page over a resultset in SqlServer 2005: you have to split your query into fragments which then move to different parts of the final query. Not only that, but if you refer in your WHERE clause to a table, like: WHERE Orders.CustomerID LIKE 'A%', it goes wrong: in the new WHERE clause, you have to refer to the set with the ROW_NUMBER function. This can become complex very very quickly, mainly because the actual query is completely ripped apart. There are alternatives to this structure though, neither one of them is easier than the one I pointed out above.

Scientists should stick with science. What they invent and discover should be moved to the real world by engineers, not by scientists as well. Because, if you leave it to scientists, you'll end up with horrible syntaxis like the ROW_NUMBER() function in SqlServer 2005. It might be a powerfull feature as it has functionality which might be usable in some situations (mostly in reporting scenario's, not in paging), however they never ever should have used the function as the paging functionality cornerstone in SqlServer 2005: they simply should have added a couple of simple keywords to SELECT and be done with it. Though if you aren't an engineer, if you don't work with the code you write yourself, if you don't write user-code up front to see if your API really is easy to use and consistent, you fall back on what you understand of what you created, which is basicly everything used as the foundation of what you created and of course what you created yourself, which makes everything look easy and straight forward. That's the pitfall in which scientists fall and which engineers many times manage to avoid

For completeness:

Again, my disclaimer: you can think that @@ROWCOUNT works always, or a simple TOP trick works always: they don't. There's no wrapper/SELECT trick in SqlServer whatever version which works with all queries you throw at it so you can write consistent code.

21 Comments

Rasmus: the NHibernate code looks similar to what I use in LLBLGen Pro as well (however they don't order by on the first field but on a dummy field) (the query I posted is how Linq to Sql does it) though it has a problem in rare situations: when 1:n relations are in the join of the normal query, it could be there are duplicates in the end result even though you use ROW_NUMBER. The temp table approach avoids this. (I have to look up a particular specific query, but it is possible). This is due to the nature of ROW_NUMBER() which isn't a simple index. If it was a simple index over the complete set, it would work, however there are situations where it doesn't work and you either have to use the approach I posted above (which is more complicated to construct) or use a temptable approach.

FransBouma - Monday, May 21, 2007 1:47:48 PM

Even the good example isn't that great.
All these examples use "pageNumber-1", is that zero a based or a 1 based number.

AndrewSeven - Monday, May 21, 2007 2:23:14 PM

What do you mean exactly with 'Even the good example isn't that great' ?

pageNumber-1 is used because as I said in the article: "In the code snippets below, pageNumber starts with 1 (which is the first page) and pageSize is the normal size of the page to retrieve."
:) so calculations where 0 can be used for the current page requires a -1.

FransBouma - Monday, May 21, 2007 4:03:01 PM

Apanizza: unfortunately, in SqlServer the syntax isn't the same. I wished it was the same as Oracle/db2, but it's not, ROW_NUMBER() does work differently in sqlserver than it does in db2/oracle. Oracle/db2 simply work with indexes on the final resultset, ROW_NUMBER() does different things, it's not just an index in the final resultset, that's the sad part. :).

FransBouma - Monday, May 21, 2007 5:03:22 PM

Frans,
I understand the Oracle Syntax is better, I just wanted to point out that the syntax is in the SQL standard ( it doesn't not mean is good).

If you rewrite your SQL example as:

select *
FROM
(
SELECT ROW_NUMBER() OVER (
ORDER BY OrderDate DESC, ShippingDate DESC) as row_number, *
FROM
(
SELECT * FROM Orders WHERE CustomerID LIKE 'A%'
) AS a

) AS b
where b.row_number between (pageSize * (pageNumber-1)) and (pageSize * (pageNumber))

It is a little simpler, at least you get rid of the prefixing , I don't think this changes the way it works though.

apanizza - Monday, May 21, 2007 5:47:11 PM

Sorry, missed the " a.* , " before ROWNUMBER() in the example.

Not having my best day. :-(

apanizza - Monday, May 21, 2007 6:23:50 PM

I'm not sure why you made the statement for SQL Server so complex. You can do it will only two levels of derived tables.

I'm also not sure why you didn't use Common Table Expressions. They are a little odd but vastly simplify the syntax for something like this. You only need a complicated inner query and then the outer query becomes trivial:

WITH Members AS
(
SELECT MEMBER_ID,M_NAME, M_POSTS, M_LASTPOSTDATE, M_LASTHEREDATE, M_DATE, M_COUNTRY,
ROW_NUMBER() OVER (ORDER BY MEMBER_ID ASC) AS RowNumber
FROM dbo.FORUM_MEMBERS
)
SELECT *
FROM Members
WHERE RowNumber BETWEEN 1 AND 20
ORDER BY RowNumber ASC;

Bill Graziano - Monday, May 21, 2007 7:17:07 PM

Agreed with Bill Grazia, "WITH ...." is usially used for that, which is simple anough and functional.

Koistya `Navin - Monday, May 21, 2007 8:37:48 PM

Navin: a CTE or not, that's not the issue. The WITH statement makes the query just re-usable inside the query, it doesn't make it simpler.

FransBouma - Tuesday, May 22, 2007 7:46:59 AM

Am I missing something here? Not sure why you're making the SQL2005 approach so complicated. I've very recently implemented paging into my custom O/R layer and it's incredibly simple in my opinion. Here's how I'd write your example:

SELECT * FROM ( -- #1

SELECT *
, ROW_NUMBER() OVER ( ORDER BY OrderDate DESC, ShippingDate DESC ) AS rownumber -- #2
FROM Orders
WHERE CustomerId LIKE 'A%'

) AS a WHERE rownumber between 21 AND 30 -- #3

So basically, by adding 3 lines to the query, I now have robust paging.

...and in with your examples, the DB2 method seems slightly more complicated than the SQL2005 version. I must be missing something.

Jason - Tuesday, May 22, 2007 6:29:06 PM

Jason, please read the comments I made below the blogpost, then you'll know that 1) I use code similar to yours and 2) that there is an issue where this query you and I and everyone elses uses isn't going to work.

Also, the query you posted is complicated as you have to chop up the query. With all other solutions, you don't have to. That's the key point, because you now have to write different queries for every query that has to be pageable.

The point was though that of all the offerings out there, SqlServer has a horrible approach compared to the ones already available. The DB2 method is actually very simple as I wrote: it can be a wrapper, as the full query is at one spot. The sqlserver 2005 one can't be a wrapper, due to the complicated syntax.

FransBouma - Tuesday, May 22, 2007 7:13:19 PM

No, you're not missing anything Jason. There's definitely no need to use TOP here, and no reason to nest "select * from Orders" in a table expression. And in fact, without an ORDER BY clause in Frans' example, there's nothing to prevent it returning the wrong rows.

All three of the DB2, Oracle and SQL2005 examples should use ORDER BY to confirm their validity, since it could potentially return row 30 before row 29 if the optimiser wanted (unlikely, but still correct).

Rob Farley - Wednesday, May 23, 2007 5:52:50 AM

Last time I checked the paging of filtered results in SQL Server 2005 it produced a suboptimal execution plan (scans) whatever I tried. I am not so bothered about the syntax, as for the execution plan ...

Deyan Petrov - Wednesday, May 23, 2007 1:37:04 PM

A colleague said he's tried this approach for the problem:

select top pageSize
order_id, --other columns
from orders
where order_id not in
(
select top (pageNumber-1) * pageSize))
order_id
from orders
)

So the number or "top" rows in the inner select grows as the page number goes up.

Any reason that doesn't work for the general case?

Donnie - Wednesday, May 23, 2007 6:20:50 PM

donny: that only works if you sort on the order_id column. If you sort on another column, it doesn't work.

FransBouma - Wednesday, May 23, 2007 6:39:45 PM

Frans,

If the "order_id" column is always added as the *last* column in the ORDER BY of both the outer and inner select, would you not get both a properly ordered result set and correct, deterministic paging?

Donnie

Donnie - Wednesday, May 23, 2007 7:50:28 PM

No, it has to be the first column in the order by, otherwise the resultset won't be ordered on order_id asc/desc.

FransBouma - Thursday, May 24, 2007 11:38:08 AM

My point was that I don't think it does have to be the first column. As long as the order is deterministic across requests for the different pages, which including the order_id at the end enforces, then I *think* you can get results in an order you really want while still getting correct paging using the technique I outlined.

Donnie

Donnie - Thursday, May 24, 2007 8:52:36 PM

Is there an ansi standard on paging? I don't remember hearing about one, but if so then why doesn't MS follow it. If not then is there a push for one? Is there any news on whether MS will do a LIMIT (i.e. page) keyword in the next version of SQL Server? You would think that Linq to SQL people that designed the query generation that you mentioned would have gone to the SQL Server people and have said that this is ridiculous.

David Parslow - Tuesday, May 29, 2007 6:46:01 PM

Psging is a side-benefit of the much more powerful function row_number(), and this is very much according to ANSI standards. The ANSI standard doesn't yet include any implementation of paging explicitly, although of course developers could quickly see how to work such functionality using the tools provided.

Rob Farley - Wednesday, May 30, 2007 10:09:35 PM

Frederick:
The typical trick is to use SELECT TOP n ... where n is equal to pagesize * pagenumber, if pagenumber starts with 1. Then apply the paging technique with that set.

The SET ROWCOUNT trick has a drawback with sorts, it's not always giving back the proper results.

FransBouma - Sunday, June 3, 2007 9:46:40 AM

Comments have been disabled for this content.