Frans Bouma's blog

Generator.CreateCoolTool();

Syndication

News




    Add to Technorati Favorites

About me

Fun stuff I created

My work

SqlServer 2005 paging: there IS a generic wrapper query possible

(the Name field in the queries below is without [ and ] brackets, because CS currently goes bezerk because of these. Don't know why, but apparently a glitch somewhere.)

Recently, I wrote a blogpost about SqlServer 2005 paging, called API's and production code shouldn't be designed by scientists, about how horrible the paging syntaxis is in SqlServer, while it's easy in competing RDBMS-s. One thing that stood out was that it was apparently impossible to produce a wrapper query in SqlServer 2005 which would be able to page any other query you would write, while it was possible to write such a query in Oracle or DB2.

Today I was updating our own SqlServer 2005 paging code as the Sql generator had to revert to a temp-table approach if it ran into one or more 1:n relations. The reason was that the SELECT statement which does the actual query can't use DISTINCT, as the ROW_NUMBER() value is also in the select list, so DISTINCT has no real value: all rows are unique due to their value for the row number. To solve this, you could use a dual SELECT, first select the set you want to page on, then select that again but apply the ROW_NUMBER() on that set and then filter on the row number. This all sounds vague, so let's go to an example.

Query with single SELECT and combined ROW_NUMBER usage. This query uses a TOP clause to limit the resultset to fetch. It has the ORDER BY placed inside the OVER() clause. This query fails, as it simply returns a set of duplicate rows. It's usable on SqlServer 2005's AdventureWorks catalog.

WITH __actualSet AS
(
    SELECT DISTINCT TOP 9 [Product].[ProductID], 
    Name, [ProductNumber], [MakeFlag], 
    [FinishedGoodsFlag], [Color], [SafetyStockLevel], 
    [ReorderPoint], [StandardCost], [ListPrice], 
    [Size], [SizeUnitMeasureCode], 
    [WeightUnitMeasureCode], [Weight],
    [DaysToManufacture], [ProductLine], [Class], 
    [Style], [ProductSubcategoryID], [ProductModelID], 
    [SellStartDate], [SellEndDate], [DiscontinuedDate], 
    [rowguid], [Product].[ModifiedDate], 
    ROW_NUMBER() OVER(ORDER BY [Product].[ProductID] ASC) AS __rowcnt
    FROM
    [Production].[Product] INNER JOIN [Purchasing].[PurchaseOrderDetail]
    ON [Product].[ProductID]=[Purchasing].[PurchaseOrderDetail].[ProductID]
    WHERE
    [Purchasing].[PurchaseOrderDetail].[OrderQty] = 60
)
SELECT * FROM __actualSet WHERE [__rowcnt] > 4 AND [__rowcnt] <= 8
ORDER BY [__rowcnt] ASC

This query is a paging version of this query:
SELECT DISTINCT [Product].[ProductID], Name, 
    [ProductNumber], [MakeFlag], [FinishedGoodsFlag], 
    [Color], [SafetyStockLevel], [ReorderPoint], 
    [StandardCost], [ListPrice], [Size], 
    [SizeUnitMeasureCode], [WeightUnitMeasureCode], [Weight],
    [DaysToManufacture], [ProductLine], [Class], 
    [Style], [ProductSubcategoryID], [ProductModelID], 
    [SellStartDate], [SellEndDate], [DiscontinuedDate], 
    [rowguid], [Product].[ModifiedDate]
FROM
    [Production].[Product] INNER JOIN [Purchasing].[PurchaseOrderDetail]
    ON [Product].[ProductID]=[Purchasing].[PurchaseOrderDetail].[ProductID]
WHERE
    [Purchasing].[PurchaseOrderDetail].[OrderQty] = 60
ORDER BY [Product].[ProductID] ASC

The paging query is what most o/r mappers, including ours, would generate (or thereabout), except of course the hard-coded values which would be parameters, but you get the idea. Some minor details differ here and there between O/R mappers, but the idea is the same overall.

However, a paging query which doesn't work is of course not what we want. Furthermore we want a generic wrapper, so we simply can feed it a query, any query, and page over it. Well, it seems that generic wrapper is possible. The key is that the ORDER BY clause of the original query shouldn't be used in the ROW_NUMBER's OVER clause, but should be left in the query itself and a wrapper SELECT statement should be used. The query above with the wrapper then looks like: (we use a trick to avoid OVER() to throw an exception, by simply passing a timestamp. This is a common way to work around the issue of having to specify a sort clause). We'll retrieve the same page, the second page with a size of 4:

WITH __actualSet AS 
( 
    SELECT *, 
        ROW_NUMBER() OVER (ORDER BY CURRENT_TIMESTAMP) AS __rowcnt 
    FROM 
    (
        SELECT DISTINCT TOP 9 [Product].[ProductID], Name, 
            [ProductNumber], [MakeFlag], [FinishedGoodsFlag], 
            [Color], [SafetyStockLevel], [ReorderPoint], 
            [StandardCost], [ListPrice], [Size], 
            [SizeUnitMeasureCode], [WeightUnitMeasureCode], 
            [Weight], [DaysToManufacture], [ProductLine], 
            [Class], [Style], [ProductSubcategoryID], [ProductModelID], 
            [SellStartDate], [SellEndDate], [DiscontinuedDate], 
            [rowguid], [Product].[ModifiedDate]
        FROM
            [Production].[Product] INNER JOIN [Purchasing].[PurchaseOrderDetail]
            ON [Product].[ProductID]=[Purchasing].[PurchaseOrderDetail].[ProductID]
        WHERE
            [Purchasing].[PurchaseOrderDetail].[OrderQty] = 60
        ORDER BY [Product].[ProductID] ASC
    ) AS _tmpSet
) 
SELECT * FROM __actualSet 
WHERE [__rowcnt] > 4 AND [__rowcnt] <= 8 
ORDER BY [__rowcnt] ASC

With this query, we can create a (more or less) generic wrapper query to page any other select statement, with one minor restriction: the SELECT statement to page has to have TOP specified, if it has an ORDER BY (which is logical, as paging over unsorted data is meaningless). This however brings the wrapper very close to a generic wrapper, the only thing that should be inserted is a TOP clause. The final wrapper looks like:

WITH __actualSet AS 
( 
    SELECT *, 
        ROW_NUMBER() OVER (ORDER BY CURRENT_TIMESTAMP) AS __rowcnt 
    FROM 
    (
        -- the query to page here
    ) AS _tmpSet
) 
SELECT * FROM __actualSet 
WHERE [__rowcnt] > @rownumberLastRowOfPreviousPage 
    AND [__rowcnt] <= @rownumberLastRowOfPage
ORDER BY [__rowcnt] ASC

Of course the WHERE clause with the __rowcnt, that's up to you. What's key is that the numbering of the rows starts with 1. I used this particular predicate expression but there are more possible, as long as you filter on the right rows.

To limit the # of rows in the WITH CTE, you can use a TOP clause with a given number. If you are in charge of generating the SQL query to page, place a TOP clause in the SELECT which limits the total number of rows to (pageSize * pageNumber), where pageNumber starts with 1. The advantage of this wrapper is that you don't have to mess with the ORDER BY clause being transfered to the OVER() clause and whatever expression rewriting that might need.

Happy paging!

Published Tuesday, June 05, 2007 3:09 PM by FransBouma

Comments

# re: SqlServer 2005 paging: there IS a generic wrapper query possible@ Tuesday, June 05, 2007 9:22 AM

Frans: Great article. I just did something quite close to this myself at work for a coworker. I wonder how well it scales to large numbers of rows. The only thing that you should be careful of is that TOP 100 PERCENT doesn't really work in SQL 2005. Take a look at my (non-inflammatory) blog entry: http://executioniseverything.blogspot.com/2007_01_01_archive.html Scary new behavior, huh!? Actually, instead of TOP, you can use ROW_NUMBER() with an ORDER BY once again, inside your CTE to produce a properly ordered set -- that'll do the trick.

David Markle

# re: SqlServer 2005 paging: there IS a generic wrapper query possible@ Tuesday, June 05, 2007 9:38 AM

Yes the TOP 100 PERCENT is indeed not working! :) I found that out myself right after posting this and tried a DESC sort instead of an ASC sort.. it didn't give the results I was after. Very odd. (So I removed that remark ;))

The ROW_NUMBER trick  inside the query is indeed an option, but that is more cumbersome to create, and is a bit less of a 'wrapper' for a query to page.

FransBouma

# re: SqlServer 2005 paging: there IS a generic wrapper query possible@ Tuesday, June 05, 2007 11:47 AM

Frans, This works:
select id id, anotherid id from customers
This gives syntax error:
WITH __actualSet AS
(
SELECT *,
ROW_NUMBER() OVER (ORDER BY CURRENT_TIMESTAMP) AS __rowcnt
FROM
(
select id id, anotherid id from customers
) AS _tmpSet
)
SELECT * FROM __actualSet
WHERE [__rowcnt] > 0
AND [__rowcnt] <= 16
ORDER BY [__rowcnt] ASC

The original query is WTF, I agree, but I run into this issue yesterday

Ayende Rahien

# re: SqlServer 2005 paging: there IS a generic wrapper query possible@ Tuesday, June 05, 2007 12:15 PM

Oren, indeed! that's indeed a bummer I didn't think of. Thanks for the heads up.

Hmm... Indeed something to think of when writing paging queries.. :/

(the layout of your reply is messed up, sorry for that, the approval api of CS is still a bit shaky in some areas apparently, so it then removes linebreaks)

FransBouma

# re: SqlServer 2005 paging: there IS a generic wrapper query possible@ Tuesday, June 05, 2007 5:12 PM

Careful, working around TOP 100 PERCENT might be going around a technical limitation, better hit that speed dial button to MS EULA Central

Sean

# links for 2007-06-07 &raquo; mhinze.com@ Thursday, June 07, 2007 11:27 AM

Pingback from  links for 2007-06-07 &raquo; mhinze.com

links for 2007-06-07 » mhinze.com