Frans Bouma's blog

Generator.CreateCoolTool();

Syndication

News



    Visit LLBLGen Pro's website

    Follow me on Twitter

    Add to Technorati Favorites

About me

Fun stuff I created

My work

May 2007 - Posts

My interview with DotNetRocks! is now online!

My interview with .Net Rocks! is now online! . Please use this link to to get to the show's page.

Carl and Richard were great, the interview was a blast although I was pretty nervous at the beginning. I hope to talk to them in person at the SDC this fall in The Netherlands.

Enjoy!

Posted Thursday, May 31, 2007 12:15 PM by FransBouma | 12 comment(s)

Gavin King slams Object Databases

In a great article, Gavin King (of Hibernate fame) whipes the floor with what we used to call 'Object Databases'. Excellent reading material and full of information why these RDBMS's still stick around after all those many claims of object database vendors that their product is so incredibly more efficient.

Posted Wednesday, May 23, 2007 9:37 PM by FransBouma | 4 comment(s)

The endless fall to uselessness of MS Connect

I ran into a new low for Microsoft Connect product feedback. Check this item (you have to login with your MS Passport/Live ID. I don't know why). The title is "When will .NET 2.0 service pack 1 be out (IF ever) ?", I posted this on April 17th. The title explains it all.

Today I got a reply (whoa, after one month already). The status is now changed to "Resolved (Fixed)". This is the comment Microsoft posted:

Thanks for your post. Our plans are always to deliver the necessary roll-up of a product in Service pack form when it makes the most (positive) impact. We continue to deliver necessary fixes (QFEs and GDRs) in the meantime and we've delivered a SP1 for Visual Studio.



Did this person look at the calendar lately (more than a year without any service pack)? And when is releasing a bunch of fixes not positive?

I have no other words to say than that this was the last time I reported anything back to Microsoft through that stupid Connect interface: the group of people in charge of dealing with the posted items is simply not capable to understand that the people who post the items are their customers and take time to make Microsoft's work better so the life of the customer will become better because the product they work with for 8-10 hours a day has less bugs/glitches/nasties etc. VS.NET 2005 isn't the only part we have to deal with: .NET 2.0 is also a part we have to deal with, and more importantly: our customers have to deal with solely that part: .NET 2.0.

A lovely example of the 'support' quality Microsoft is willing to provide. Who was it again who said "Developers! Developers! Developers!" ? Oh never mind...

Posted Tuesday, May 22, 2007 7:05 PM by FransBouma | 13 comment(s)

API's and production code shouldn't be designed by scientists

One of the biggest mistakes Microsoft made in Database land was the absense of a proper paging mechanism in SqlServer 7 and SqlServer 2000. (No, don't come to me with tricks with @@ROWCOUNT because these don't always work in all situations). People had to use temp tables to get a mechanism which always works in all situations. What a surprise it was to see that Microsoft said they solved it in SqlServer 2005: they added a construct which offered paging inside the database without the necessity of temp tables. Though, who designed this feature? Did that person ever write production SQL code or did that person ever write normal business applications? Did anyone ever tell that person that all other major databases on the planet, except Access, had a simple paging mechanism build in so anything more complex than that would be bad? Apparently not.

There's a saying in IT land: when you let smart people solve a simple problem you get a overly complicated solution. The solution Microsoft chose for paging in SqlServer 2005 is one of the most bulkiest, complicated solution to paging ever created. I dare to say it's even more complicated than temp tables. Let's look at an example and how other databases handle it, shall we? In the code snippets below, pageNumber starts with 1 (which is the first page) and pageSize is the normal size of the page to retrieve.

Example
The typical example we want to use is this query:

SELECT * FROM Orders
WHERE CustomerID LIKE 'A%'
ORDER BY OrderDate DESC, ShippingDate DESC

(I use '*' instead of a list of fields for simplicity here, but you get the idea). The query is pretty simple. Now, if you ask a random developer what s/he would add to the SELECT syntaxis to implement paging, what do you think that person will answer in general? Very likely the person will say that the SELECT statement has to have a page specification and a page size specification and that's about it. You know, how Firebird solves it:

SELECT FIRST pageSize SKIP ((pageNumber-1) * pageSize) * FROM Orders
WHERE CustomerID LIKE 'A%'
ORDER BY OrderDate DESC, ShippingDate DESC

Which basicly means: skip the first n rows and then simply take the first m rows. Pretty simple, works very straight forward and works on every resultset. Syntaxis is clean, simple to understand and to the point. Clearly the work of an engineer who knows what s/he's doing.

Another simple approach would be how MySql and PostgreSql solve it: with a specification at the end of the query:

-- MySql
SELECT * FROM ORDERS
WHERE CustomerID LIKE 'A%'
ORDER BY OrderDate DESC, ShippingDate DESC
LIMIT ((pageNumber -1) * pageSize), pageSize
-- PostgreSql
SELECT * FROM ORDERS
WHERE CustomerID LIKE 'A%'
ORDER BY OrderDate DESC, ShippingDate DESC
LIMIT pageSize OFFSET ((pageNumber -1) * pageSize)

Which basicly comes down to the FIRST, SKIP approach: skip a set of rows and take the subset with the length specified.

Though are the big guns in the field, i.e. DB2 and Oracle, also doing paging this way? No, they have a slightly different approach. They do it a little more complicated, but not that much complicated. What they basicly do is that you can refer to the number of the row in the total resultset and simply filter on that. So if the example query results in 1,000,000 rows, and you want the 10th page (which is the page with pageNumber==10) of 25 results, you can simply refer to the ((10-1) * 25)th row in the resultset as the first row (row counting starts with 0) in the resultset to grab and then grab the 25 rows you want. You can do this with a normal WHERE clause. It's quite straight forward. Let's look at DB2's query first:

-- IBM DB2
SELECT * FROM 
(
	SELECT a.* , rownumber() over () AS rn 
	FROM 
	(
		SELECT * FROM ORDERS WHERE CustomerID LIKE 'A%'
		ORDER BY OrderDate DESC, ShippingDate DESC
	) AS a
) AS rs 
WHERE rs.rn between 
	(((pageNumber-1) * pageSize)+1) 
	AND
	(pageNumber * pageSize)

This looks complicated at first but it's not: whatever query you have, it's always placed at the same spot, so the paging query itself is always the same. That's a key point, as writing queries can be tedious and if you can use a paging query which is nothing more than a wrapper around the main query, you're not likely to make a mistake, in fact you can factor out the paging query to a piece of sequel you simply apply to your main query by using a routine.

The Oracle one is a bit simpler:

SELECT * FROM
(
	SELECT a.*, rownum r__
	FROM
	(
		SELECT * FROM ORDERS WHERE CustomerID LIKE 'A%'
		ORDER BY OrderDate DESC, ShippingDate DESC
	) a
	WHERE rownum < ((pageNumber * pageSize) + 1 )
)
WHERE r__ >= (((pageNumber-1) * pageSize) + 1)

The same basic idea: refer to the physical row number in the final resultset of the row you want to retrieve and then return that row and the n following rows which all form the page you want to fetch. In Oracle you too can use a wrapper routine for every query you have.

So, two main alternatives to choose from: add simple keywords to the SELECT syntaxis and hide every little dirty element necessary for paging for the developer or offer a way to use the row number inside a paging query and basicly let the developer wrap any query with a standard paging query.

So what does Microsoft bring to the table? Well, it does sound like the DB2/Oracle method: the ROW_NUMBER() function. At first, it looks like we can go the DB2/Oracle route and simply write a wrapper query which pages everything. But no, that's not the case. Of every option thinkable, the scientist who designed this syntax thought it had to be as much complicated as it possibly can be made: your original query has to be split up to work properly with the ROW_NUMBER() function. That means that any wrapper approach isn't going to work, which means for every query that has to be paged, a complicated new query has to be formulated. No, there's no simple SELECT statement syntaxis. Any engineer working on this API would have opted for that approach. A person however who never uses his/her own work will perhaps opt for a different approach but even then, the DB2/Oracle approach is more simpler to use than the SqlServer 2005's approach will ever be. Let's take a look at how the simple example query will look like in SqlServer 2005. I won't use a CTE here, to focus on the main SQL.

select TOP 10 b.*
FROM
(
	SELECT ROW_NUMBER() OVER (
		ORDER BY OrderDate DESC, ShippingDate DESC) as row_number, *
	FROM
	(
		SELECT * FROM Orders
	) AS a
	WHERE a.CustomerID LIKE 'A%'
) AS b
where b.row_number > (pageSize * (pageNumber-1))

So, let me get this straight: to page over a resultset in SqlServer 2005: you have to split your query into fragments which then move to different parts of the final query. Not only that, but if you refer in your WHERE clause to a table, like: WHERE Orders.CustomerID LIKE 'A%', it goes wrong: in the new WHERE clause, you have to refer to the set with the ROW_NUMBER function. This can become complex very very quickly, mainly because the actual query is completely ripped apart. There are alternatives to this structure though, neither one of them is easier than the one I pointed out above.

Scientists should stick with science. What they invent and discover should be moved to the real world by engineers, not by scientists as well. Because, if you leave it to scientists, you'll end up with horrible syntaxis like the ROW_NUMBER() function in SqlServer 2005. It might be a powerfull feature as it has functionality which might be usable in some situations (mostly in reporting scenario's, not in paging), however they never ever should have used the function as the paging functionality cornerstone in SqlServer 2005: they simply should have added a couple of simple keywords to SELECT and be done with it. Though if you aren't an engineer, if you don't work with the code you write yourself, if you don't write user-code up front to see if your API really is easy to use and consistent, you fall back on what you understand of what you created, which is basicly everything used as the foundation of what you created and of course what you created yourself, which makes everything look easy and straight forward. That's the pitfall in which scientists fall and which engineers many times manage to avoid

For completeness:

Again, my disclaimer: you can think that @@ROWCOUNT works always, or a simple TOP trick works always: they don't. There's no wrapper/SELECT trick in SqlServer whatever version which works with all queries you throw at it so you can write consistent code.

Posted Monday, May 21, 2007 12:31 PM by FransBouma | 33 comment(s)

New blog server as it seems, which doesn't work ok with my blog-skin

It seems Telligent uploaded a new version of Community server here on weblogs.asp.net, though it broke my custom blog-skin so the blog doesn't look that great now. Sorry for that, I'll try to migrate the skin a.s.a.p.

Update: . Apparently the inTelligent guy / girl who did the migration didn't make it possible to upload images for your blog skin to the server (access denied). How wonderful. Also my normal body tag CSS override doesn't seem to work, and I now have no clue what would help in this case. Why didn't they email us to warn us, tell us what might change etc... ?

Update 2: Ok, I got the colors right again, no background image yet at the left, but at least I got rid of the red link color

Update 3: Hurray! Everything works again! .

Posted Wednesday, May 16, 2007 9:15 PM by FransBouma | 10 comment(s)

Filed under:

More Posts