People are still using dr[stringName], when dr[intOrdinal] is soooooo much faster.

Sorry in advance Plip, I'm not targeting your code, I'm writing this more as a general performance rant.  Plip writes in his article on how to load strongly typed data Part 3: Using Strongly Typed Objects and Collections to replace DataSet’s in your .NET applications.   I only have one problem with his approach, he makes the mistake of using a string ordinal look-up.  Let me tell you what is involved with a string ordinal look-up.  First time through, you have to create a FieldNameLookup object.  This takes a bit of time as it loads all of the meta-data off of the reader to build your look-up table.  Once that is built, the string is used to look into that table.  This involves a bunch of null checks, a hashtable look-up, etc.., etc..., you get the point.  This crap is slow.

What are the alternatives?  Well, for development, I say leave it as is.  However, after development is done and the SP is locked down in terms of column ordinals in the returned data, you are going to replace that string name with an enumeration.  Enumerations give you all the power of having a string based name right there that you can see but at the same time index into the data reader much more quickly than a string.  I originally made this change to the ASP .NET Forums system when upgrading it for V2 back in February of 2003.  The resulting performance increase and throughput measured on average 3-4% and as much as 7%.  That is pretty huge.  At that time the system was extremely data-bound and we were doing thousands of look-ups into that FieldLookupTable to render just a single view.

Want another alternative?  Well, you can cache the ordinals for all of your fields the first time through.  That is, if you want that level of flexibility.  You might wind up with a string[] of each field, and an int[] that stores the ordinal after look-up, then use that to index.  I just can't see where this is more powerful than the enumeration method since it has slightly degraded performance and there is an implementation detail with how you store the cached look-up data.

Published Sunday, April 11, 2004 3:14 AM by Justin Rogers
Filed under: ,

Comments

Sunday, April 11, 2004 6:28 AM by Frans Bouma

# re: People are still using dr[stringName], when dr[intOrdinal] is soooooo much faster.

Hardcoded ordinals are error-prone. You can however cache the ordinals: in the first record, grab the ordinals for each name, and store these in a table (hashtable or other). Every next row, you use the ordinals cached. You can implement even more faster constructs. (However a hashtable is pretty fast).

What's often forgotten though is that dr.IsDBNull() is very very very slow. Better do:
bool isNull = (dr[ordinal]==System.DBNull.Value);

Saves 20% of speed.
Sunday, April 11, 2004 6:37 AM by Justin Rogers

# re: People are still using dr[stringName], when dr[intOrdinal] is soooooo much faster.

My point for enumeration based ordinals was in production level applications. Once you've finalized a database interface, the error level of using enumerations to identify columns starts to get continuously smaller. Again, the point in case for the ASP .NET Forums system, of a 7% increase was HUGE.

Code spit automation programs take even more of the error metric away because you can use the results of a data reader to spit the relavent code in terms of enumeration for ordinal mapping, mapping of data to strongly typed object properties, and expanding out DBNull checks as you display above.

Honestly, saying it is error prone is like pointing out that the usage of pointers is error prone. While I agree highly that it CAN be, there is definitely a time to user pointers, and certainly a time to use enumeration backed ordinals for the data reader.
Sunday, April 11, 2004 7:10 AM by RichB

# re: People are still using dr[stringName], when dr[intOrdinal] is soooooo much faster.

7% is an OK increase. It's not great though.

Prior to spending the time replacing the code with ordinals, I'd first look at the SQL I was using. It's quite likely I'd get 50% perf gains from tweaking the SQL. I would also look at internal algorithms - I once re-wrote a complex algorithm which reduced accuracy in the algorithm, but changed it from a O(n^2) to an O(n) algorithm, reducing calculation times from minutes down to seconds. Luckily the reduced accuracy wasn't important.

Only after considering all this would I then look at replacing ordinals.
Sunday, April 11, 2004 7:18 AM by Justin Rogers

# re: People are still using dr[stringName], when dr[intOrdinal] is soooooo much faster.

Depends on the context of 7%. In this case the context of 7% was on the fully optimized SQL set used for the ASP .NET Forums system. This was after massive architectural changes to the database, index tuning, upgrading RAID hardware, etc...

The point here is that thousands of calls per second are being wasted on FieldNameLookup, period. The majority of people may not need real performance, others do. For those that do, or simply want to ship a leaner component, then enumeration backed ordinal lookups are simply that much faster and that much leaner. Can't beat a relatively free performance gain.

So no more assumptions here people. No more spending hours looking at your SQL, just spend 10 minutes and make a frigen enum ;-)
Sunday, April 11, 2004 7:40 AM by Plip

# re: People are still using dr[stringName], when dr[intOrdinal] is soooooo much faster.

Two words: Magic Numbers.

*Shudder* ;)

I do not want to maintain code that's littered with Ordinals, especially on Data Schema's I'm not currently familiar with.

http://www.programmingresearch.com/solutions/QAJDOCS/Patterns/no-magic-numbers.html

If you *MUST* user Magic Number please please please use Enums like Justin suggests, or use declared variables: -

int FieldNameThatMakesSense = 9;
string moo = DR[FieldNameThatMakesSense].ToString();

I wasn't aware of the performance loss though, thanks for that, I'll certainly be passing this information on to the rest of the Developers on my team.

And absolutely no need to apologise, it was bad code to make my point look prettier ;)
Sunday, April 11, 2004 7:49 AM by Justin Rogers

# re: People are still using dr[stringName], when dr[intOrdinal] is soooooo much faster.

Yep, magic numbers makes it sound so dirty. Definitely write yourself a tool that runs your query or SPROC and uses the results to create the enumeration. If you create all of your enumerations using an identical process you can even validate over time that people aren't screwing things up by changing column ordering. This is definitely a concept of performance versus resilience. Depending on your comfortability with your own setup, you'll pick one over the other.
Sunday, April 11, 2004 9:55 AM by Patrick Santry

# re: People are still using dr[stringName], when dr[intOrdinal] is soooooo much faster.

I like the use of enums for managability. Another reason for the use of ordinals was on one project I worked on where I had a search query, and then I wanted to bind an entirely different search to the same UI. Since all of the field names were referenced in the new query I had to write it so it would match up with the original query, for example:
Select thisfield As theOtherFieldName, thatField As thatOtherField, etc...

I had to do a lot of writing in my sproc to make it match up with what the code was expecting. Whereas by using ordinals, it wouldn't be a big deal and save me some time writing.
Sunday, April 11, 2004 10:35 AM by Jim Ross

# re: People are still using dr[stringName], when dr[intOrdinal] is soooooo much faster.

Hurray for Plip! Magic numbers, indeed. Interesting that this young guy from across the pond uses the same term I've been using (with equal scorn) for years.

Re: addressing via an enumeration. I always start out that way. In a typical project, once you've coded it with strings, it's going to be a rare thing if you have the time to go back and replace them. Not to mention the additional debugging cycle you introduce. Ugh!. Start out defining your enumeration, and code using it from the start. For me, it's a whole bunch easier to change an enumeration than embedded magic numbers. And when you're done, you're done.
Sunday, April 11, 2004 11:51 AM by Phil Winstanley

# re: People are still using dr[stringName], when dr[intOrdinal] is soooooo much faster.

I still have a concern here that there is a definate contract between the DAL and the Stored Procedures, if a DBA goes in and moves a column they will break your code, most likly causing Conversion Exceptions in your DAL/BAL.

This isn't the case with Named columns, (mind you there is nothing to stop renaming a Column however it's more likly a new Column will be added to a select, if that goes at the beggining everything else is out by one ... I see gremlins on the horizon)
Sunday, April 11, 2004 3:10 PM by Dave Burke

# re: People are still using dr[stringName], when dr[intOrdinal] is soooooo much faster.

Justin, Thanks for pointing me in the direction of using Enums for column ordinals. Never thought of it, but will definitely pursue.
Sunday, April 11, 2004 5:15 PM by Nat Luengnaruemitchai

# Nahhhh dr[dataColumn] is faster

When you populate the data into dataset or whatever, you can assign the columns that you want to use in DataColumn variable and use it from there.... It is even faster than dr[intOrdinal]

DataColumn dcUserName = dt.Columns["userName"];

for(... ;... ;...)
{
blablabla....(dt.Rows[i][dcUserName]);
}

this way, you will achieve both performance and compatibility if you change the SQL statement in the future
Sunday, April 11, 2004 5:30 PM by Justin Rogers

# re: People are still using dr[stringName], when dr[intOrdinal] is soooooo much faster.

That would appear exceptionally intelligent if we weren't talking about the DataReader class. The DataReader class doesn't have a Rows collection, nor does it accept DataColumn as a default indexed property overload.

We are talking about some raw performance here, not the case where you are using the higher level APIs.
Sunday, April 11, 2004 11:30 PM by Jacob Morgan

# re: People are still using dr[stringName], when dr[intOrdinal] is soooooo much faster.

I know we have beat O/R mappers to death around here, but this discussion really exemplifies their usefulness.

I don't use O/R mappers to avoid writing SQL. I don't use them to avoid writing domain objects.
I don't use them to avoid building database tables.

I use them to avoid writing hundreds of redundant assignment statements that lead to synchronization hell. Any O/R mapper worth a flip will give you ordinal based lookups based on the explicit list of queried fields. Optimizations such as this can counter some of the lost optimization of dynamic SQL.
Monday, April 12, 2004 5:08 AM by TrackBack

# re: IsDBNull() alternative

Monday, April 12, 2004 2:09 PM by Henry Erich III

# re: People are still using dr[stringName], when dr[intOrdinal] is soooooo much faster.

Excelent post and argument!

My Opinion:
Code Gen: Numeric Ordinals
Hand Gen: String Ordinal Lookup

My Opinion Also:
Enums or Constants would be a waste of time for readability gain.
Tuesday, April 13, 2004 5:09 AM by Justin Rogers

# re: People are still using dr[stringName], when dr[intOrdinal] is soooooo much faster.

The actual framework issue behind IsDBNull being as slow as it is.

http://weblogs.asp.net/justin_rogers/archive/2004/04/13/112100.aspx
Tuesday, April 13, 2004 5:27 AM by TrackBack

# IsDBNull

Justin points to Frans Bouma noting the performance hit using IsDBNull() in .NET and a workaround....
Monday, April 19, 2004 11:59 PM by Lavos

# re: People are still using dr[stringName], when dr[intOrdinal] is soooooo much faster.

Interesting stuff. It's late to add this comment to the party, but I will anyway.

Using enumerations is great, but as was illustrated within the first month of my current job, someone at our provider inserted a column into a single resultset that broke a lot of our production applications that had code dependant on ordinals.

We can blame that on poor testing on our part but it still follows that DBAs and upgrades are going to break old applications that had been chugging along since forever if you depend on ordinals, and thus increases maintenance costs.

With that said, ordinals really do seem to help you out with speed :)


My personal preference if the gains were that important, would be to have a singleton class that more or less has fields/properties/whatever that I can reference by name and do runtime checks (via string names or whatever) to populate with the proper ordinals.

(Or, have all your enumerations defined in a seperate assembly that you can "upgrade" independantly to keep pace with the latest changes.)
Tuesday, April 20, 2004 12:57 AM by Justin Rogers

# re: People are still using dr[stringName], when dr[intOrdinal] is soooooo much faster.

It is equally easy to remove a single column from a result set as add one. Removing a single column would break any form of column indexing method, just adding one happens not to.

I'd argue that any competent DBA would create a new procedure before he'd add columns to an existing result set. If he does add columns why would he add it to the beginning of the column set or in the middle rather than append it to the end.

A small investment in user education for a 7-10% performance gain is well worth it. And since the same DBA could just have easily removed a column, I can't see a valuable reason behind not using the oridinals.

Proper versioning of enumerations is important in any case. Having singleton classes and levels of abstraction does nothing but hurt performance. In many cases the extra indirection of using properties is going to hurt perf a bit, as is doing runtime checks for proper ordinals. There are some solutions, but they all involve dynamically generating optimized code using a CodeDOM or Reflection.Emit and I don't think the majority of users are ready for that step (though I use it all of the time for my own stuff).

Leave a Comment

(required) 
(required) 
(optional)
(required)