Gospel of the GUID - Answers to your Burning Questions, Comments and Insults.

Gospel of the GUID - Answers to your Burning Questions, Comments and Insults.

First and foremost, I was blown away with the number of nice responses and questions to my Gospel of the GUID post. For the most part, I got a lot of “wow, I never thought of that” and “cool, we do that too, glad to see someone else doing it” type of responses.  I enjoyed the slew of “I am a convert now” type of responses also. It was nice to see people play along with the motif of the post.  I had expected a few comments, but not the flood I received. Instead of just “approving” the monster load of comments, I wanted to condense it all down into a single readable response post. So many of the questions were flavors of the same question that I would hate for people to read 100+ comments where most of it was just duplication.

Before I go any further, I should state, and should have stated in my Gospel of the GUID post, that every rule in my shop can have exceptions. My GUID rule is no exception to this. (Sorry… couldn’t resist the pun) However, to break my rules, you must have valid reason for doing so. Many people have pointed out the more common exceptions; data warehousing, apps that need high speed multi-million inserts, etc.  These would all be allowable exceptions to me, however, these are not the majority of my applications and they are certainly not the norm of most business applications. After all, someone has to actually build the apps that generate the data for the warehouseJ!   

Another interesting thing that I did not expect, were the insults. I certainly didn’t expect the anti-Microsoft slurs but I guess if you are unemployed, sitting at home in your whitey-tighties, taking a break from leveling your level 70 Night Elf, what else are you going to do? Luckily after many years of playing MMORPGs, I am quite used to bravado that seems to be induced on the weak minded and immature by the anonymity of the net.  I just didn’t expect so much of it from the readers of ASP.NET … but what do I know? J Truth be known, it was the most entertaining part of it all for me.  For those of you that took the time to go out of your way to insult me (and Bill Gates), you will sleep better knowing that I, my team, and my friends had a good time laughing at you… not with you… but at you.

For the most part, I received a lot of emails and many comments asking “what about x?” I will answer the most common questions and comments below. I am so sorry it has taken so long to get these answers back to you.

1)      Can you explain number one a little better? I still don’t quite get it. Why don’t you just return the unique identifier from a stored proc?
The beauty of the GUID approach is that I can generate the identifier that I need in my object model WITHOUT going to the database.  This allows me to create an “Order” object, and all of its “Line Item” objects without making any calls to the database until I am ready to persist them, thereby saving trips to the database. I don’t care how small you can make your index, no trips to the database is faster than 1.
J
Note: If I need to generate an identifier at the database level, I can do that also. If I need to “return” it from a stored proc, I would need to use an output parameter.


2)      Don’t you know that using GUID’s as clustered indexes is bad?
At the risk of sounding flippant, yes, I know in certain cases it is bad.

Yes, I know that if you have a GUID as a clustered index AND you are doing high volume inserts, this really slows down your inserts. For 99% of the applications I write, I truly don’t care about how long the inserts take as long as they perform adequately.  

Yes, I know that over the long haul, the more records that get inserted the more fragmentation I could possibly have. For the most part I deal with this by using the appropriate Fill Factor or through database maintenance.  If it is really an issue after that, I don’t use a clustered index on the GUID here either.  MOST of my tables aren’t million+ rows, and most don’t need to be able to handle high volume inserts, therefore MOST of my tables do have the GUID as the clustered index.

3)      Why not use the new NewSequentialId()?
There are
several reasons really. The only one that really matters is that I can’t do it in my object model, and therefore it is of no use to me.
 

4)      What happens if we run out of GUIDs?  What if I don’t trust that they can’t be duplicated?
Simple, your “unique index” will guarantee you don’t get that duplicated bugger into your database. Unless, of course, you don’t trust SQL to make sure it doesn’t allow duplicates
J
 

Now you have a huge problem though, you need to rush out and buy a lottery ticket but be careful, because you are going to get hit by lightning on the way because you have defied the odds on a scale that is nearly unimaginable. After all, the odds of duplicating a GUID is 1 in 2122 (or 5.3×1036) Good luck with that. By the way, anyone know how long that would be if I could line up 5.3×1036 dollar bills one after the other?

5)      What about referential integrity regarding #3?
I knowing sacrifice referential integrity for the flexibility I get with my object model. I rarely physically delete data anyway. Most of the time, we do logical deletes so referential integrity doesn’t even come into play.

6)      What if you need to know which parent a row belongs to in #3?
If you absolutely must know at the row level of the child record, then you will either need to add a “type” identifier column, or you will need to do a left join against your appropriate parent tables to see who it belongs to.  It is extremely rare for me to need to go “up the chain” in my object graph from an object that is stored this way. I almost always only go “down the chain.” For example, I would start with my customer and get the notes for that customer. Not start with some note and try to figure out, “Gee what the heck is this note for?”

7)      What about Customers who don’t want to see a GUID for an Order Number?
GUIDs are NEVER for human consumption. (I don’t consider devs and dba’s human…they are superhumans
J
)  People like to see things like “Order ID 1000” which is totally fine.  Obviously you should make sure those are unique also, but that has nothing to do with using GUIDs.

Besides, this breaks another one of my rules, which is never make something that normal people feel is unique, your identifier. Some day humans will change this little piece of data, and when they do, your app won’t work.  I can’t tell you how many times I have seen an order “number” go from numeric to alpha numeric, or an identifier that will “always” be 5 characters, be changed to 6 characters years later. Here me now, and believe me later, don’t let your PKs be anything that means something to your customer.  Your PK should only be something that means something to your database… and that something should be what row am I getting… even if it is an INT
J
 

Additional Thoughts:

1)      In my world, which admittedly is different from many of yours, what I care about the most is:  

a.       How my app behaves from the user’s perspective.  I truly don’t care if a DBA thinks that speedy inserts of millions of records is important. DBA’s don’t dictate design in my apps. DBA’s don’t “own” the database in my world. They are an important part of the design team of our app, but they don’t get to make unilateral design decisions based on whether we can do millions of inserts quickly, if that’s not what my users do. If my users are ACTUALLY doing millions of inserts, I care very much and certainly wouldn’t do anything that caused that task to perform poorly, including making a clustered index on a GUID.

b.       Code maintainability. This is a topic for a future post of mine, but the short version is that I will take code that performs acceptably and is easier to maintain over hyper optimized code that is harder to maintain. The Gospel of the GUID is based on this philosophy, and makes my application framework very easy to maintain. When you know that EVERY row is identifiable by a GUID, your framework can easily deal with any row of data, without having to “figure it out.” For example, in our UI framework, we have an abstract base class for all of our pages that deal with singletons to inherit. In the base page, we have helper functions that do things like get the singleton etc. While not a huge amount of code in and of itself, across several hundred pages, it’s quite a nice savings.

2)      COMBS - http://www.informit.com/articles/article.aspx?p=25862&seqNum=1  Now here is something that I must take a closer look at. Trading some of the uniqueness for better performing inserts merits some investigation on my part.

Off the top of my head, it would be pretty easy to modify my framework to use a COMB instead of GUID and of course it would be trivial to create a UDF in SQL for the same thing if I needed to generate some IDs at the DB level.  I will have to do some testing and see if this is worth doing. Thanks for the heads up on this dani.

Published Sunday, November 11, 2007 4:03 PM by wwright

Comments

Sunday, November 11, 2007 8:20 PM by Ken Cox [MVP]

# re: Gospel of the GUID - Answers to your Burning Questions, Comments and Insults.

>What happens if we run out of GUIDs?

Right on! Don't believe the people who claim that GUIDs are unique until you see their test plan! We want *proof* with reproducible results, not statistical theory! Show us every GUID that's been made or will be made and *then* we'll know for sure.

I'll bet they're the same people who claim the Universe is infinite. A never-ending universe might be the only explanation that their tiny minds can come up with - but not necessarily the real answer!

<vbg>

Monday, November 12, 2007 12:45 PM by Bernal Schooley

# re: Gospel of the GUID - Answers to your Burning Questions, Comments and Insults.

Thank you for the great articles. I've been forcing the use of GUID's on my projects for several years now and still continue to run into all the "but why" questions.

The only two concerns that I give any weight to are performance and size.

The size concern is often easily dismissed in 99% of use cases and in the other 1% alternatives should be used and that's ok.

The performance concern is not as easily dismissed. I've always considered it a trade-off that was well worth making given the benefits of GUIDs (specifically the ability to create them in your application layer). Having just read the article by Jimmy Nilsson and his COMB's style GUID solution, I'm thinking we can now have our cake and eat it too.

Tuesday, November 13, 2007 6:14 AM by wwright

# re: Gospel of the GUID - Answers to your Burning Questions, Comments and Insults.

DOH!

Thanks Richard! Guess I will have to revisit that idea a little more now. I truly need to consider the chance of duplication, but I can't think of the last time one of my apps ran on a box without a NIC. Hmmmm... I will investigate further.