"When Should I Use SQL-Server CLR User Definied Types (UDT)?"
No one has asked me that question just yet, but with the release of SQL Server 2005 just around the corner, I'm sure a handful of people will. Unlike regular User Defined Types, CLR UDTs are a new feature of SQL Server 2005 that allows one to create a .NET class and use it as a column datatype. As long as a few requirements are followed, one can create any class with any number of properties and methods and use that class as a CLR UDT.
Generally, when a new feature is introduced with a product, it can be a bit of a challenge to know when and how to use that feature. Fortunately, with SQL Server's CLR UDTs, knowing when to use them is pretty clear:
Let me repeat that. Never. You should never use SQL Server CLR User Defined Types. I'm pretty sure that this answer will just lead to more questions, so allow me to answer a few follow-up questions I'd anticipate.
CLR UDTs violate a fundamental principle of relational databases: a relationship's underlying domains must contain only atomic values. In other words, the columns on a table can contain only scalar values. No arrays. No sub-tables. And, most certainly, no classes or structures. Remember all the different levels of normalization? This is the first normal form, you know, the "duh" one.
This is a big thing. One can't just go and fudge a tried-and-true, mathematically-validated, theoretically-sound concept and "add and change stuff to it 'cause it'll be cool." Think of how much your car would love driving on a road made of stained glass blocks three years after it was built by an engineer who thought it'd look better.
Deviating so grossly from the relational model will bring as much joy as a dilapidated glass road. Take Oracle's foray into relational abuse: nested tables. I don't believe that there has ever been a single, successful implementation of that abomination. Sure, it may work out of the box, but after a year or two of use and maintenance, it decays into a tangled mess of redundancy and "synch" procedures -- both completely unnecessary with a normalized relational model.
And if that doesn't convince you, just think of having to change that CLR UDT. How easy do you think it would be to add a property to the class representing a few million rows of binary-serialized objects? And, trust me, it won't be nearly as easy as you think.
But wouldn't I want to share my .NET code so I don't have to duplicate logic?
This is always a novel goal, but an impossible one. A good system (remember, good means maintainable by other people) has no choice but to duplicate, triplicate, or even-more-licate business logic. Validation is the best example of this. If "Account Number" is a seven-digit required field, it should be declared as CHAR(7) NOT NULL in the database and have some client-side code to validate it was entered as seven digits. If the system allows data entry in other places, by other means, that means more duplication of the "Account Number" logic.
By trying to share business logic between all of the tiers of the application, you end up with a tangled mess of a system. I have illustrated this in the diagram below.
As you can see, the diagram on the right is a nicely structure three-tier architecture. The system on the right is the result of someone trying to share business logic between tiers, making a horribly tangled mess. One can expect to end up with the latter system by using CLR UDTs.
Never?!? How can there never, ever be an application of CLR UDTs?
Though I may not live by the cliché "never say never," I do follow the "never say 'never, ever'" rule. The only possible time where one might possibly want to use this feature is for developing non-data applications. But therein lies the crux: why would one develop a non-data application using SQL Server? There are certainly better tools out there for what the non-data application needs to accomplish. If you can come up with an appropriate use of a CLR UDT in an information system, I'll buy you a t-shirt or a mug.
But what about the samples provided? That's a use, right there!
Allow me to address these ...
Supplementary-Aware String Manipulation / UTF8 String User-Defined Data Type
Both of these samples have to do with UTF-8 character encoding. Without getting into the details, UTF-8 encodes characters as one, two, three, or four 8-bit bytes, meaning you can not do anything with characters in the string (length, substring, etc) unless you read it byte-by-byte. This works great for preserving "funny characters" while transmitting data but is a poor choice for storage. UCS-2 uses a fixed-size character format of 16-bits per character and is what should be used for storing character data.
Calendar-Aware Date/Time UDTs
Let's think about this. A point in time is a point in time; how it's described varies by culture ("Monday", "Lunes"), time zone (+6:00 GMT, -3:00GMT), calendar (Gregorian, Aztek), and format (2005-08, Aug '05). Describing a point in time properly is essential when interfacing with people or other systems. The keyword in that last sentence was "interface;" such description is best done in the "interface" tier of a system, not in the data tier. Doing this makes as much sense as putting currency conversion and language translation in the database.
Multi-dimensional Points and Latitude/Longitude
A geospatial location is described with Latitude and Longitude. Not Lati-longi-tude. These are two separate attributes and putting them in the same column violates First Normal Form. The same goes for points and other "array-like" structures.
Seriously? Correct me if I'm wrong, but the only actual use for imaginary numbers is in solving of differential equations. If you're not sure why this invalidates the example, say these two phrases aloud: "solving differential equations" and "relational database." Didn't that feel just like saying "drilling a hole" and "hacksaw?"
But what about if I want to put down "SQL CLR UDTs" on my resume?
What's stopping you now? By reading this article, you know everything you will ever need to about CLR UDTs. With this on your resume, you will be able to use your expert knowledge on the topic to never use CLR UDT.
I hope that clears things up about CLR UDT. Hopefully now you look forward to not using them and strongly opposing anyone who suggests it. Oh, and I really am serious about sending The Daily WTF swag to whoever can come up with a use for these things. So think about a use; you may just get a free t-shirt.