"When Should I Use SQL-Server CLR User Definied Types (UDT)?"
No one has asked me that question just yet, but with the release of SQL Server 2005 just around the corner, I'm sure a handful of people will. Unlike regular User Defined Types, CLR UDTs are a new feature of SQL Server 2005 that allows one to create a .NET class and use it as a column datatype. As long as a few requirements are followed, one can create any class with any number of properties and methods and use that class as a CLR UDT.
Generally, when a new feature is introduced with a product, it can be a bit of a challenge to know when and how to use that feature. Fortunately, with SQL Server's CLR UDTs, knowing when to use them is pretty clear:
Never.
Let me repeat that. Never. You should never use SQL Server CLR User Defined Types. I'm pretty sure that this answer will just lead to more questions, so allow me to answer a few follow-up questions I'd anticipate.
Why Not?
CLR UDTs violate a fundamental
principle of relational databases: a relationship's
underlying domains must contain only atomic values. In other
words, the columns on a table can contain only scalar
values. No arrays. No sub-tables. And, most certainly, no
classes or structures. Remember all the different levels of
normalization? This is the first normal form, you know, the
"duh" one.
This is a big thing. One can't just go and fudge a tried-and-true, mathematically-validated, theoretically-sound concept and "add and change stuff to it 'cause it'll be cool." Think of how much your car would love driving on a road made of stained glass blocks three years after it was built by an engineer who thought it'd look better.
Deviating so grossly from the relational model will bring as much joy as a dilapidated glass road. Take Oracle's foray into relational abuse: nested tables. I don't believe that there has ever been a single, successful implementation of that abomination. Sure, it may work out of the box, but after a year or two of use and maintenance, it decays into a tangled mess of redundancy and "synch" procedures -- both completely unnecessary with a normalized relational model.
And if that doesn't convince you, just think of having to change that CLR UDT. How easy do you think it would be to add a property to the class representing a few million rows of binary-serialized objects? And, trust me, it won't be nearly as easy as you think.
But wouldn't I want to share my .NET code so I don't have
to duplicate logic?
This is always a novel goal, but an impossible one. A
good system (remember, good means maintainable by other
people) has no choice but to duplicate, triplicate, or
even-more-licate business logic. Validation is the best
example of this. If "Account Number" is a seven-digit
required field, it should be declared as CHAR(7) NOT NULL in
the database and have some client-side code to
validate it was entered as seven digits. If the system
allows data entry in other places, by other means, that
means more duplication of the "Account Number" logic.
By trying to share business logic between all of the tiers of the application, you end up with a tangled mess of a system. I have illustrated this in the diagram below.
As you can see, the diagram on the right is a nicely structure three-tier architecture. The system on the right is the result of someone trying to share business logic between tiers, making a horribly tangled mess. One can expect to end up with the latter system by using CLR UDTs.
Never?!? How can there never, ever be an application of
CLR UDTs?
Though I may not live by the cliché "never say
never," I do follow the "never say 'never, ever'" rule. The
only possible time where one might possibly want to use this
feature is for developing non-data applications. But therein
lies the crux: why would one develop a non-data application
using SQL Server? There are certainly better tools out there
for what the non-data application needs to accomplish. If
you can come up with an appropriate use of a CLR UDT in an
information system, I'll buy you a t-shirt or a mug.
But what about the
samples
provided? That's a use, right there!
Allow me to address these ...
Supplementary-Aware String Manipulation / UTF8 String
User-Defined Data Type
Both of these samples have to do with UTF-8 character
encoding. Without getting into the
details, UTF-8 encodes characters as one, two, three, or four
8-bit bytes, meaning you can not do anything with characters
in the string (length, substring, etc) unless you read it
byte-by-byte. This works great for preserving "funny
characters" while transmitting data but is a poor choice for
storage. UCS-2 uses a fixed-size character format of 16-bits
per character and is what should be used for storing
character data.
Calendar-Aware Date/Time UDTs
Let's think about
this. A point in time is a point in time; how it's described
varies by culture ("Monday", "Lunes"), time zone (+6:00 GMT,
-3:00GMT), calendar (Gregorian, Aztek), and format (2005-08,
Aug '05). Describing a point in time properly is essential
when interfacing with people or other systems. The keyword
in that last sentence was "interface;" such description is
best done in the "interface" tier of a system, not in the
data tier. Doing this makes as much sense as putting
currency conversion and language translation in the
database.
Multi-dimensional Points and Latitude/Longitude
A geospatial location is described with Latitude
and Longitude. Not Lati-longi-tude. These are two
separate attributes and putting them in the same column
violates First Normal Form. The same goes for points and
other "array-like" structures.
Imaginary Numbers
Seriously? Correct me if I'm
wrong, but the only actual use for imaginary numbers is in
solving of differential equations. If you're not sure why
this invalidates the example, say these two phrases aloud:
"solving differential equations" and "relational database."
Didn't that feel just like saying "drilling a hole" and
"hacksaw?"
But what about if I want to put down "SQL CLR UDTs" on my
resume?
What's stopping you now? By reading this article, you
know everything you will ever need to about CLR UDTs. With
this on your resume, you will be able to use your expert
knowledge on the topic to never use CLR UDT.
I hope that clears things up about CLR UDT. Hopefully now you look forward to not using them and strongly opposing anyone who suggests it. Oh, and I really am serious about sending The Daily WTF swag to whoever can come up with a use for these things. So think about a use; you may just get a free t-shirt.