MySQL 5.0: Still A "Toy" RDBMS

Wednesday, October 26, 2005

"Ha," an email from a colleague started, "I think you can finally admit that MySQL is ready to compete with the big boys!" I rolled my eyes and let out a skeptical "uh huh." His email continued, "Check out Version 5. They now have views, stored procedures, and triggers."

My colleague has been a MySQL fan since day one. He loves the fact that it's free and open source and could never quite understand why anyone would spend tens of thousands of dollars on something else. But then again, he has never really had an interest in understanding; data management just isn't his "thing." Thankfully, he readily admits this and stays far, far away from anything to do with databases, leaving all of that "stuff" to the experts. No less, he'll still cheers whenever there's a MySQL "victory." it is, after all, free and open source.

Data professionals have traditionally relegated MySQL as a "toy" relational database management system (RDBMS). Don't get me wrong, it's perfectly suitable for blogs, message boards, and similar applications. But despite what its proponents claim, it has always been a non-choice for data management in an information-system. This is not a criticism of the "free open source" aspect of the product, but of its creators.

The MySQL developers claim to have built a reliable RDBMS yet seem to lack a thorough understanding of RDBMS fundamentals, namely data integrity. Furthermore, they will often surrogate their ignorance with arrogance. Consider, for example, their documentation on invalid data [emphasis added]:

MySQL allows you to store certain incorrect date values into DATE and DATETIME columns (such as '2000-02-31' or '2000-02-00'). The idea is that it's not the job of the SQL server [sic] to validate dates.

Wait a minute. It's not the job of the RDBMS to ensure data are valid?!? One of the greatest revelations in information systems is that applications are not good at managing their data: they change too frequently are too-bug prone. It just doesn't work. That's the whole point of a DBMS; it ensures that data are typed and valid according to business rules (i.e. an employee can't have -65 dependents).

But I digress. This is the 5.0 release. They've added views. They've added stored procedures. They've added triggers. Maybe things have changed.

I thought I'd check out MySQL 5.0 first hand, so I visited their website and downloaded the product. I have to say, the installation process was painless. It even defaulted to and recommended "strict mode," which apparently disallows the invalid dates as seen above. This is certainly progress!

After it installed, I fired up the MySQL prompt and started hackin' around.

 mysql> CREATE DATABASE ALEXP; Query OK, 1 row affected (0.00 sec) 
 mysql> USE ALEXP; Database changed 
 mysql> CREATE TABLE HELLO ( 
 -> WORLD VARCHAR(15) NOT NULL PRIMARY KEY, 
 -> CONSTRAINT CK_HELLO CHECK (WORLD = 'Hello World') 
 -> ); 
 Query OK, 0 rows affected (0.14 sec)

Wow! I'm impressed! MySQL 5.0 has check constraints! Maybe I was wrong about these guys ...

 mysql> INSERT INTO HELLO(WORLD) VALUES('Hi World'); 
 Query OK, 1 row affected (0.05 sec)

Err … umm … wait a minute. You did just see me put that check constraint on the HELLO table, right? It's not a very complicated check, maybe, I did it wrong?

 mysql> SELECT * FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS
 -> WHERE TABLE_NAME='HELLO'; 
 +--------------------+-------------------+-----------------+--------------+------------+-----------------+ 
 | CONSTRAINT_CATALOG | CONSTRAINT_SCHEMA | CONSTRAINT_NAME | TABLE_SCHEMA | TABLE_NAME | CONSTRAINT_TYPE | 
 +--------------------+-------------------+-----------------+--------------+------------+-----------------+ 
 | NULL | alexp | PRIMARY | alexp | hello | PRIMARY KEY | 
 +--------------------+-------------------+-----------------+--------------+------------+-----------------+ 
 1 row in set (0.01 sec)

Well how about that. It didn't actually add the constraint. Of course, had I done my research before hand, I would have known that MySQL 5.0 does not support check constraints. Apparently, in the MySQL world, one can buy a negative number of items and can be hired long before being born.

Alas, maybe I'm being too harsh; I suppose we could implement data validation logic in triggers. After all, this is often required in other RDBMS when CHECK constraints do not allow cross-table tests (e.g., making sure items cannot be added to a shipped order). So, why not just do it here?

The trigger cannot use statements that explicitly or implicitly begin or end a transaction such as START TRANSACTION, COMMIT, or ROLLBACK.

Oh that's just lovely. Leave it to MySQL to drop the most important use of triggers (complex data validation) and encourage their most obnoxious use (difficult to maintain business logic).

As far as other features added in MySQL, I think they are definitely a step in the right direction. Stored Procedures are a key component in creating a clean interface with strong-cohesion to the data layer (see, Strong vs Weak cohesion). Views (virtual tables) are absolutely essential for creating an effective and maintainable data model.

However, the new features do very little to budge MySQL's position as a "toy" RDBMS. As I mentioned before, this does not preclude MySQL from being an excellent tool for niche applications. But without mechanisms to ensure data are valid, MySQL will remain a non-choice for information systems.

22 Comments

We just got stored procs and views and all that other stuff....kinda like saying, "hey we are serious about buildings cars, our new one even has fuel injection and shocks and power steering; all those fancy things you guys wanted!

HELLO! You are still 20 years behind the rest of us!

Critic - Wednesday, October 26, 2005 2:38:00 PM

Perhaps this is a cost issue and not a feature issue? Perhaps you're too brainwashed?

Police - Wednesday, October 26, 2005 3:01:00 PM

I have to fully agree with this. I can't see myself ever trusting MySQL to anything critical. It will be a long time before I forget that the MySQL developers used to tell everyone that they didn't REALLY need things like referential integrity or transactions. Thank goodness there's PostgreSQL...

DerelictMan - Wednesday, October 26, 2005 3:10:00 PM

DerelictMan, you can have referrential integrity, but you have to change the table driver. And by "change the table driver," I mean you have to create a new table, apply the constraints, INSERT INTO the new table from the old table, DROP the old table, and rename the new one (I forget the command for that).

That's right, by default you can not have any referrential integrity! But, it's just a simple 5 step process for each table you've created. Microsoft SQL Server does way too much on its own, and I like knowing that I have to set table drivers. Builds character.

On a serious note - table drivers? Why the hell should I have to worry about this? MySQL is an incredibly lousy platform and it's what I keep getting stuck building sites on for my job :[

Jakeypoo - Wednesday, October 26, 2005 3:45:00 PM

Firebird too is free and open source. And have a lot of interesting extra features, like the "Fyracle" where the Firebird works in Oracle mode!

Fabio - Wednesday, October 26, 2005 4:43:00 PM

It seems that your criticisms stem from the fact that you believe a database should be responsible for enforcing business rules. That's fine, but it is not the only view.

Plenty of critical, high performance, enterprise systems have been built without relying on the database for anything more than storing data - it is the application software which guarantees integrity.

I don't want to get into the "which layer does this belong in" debate, because it's never pretty, but bear in mind that the absence of so-called "key features" from a DBMS doesn't always matter. And it doesn't make that DBMS a toy.

Jim Arnold - Wednesday, October 26, 2005 4:50:00 PM

@Jim Arnold,

Those who believe that a database should not enforce business rules are a subset of those who have never maintained such a system.

The fact that you think there's a "which layer does this belong in" debate is rediculous. That's like arguing about how to wash clothes: sink & washboard or whirlpool washing machine.

Maintain some large information systems and you'll see the complete absurdity of not enforcing data integrity in the database. You can't know what's "good" until you've gotten your hands dirty.

Alex Papadimoulis - Wednesday, October 26, 2005 5:21:00 PM

Alex: if he doesn't want to go into that discussion, don't start it yourself, that's only silly.

The whole debate IS silly, simply because people participating in it are talking about different things. Example?

He wasn't talking about data integrity alone. He was talking about business rules in general. Business rules are more than integrity enforcing rules.

If I create a NIAM model, I can define the constraints I want, they're part of my abstract model. Logically, they can be perfectly implemented in the live datamodel in a schema. The thing is: business rules are far more than a set of constraints. Update a set of entity X if X has a relation with at least one entity Y in another set.

That should be done in the DB? I don't see it has to be done ONLY in the db: there are more views on that problem. Your remark:

"Those who believe that a database should not enforce business rules are a subset of those who have never maintained such a system. "

shows you mean A but say B, as you're talking about data-integrity, not business rules.

And make no mistake, determining the set for X and for Y in my update example can be very complex and the rule set can be huge.

Another point against having all rules in the db (I'm not saying no rules in the db, I have my constraints defined in the db as well, but constraints aren't all bl rules) is that if a rule is violated in the DB it's really late in the call chain. recovering from that is often harder to do than discovering earlier on what's going on.

A physical datamodel represents a set of entities (Chen) but the data inside it can be semantically interpreted by creating new sets (Codd) which can be used for new rules to produce even more sets, for example for reporting or datamining. Storing everything in the DB means that you effectively flatten your complete application into a single set of procs.

Besides DBA's fearing for their job, no-one would do that.

Frans Bouma - Wednesday, October 26, 2005 6:33:00 PM

@Frans:

Indeed. The problem you describe is one that is difficult to enforce in the database. Each situation is different, though many are solvable without getting ugly; here's how I handle the classic "itemless orders" problem:

Create a order lifecycle starting with a status of "incomplete." When the status is upgraded from "incompletel" to "submitted", validate that ORder Items exist.

I think that a lot of people have a difficult time equating "business rules" with "data integrity." The two go hand in hand. "Purchase_Qty > 0" and "LEN(SSN) = 9" are both business rules and constraints that need to be enforced in multiple places (UI, DB, possibly more).

I don't advocate storing "everything" in the database or even doing "business processes" in the database. But CHECK constraints are absolutely essential.

You seem to agree with this. I think you may be surprised how many people believe that "constraints are totally unecessary." These are the people I was refering to when I said that "there is no debate."

Alex Papadimoulis - Wednesday, October 26, 2005 7:01:00 PM

No one in their right mind will claim that MySQL is better than SQL server or Oracle if the two had to go head to head and cost wasn't an option. You definately would want to think twice before making it the primary RDBMS for some critical data of a huge enterprise, but that isn't the market it is intended for quite yet. Where MySQL shines is as the backend for an application. When you have the RDBMS behind a single application, these data integrity issues are minimized, because the application can easily validate any constraints itself (obviously, this would be a problem if many different applications that didn't share the same codebase were accessing the DB).

This makes MySQL an excellent choice for just about any modern web app, and just about every modern web app needs a DBMS. Remember back when SQL Server was ridiculed (and to some extent it still is today) because it couldn't handle a lot of the enterprise work that systems like Oracle could handle? Now Microsoft is the one who is laughing all the way to the bank. However, if they don't play their cards right, MySQL is now in that position that Microsoft SQL Server was once in and it could soon cut that laughter short. Its already turning heads and its definately headed in the right direction. The truth is that most situations where people are installing SQL Server aren't situations where MySQL couldn't concievably do the same job. There are countless SQL server installations that are serving up web sites and decent web apps, yet don't use triggers, don't use complex constraints, don't use OLAP, etc. Honestly, although it may sound like herisy, they could probably even get away without any transaction support at all if they really had to.

Jesse Ezell - Wednesday, October 26, 2005 7:17:00 PM

To throw my two cents into the ring:

When Designing a web app, I try to validate at every step in the chain. For user input you want to;

-First validate on the client machine so they have a chance to correct there mistake (ie. Missing Fields)

-Second, Validate on the server side things that can't be validated on the client (ie. duplicate user names, etc) and duplicate the client side validation (Can never trust the client side)

-third data layer checks if the data is of the right type, length, not null, etc before inserting into the database.

-forth database checks for checks if the data is of the right type, length, not null, etc

Theoretically the database checks could be removed and all the other validation would guarantee valid data, but its best to be safe then sorry.

Founder - Thursday, October 27, 2005 8:41:00 AM

Well I use mysql a lot in my work, but I will say database level data integrity checks are a very good idea.

Essentially, you are testing assertions, and while no one should ever *depend* on it for error handling, its a thousand times better to raise an error pre-insert vs. post-insert with "wierd results" down the road.

paddy - Thursday, October 27, 2005 6:09:00 PM

Great post Alex!

One thing I notice (and it strikes fear into my heart) is that a lot of Web companies seem to be actively hiring MySQL people.

This even though MySQL is well-known to be a *dangerous* thing to trap yourself into. Fine for your blog, but really scary if you have to do something complex, or want to grow your business, or if the data itself is vital to your business. I won't go into the details, I think from the comments everyone here knows what I'm talking about.

With PostgreSQL available and in fact *more free* than MySQL (license-wise), why would anyone do this? Postgres is clearly considered superior by everyone who knows anything about RDBMS's, is safe and secure and feature-rich, and offers a clear upgrade path to industrial DBs should the need arise. And PG is not the only serious free RDBMS, and some of the commercial ones are not actually all that expensive (especially if you're in a position to hire people).

Your post has a perfect example of the danger: a semi-failed transaction (the constraint) is reported as OK. WTF?? Actually, if you do any real work with MySQL, you quickly realize there's a lot of that. The very fact that a transition can "semi-fail" should have any DB geek twitching. The fact that you have a whole lot of people, techies and managers alike, who don't consider that a catastrophic situation is a sad and dangerous thing.

BTW, one request: could you possibly put line-breaks in your code here? It would be a lot easier to read without the horizontal scrolling.

frosty - Wednesday, November 2, 2005 4:37:00 PM

Superb post, Alex!

I wasn't aware that MySQL didn't enforce CHECK constraints. Doesn't it support UNIQUE constraints either? Regardless, this absent feature makes the MySQL DBMS an absolutely laughable joke for any non-trivial application. I'll avoid it like the plague and use PostgreSQL or Firebird for open source solutions.

IMHO the only reason MySQL maintains its popularity is due to its ubiquity on the Linux platform. Virtually all the big distros feature it. Makes no sense to me!

As a side point, I wonder if the fabled speed of MySQL suffers when using the InnoDB table driver and has to enforce referential integrity etc. I'll bet it does...

Baxter Basix - Friday, June 2, 2006 12:30:31 PM

"Are database-side integrity checks a bad idea? No, I can see the utility there. Are they necessary? "
Absolutely. Doing all data integrity in the client leaves you open to attack since you can't force the bad guys to use your application.

Craig Beere - Tuesday, December 19, 2006 9:31:15 PM

I think you make some very good points.

I think MySQL is still very handy when you're on a budget, but checking data integrity at the database level means cleaner data.

I'm not saying that all business rules should be in the database, but you should be pragmatic about it and do as much data validation at the database level as can be done with a reasonable amount of effort.

The reasons for this in my opinion are:
- the database often (if not always) will be around a lot longer than the application code, so you should take great care in keeping it clean.
- there will often be more than one application accessing the database, so doing a reasonable amount of data validation in the database will make that validation consistent across applications.

Mike Stephen - Wednesday, March 28, 2007 4:53:46 PM

Let me get this straight. You are saying that since MySql doesn't support CHECK CONSTRAINT, it could potentially leaves the database with bad data. But at the same time, you are not advocating putting all the business rules into the database. If all the business rules are not in the database then the potential for the database to contain bad data is still there. How do you resolve that?

Conrad - Wednesday, March 28, 2007 8:14:10 PM

If you care about data integrity and important things like that, why not use PostgreSQL if you want an open source RDBMS?

Luke - Wednesday, March 28, 2007 8:19:48 PM

create temporary table the_dates (d date not null default '0000-00-00');
insert into the_dates values ('0000-00-00');

mysql> select count(*) from the_dates where d is null;
+----------+
| count(*) |
+----------+
| 1 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from the_dates where d is not null;
+----------+
| count(*) |
+----------+
| 1 |
+----------+
1 row in set (0.00 sec)

STUPID!!! I could go on and on.

Robert - Friday, March 30, 2007 12:52:13 PM

Hello all

How I can change avatar in this forum?

MaryJames - Wednesday, May 23, 2007 11:42:33 PM

Maybe its a toy..Bu if you know the problems + workarounds it's a good toy, perfect for web apps. It just works, and there are lots of heavy loaded sites around using mysql without any big problem. yahoo, wordpress, flickr etc.. So I think it depends on what you are using it for.

baris - Wednesday, July 18, 2007 11:05:57 AM

I develpe various Web application for LAN/WAN Network Administration using palin text file.
Due to certain well-known limitation,now I am switching over to & learnign DBMS technic with MySQL.But,reading such discusson
(like 'MySQL 5.0: Still A "Toy" RDBMS !!!!')
,I got realy confused about choice of DBMS producct.
I thing I have to be first master of all available DBMS product (like MySQL,
MS SQL,Oracle etc)and then I can select proper DBMS and then I will fouce on my original work i.e. network software development.
May God help me !!!

mysqllover - Friday, April 11, 2008 11:53:22 AM

Comments have been disabled for this content.