Database independent ADO.NET

Wednesday, October 8, 2003

One of the main complains about ADO.NET it's that you cannot write code that is database independent.

There are a number of libraries to accomplish it, and the latest DAAB version also supports it.

I think the designers of ADO.NET acknowledged something that we don't want to: each database engine is different, we will write different SQL code, and we will need different APIs.

The most visible difference is parameter markers. SQL Server native ones are prefixed with '@'. In OleDB we used '?' which required the SQL driver to parse the sentence and change them to '@'. In .NET all the Data Providers use their native parameter markers, so you have ':' for Oracle, '@' for SQLServer/Access, etc.

This implies that is very difficult to write a parameterized SQL sentence that can be database independent. 'SELECT * FROM CUSTOMER WHERE CustomerId = @CustomerId' won't work in Microsoft's Oracle Data Provider even if we are using the generic IDb APIs.

Even if ADO.NET provided a generic API and even if it supported the same parameter markers in each Data Provider, we all know the SQL sentences are different in each database engine. Doing a SELECT * FROM CUSTOMER always works, but when you need to deal with autonumbering, database-specific types (think uniqueidentifier in SQL Server), outer joins, then it's not possible to use the same SQL sentences.

Suppose you _can_ write generic SQL, because you have simple SQL statements that are supported in all the databases you need, or the places where you need different statements are not many.

Then you code:

IDataReader reader = myAbstractCommand.ExecuteReader(“SELECT CustomerID FROM CUSTOMER);

while (reader.Read()) short customerId = reader.GetInt16(0);

And then you run Anakrino, open the OracleDataReader class, and find:

public virtual short GetInt16(int i) {
throw ADP.NotSupported();
}

...

Writing database independent code is a dream. Let's wake up.

6 Comments

This is a hard learned lesson for a lot of people. As much as we would like to ignore that a database server is a hugely complex system, we constantly hit areas where performance, isolation, locking, tuning and best practices must be made in a database server specific way.

I also see a lot of people trying to make database independent code because it gives them freedom, even though it's not necessarily in the project requirements. Sometimes a feel we create problems that are not really there, and the costs are huge. I would be surprised if the cost of installed a specific database server is going to be the turning point of a large and expensive business system installation.

I have spent considerable amount of time working with databases, and I agree; it’s definitely time to wake up.

Morten Abrahansen - Wednesday, October 8, 2003 4:57:00 AM

Well, I wouldn't agree with you at all. It is possible to write the data layer in such a way that you can localize it after that for a specific database. The parameter names problem is not a real problem, as there are solutions to it. The not implemented methods in a certain data provider is a nastier problem, but of course you can always workaround by using the implemented ones (i.e. GetInt32 instead of GetInt16 or whatever) ;).

I agree with the previous comment that sometimes needlessly things are made more complicated just to "support" different dbs ... Of course when it comes some day to rewriting 100 classes just to support another db, then it will be obvious that a ***measured*** additional complexity is the better option ...

Best regards,

Deyan Petrov

Deyan Petrov - Wednesday, October 8, 2003 9:26:00 AM

... and also - how easy is to support 2 data layers for 2 different data providers? Or more? And why should you do that when all the code is the same at the interface level?

The NotImplementedException is a hack imho - first you say you are gonna implement the interface and after that you throw a NotImplementedException ...

To sum it up, my opinion is that the data layer should not rewritten entirely in order to support a new data provider. The data layer should deal with interfaces. The data access component should be able to handle the data provider differences by using different sqls (in xml configs) for the different data providers. If it's not able to abstract adequately, then the abstraction should be made at a higher level (look at ORMs, best examlpe Hibernate).

Regards,

Deyan

Deyan Petrov - Wednesday, October 8, 2003 11:15:00 AM

Deyan,

If you use different SQL sentences for each data provider you solve the problem, but that's not writing 'database independent SQL'. The goal of having a generic ADO.NET interface is to be able to share the SQL code.

Using a tool for the persistence layer solves the problem, because the tool generates or executes different SQL sentences for each database, but then your API is not ADO.NET but the tool's API.

Regards,

Andres.

Andres Aguiar - Wednesday, October 8, 2003 11:27:00 AM

Perhaps ADO.NET should provide a facility similar to CodeDom that would allow generating the SQL sentences appropriate for each provider. This would provide a certain degree of independence from the SQL syntax of the database.

Fernando Tubio - Wednesday, October 8, 2003 1:03:00 PM

I agree with the comments below in that providing an unified API without trying to abstract the SQL syntax/features is a good thing -and much more realistic than trying to provide an absolutely database-agnostic data access API.

The problem is that even if you ignore the different in syntax and focus in APIs, there are still big differences.

For example, SQL Server currently sends first result-sets and then output parameters. Other databases make the output parameters available first. This means that for SqlClient you cannot access the output parameters until you consumed all the SqlDataReaders. This restriction may not be there for a different back-end. This may cause code to break depending on the provider, and it's unrelated to SQL syntax.

These differences (there are many more like this) make it very hard even to unify the API without doing some magic...and we all now that magic is nice when it's optional, but not when it's buried into a core api such as ADO.NET, it'll get in the middle sooner or later.

All that said, if you're comming to the PDC, take a look at the abstract of the 'Programming ADO.NET in "Whidbey"' in the PDC website and you may find some related news ;)

-pc

Pablo Castro - Wednesday, October 8, 2003 9:30:00 PM

Comments have been disabled for this content.