Why OR Mapping does work.

Monday, June 16, 2003

I've tried to make a similar framework [as Paul Gielens has] but I'm leaning more and more towards discarding OR-mapping as an alternative. Clemens Vasters gave me the final push in a breaktime discussion a month ago during the Scalable Apps tours visit in norway. Successful OR mapping might be flawed by definition because you'll never fully be able to ensure that you are operating on the same object.

He further has some other points against O/R Mapping.

When you look at O/R mapping the way Mads describes it, you will conclude than that he's right: you can't assure two or more threads are working on the same entity unless they operate in the same appDomain and even then. But that's not the point of O/R mapping. O/R mapping is a way to work with data on a more abstract level. It doesn't suffer more from the 'concurrency problem' than any other database code outside the RDBMS itself: every line of code outside the RDBMS which is not shared in a single thread is potentially a danger to the integrity of the data in the RDBMS, since two or more threads can execute DML statements using that line of code at the same time, resulting in potentially concurrency problems.

It therefor doesn't matter how you interact with a persistent storage such as an RDBMS: be it through low level stored procedure calls, through objects from an O/R mapper layer, by calling code in a DAL, each time you use that kind of code, you run the potential risk of working on data that is currently also targeted by another thread. Then there is the problem of (a) reading data from the database (b) modifying it (c) and saving it again. When the application starts action (b), the data read from the database can already be altered by another thread. This can be solved by locks in the persistent storage but that doesn't scale well. If you look at it closely, it is again a concurrency problem: a user who wants to execute the sequence (a), (b) and (c), can void the work of another user who is already in action (b). Is this the result of O/R mapping? No! It is related to code that is run outside the persistent storage. So any low level DAL code will suffer from this too. O/R mapping is therefor not a mechanism that suffers from this disadvantage alone, every C# code accessing a database does.

If you as a developer are aware of that fact, that no matter what you do, the code you'll write in C# or VB.NET is potentially not working with the correct data because it modifies data outside the persistent storage, you can do something about it, by implementing concurrency control, preferably by using functionality locking, described in a previous blog. From that moment on, you can use an O/R mapper layer without any problems, simply because the actions performed on the persistent storage are not hurting other users.

So basically it comes down to: realize that the data that's stored in-memory (be it a datatable, a custom class, a string of text) is a copy of data that is stored in the persistent storage and that other threads in the application (possibly on other machines as well, f.e. in a webfarm) will not see your version of the data. If that's not what you want, be sure that other threads do see your version of the data, by a) updating the persistent storage and b) keeping your in-memory copy in-sync with the persistent storage.

O/R mapping is a very useful abstract way of working with entities stored in a database in the form of values stored as rows inside tables and views, which does not introduce new problems compared to other .NET code, but does make working with data more transparent for the developer. Can the developer forget about the database? No. As the developer also can't forget about files in a filesystem, protocols when working with networks etc. An entity is a concept that both is defined in code and in the database model, and thus in code you'll have relations as you have them in your database model. The database is there as the persistent storage, the concept 'entity' is part of the database too. You can't ignore that simply because you are in the C# editor and not typing SQL.

The only small problem I have with writing an O/R mapping layer is that the relational databases which are used to persist the entity data into are not capable to store inheritance trees, simply because the relational model doesn't support inheritance of subtypes: you can't store a supertype-subtype relation in DDL, you have to use attribute values for that, which is not workable (change a value in an attribute and the entity holding that value is suddenly of another type, that's not what I call 'robust'). However this is not a problem per se: if you look at the tables found in a schema in a catalog / tablespace, you can use these entities as your entities in your code. Your OO abstraction will then be a little less abstract than with a database that's pure OO, but the abstract way of working with data which is brought to you by O/R mapping is still possible, very useful and at a highly abstract level. With OO databases, O/R mapping can only get better, but O/R mapping is already very useful today.

No Comments