O/R mappers and concurrency control

Paul Wilson and Alex Thissen both blog about concurrency control related to O/R mappers. Let me start by pointing you to an article about concurrency methods I wrote some time ago: Concurrency Control Methods: is there a silver bullet?. I don't believe in low level concurrency methods, as they give you the false sense of 'it has been taken care of', while they just don't do that: they still cause loss of work.

However, there are occasions where low level concurrency control can be helpful: for example if you don't care if someone looses his/her work and you just care about the consistent state of the database, or in situations where you for example want to prevent deletion of data after some period of time (user opens edit screen, goes away for lunch, comes back, deletes data, however the state of the record has been changed, the delete should not happen). This example also illustrates that concurrency control is not something that's solely related to saving data. It is also related to deleting data.

In my commercial O/R mapper LLBLGen Pro I had to solve the same question Paul Wilson had asked himself: how to implement concurrency control? History learns that when you ask a random group of developers: "how would you implement optimistic concurrency control?", you get a wide variety of answers. This means that, besides the point that there is no silver bullet to solve all concurrency problems, it is also not possible to define a common way to supply concurrency control: different developers will implement concurrency control differently, or at least: want to use it differently.

In LLBLGen Pro I've solved it by introducing a predicate factory which is used through the Strategy Pattern [GoF]. A predicate is a clause which can be used in a WHERE statement of a select or delete. The factory implements a common interface: IConcurrencyPredicateFactory, defined as follows:

public interface IConcurrencyPredicateFactory
{
   IPredicateExpression CreatePredicate(
      ConcurrencyPredicateType predicateTypeToCreate, 
      object containingEntity);
}
where ConcurrencyPredicateType is a normal Enum.

A developer who wants to add concurrency control to a given entity, can implement this interface and store an instance of that implementation in the entity instance. Once that is done, persistence logic, like the Save logic or Delete logic, also during recursive saves (saving a complete hierarchy of objects at once in a single transaction) will consult the IConcurrencyPredicateFactory instance for a predicate to add to the filter to be used in the Save or Delete action. Because the predicate is constructed at runtime, in a class the developer writes him/herself, the predicate can be constructed precisely how the developer wants it to be. The factory can even fire events, call delegates or consult other objects to retrieve information to produce the correct predicate expression, because the class is written by the developer, it just has to implement the one method defined in the interface.

Discussions how concurrency control has to be implemented are then not necessary anymore: the developer decides. Customer requires timestamp-based concurrency control and Order requires all-value-checking concurrency control? Just produce the predicate expression which matches those requirements: pass a different implementation of IConcurrencyPredicateFactory to the Customer entities than you pass to the Order entities. Concurrency control as flexible as you can possibly get, and usable not only on a per-type basis but also on a per-instance basis.

11 Comments

  • Good stuff Frans. I totally agree that optimistic concurrency is often, and incorrectly, given a very lofty place. That's why I didn't even bother to include it in my first version, and even now make you set it up. I also don't like the fact that its the default in MS ObjectSpaces, although I assume (hope) there's some way to turn it off. I have seen very few systems over the years that have multiple people updating the same records, although its very common to have triggers or other checks against common summary data, but that's not really the same thing at all!

  • How do you solve the case where you want to do an optimist lock in a row that you will be just reading and not updating?



    For example, I read the product price but I don't update the product table. When I save the order, I want to make sure the product has the same price as it had when I read it.

  • Peter: you're not able to do that, unless you're placing an exclusive lock on the product row in the products table. However, even then, this can be overruled by hints like 'NOLOCK' on sqlserver for example.



    It will also really hurt performance, as it will block any readers of the product table during your action. (On Sqlserver).



    If you really want to prevent that, you have to schedule logic in your application so it can't update a product row if the product row is used in some logic in the application. THis is called functionality locking, see my blog about that (linked in this blog) for more details.

  • Frans,



    No, I can perform the same kind of optimistic locking. I read the priduct price when I read the order, and when I update the order I check if the product price is the same.



    The thing is that is an optimistic lock in a table that you are not updating.

  • You can't 'lock' anything. The only way you can achieve it is by an extra FROM clause and WHERE predicates which filter on the product price row (or subquery on oracle). This can become pretty intensive SQL.

  • I don't get what you mean.



    I could do a select (updlock) or a select with a serizable isolation level in the Product when updating, check the old value with the new one, and then continue with the usual optimistic locking process. That's what I would do if I need that to happen automatically without support from the O/R mapper.





  • An update lock or another lock will always be circumventable through the usage of 'NOLOCK' if you want. You'd also still lock the product row which will block readers if they don't use NOLOCK.



    As soon as you bring data outside the RDBMS (in this case the product price) and use that data to write other data back into the database, you can have differences between the original data read and the values of the same fields at the time of writing of the order. (product price may have changed).



    Does this matter? No. The reason for this is that you have to read the data first, do some things with it (probably put it in a gui) and some time later the price data is used for an insert somewhere else (order). During that time, the price row is locked in your situation. This hurts performance tremendously.



    It also causes weird logic: what happens when the order is written and the price has changed? Does the order fail? The customer has agreed to buy it on the price specified. You can't say: "Oh, erm.. the price is higher now". In the worst case, the customer has to decide if he wants to order for the new price, and finds it too high perhaps and leaves: the customer has spend some time to fill in the order which is lost.



    Locking rows during a long transaction (with user intervention) is not an option as this will hurt performance.

  • > However, there are occasions where low level concurrency control can be helpful: for example if you don't care if someone looses his/her work and you just care about the consistent state of the database, or in situations where you for example want to prevent deletion of data after some period of time (user opens edit screen, goes away for lunch, comes back, deletes data, however the state of the record has been changed, the delete should not happen).



    C’mon Frans I know you can do better. After reading this hard to read sentence :) This is a typical scenario I find myself in while writing factory automation software. 99.9% of the time I only care about detecting the concurrency violation and retry by processing and offering the whole load of data again and again, and again.



    Isn’t it so that a typical business application would profit from a “context based concurrency strategy”. Your IConcurrencyPredicateFactory isn’t flexible in case a single entity would “vote” and participate in multiple object hierarchies. Or more then one use-case so to speak, each offered as a unit of work.

  • "C’mon Frans I know you can do better. After reading this hard to read sentence :) This is a typical scenario I find myself in while writing factory automation software. 99.9% of the time I only care about detecting the concurrency violation and retry by processing and offering the whole load of data again and again, and again."

    But in the end you have to decide to quit processing it. Low level concurrency is not a solution. It's a last resort if your high-order concurrency scheme failed.



    "Isn’t it so that a typical business application would profit from a “context based concurrency strategy”. Your IConcurrencyPredicateFactory isn’t flexible in case a single entity would “vote” and participate in multiple object hierarchies. Or more then one use-case so to speak, each offered as a unit of work."

    Why is the most fine grained solution not flexible enough? :) The implementation of the interface is up to the developer, so if the developer wants to add events, references to other objects, it's possible. So the decision the concurrency predicate factory has to make can be based on whatever scenario you can think of. It also might simply fire an event which is caught by an external service object which has a clear overview what to do.



    If you want higher level concurrency functionality, it is not wise to implement that on a low level, so it's in all cases more wise to simply add logic which prevents low level concurrency violations pop up like multi-edits of the same row.

  • Ok I didn't realize this level of concurrency handling is_intended to be implemented behind your IConcurrencyPredicateFactory interface. Perhaps you should add a comment on your LLBLGen Pro product page... didn't find anything about it in the public section.



    Thnx for commenting it further.

  • Paul: the interface is meant to supply a filter object which should limit the query executed. So if your tables have a timestamp column, you can produce a filter which filters on a particular timestamp value. This predicate (the filter object is a predicate object) is then added with AND to the WHERE clause of the query. Which then thus fails if the timestamp value isn't met.



    However if you want to test on other things, you can. It's up to you :)



    Your point about the info is a good one. We'll make the documentation available online for browsing next week. (as it is explained in there).

Comments have been disabled for this content.