LINQ to Concurrency Problems

Tuesday, October 28, 2008

LINQ to SQL is fascinating - the more I work with it, the more I hate it, and then love it again all at the same time. Recently we had an issue while using it in a slightly older version of our N-Tier WCF application which uses LINQ to SQL as it's primary ORM solution (yes, I know).

I'll try to briefly familiarize you with the part of the system that became interesting here. In our system we have a domain object called "Asset". Asset has a property called "AssetType" that is an "AssetType" type. Here's a super simplified version:

The interesting thing about our system is that it is strongly tied to a legacy system which keeps all of it's data in XML files organized using a folder structure. The legacy system can't go away and we can't yet migrate the data to our new system's database so we actually pull our asset objects from either the database or the XML files. Once we have the asset built based on the XML data, we need to assign the AssetType, which is only stored in the database. The first solution was to use LINQ to SQL to select an appropriate AssetType and attach it to the asset created from XML. The problem here, is that getting an asset from XML (which happens more often than from the db) took about twice as long as we wanted it to.

Solution ? Cache Baby, Cache!

We started by creating a very basic cache. We decided to just create a static class called "DataCacheHelper" with a single static property that was a static collection of AssetType. By selecting all of the AssetTypes from the database the first time we needed them, and storing them in the static collection, we could quickly grab the one we wanted from any OperationContext when we needed it, without having to go to the database. It was fast and it worked great in when we unit tested it. We deployed it to our test environment, went through the usual workflows and test cases, and everything seemed to check out, so we were cleared to go to production.

That's when it all got... interesting.

Everything was going fantastic. More than fantastic. The system was 3 times faster for some calls, and just more performant all around. We were all sitting here, hitting the system from various vectors, gleefully proving to ourselves how great we all are, never imagining that we had overlooked something critically important in our design. Then:

Person 1: "Uh... I have a failure."

Me: "Which method?"

Person 2: "Testcase 3."

Person 3: "It passes for me."

Me: "Yeah me too... wait... it just failed for me now."

And then they all failed. The whole thing kept on failing and we hurriedly rolled back the production deployment. But what happened?

System.ArgumentException: Could not modify EntitySet.
   at System.Data.Linq.EntitySet`1.CheckModify()
   at System.Data.Linq.EntitySet`1.Add(TEntity entity)
   at Domain.Entities.Asset.set_AssetType(AssetType value) in DomainDataEntities.designer.cs:line 2933

So this was unexpected, but at least I had a place to start. The generated code for setting the AssetType property for Asset was throwing an exception for some reason. I looked at the code and that's when the first problem dawned on me. LINQ to SQL will automatically create the reverse reference for you by default. So my AssetType had a property called "Assets" that was a collection of Asset objects. Specifically, it was an EntitySet<Asset>. I realized instantly that EntitySet is probably not thread safe, and even if it were, it certainly wasn't OperationContext safe (as you can have more than one operation per thread). So the design had failed because LINQ to SQL was doing it's job and making assumptions about how we wanted to work with our data, and we weren't doing our job and customizing what LINQ to SQL had created to fit the specific way we wanted to work with our domain objects. Fortunately this was REALLY easy to fix.

If you look at the properties for the relationship you'll se a property named "Child Property". All you have to do is set that to false and voila! The AssetType object no longer has an "Assets" property. We rebuilt, retested, and are now happy with a system that is both more performant AND it has the added benefit of actually working.

As a side note, while researching this issue, I found absolutely 0 results for a Google search on "EntitySet.CheckModify()". Unfortunately you can't get source code for System.Linq.Data yet to see what that method is checking before it throws, but I used Reflector (search for it) and was able to see exactly how it worked and why it was throwing. I'd love to post more about that but I think MS frowns on posting decompiled source code, so I'll have to come up with a better way.

@TechnnetGuy

I've had that set up since it was available. Unfortunately, System.Linq.Data source code is not yet available so the feature doesn't help.

fdumlao - Wednesday, October 29, 2008 5:06:16 PM

Awesome! Thanks for posting. Otherwise, I might have spent all day on this problem.

Robert Claypool - Wednesday, December 2, 2009 6:16:32 PM

Solution ? Cache Baby, Cache!

2 Comments