Entity Framework v4.1 update 1, the Kill-The-Tool-Eco-system version

Updated with fix of Microsoft's code so Microsoft can get this fixed quickly. See below

As you might know, we've been busy with our own data-access profiler for a while now. The profiler, which can profile any DbProviderFactory using data-access layer / ORM, works by overwriting the DbProviderFactories table for the app-domain the profiler is used in. This is a common technique: it replaces the type name of the DbProviderFactory of an ADO.NET provider with the type name of a wrapper factory, which receives the real factory as a generic type argument. Example: ProfilerDbProviderFactory<System.Data.SqlClient.SqlClientFactory>.

This is the same technique used by the Hibernating Rhino's profilers and others, and it has the benefit that it's very easy to use and has no intrusive side effect: you only have to add 1 line of code to your own application and everything in the complete application can be profiled.

This morning I was looking into how the stacktraces of code executed by MVC 3 looked like so I used an example application to get up and running quickly. It required Entity Framework v4.1 (for code first), so I grabbed the latest bits of Entity Framework v4.1, which is the update 1 version. Our tests on Entity Framework v4.0 worked fine, so I was pretty confident.

However, it failed. Inside the Entity Framework v4.1  update 1 assembly, it tried to obtain the factory type name from the DbProviderFactories table, and did some string voodoo on the name to obtain the assembly name. As it doesn't expect a generic type, it fails and it simply crashes. For the curious:

Method which fails (in EntityFramework.dll, v4.1.10715.0, downloaded this morning):

public static string GetProviderInvariantName(this DbConnection connection)
{
    Type type = connection.GetType();
    if (type == typeof(SqlConnection))
    {
        return "System.Data.SqlClient";
    }
    AssemblyName name = new AssemblyName(type.Assembly.FullName);
    foreach (DataRow row in DbProviderFactories.GetFactoryClasses().Rows)
    {
        string str = (string) row[3];
        AssemblyName name2 = new AssemblyName(str.Substring(str.IndexOf(',') + 1).Trim()); /// CRASH HERE
        if ((string.Equals(name.Name, name2.Name, StringComparison.OrdinalIgnoreCase) && (name.Version.Major == name2.Version.Major)) && (name.Version.Minor == name2.Version.Minor))
        {
            return (string) row[2];
        }
    }
    throw Error.ModelBuilder_ProviderNameNotFound(connection);
}

Stacktrace:

[FileLoadException: The given assembly name or codebase was invalid. (Exception from HRESULT: 0x80131047)]
   System.Reflection.AssemblyName.nInit(RuntimeAssembly& assembly, Boolean forIntrospection, Boolean raiseResolveEvent) +0
   System.Reflection.AssemblyName..ctor(String assemblyName) +80
   System.Data.Entity.ModelConfiguration.Utilities.DbConnectionExtensions.GetProviderInvariantName(DbConnection connection) +349
   System.Data.Entity.ModelConfiguration.Utilities.DbConnectionExtensions.GetProviderInfo(DbConnection connection, DbProviderManifest& providerManifest) +57
   System.Data.Entity.DbModelBuilder.Build(DbConnection providerConnection) +159
   System.Data.Entity.Internal.LazyInternalContext.CreateModel(LazyInternalContext internalContext) +61
   System.Data.Entity.Internal.RetryLazy`2.GetValue(TInput input) +117
   System.Data.Entity.Internal.LazyInternalContext.InitializeContext() +423
   System.Data.Entity.Internal.InternalContext.GetEntitySetAndBaseTypeForType(Type entityType) +18
   System.Data.Entity.Internal.Linq.InternalSet`1.Initialize() +63
   System.Data.Entity.Internal.Linq.InternalSet`1.GetEnumerator() +15
   System.Data.Entity.Infrastructure.DbQuery`1.System.Collections.Generic.IEnumerable.GetEnumerator() +40
   System.Collections.Generic.List`1..ctor(IEnumerable`1 collection) +315
   System.Linq.Enumerable.ToList(IEnumerable`1 source) +58
...

Mind you, this isn't a CTP. It's the real deal. Hibernating Rhino's blogged yesterday about this problem in v4.2 CTP1, and they added a temporary workaround, but in the end this situation actually sucks big time.

We're close to beta for our profiler, which supports (among all other DbProviderFactory using data-access code) LLBLGen Pro, Linq to Sql, Massive, Dapper and Entity Framework v1 and v4, but from the looks of it, not v4.1. In the many years we're now building tools for .NET, this is the biggest let-down Microsoft has given me: almost done with the release and now this...

Frankly I don't know what Microsoft is up to, but it sure as hell isn't helping the tool eco-system along, on the contrary. At the moment, I'm simply sad and angry... sad for hitting just another wall after all the work we've done and angry because it's so unnecessary.

Hopefully they fix this soon...

Update

I rewrote their code in a test to see if I could obtain what they want to obtain and still use the overwriting. It's easy, especially since they have access to the DbConnection.ProviderFactory property, which is internal, but not for Microsoft. My test below uses reflection, which they don't have to use. Hacked together, so not production readly code, but it serves the purpose of illustrating what could be done about it with little effort. The 'continue' in the catch is there because you can't recover from any exceptions at that point anyway (and most of them are originating from factories you can't load)

[Test]
public void GetProviderInvariantName()
{
    var factory = DbProviderFactories.GetFactory("System.Data.SqlClient");
    var connection = factory.CreateConnection();
    Type type = connection.GetType();
    AssemblyName name = new AssemblyName(type.Assembly.FullName);
    var factories = DbProviderFactories.GetFactoryClasses();
    string invariantName = string.Empty;
    var dbProviderFactoryProperty = connection.GetType().GetProperty("DbProviderFactory", BindingFlags.NonPublic | BindingFlags.Instance);
    foreach(DataRow row in factories.Rows)
    {
        try
        {
            var tableFactory = DbProviderFactories.GetFactory(row);
            if(tableFactory.GetType()==dbProviderFactoryProperty.GetValue(connection, null).GetType())
            {
                // found it. 
                invariantName = (string)row[2];
                break;
            }
        }
        catch
        {
            continue;
        }
    }
    Assert.AreEqual("System.Data.SqlClient", invariantName);
}
Published Thursday, July 28, 2011 12:49 PM by FransBouma

Comments

# re: Entity Framework v4.1 update 1, the Kill-The-Tool-Eco-system version

Thursday, July 28, 2011 5:43 PM by Synerlan SSII

Wow ... hidden breaking changes like those are pretty brutal in a library that's supposed to integrate with any set of custom dev tools. I don't think Microsoft has done too much of this kind of thing in the past. Maybe the EF team has less stringent requirements for maintaining API stability than some larger teams do?

# re: Entity Framework v4.1 update 1, the Kill-The-Tool-Eco-system version

Monday, August 8, 2011 5:35 AM by scifire

The type

dbProviderFactoryProperty.GetValue(connection, null).GetType() is loop invariant so you can move it to outer scope. Yes I know it's a test, it's not going to have many many factories and probably it will be optimized by the compiler but still it's a good programming practice to write loop invariant code outside of the loop and I know you are a good programmer.

# re: Entity Framework v4.1 update 1, the Kill-The-Tool-Eco-system version

Monday, August 8, 2011 5:47 AM by FransBouma

@scifire:

I agree that any loop invariant logic should be moved outside the loop, no question about that.

I didn't pay attention to anything when hacking that test together. I just wanted to see whether it was possible at all to determine the information in another way, and it was, that was the whole purpose :)

You also have to imagine me sitting behind the keyboard totally down and mentally broken at that moment after I realized our hard work for the past 3 months was a waste of time until this was fixed.

We found a 'work around' the next day, but it's not pretty, although it works: generate a wrapper class per factory found in-memory and compile it into a separate assembly in-memory. This gives a worse-case 2-3 second startup penalty but at least it makes it work again.

Microsoft has told us they're working on a fix (really wondering how long that will take them, the fix is already known ;)) and will post about it soon. We don't hold our breaths though so we worked on a workaround to at least make our profiler, which is going to hit beta soon, work with EF v4.1 update 1. :)