Thursday, April 02, 2009 3:30 PM Mohamed Meligy

Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

If you don’t know what eager loading is, Jump to “What’s eager loading?”.

Eager Loading Syntax

If you are eager loading Products for example in a typical (Categories 1<->* Products) relation, the standard syntax would like:

DbDataContext.Categories.Include(“Products”)

What is the problem with that?

The “Products” part. The word “Products” is a string. If I rename the Products table to ShopProducts or whatever or even remove it from this data diagram and have it elsewhere, or even something wrong happens and the relation is removed from DB/diagram by mistake, my code will still compile, but will fail when it runs. This is BAD BAD BAD.

How to solve this?

Since I always believe that if something exists somewhere you shouldn’t do it yourself unless its totally broken (and I mean REALLY REALLY BROKEN), I started searching inside the Entity Framework itself for something to get the entity name from.

At first it seemed super easy. Every entity class has a static property “EntityKeyPropertyName”, so, I thought I can write something like:

DbDataContext.Categories.Include(Product.EntityKeyPropertyName); // But this didn’t work

Where Product is the entity class generated for table “Products”. Note that singularizing the name (Products becomes Product) does not happen automatically like in Linq-To Sql, you’ll have to change it manually, which is not required for the code here of course.

As you an see n the comment, this didn’t work. The value of property was always “-EntityKey-”, the default value of the abstract class “StructuralObject” which all entity classes inherit.

I kept searching all over until I found that the only place I can get the name from was an Attribute generated on the class somewhat like this:

[global::System.Data.Objects.DataClasses.EdmEntityTypeAttribute(NamespaceName="DatabaseNameFlowModel", Name="Products")]

My requirement was simple. if the diagram has something wrong that the relation between ParentTable and ChildTable tables but not about the entity classes themselves, My code should not still compile and fail on run. I need to use some code that depends on the relation so that if something is wrong with the relation this code fails early and I know about it the problem the next time I build the VS project.

The Final Solution

I tried badly to get the entity name from the API, after frustration, I ended up writing this code:

using System;
using System.Collections.Generic;
using System.Data.Objects;
using System.Data.Objects.DataClasses;
using System.Linq.Expressions;
using System.Reflection;
 
namespace Meligy.Samples.EntityFramework
{
    public static class LinqExenstions
    {
        //Used for child entities.
        //Example: Order.Customer
        public static ObjectQuery<T> Include<T>(this ObjectQuery<T> parent,
                                                Expression<Func<T, StructuralObject>> expression)
            where T : StructuralObject
        {
            return Include(parent, (LambdaExpression) expression);
        }
 
        //Used for child collections of entities.
        //Example: Order.OrderLines
        public static ObjectQuery<T> Include<T>(this ObjectQuery<T> parent,
                                                Expression<Func<T, RelatedEnd>> expression)
            where T : StructuralObject
        {
            return Include(parent, (LambdaExpression) expression);
        }
 
        private static ObjectQuery<T> Include<T>(ObjectQuery<T> parent,
                                                 LambdaExpression expression)
            where T : StructuralObject
        {
            //There must be only one root entity to load related entities to it.
            if (expression.Parameters.Count != 1)
            {
                throw new NotSupportedException();
            }
 
            //We'll store entity names here in order then join them at the end.
            var entityNames = new List<string>();
 
            //We split the calls ... Entity.MemberOfTypeChild.ChildMemberOfChildMember etc..
            //Example: (Order ord) => ord.Customer.Address
            string[] childTypesMembers = expression.Body.ToString().Split('.');
 
            //Get the root entity type to start searching for the types of the members inside it.
            //In prev. example: Find: Order
            Type parentType = expression.Parameters[0].Type;
            //entityNames.Add(GetEntityNameFromType(parentType));
 
            //The first word in the expression is just a variable name of the root entity. 
            //  Skip it and start next.
            //In example: First part is: ord
            for (int i = 1; i < childTypesMembers.Length; i++)
            {
                string memberName = childTypesMembers[i];
 
                //Get the member from the root entity to get its entity type.
                MemberInfo member = parentType.GetMember(memberName)[0];
 
                //We cannot get the type of the entity except by knowing
                //  whether it's property or field (most likely will be property).
                //Bad catch in the reflection API? Maybe!
                Type memberType = member.MemberType == MemberTypes.Property
                                      ? ((PropertyInfo) member).PropertyType
                                      : ((FieldInfo) member).FieldType;
 
                //Add the eneity name got from entity type to the list.
                entityNames.Add(GetEntityNameFromType(memberType));
 
                //The next member is belong to the child entity, so,
                //  the root entity to seach for members should be the child entity type.
                parentType = memberType;
            }
 
            //Join the entity names by "." again.
            string includes = string.Join(".", entityNames.ToArray());
 
            //Simulate the original Include(string) call.
            return parent.Include(includes);
        }
 
        private static string GetEntityNameFromType(Type type)
        {
            // We didn't just use the Entity type names because maybe
            //  the table is called Orders and the class is Order or OrderEntity.
 
            if (type.HasElementType) //For arrays, like: OrderLines[]
            {
                //The type of the element of the array is what we want.
                type = type.GetElementType();
            }
            else if (type.IsGenericType) // for collections, like: EntityCollection<OrderLines>
            {
                var genericClassTypeParameters = type.GetGenericArguments();
 
                //The generic class must have one entity type only to load it.
                if (genericClassTypeParameters.Length != 1)
                    throw new NotSupportedException();
 
                //The type of the element of the collection is what we want.
                type = genericClassTypeParameters[0];
            }
 
            //Get the attributes that have the entity name in them.
            var entityTypeAttributes =
                type.GetCustomAttributes(typeof (EdmEntityTypeAttribute), true) as EdmEntityTypeAttribute[];
 
            //Make sure there IS one and ONLY one attribute to get the only entity name.
            if (entityTypeAttributes == null || entityTypeAttributes.Length != 1)
                throw new NotSupportedException();
 
            //Return the entity name.
            return entityTypeAttributes[0].Name;
        }
    }
}

 

This enables you to write:

DbDataContext.Categories.Include( (cat)=> cat.Prodycts);

Or:

DbDataContext.Prodycts.Include( (prod)=> prod.Category);

According to your need.

For things like: Order.Customer.Address (multiple levels), you’ll have to write code like:

DbDataContext.Orders.Include( order => order.Customer ).Include( customer => Customer.Address );

What’s Eager Loading? (in case you don’t know)

Let’s say you have tables Products, and Categories with relation 1<->* between them (Any category has many products; one product has one category). Let’s say you want to display a page of all products grouped by categories. Something like the following list but with much more information of course:

    • Category 1
      • Product A
      • Product B
      • Product C
    • Category 2
      • Product X
      • Product Y
      • Product Z

If you are using some ORM / Code generator that creates for you classes like “Product”, “Category” and gives you properties like “myCategory.Products” , “myProduct.Category”, how would you create such page?

Normally you’ll put a repeater or such for products inside a repeater for categories.

image

The products repeater will have its data source set to the current category item of the Categories repeater, something like “( (CategoryEntity)Container.DataItem ).Products”. Fine with that? Familiar?

OK. Now, if the code generator that generated the “Products” property has something like that:

public List<PRoduct> _Products;
public List<PRoduct> Products
{
    get
    {
        if (_Products == null)
        {
            _Products = (from products in DB.Products
                         where products.CategoryID == this.ID
                         select products)
                .ToList();
        }

        return _Products;
    }
}

* Nevermind the LINQ syntax. It’s just like writing “SELECT * FROM [Products] WHERE …” with all the dataset/datareader stuff.

Lazy Loading (AKA. Deferred Loading)

If the generated code (or your code) looks like this, this means that that for every category in the database, you’ll have a separate DB call to get the products of this category.

It also means that the products of each category will not be loaded until someone writes code that calls the getter of the Products property. That’s why this style of coding (not loading the related objects until they’re asked to be loaded) is called Lazy Loading.

This is good for a single category where you may be seeking just the basic information of the category and will not try to load products, since then they will not be requested when you don’t ask for it.

However, this is very bad for our example page. Because it means that a new DB call for each category. Imagine that you have 20 or 50 or 100 Category there, this will give you how many DB calls? (Hint: N Categories + 1 call for the category list itself).

Eager Loading

What if the code in the getter above was in the constructor?. Imaging something like:

public Category(int categoryID)
{
    // Code that laods category info from DB. Does not matter here.
    //Could be in factory method or such. Not our topic
    _CategoryID = categoryID;
    // .... .... .... Continue Loading Logic

    //The important part
    _Products = (from products in DB.Products
                 where products.CategoryID == this.ID
                 select products)
                .ToList();
}

This is good in case you know that in every situation when you use the category, the Products will be needed. This is probably not useful in a Product/category Scenario but think of a Person table and Address table where most likely whenever you load a Person you’re going to load his Addresses.

This is also useful especially when using ORM/code generator as in the first example. Lets get back to the Repeater example. If you use Entity framework or similar ORM, and you set the Categories query to load the Products eager loading (meaning each Category is created with its Products loaded already), Entity Framework can have a single connection and only TWO database hits, one for the Categories, and one for the Products. This is very useful in many listing scenarios. It also help especially when you have many parent objects (say Categories) or if the parent object needs to load entities of many different classes (say User needs to load Roles and Permissions and Personal Information and History and …. (if such case is applicable for you of course.

Now that you know what eager loading is, you can go up and check how the Entity Framework does that.

 

Filed under: , , , , ,

Comments

# Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading) - Guru Stop

Wednesday, April 01, 2009 7:06 PM by DotNetShoutout

Thank you for submitting this cool story - Trackback from DotNetShoutout

# re: Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

Wednesday, April 01, 2009 8:30 PM by Drew Marsh

This is nice, but I can't imagine it's worth the performance trade-off at runtime (especially for multi-step properties) in a high perf service environment. Have you thought about introducing caching so you don't have to process the expression tree every single time?

# re: Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

Thursday, April 02, 2009 1:45 AM by Mohamed Meligy

Well, the days when reflection was considered too expensive to use are a bit far now! Consider the very wide use of IoC containers out there.

Regarding the levels, in most cases you'll be having 1 to 4 levels, so, that shouldn't be a problem.

But I agree, in a complete project code, you'll most likely be caching this either varying by expression level or by entire resultant ObjectQuery. I did MEAN not to include the code for this as to keep the article focused on its sole purpose, especially that there are many articles on caching expressions out there.

# re: Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

Thursday, April 02, 2009 6:36 AM by Rokey G

Thanks for the article.

One thing I'm not cleared of - how would the eager loading example only generate 2 database hits. For every category constructor it queries DB.Product for a set of products. Doesn't this generate (N catagories) as well?

Or are you talking about using EF to do DB.Product.Select(), then perform Linq on the local IEnumerable<Products> instance?

# re: Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

Thursday, April 02, 2009 7:25 AM by Mohamed Meligy

@Rokey

No, I do not do it manually. Entity Framework does it for me. Most good ORMs can do that.

The way it's implemented is usually either by generating {SELECT...WHERE ... IN ()} sub queries or through LEFT JOIN queries. If you use SQL Profiler while executing the query, you'll find that it does loading by using OUTER LEFT JOIN. Entity Framework later extracts the objects of different entity types from the returned result set.

# re: Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

Thursday, April 02, 2009 7:31 AM by Mohamed Meligy

For more information on how eager loading is implemented in ORMs, check these great posts of Frans Bouma:

weblogs.asp.net/.../linq-to-llblgen-pro-feature-highlights-part-2.aspx

weblogs.asp.net/.../developing-linq-to-llblgen-pro-part-14.aspx

# Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading) - Guru Stop

Thursday, April 02, 2009 8:38 AM by DotNetShoutout

Thank you for submitting this cool story - Trackback from DotNetShoutout

# re: Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

Thursday, April 02, 2009 10:19 AM by Mohamed Meligy

For more information on how the Entity Framework does Eager Loading and performance difference in numbers, check : www.thedatafarm.com/.../TheCostOfEagerLoadingInEntityFramework.aspx

# re: Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

Friday, April 03, 2009 11:01 AM by mosessaur

The way EF provide eager loading is not so good. I preffer that way LINQ to SQL does it.

I wish to find a way to define my loading options on the fly. For example now I want category with products but in some other call I wish just to get category without products. like if I am retrieving a collection of categories I want them loaded without related products of each one. But when retrieving single category at this time I wanted to be loaded with related Products.

And as we don't just use EF directly from top layers like presentation or services etc... defining such thing might not become an easy to do, specially when you want to support different ORMs or different data accdess layer implementations.

Anyway, very nice post Muhammad, and cool idea.

# re: Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

Friday, April 03, 2009 3:40 PM by Rokey

@Mohamed Meligy

Doesn't

   _Products = (from products in DB.Products

                where products.CategoryID == this.ID

                select products)

               .ToList();

return List<Product> for each category in the second example?

Or you mean the EF ORM will actually intelligently determine it will load all the products and cache it somewhere?

Cheers!

# re: Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

Friday, April 03, 2009 4:00 PM by Rokey

@Mohamed Meligy

Ahhh I see the 'imagining' bit... Ok I get what you mean 'ORM does this when eager loading' Now.

@mosessaur

Completely agree. It's a nightmare to conditional eager loading different relations depends on Business logic needs. I ended up creating multiple/overloaded functions in my db repository just to give back same entity with different parts loaded!

Does anyone actually have better ways to do manage eager loading?

# re: Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

Friday, April 03, 2009 4:30 PM by Mohamed Meligy

@ mosessaur:

The EF does exactl what you want :). See my following reply @Rokey for details.

Regarding the seperation, the major problem with EF in this is supporting POCO classes (See en.wikipedia.org/.../Poco ). Other than that, no problem. You should be sending the types you want to eager load from the Services layer down to the Repository layer as Lambda Expressions as I've shown above. It IS the Repository's responsibility to call the ORM-specific "Include" method, as it's the Repository's responsibility to wrap the data access logic. So, everything is now in place :).

Regarding POCO class support issue (said to be included in next version of EF), check the "Persistence Ignorance (POCO) Adapter for Entity Framework " at code.msdn.microsoft.com/EFPocoAdapter

@Rokey

To have EF to eager load, youll need to change the query to be:

_Products = (from products in DB.Products.Include("Categories")

               where products.CategoryID == this.ID

               select products)

              .ToList();

Or if you use my code above:

_Products = (from products in DB.Products.Include( product=> product.Category)

               where products.CategoryID == this.ID

               select products)

              .ToList();

So, while LinqToSql required defining this per DataContext, Entity Framework defines it per query ;).

# re: Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

Friday, April 03, 2009 5:06 PM by Mohamed Meligy

@Rokey

BTW, sometimes eager loading (or aggreggate roots in geeneral) is nor the right answer. Sometimes you'd better do database Views and ViewModel classes. For example. I'd use Aggregate Roots for Category Listing that shows products also, and I'd use it for Editing Pages as well. However, I'd not use it for Product hat shows Category name in it, here, I'd use Views instead. This is sole reporting purpose.

But anyway, as mentioned, usually your Aggregate Root (Eager Load) should be specified at the Service level, and passed to the Repository that encapsulats how the DB does it.

Sometimes in practical real situations you need to have a static property or so that saves a certain path you use in many places or even always use with specific entity (like Address with Person), and maybe a constructor or a static factory that always injects the code that sets the loading.

# re: Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

Saturday, April 04, 2009 3:27 PM by mosessaur

As I understand the method you provide here in this article is just a better enhanced way of existing EF eager loading! instead of using literal strings, use lamda Expression.

My take on EF deffered loading is the we have to call Load explicitly and may be check (IsLoaded) first. Also defining that I want eager loading per query is good idea but also defining that on Context lever is also good idea, because when you start working with Context you are propably know what excatly you want to do with it.

And In Compiled queries, defining eager loading per query might not be of a good choice because this might cause some limitatios, like creating 2 queries one that include eager loading and the other without. In case of Context level to define your eager loading criteria this might work but I didn't try it.

Not sure how far you went into EF and the ability to build testable code with it, but EF in its current version has many limitiation in different aspects.

But it is interesting tool and I like it

# re: Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

Sunday, April 05, 2009 10:58 PM by Art

Hi,

I'm not sure that multiple levels loading works in your example, you say: "For things like: Order.Customer.Address (multiple levels), you’ll have to write code like:

DbDataContext.Orders.Include( order => order.Customer ).Include( customer => Customer.Address );"

Obviously it's not going to work because Include( order => order.Customer ) return ObjectQuery<Order>, so following .Include( customer => Customer.Address ) can not be applied.

I have seen a few solutions to solve problem you are trying to solve, you may want to have a look into this for example: msmvps.com/.../entity-framework-include-with-func-next.aspx

Personally I use a solution where within single lambda expression any loading path can be expressed, like: context.Orders.Include( order => order.Customer.Address.Country... ) or context.Orders.Include( order => order.Customer.Orders.Single().OrderItems... ) - as you see payment for getting rid of being verbose is improper Single() method usage (to go from collection to single item).

# re: Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

Monday, April 06, 2009 3:21 AM by Mohamed Meligy

Although I cannot see exactly the obvious non working part (as long as the extension parameter type and return type are ObjectQuery<T>), the "Single()" hack and similar were things I thought of for this better syntax but actually they didn't look the best for my convenience. But I'm glad this one exists anyway, as it's pretty elegant.

# re: Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

Monday, April 06, 2009 6:34 PM by Art

Hi,

Try your multi level example - "DbDataContext.Orders.Include( order => order.Customer ).Include( customer => Customer.Address );" and you'll see what I mean - it shouldn't even compile.

Issue is: when you say Orders.Include(...) it returns ObjectQuery<Order>. Then you are trying to do .Include( customer =>...) - it is not possible, because .Include is applied to ObjectQuery<Order>, so type for your second .Include( customer =>...) won't be Customer - it will be Order.

# re: Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

Friday, April 10, 2009 8:44 AM by J Khan

Thnx M, I picked up your post in Roger Jennings' blog

oakleafblog.blogspot.com/.../linq-and-entity-framework-posts-for.html

Recently Omar al Zabir has made upgrades to his Dropthings project

Do you have any suggestions for the migration too EF? (LINQ to Entities)

msmvps.com/.../web-2-0-ajax-portal-using-jquery-asp-net-3-5-silverlight-linq-to-sql-wf-and-unity.aspx

Re: Omar...

Secondly, Linq to SQL queries are replaced with Compiled Queries. Dropthings did not survive a load test when regular lambda expressions were used to query database. I could only reach up to 12 Req/Sec using 20 concurrent users without burning up web server CPU on a Quad Core DELL server.

# What about DataLoadOptions for Entity Framework ObjectContext?

Monday, June 15, 2009 1:28 PM by VS2010学习

What about DataLoadOptions for Entity Framework ObjectContext? I think the proper way to start this post

# re: Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

Monday, August 31, 2009 7:02 AM by Mark

Hi,

Does your interesting solution supports multilevel loading?

Can you provide an example?

Regards

# re: Typed Eager Loading Using Entity Framework (& What is Eager Loading vs Deferred Loading)

Saturday, September 05, 2009 11:43 AM by Mark

This code doesn't support custom names for relations ("include paths"), i.e. if  I set a name for a navigation property via EDM designer, the extension fails.

Leave a Comment

(required) 
(required) 
(optional)
(required)