Associations in EF Code First: Part 4 – Table Splitting

Sunday, April 24, 2011

This is the fourth post in a series that explains entity association mappings with EF Code First. This series includes:

Part 1 – Introduction and Basic Concepts
Part 2 – Complex Types
Part 3 – Shared Primary Key Associations
Part 4 – Table Splitting
Part 5 – One-to-One Foreign Key Associations
Part 6 – Many-valued Associations

In the second part of this series we saw how to map a special kind of one-to-one association—a composition with complex types. We argued that this is usually the simplest way to represent one-to-one relationships which comes with some limitations. We addressed the first limitation (shared references) by introducing shared primary key associations in the previous blog post. In today’s blog post we are going to address the third limitation of the complex types by learning about Table Splitting as yet another way to map a one-to-one association.

The Motivation Behind this Mapping: A Complex Type That Can be Lazy Loaded

A shared primary key association does not expose us to the third limitation of the complex types regarding Lazy Loading, we can of course lazy/defer load the Address information of a given user but at the same time, it does not give us the same SQL schema as the complex type mapping. After all, it adds a new table for the Address entity to the schema while mapping the Address with a complex type stores the address information in the Users table. So the question still remains there: How can we keep everything (e.g. User and Address) in one single table yet be able to lazy load the complex type part (Address) after reading the principal entity (User)? In other words, how can we have lazy loading with a complex type?

Splitting a Single Table into Multiple Entities

Table splitting (a.k.a. horizontal splitting) enables us to map a single table to multiple entities. This is particularly useful for scenarios that we have a table with many columns where some of those columns might not be needed as frequently as others or some of the columns are expensive to load (e.g. a column with a binary data type).

An Example From the Northwind Database

Unlike the other parts of this series, where we start with an object model and then derive a SQL schema afterwards, in this post we are going to do the reverse, for a reason that you'll see, we will start with an existing schema and will try to create an object model that matches the schema. For that we are going to use the Employees table from the Northwind database. You can download and install Northwind database from here If you don't have it already installed on your SQL Server. The following shows the Employees table from the Northwind database that we are going to use:

As you can see, this table has a Photo column of image type which makes it a good candidate to be lazy loaded each time we read an Employee from this table.

The Object Model

As the following object model shows, I created two entities: Employee and EmployeePhoto. I also created a unidirectional association between these two by defining a navigation property on the Employee class called EmployeePhoto:

public class Employee
{
    public int EmployeeID { get; set; }
    public string LastName { get; set; }
    public string FirstName { get; set; }
    public string Title { get; set; }
    public string TitleOfCourtesy { get; set; }
    public DateTime? BirthDate { get; set; }
    public DateTime? HireDate { get; set; }
    public string Address { get; set; }
    public string City { get; set; }
    public string Region { get; set; }
    public string PostalCode { get; set; }
    public string Country { get; set; }
    public string HomePhone { get; set; }
    public string Extension { get; set; }        
    public string Notes { get; set; }
    public int? ReportsTo { get; set; }        
 
    public virtual EmployeePhoto EmployeePhoto { get; set; }
}
 
public class EmployeePhoto
{
    [Key]
    public int EmployeeID { get; set; }
    public byte[] Photo { get; set; }
    public string PhotoPath { get; set; }
}
 
public class NorthwindContext : DbContext
{        
    public DbSet<Employee> Employees { get; set; }
    public DbSet<EmployeePhoto> EmployeePhoto { get; set; }     
}

How to Create a Table Splitting with Fluent API?

As also mentioned in the previous post, by convention, Code First always takes a unidirectional association as one-to-many unless we specify otherwise with fluent API. However, the fluent API codes that we have seen so far in this series won't let us create a table splitting. If we mark EmployeePhoto class as a complex type, we wouldn't be able to lazy load it anymore or if we create a shared primary key association then it will look for a separate table for the EmployeePhoto entity which we don't have in the Northwind database. The trick is to create a shared primary key association between Employee and EmployeePhoto entities but then instruct Code First to map them both to the same table. The following code shows how:

protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
    modelBuilder.Entity<Employee>()
                .HasRequired(e => e.EmployeePhoto)
                .WithRequiredPrincipal();
        
    modelBuilder.Entity<Employee>().ToTable("Employees");
    modelBuilder.Entity<EmployeePhoto>().ToTable("Employees");
}

Note how we made both ends of the association required by using HasRequired and WithRequiredPrincipal methods, even though both the Photo and PhotoPath columns has been defined to allow NULLs.

See the Lazy Loading of the Dependent Entity in Action

Now it's time to write a test to make sure that EF does not select the Photo column each time we query for an employee:

using (var context = new NorthwindContext())
{
    Employee employee = context.Employees.First();
    byte[] photo = employee.EmployeePhoto.Photo;
}

The following screen shot from the SQL Profiler shows the query that has been submitted to SQL Server as the result of reading the first employee object:

Accessing the EmployeePhoto navigation property of the employee object on the next line causes EF to submit a second query to the SQL Server to lazy (implicit) load the EmployeePhoto (By default, EF fetches associated objects and collections lazily whenever you access them):

Where to Use this Mapping?

I recommend using Table Splitting only for mapping of the legacy databases, actually that's the reason we start this post from an existing database like Northwind. For green-field development scenarios consider using shared primary key association instead. There are several reasons why you may want to split the Employee table to two tables when designing a new physical data model for your application. In fact, it is very common for most applications to require a core collection of data attributes of any given entity, and then a specific subset of the noncore data attributes. For example, the core columns of the Employee table would include the columns required to store their name, address, and phone numbers; whereas noncore columns would include the Photo column. Because Employee.Photo is large, and required only by a few applications, you would want to consider splitting it off into its own table. This would help to improve retrieval access times for applications that select all columns from the Employee table yet do not require the photo. This also works pretty well for EF since it doesn't support lazy loading at the scalar property or complex type level.

Summary

In this post we learned about mapping a one-to-one association with table splitting. It enabled us to have lazy loading for the EmployeePhoto entity, something that we would have missed, had we mapped it with a complex type. We saw that on the database side it looks like a complex type mapping but on the object model it is not a complex type since we mapped EmployeePhoto as an Entity with an object identifier (EmployeeID). In fact, it's a special kind of a shared primary key association where both the principal and dependent entities are mapped to one single table. This somehow exotic one-to-one association mapping should be reserved only for the mapping of existing legacy databases.

Great post,
thanks

Vahid hassani - Tuesday, April 26, 2011 5:37:01 AM

I have used this method before but found it a bit annoying as there is no designer support.
I've ended up creating a view to return the columns I need. I then created an entity for the view.
It is not the best solutions as references are not created for the view entity, but at least I don't have to worry when I update my entities from database.

Assaf

Assaf S. - Saturday, May 7, 2011 2:52:04 AM

I fail to insert a new Employee with the above model and mapping.
My code is simple:
using (var context = new NorthwindContext()){    Employee e1 = new Employee();    e1.LastName = "Some last name";    e1.FirstName = "Some name";    e1.EmployeePhoto = new EmployeePhoto();    context.Employees.Add(e1);    context.SaveChanges();}
The Inner Exception message is "Cannot insert explicit value for identity column in table 'Employees' when IDENTITY_INSERT is set to OFF."
How to fix this?

LambdaCruiser - Monday, May 9, 2011 12:29:35 PM

I've found out that the code works if I let EF create a new database.

But I still get the exception when running the code against original Northwind DB. So it's some weird issue with the database I guess.

LambdaCruiser - Tuesday, May 10, 2011 10:52:22 AM

Thank you for a very detailed answer.
Indeed, EF infers that the EmployeeId is _not_ an Identity in this mapping. I think the answer to this behaviour is in Part 3 of this post series.
"a very important consequence of the principal/dependent decision for one-to-one associations: the dependent primary key will become non-Identity by default."
I changed part of the mapping from:
modelBuilder.Entity<Employee>()            .HasRequired(e => e.EmployeePhoto)            .WithRequiredDependent();
to
modelBuilder.Entity<Employee>()            .HasRequired(e => e.EmployeePhoto)            .WithRequiredPrincipal();Now the inserts seem to work as expected, the Id is treated like an Identity.

LambdaCruiser - Thursday, May 12, 2011 10:57:23 AM

Thanks for this great post. Very detailed and guide. I just subscribe to your RSS> Looking forward for more. Cheers!!!

richardbhong - Monday, August 1, 2011 5:41:57 AM

how can we use same properties in different models?

deniz güney - Monday, August 8, 2011 9:19:39 PM

Thanks for the post.
Is there a way to set the key in EmployeePhoto using the Fluent API instead of the DataAnnotations [Key] attribute?

Thanks

Mark Phillips - Monday, September 19, 2011 11:57:12 PM

@Mark Phillips: Yes, fluent API offers HasKey() method to configure the primary key property(s) for an entity like the following code:

modelBuilder.Entity<EmployeePhoto>().HasKey(e => e.EmployeeID);

mortezam - Wednesday, September 21, 2011 2:44:53 AM

How about the EmployeePhoto is optional for Employee? I have try HasOptional().WithRequired(), but it seems not works. Could you help to instruct me how to implement this case.

ddredstar@hotmail.com - Friday, October 28, 2011 5:47:50 AM

@ddredstar@hotmail.com: Because you are mapping both entities to the same table, EF expects you to configure a required one-to-one relationship between them and the fluent API code that I showed is the only way to achieve Table Splitting in code first be it required or optional. That said, you can still workaround this limitation by allowing NULL in the related entity columns (like I did for EmployeePhoto.Photo and EmployeePhoto.PhotoPath). Then when working with the object model, always use NULLs to indicate an optional association. For example, the following code shows how you can save an Employee without a Photo:
Employee employee = new Employee()
{
    EmployeePhoto = new EmployeePhoto()
};               
context.Employees.Add(employee);
context.SaveChanges();

mortezam - Friday, November 4, 2011 2:11:51 AM

Is this method the same as TPT? I am so confused. This EF thing is so big and so complicated :(

maximusmd - Tuesday, November 15, 2011 10:01:57 PM

I tried implementing this pattern to separate a large table into multiple discrete types but it did not work. I received an error "The Entity types 'CampaginFeedback' and 'CampaignSurvey' cannot share table 'Campaign' because they are not int he same type hierarchy or do not have a valid one to one foreign key relationship with matching primary keys between them."

My classes look like this (simplified here):

public class Campaign {
[Key]
public int CampaignId {get;set;}
public string Name {get;set;}
}

public class CampaignSurvey {
[Key]
public int CampaignId {get;set;}
public string Question {get;set;}
public string Answer {get;set;}
}

public class CampaignFeedback {
[Key]
public int CampaignId {get;set;}
public string Feedback {get;set;}
}

What am I doing wrong?

parleer - Tuesday, December 13, 2011 3:12:03 PM

I'd like to know if is it possible to map more then two entities to the same table. In your exemple are mapped Employee and EmployeePhoto to the table Employees. It is possible this exemple have plus one entity, like Address, mapped to fields address, city, region, postalcode and country?
I have done some tests, and when I map Address I get an error. When I delete the Address entity, it works with Employee and EmployeePhoto entities, when I delete the EmployeePhoto entity, it works with Employee and Address entities. So there are no erros with the mapping, Seems like this approach do not support more then two entities mapped to the same table.
If you know a solution for this problem let me know. Thanks.

Renan - Saturday, March 24, 2012 3:08:42 PM

Why is the EmployeePhoto property virtual? Is this necessary? I don't see where it has been employed.

missingLINQ - Thursday, August 16, 2012 4:07:18 PM