Inheritance with EF Code First: Part 2 – Table per Type (TPT)
In the previous
blog post
you saw that there are three different approaches to
representing an inheritance hierarchy and I explained
Table per Hierarchy (TPH) as the default mapping
strategy in EF Code First. We argued that the
disadvantages of TPH may be too serious for our design
since it results in denormalized schemas that can
become a major burden in the long run. In today’s blog
post we are going to learn about Table per Type (TPT)
as another inheritance mapping strategy and we'll see
that TPT doesn’t expose us to this problem.
Table per Type (TPT)Table per Type is about representing inheritance relationships as relational foreign key associations. Every class/subclass that declares persistent properties—including abstract classes—has its own table. The table for subclasses contains columns only for each noninherited property (each property declared by the subclass itself) along with a primary key that is also a foreign key of the base class table. This approach is shown in the following figure: |
|
|
For example, if an instance of the CreditCard subclass
is made persistent, the values of properties declared
by the BillingDetail base class are persisted to a new
row of the BillingDetails table. Only the values of
properties declared by the subclass (i.e. CreditCard)
are persisted to a new row of the CreditCards table.
The two rows are linked together by their
shared primary key value. Later, the subclass instance may be retrieved from
the database by joining the subclass table with the
base class table.
TPT AdvantagesThe primary advantage of this strategy is that the SQL schema is normalized. In addition, schema evolution is straightforward (modifying the base class or adding a new subclass is just a matter of modify/add one table). Integrity constraint definition are also straightforward (note how CardType in CreditCards table is now a non-nullable column).Implement TPT in EF Code FirstWe can create a TPT mapping simply by placing Table attribute on the subclasses to specify the mapped table name (Table attribute is a new data annotation and has been added to System.ComponentModel.DataAnnotations namespace in CTP5): |
public abstract class BillingDetail { public int BillingDetailId { get; set; } public string Owner { get; set; } public string Number { get; set; } } [Table("BankAccounts")] public class BankAccount : BillingDetail { public string BankName { get; set; } public string Swift { get; set; } } [Table("CreditCards")] public class CreditCard : BillingDetail { public int CardType { get; set; } public string ExpiryMonth { get; set; } public string ExpiryYear { get; set; } } public class InheritanceMappingContext : DbContext { public DbSet<BillingDetail> BillingDetails { get; set; } } |
| If you prefer fluent API, then you can create a TPT mapping by using ToTable() method: |
protected override void OnModelCreating(DbModelBuilder modelBuilder) { modelBuilder.Entity<BankAccount>().ToTable("BankAccounts"); modelBuilder.Entity<CreditCard>().ToTable("CreditCards"); } |
Polymorphic AssociationsA polymorphic association is an association to a base class, hence to all classes in the hierarchy with dynamic resolution of the concrete class at runtime. For example, consider the BillingInfo property of User in the following domain model. It references one particular BillingDetail object, which at runtime can be any concrete instance of that class. |
|
In fact, because BillingDetail is abstract, the
association must refer to an instance of one of its
subclasses only—CreditCard or BankAccount—at runtime.
Implement Polymorphic Associations with EF Code FirstWe don’t have to do anything special to enable polymorphic associations in EF Code First; The user needs a unidirectional association to some BillingDetails, which can be CreditCard or BankAccount so we just create this association and it would be naturally polymorphic: |
public class User { public int UserId { get; set; } public string FirstName { get; set; } public string LastName { get; set; } public int BillingDetailId { get; set; } public virtual BillingDetail BillingInfo { get; set; } } |
|
In other words, as you can see above, a polymorphic
association is an association that may refer instances
of a subclass of the class that was explicitly
specified as the type of the navigation property (e.g.
User.BillingInfo). The following code demonstrates the creation of an association to an instance of the CreditCard subclass: |
using (var context = new InheritanceMappingContext()) { CreditCard creditCard = new CreditCard() { Number = "987654321", CardType = 1 }; User user = new User() { UserId = 1, BillingInfo = creditCard }; context.Users.Add(user); context.SaveChanges(); } |
| Now, if we navigate the association in a second context, EF Code First automatically retrieves the CreditCard instance: |
using (var context = new InheritanceMappingContext()) { User user = context.Users.Find(1); Debug.Assert(user.BillingInfo is CreditCard); } |
Polymorphic Associations with TPTAnother important advantage of TPT is the ability to handle polymorphic associations. In the database a polymorphic association to a particular base class will be represented as a foreign key referencing the table of that particular base class. (e.g. Users table has a foreign key that references BillingDetails table.) |
Generated SQL For QueriesLet’s take an example of a simple non-polymorphic query that returns a list of all the BankAccounts: |
var query = from b in context.BillingDetails.OfType<BankAccount>() select b; |
| Executing this query (by invoking ToList() method) results in the following SQL statements being sent to the database (on the bottom, you can also see the result of executing the generated query in SQL Server Management Studio): |
|
| Now, let’s take an example of a very simple polymorphic query that requests all the BillingDetails which includes both BankAccount and CreditCard types: |
var query = from b in context.BillingDetails select b; |
| This LINQ query seems even more simple than the previous one but the resulting SQL query is not as simple as you might expect: |
|
As you can see, EF Code First relies on an
INNER JOIN
to detect the existence (or absence) of rows in the
subclass tables CreditCards and BankAccounts so it can
determine the concrete subclass for a particular row
of the BillingDetails table. Also the SQL
CASE
statements that you see in the beginning of the query
is just to ensure columns that are irrelevant for a
particular row have NULL values in the returning
flattened table. (e.g. BankName for a row that
represents a CreditCard type)
TPT ConsiderationsEven though this mapping strategy is deceptively simple, the experience shows that performance can be unacceptable for complex class hierarchies because queries always require a join across many tables. In addition, this mapping strategy is more difficult to implement by hand— even ad-hoc reporting is more complex. This is an important consideration if you plan to use handwritten SQL in your application (For ad hoc reporting, database views provide a way to offset the complexity of the TPT strategy. A view may be used to transform the table-per-type model into the much simpler table-per-hierarchy model.) |
SummaryIn this post we learned about Table per Type as the second inheritance mapping in our series. So far, the strategies we’ve discussed require extra consideration with regard to the SQL schema (e.g. in TPT, foreign keys are needed). This situation changes with the Table per Concrete Type (TPC) that we will discuss in the next post.References |