Archives

Archives / 2010 / December
  • Inheritance with EF Code First: Part 2 – Table per Type (TPT)

    In the previous blog post you saw that there are three different approaches to representing an inheritance hierarchy and I explained Table per Hierarchy (TPH) as the default mapping strategy in EF Code First. We argued that the disadvantages of TPH may be too serious for our design since it results in denormalized schemas that can become a major burden in the long run. In today’s blog post we are going to learn about Table per Type (TPT) as another inheritance mapping strategy and we'll see that TPT doesn’t expose us to this problem.

    Table per Type (TPT)

    Table per Type is about representing inheritance relationships as relational foreign key associations. Every class/subclass that declares persistent properties—including abstract classes—has its own table. The table for subclasses contains columns only for each noninherited property (each property declared by the subclass itself) along with a primary key that is also a foreign key of the base class table. This approach is shown in the following figure:
    For example, if an instance of the CreditCard subclass is made persistent, the values of properties declared by the BillingDetail base class are persisted to a new row of the BillingDetails table. Only the values of properties declared by the subclass (i.e. CreditCard) are persisted to a new row of the CreditCards table. The two rows are linked together by their shared primary key value. Later, the subclass instance may be retrieved from the database by joining the subclass table with the base class table.

    TPT Advantages

    The primary advantage of this strategy is that the SQL schema is normalized. In addition, schema evolution is straightforward (modifying the base class or adding a new subclass is just a matter of modify/add one table). Integrity constraint definition are also straightforward (note how CardType in CreditCards table is now a non-nullable column).

    Implement TPT in EF Code First

    We can create a TPT mapping simply by placing Table attribute on the subclasses to specify the mapped table name (Table attribute is a new data annotation and has been added to System.ComponentModel.DataAnnotations namespace in CTP5):
    public abstract class BillingDetail
    {
        public int BillingDetailId { getset; }
        public string Owner { getset; }
        public string Number { getset; }
    }
     
    [Table("BankAccounts")]
    public class BankAccount : BillingDetail
    {
        public string BankName { getset; }
        public string Swift { getset; }
    }
     
    [Table("CreditCards")]
    public class CreditCard : BillingDetail
    {
        public int CardType { getset; }
        public string ExpiryMonth { getset; }
        public string ExpiryYear { getset; }
    }
     
    public class InheritanceMappingContext : DbContext
    {
        public DbSet<BillingDetail> BillingDetails { getset; }
    }
    If you prefer fluent API, then you can create a TPT mapping by using ToTable() method:
    protected override void OnModelCreating(DbModelBuilder modelBuilder)
    {
        modelBuilder.Entity<BankAccount>().ToTable("BankAccounts");
        modelBuilder.Entity<CreditCard>().ToTable("CreditCards");
    }

    Polymorphic Associations

    A polymorphic association is an association to a base class, hence to all classes in the hierarchy with dynamic resolution of the concrete class at runtime. For example, consider the BillingInfo property of User in the following domain model. It references one particular BillingDetail object, which at runtime can be any concrete instance of that class.
    In fact, because BillingDetail is abstract, the association must refer to an instance of one of its subclasses only—CreditCard or BankAccount—at runtime.

    Implement Polymorphic Associations with EF Code First

    We don’t have to do anything special to enable polymorphic associations in EF Code First; The user needs a unidirectional association to some BillingDetails, which can be CreditCard or BankAccount so we just create this association and it would be naturally polymorphic:
    public class User
    {
        public int UserId { getset; }
        public string FirstName { getset; }
        public string LastName { getset; }
        public int BillingDetailId { getset; }
     
        public virtual BillingDetail BillingInfo { getset; }
    }
    In other words, as you can see above, a polymorphic association is an association that may refer instances of a subclass of the class that was explicitly specified as the type of the navigation property (e.g. User.BillingInfo).

    The following code demonstrates the creation of an association to an instance of the CreditCard subclass:
    using (var context = new InheritanceMappingContext())
    {
        CreditCard creditCard = new CreditCard()
        {                    
            Number   = "987654321",
            CardType = 1
        };                
        User user = new User()
        {
            UserId      = 1,    
            BillingInfo = creditCard
        }; 
        context.Users.Add(user);
        context.SaveChanges();
    }
    Now, if we navigate the association in a second context, EF Code First automatically retrieves the CreditCard instance:
    using (var context = new InheritanceMappingContext())
    {
        User user = context.Users.Find(1);
        Debug.Assert(user.BillingInfo is CreditCard);
    }

    Polymorphic Associations with TPT

    Another important advantage of TPT is the ability to handle polymorphic associations. In the database a polymorphic association to a particular base class will be represented as a foreign key referencing the table of that particular base class. (e.g. Users table has a foreign key that references BillingDetails table.)

    Generated SQL For Queries 

    Let’s take an example of a simple non-polymorphic query that returns a list of all the BankAccounts:
    var query = from b in context.BillingDetails.OfType<BankAccount>() select b;
    Executing this query (by invoking ToList() method) results in the following SQL statements being sent to the database (on the bottom, you can also see the result of executing the generated query in SQL Server Management Studio):
    Now, let’s take an example of a very simple polymorphic query that requests all the BillingDetails which includes both BankAccount and CreditCard types:
    var query = from b in context.BillingDetails select b;
    This LINQ query seems even more simple than the previous one but the resulting SQL query is not as simple as you might expect:
    As you can see, EF Code First relies on an INNER JOIN to detect the existence (or absence) of rows in the subclass tables CreditCards and BankAccounts so it can determine the concrete subclass for a particular row of the BillingDetails table. Also the SQL CASE statements that you see in the beginning of the query is just to ensure columns that are irrelevant for a particular row have NULL values in the returning flattened table. (e.g. BankName for a row that represents a CreditCard type)

    TPT Considerations

    Even though this mapping strategy is deceptively simple, the experience shows that performance can be unacceptable for complex class hierarchies because queries always require a join across many tables. In addition, this mapping strategy is more difficult to implement by hand— even ad-hoc reporting is more complex. This is an important consideration if you plan to use handwritten SQL in your application (For ad hoc reporting, database views provide a way to offset the complexity of the TPT strategy. A view may be used to transform the table-per-type model into the much simpler table-per-hierarchy model.)

    Summary

    In this post we learned about Table per Type as the second inheritance mapping in our series. So far, the strategies we’ve discussed require extra consideration with regard to the SQL schema (e.g. in TPT, foreign keys are needed). This situation changes with the Table per Concrete Type (TPC) that we will discuss in the next post.

    References

    Read more...

  • Inheritance with EF Code First: Part 1 – Table per Hierarchy (TPH)

    A simple strategy for mapping classes to database tables might be “one table for every entity persistent class.” This approach sounds simple enough and, indeed, works well until we encounter inheritance. Inheritance is such a visible structural mismatch between the object-oriented and relational worlds because object-oriented systems model both “is a” and “has a” relationships. SQL-based models provide only "has a" relationships between entities; SQL database management systems don’t support type inheritance—and even when it’s available, it’s usually proprietary or incomplete.

    There are three different approaches to representing an inheritance hierarchy:
    • Table per Hierarchy (TPH): Enable polymorphism by denormalizing the SQL schema, and utilize a type discriminator column that holds type information.
    • Table per Type (TPT): Represent "is a" (inheritance) relationships as "has a" (foreign key) relationships.
    • Table per Concrete class (TPC): Discard polymorphism and inheritance relationships completely from the SQL schema.
    I will explain each of these strategies in a series of posts and this one is dedicated to TPH. In this series we'll deeply dig into each of these strategies and will learn about "why" to choose them as well as "how" to implement them. Hopefully it will give you a better idea about which strategy to choose in a particular scenario.

    Inheritance Mapping with Entity Framework Code First

    All of the inheritance mapping strategies that we discuss in this series will be implemented by EF Code First CTP5. The CTP5 build of the new EF Code First library has been released by ADO.NET team earlier this month. EF Code-First enables a pretty powerful code-centric development workflow for working with data. I’m a big fan of the EF Code First approach, and I’m pretty excited about a lot of productivity and power that it brings. When it comes to inheritance mapping, not only Code First fully supports all the strategies but also gives you ultimate flexibility to work with domain models that involves inheritance. The fluent API for inheritance mapping in CTP5 has been improved a lot and now it's more intuitive and concise in compare to CTP4.

    A Note For Those Who Follow Other Entity Framework Approaches

    If you are following EF's "Database First" or "Model First" approaches, I still recommend to read this series since although the implementation is Code First specific but the explanations around each of the strategies is perfectly applied to all approaches be it Code First or others.

    A Note For Those Who are New to Entity Framework and Code-First

    If you choose to learn EF you've chosen well. If you choose to learn EF with Code First you've done even better. To get started, you can find a great walkthrough by Scott Guthrie here and another one by ADO.NET team here. In this post, I assume you already setup your machine to do Code First development and also that you are familiar with Code First fundamentals and basic concepts. You might also want to check out my other posts on EF Code First like Complex Types and Shared Primary Key Associations.

    A Top Down Development Scenario

    These posts take a top-down approach; it assumes that you’re starting with a domain model and trying to derive a new SQL schema. Therefore, we start with an existing domain model, implement it in C# and then let Code First create the database schema for us. However, the mapping strategies described are just as relevant if you’re working bottom up, starting with existing database tables. I’ll show some tricks along the way that help you dealing with nonperfect table layouts.

    The Domain Model

    In our domain model, we have a BillingDetail base class which is abstract (note the italic font on the UML class diagram below). We do allow various billing types and represent them as subclasses of BillingDetail class. As for now, we support CreditCard and BankAccount:

    Implement the Object Model with Code First

    As always, we start with the POCO classes. Note that in our DbContext, I only define one DbSet for the base class which is BillingDetail. Code First will find the other classes in the hierarchy based on Reachability Convention.
    public abstract class BillingDetail 
    {
        public int BillingDetailId { getset; }
        public string Owner { getset; }        
        public string Number { getset; }
    }
     
    public class BankAccount : BillingDetail
    {
        public string BankName { getset; }
        public string Swift { getset; }
    }
     
    public class CreditCard : BillingDetail
    {
        public int CardType { getset; }                
        public string ExpiryMonth { getset; }
        public string ExpiryYear { getset; }
    }
     
    public class InheritanceMappingContext : DbContext
    {
        public DbSet<BillingDetail> BillingDetails { getset; }
    }
    This object model is all that is needed to enable inheritance with Code First. If you put this in your application you would be able to immediately start working with the database and do CRUD operations. Before going into details about how EF Code First maps this object model to the database, we need to learn about one of the core concepts of inheritance mapping: polymorphic and non-polymorphic queries.

    Polymorphic Queries

    LINQ to Entities and EntitySQL, as object-oriented query languages, both support polymorphic queries—that is, queries for instances of a class and all instances of its subclasses, respectively. For example, consider the following query:
    IQueryable<BillingDetail> linqQuery = from b in context.BillingDetails select b;
    List<BillingDetail> billingDetails = linqQuery.ToList();
    Or the same query in EntitySQL:
    string eSqlQuery = @"SELECT VAlUE b FROM BillingDetails AS b";
    ObjectContext objectContext = ((IObjectContextAdapter)context).ObjectContext;
    ObjectQuery<BillingDetail> objectQuery = objectContext.CreateQuery<BillingDetail>(eSqlQuery);
    List<BillingDetail> billingDetails = objectQuery.ToList();
    linqQuery and eSqlQuery are both polymorphic and return a list of objects of the type BillingDetail, which is an abstract class but the actual concrete objects in the list are of the subtypes of BillingDetail: CreditCard and BankAccount.

    Non-polymorphic Queries

    All LINQ to Entities and EntitySQL queries are polymorphic which return not only instances of the specific entity class to which it refers, but all subclasses of that class as well. On the other hand, Non-polymorphic queries are queries whose polymorphism is restricted and only returns instances of a particular subclass. In LINQ to Entities, this can be specified by using OfType<T>() Method. For example, the following query returns only instances of BankAccount:
    IQueryable<BankAccount> query = from b in context.BillingDetails.OfType<BankAccount>() 
                                    select b;
    EntitySQL has OFTYPE operator that does the same thing:
    string eSqlQuery = @"SELECT VAlUE b FROM OFTYPE(BillingDetails, Model.BankAccount) AS b";
    In fact, the above query with OFTYPE operator is a short form of the following query expression that uses TREAT and IS OF operators:
    string eSqlQuery = @"SELECT VAlUE TREAT(b as Model.BankAccount) 
                         FROM BillingDetails AS b 
                         WHERE b IS OF(Model.BankAccount)";
    (Note that in the above query, Model.BankAccount is the fully qualified name for BankAccount class. You need to change "Model" with your own namespace name.)

    Table per Hierarchy (TPH)

    An entire class hierarchy can be mapped to a single table. This table includes columns for all properties of all classes in the hierarchy. The concrete subclass represented by a particular row is identified by the value of a type discriminator column. You don’t have to do anything special in Code First to enable TPH. It's the default inheritance mapping strategy:
    This mapping strategy is a winner in terms of both performance and simplicity. It’s the best-performing way to represent polymorphism—both polymorphic and nonpolymorphic queries perform well—and it’s even easy to implement by hand. Ad-hoc reporting is possible without complex joins or unions. Schema evolution is straightforward.

    Discriminator Column

    As you can see in the DB schema above, Code First has to add a special column to distinguish between persistent classes: the discriminator. This isn’t a property of the persistent class in our object model; it’s used internally by EF Code First. By default, the column name is "Discriminator", and its type is string. The values defaults to the persistent class names —in this case, “BankAccount” or “CreditCard”. EF Code First automatically sets and retrieves the discriminator values.

    TPH Requires Properties in SubClasses to be Nullable in the Database

    TPH has one major problem: Columns for properties declared by subclasses will be nullable in the database. For example, Code First created an (INT, NULL) column to map CardType property in CreditCard class. However, in a typical mapping scenario, Code First always creates an (INT, NOT NULL) column in the database for an int property in persistent class. But in this case, since BankAccount instance won’t have a CardType property, the CardType field must be NULL for that row so Code First creates an (INT, NULL) instead. If your subclasses each define several non-nullable properties, the loss of NOT NULL constraints may be a serious problem from the point of view of data integrity.

    TPH Violates the Third Normal Form

    Another important issue is normalization. We’ve created functional dependencies between nonkey columns, violating the third normal form. Basically, the value of Discriminator column determines the corresponding values of the columns that belong to the subclasses (e.g. BankName) but Discriminator is not part of the primary key for the table. As always, denormalization for performance can be misleading, because it sacrifices long-term stability, maintainability, and the integrity of data for immediate gains that may be also achieved by proper optimization of the SQL execution plans (in other words, ask your DBA).

    Generated SQL Query

    Let's take a look at the SQL statements that EF Code First sends to the database when we write queries in LINQ to Entities or EntitySQL. For example, the polymorphic query for BillingDetails that you saw, generates the following SQL statement:
    SELECT 
    [Extent1].[Discriminator] AS [Discriminator], 
    [Extent1].[BillingDetailId] AS [BillingDetailId], 
    [Extent1].[Owner] AS [Owner], 
    [Extent1].[Number] AS [Number], 
    [Extent1].[BankName] AS [BankName], 
    [Extent1].[Swift] AS [Swift], 
    [Extent1].[CardType] AS [CardType], 
    [Extent1].[ExpiryMonth] AS [ExpiryMonth], 
    [Extent1].[ExpiryYear] AS [ExpiryYear]
    FROM [dbo].[BillingDetails] AS [Extent1]
    WHERE [Extent1].[Discriminator] IN ('BankAccount','CreditCard')
    Or the non-polymorphic query for the BankAccount subclass generates this SQL statement:
    SELECT 
    [Extent1].[BillingDetailId] AS [BillingDetailId], 
    [Extent1].[Owner] AS [Owner], 
    [Extent1].[Number] AS [Number], 
    [Extent1].[BankName] AS [BankName], 
    [Extent1].[Swift] AS [Swift]
    FROM [dbo].[BillingDetails] AS [Extent1]
    WHERE [Extent1].[Discriminator] = 'BankAccount'
    Note how Code First adds a restriction on the discriminator column and also how it only selects those columns that belong to BankAccount entity.

    Change Discriminator Column Data Type and Values With Fluent API 

    Sometimes, especially in legacy schemas, you need to override the conventions for the discriminator column so that Code First can work with the schema. The following fluent API code will change the discriminator column name to "BillingDetailType" and the values to "BA" and "CC" for BankAccount and CreditCard respectively:
    protected override void OnModelCreating(DbModelBuilder modelBuilder)
    {
        modelBuilder.Entity<BillingDetail>()
                    .Map<BankAccount>(m => m.Requires("BillingDetailType").HasValue("BA"))
                    .Map<CreditCard>(m => m.Requires("BillingDetailType").HasValue("CC"));
    }
    Also, changing the data type of discriminator column is interesting. In the above code, we passed strings to HasValue method but this method has been defined to accepts a type of object:
    public void HasValue(object value);
    Therefore, if for example we pass a value of type int to it then Code First not only use our desired values (i.e. 1 & 2) in the discriminator column but also changes the column type to be (INT, NOT NULL):
    modelBuilder.Entity<BillingDetail>()
                .Map<BankAccount>(m => m.Requires("BillingDetailType").HasValue(1))
                .Map<CreditCard>(m => m.Requires("BillingDetailType").HasValue(2));

    Summary

    In this post we learned about Table per Hierarchy as the default mapping strategy in Code First. The disadvantages of the TPH strategy may be too serious for your design—after all, denormalized schemas can become a major burden in the long run. Your DBA may not like it at all. In the next post, we will learn about Table per Type (TPT) strategy that doesn’t expose you to this problem.

    References

    Read more...

  • Associations in EF Code First CTP5: Part 2 – Shared Primary Key Associations

    In the previous blog post I demonstrated how to map a special kind of one-to-one association—a composition with complex types as the first post in a series about entity association mapping with EF Code First. We argued that the relationships between User and Address are best represented with a complex type mapping and we saw that this is usually the simplest way to represent one-to-one relationships but comes with some limitations.

    In today’s blog post I’m going to discuss how we can address those limitations by changing our mapping strategy. This is particularly useful for scenarios that we want a dedicated table for Address, so that we can map both User and Address as entities. One benefit of this model is the possibility for shared references— another entity class (let’s say Shipment) can also have a reference to a particular Address instance. If a User has a reference to this instance, as her BillingAddress, the Address instance has to support shared references and needs its own identity. In this case, User and Address classes have a true one-to-one association.

    Introducing the Revised Model

    In this revised version, each User could have one BillingAddress (Billing Association). Also Shipment has to be delivered to an address so it always has one Delivery Address (Delivery Association). Here is the class diagram for this domain model (note the multiplicities on association lines):
    In this model we assumed that the billing address of the user is the same as her delivery address. Now let’s create the association mappings for this domain model. There are several choices, the first being a One-to-One Primary Key Association.

    Shared Primary Associations

    Also know as One-to-One Primary Key Associations, means two related tables share the same primary key values. The primary key of one table is also a foreign key of the other. Let’s see how we map the primary key associations with Code First.

    How to Implement a One-to-One Primary Key Association with Code First

    First, we start with the POCO classes. As you can see, we've defined BillingAddress as a navigation property on User class and another one on Shipment class named DeliveryAddress. Both associations are unidirectional since we didn't define related navigation properties on Address class as for User and Shipment.
    public class User
    {
        public int UserId { getset; }
        public string FirstName { getset; }
        public string LastName { getset; }
        public virtual Address BillingAddress { getset; }
    }
     
    public class Address
    {
        public int AddressId { getset; }
        public string Street { getset; }
        public string City { getset; }
        public string PostalCode { getset; }
    }
     
    public class Shipment
    {
        public int ShipmentId { getset; }
        public DateTime CreatedOn { getset; }
        public string State { getset; }
        public virtual Address DeliveryAddress { getset; }
    }
     
    public class EntityMappingContext : DbContext
    {
        public DbSet<User> Users { getset; }
        public DbSet<Address> Addresses { getset; }
        public DbSet<Shipment> Shipments { getset; }
    }

    How Code First Reads This Object Model: One-to-Many

    Code First reads the model and tries to figure out the multiplicity of the associations. Since the associations are unidirectional, Code First takes this as if one Address has many Users and Many Shipments and will create a one-to-many association for each of them. So, what we were hoping for —a one-to-one association, is not inline with the conventions.

    How to Change the Multiplicity to One-to-One by Using the Conventions

    One way to turn our associations to be one-to-one is by making them bidirectional. That is, adding a new navigation property to Address class of type User and another one of type Shipment. By doing that we basically signal Code First that we are looking to have one-to-one associations since for example User has an Address and also Address has a User. Based on the conventions, Code First will change the multiplicity to one-to-one and this will solve the problem.

    Should We Make This Association Bidirectional?

    As always, the decision is up to us and depends on whether we need to navigate through our objects in that direction in the application code. In this case, we’d probably conclude that the bidirectional association doesn’t make much sense. If we call anAddress.User, we are saying “give me the user who has this address”, not a very reasonable request. So this is not a good option. Instead we'll keep our object model as it is and will resort to fluent API.

    How to Change the Multiplicity to One-to-One with Fluent API

    The following code is all that is needed to make the associations to be one-to-one. Note how the multiplicities in the UML class diagram (e.g. 1 on User and 0..1 on address) has been translated to the flunet API code by using HasRequired and HasOptional methods:
    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<User>().HasOptional(u => u.BillingAddress)
                                   .WithRequired();
        
        modelBuilder.Entity<Shipment>().HasRequired(u => u.DeliveryAddress)
                                       .WithOptional();
    }
    Also it worth mentioning that in CTP5, when we are mapping a one-to-one association with fluent API, we don't need to specify the foreign key as we would do when mapping a one-to-many association with HasForeignKey method. Since EF only supports one-to-one primary key associations it will automatically create the relationship in the database based on the primary keys and we don't need to state the obvious as we did in CTP4.

    Database Schema

    The mapping result for our object model is as follows (note the Identity column):

    Referential Integrity

    In relational database design the referential integrity rule states that each non-null value of a foreign key must match the value of some primary key. But wait, how does it even applies here? All we have is just three primary keys referencing each other. Who is the primary key and who is the foreign key? The best way to find the answer of this question is to take a look at the properties of the relationships in the database that has been created by Code First:
    As you can see, Code First adds a foreign key constraint which links the primary key of the Addresses table to the primary key of the Users table and adds another foreign key constraint that links the primary key of the Shipments table to the primary key of the Addresses table. The foreign key constraint means that a user has to exist for a particular address but not the other way around. In other words, the database guarantees that an Addresses row’s primary key references a valid Users primary key and a Shipments row’s primary key references a valid Addresses primary key.

    How Code First Determines Principal and Dependent?

    Code First has rules to determine the principal and dependent ends of an association. For one-to-many relationships the many end is always the dependent, but it gets a little tricky in one-to-one associations. In one-to-one associations Code First decides based on our object model, and possible data annotations or fluent API that we may have. For example in our case, we wrote this fluent API code to configure User-Address association:
    modelBuilder.Entity<User>().HasOptional(u => u.BillingAddress).WithRequired();
    This reads as "User entity has an optional association with one Address object but this association is required for Address entity."
    For Code First this is good enough to make the decision: It marked User as the principal end and Address as the dependent end in the association. Since we have the same fluent API code for the second association between Address and Shipment, it marks Address as the principal end and Shipment as the dependent end in this association as well.

    The referential integrity that we saw, is the first result of this Code First's principal/dependent decision.

    Second Result of Code First's Principal/Dependent Decision: Database Identity

    If you take a closer look at the above DB schema, you'll notice that only UserId has a regular identifier generator (aka Identity or Sequence) and AddressId and ShipmentId does not. This is a very important consequence of the principal/dependent decision for one-to-one associations: the dependent primary key will become non-Identity by default. This make sense because they share their primary key values and only one of them can be auto generated and we need to take care of providing valid keys for the rest.

    What about Cascade Deletes?

    As we saw, each Address always belongs to one User and each Shipment always delivered to one single Address. We want to make sure that when we delete a User the possible dependent rows on Address and Shipment also get deleted in the database. In fact, this is one of the Referential Integrity Refactorings which called Introduce Cascading Delete. The primary reason we would apply "Introduce Cascading Delete" is to preserve the referential integrity of our data by ensuring that related rows are appropriately deleted when a parent row is deleted. By default, Code First does not enable cascade delete when it creates a relationship in the database. As always we can override this convention with fluent API:
    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<User>().HasOptional(u => u.BillingAddress)
                                   .WithRequired()
                                   .WillCascadeOnDelete();
        
        modelBuilder.Entity<Shipment>().HasRequired(u => u.DeliveryAddress)
                                       .WithOptional()
                                       .WillCascadeOnDelete();
    }

    What If Both Ends are Required?

    We saw that the only reason Code First could figure out principal and dependent in our 1:1 associations was because our fluent API code clearly specified one end as Required and the other as Optional. But what if both endpoints are the same in terms of being required in the association? For example what if in our domain model, User always has one Address and Address always has one User (required on both end)? The answer is that ultimately, this scenario need to be configured by fluent API and the interesting point is that fluent API is designed in a way that will force you to explicitly specify who is dependent and who is principal in such cases that this cannot be inferred by Code First.

    To illustrate the idea, let's see how we can configure mapping for this User-Address association (Required/Required) with fluent API:
    modelBuilder.Entity<User>().HasRequired(u => u.BillingAddress).WithRequiredDependent();
    
    So we invoke WithRequiredDependent() after HasRequired() method. To see the reason, we need to take a look at the RequiredNavigationPropertyConfiguration type which is returned by HasRequired():
    public class RequiredNavigationPropertyConfiguration<TEntityType, TTargetEntityType>
    {
        public DependentNavigationPropertyConfiguration<TEntityType, TTargetEntityType> WithMany();
        public CascadableNavigationPropertyConfiguration WithOptional();
        public CascadableNavigationPropertyConfiguration WithRequiredDependent();
        public CascadableNavigationPropertyConfiguration WithRequiredPrincipal();
    }
    As you can see, if you want to go another Required after HasRequired() method, you have to either call WithRequiredDependent() or WithRequiredPrincipal() since there is no WithRequired() method on this RequiredNavigationPropertyConfiguration class which is returned by HasRequired() method.
    Both WithRequired and WithOptional methods return a CascadableNavigationPropertyConfiguration type which has a WillCascadeOnDelete() method. Now if we run the code and check the database, we'll see that cascade delete on both relationships are switched on.

    Working with the Model

    Here is an example for adding a new user along with its billing address. EF is smart enough to use the newly generated UserId for the AddressId as well:
    using (var context = new EntityMappingContext())
    {
        Address billingAddress = new Address()
        {   
            Street = "Yonge St.",
            City   = "Toronto"                    
        };
        User morteza = new User()
        {                    
            FirstName      = "Morteza",
            LastName       = "Manavi",
            BillingAddress = billingAddress
        }; 
                
        context.Users.Add(morteza);                        
        context.SaveChanges();                
    }
    The following code is an example of adding a new Address and Shipment for an existing User (assuming that we have a User with UserId=2 in the database):
    using (var context = new EntityMappingContext())
    {
        Address deliveryAddress = new Address()
        {
            AddressId = 2,
            Street    = "Main St.",
            City      = "Seattle"
        };
        Shipment shipment = new Shipment()
        {
            ShipmentId      = 2,
            State           = "Shipped",
            CreatedOn       = DateTime.Now,
            DeliveryAddress = deliveryAddress
        };                
                   
        context.Shipments.Add(shipment);
        context.SaveChanges();      
    }

    Limitations of This Mapping

    There are two important limitations to associations mapped as shared primary key:
    • Difficulty in saving related objects: The main difficulty with this approach is ensuring that associated instances are assigned the same primary key value when the objects are saved. For example, when adding a new Address object, it's our responsibility to provide a unique AddressId that is also valid (a User can be found with such a value as UserId.)

    • Multiple addresses for User is not possible: With this mapping we cannot have more than one Address for User. At the beginning of this post, when we introduce our model, we assumed that the user has the same address for billing and delivery. But what if that's not the case? What if we also want to add a Home address to User for the deliveries? In the current setup, each row in the User table has a corresponding row in the Address table. Two addresses would require an additional address table, and this mapping style therefore wouldn’t be adequate.

    Summary

    In this post we learned about one-to-one associations which shared primary key is just one way to implement it. Shared primary key associations aren’t uncommon but are relatively rare. In many schemas, a one-to-one association is represented with a foreign key field and a unique constraint. In the next posts we will revisit the same domain model and will learn about other ways to map one-to-one associations that does not have the limitations of the shared primary key association mapping.

    References

    Read more...

  • Associations in EF Code First CTP5: Part 1 – Complex Types

    Last week the CTP5 build of the new Entity Framework Code First has been released by data team at Microsoft. Entity Framework Code-First provides a pretty powerful code-centric way to work with the databases. When it comes to associations, it brings ultimate flexibility. I’m a big fan of the EF Code First approach and I am planning to explain association mapping with code first in a series of blog posts and this one is dedicated to Complex Types.

    A Note For Those Who are New to Entity Framework and Code-First

    If you choose to learn EF you've chosen well. If you choose to learn EF with Code First you've done even better. To get started, you can find a great walkthrough by Scott Guthrie here and another one by ADO.NET team here. In this post, I assume you already setup your machine to do Code First development and also that you are familiar with Code First fundamentals and basic concepts.

    In order to build a solid foundation for our discussion, we will start by learning about some of the core concepts around the relationship mapping.  

    What is Mapping?

    Mapping is the act of determining how objects and their relationships are persisted in permanent data storage, in our case, relational databases.

    What is Relationship Mapping?

    A mapping that describes how to persist a relationship (association, aggregation, or composition) between two or more objects.

    Types of Relationships

    There are two categories of object relationships that we need to be concerned with when mapping associations. The first category is based on multiplicity and it includes three types:
    • One-to-one relationships: This is a relationship where the maximums of each of its multiplicities is one.
    • One-to-many relationships: Also known as a many-to-one relationship, this occurs when the maximum of one multiplicity is one and the other is greater than one.
    • Many-to-many relationships: This is a relationship where the maximum of both multiplicities is greater than one.
    The second category is based on directionality and it contains two types:
    • Uni-directional relationships: when an object knows about the object(s) it is related to but the other object(s) do not know of the original object. To put this in EF terminology, when a navigation property exists only on one of the association ends and not on the both.
    • Bi-directional relationships: When the objects on both end of the relationship know of each other (i.e. a navigation property defined on both ends).

    How Object Relationships are Implemented in POCO Object Models?

    When the multiplicity is one (e.g. 0..1 or 1) the relationship is implemented by defining a navigation property that reference the other object (e.g. an Address property on User class). When the multiplicity is many (e.g. 0..*, 1..*) the relationship is implemented via an ICollection of the type of other object.

    How Relational Database Relationships are Implemented?

    Relationships in relational databases are maintained through the use of Foreign Keys. A foreign key is a data attribute(s) that appears in one table and must be the primary key or other candidate key in another table. With a one-to-one relationship the foreign key needs to be implemented by one of the tables. To implement a one-to-many relationship we implement a foreign key from the “one table” to the “many table”. We could also choose to implement a one-to-many relationship via an associative table (aka Join table), effectively making it a many-to-many relationship.

    Introducing the Model

    Now, let's review the model that we are going to use in order to implement Complex Type with Code First. It's a simple object model which consist of two classes: User and Address. Each user could have one billing address. The Address information of a User is modeled as a separate class as you can see in the UML model below:
    In object-modeling terms, this association is a kind of aggregation—a part-of relationship. Aggregation is a strong form of association; it has some additional semantics with regard to the lifecycle of objects. In this case, we have an even stronger form, composition, where the lifecycle of the part is fully dependent upon the lifecycle of the whole.

    Fine-grained Domain Models

    The motivation behind this design was to achieve Fine-grained domain models. In crude terms, fine-grained means “more classes than tables”. For example, a user may have both a billing address and a home address. In the database, you may have a single User table with the columns BillingStreet, BillingCity, and BillingPostalCode along with HomeStreet, HomeCity, and HomePostalCode. There are good reasons to use this somewhat denormalized relational model (performance, for one). In our object model, we can use the same approach, representing the two addresses as six string-valued properties of the User class. But it’s much better to model this using an Address class, where User has the BillingAddress and HomeAddress properties. This object model achieves improved cohesion and greater code reuse and is more understandable.

    Complex Types: Splitting a Table Across Multiple Types

    Back to our model, there is no difference between this composition and other weaker styles of association when it comes to the actual C# implementation. But in the context of ORM, there is a big difference: A composed class is often a candidate Complex Type. But C# has no concept of composition—a class or property can’t be marked as a composition. The only difference is the object identifier: a complex type has no individual identity (i.e. no AddressId defined on Address class) which make sense because when it comes to the database everything is going to be saved into one single table.

    How to implement a Complex Type with Code First

    Code First has a concept of Complex Type Discovery that works based on a set of Conventions. The convention is that if Code First discovers a class where a primary key cannot be inferred, and no primary key is registered through Data Annotations or the fluent API, then the type will be automatically registered as a complex type. Complex type detection also requires that the type does not have properties that reference entity types (i.e. all the properties must be scalar types) and is not referenced from a collection property on another type. Here is the implementation:
    public class User
    {
        public int UserId { get; set; }
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public string Username { get; set; }
        public Address Address { get; set; }
    }

    public class Address
    {
        public string Street { get; set; }
        public string City { get; set; }       
        public string PostalCode { get; set; }       
    }

    public class EntityMappingContext : DbContext
    {
        public DbSet<User> Users { get; set; }       
    }

    With code first, this is all of the code we need to write to create a complex type, we do not need to configure any additional database schema mapping information through Data Annotations or the fluent API.

    Database Schema

    The mapping result for this object model is as follows:

    Complex Types are Required

    As a limitation of EF in general, complex types are always considered required. To see this limitation in action, let's try to add a record to our database:
    using (var context = new EntityMappingContext())
    {
        User user = new User()
        {
            FirstName = "Morteza",
            LastName  = "Manavi",
            Username  = "mmanavi"        
        };
     
        context.Users.Add(user);
        context.SaveChanges();                
    }
    Surprisingly, this code throws a System.Data.UpdateException at runtime with this message:
    Null value for non-nullable member. Member: 'Address'.
    If we initialize the address object, the exception would go away and user will be successfully saved into database:
    When we read back the inserted record from the database, EF will return an Address object with all the properties (Street, City and PostalCode) have null values. This means that if you store a complex type object with all null property values, EF returns a initialized complex type when the owning entity is retrieved from the database.

    Explicitly Register a Type as Complex

    You saw that in our model, we did not use any data annotation or fluent API code to designate the Address as a complex type, yet Code First perfectly detects it as a complex type based on Complex Type Discovery concept. But what if our domain model requires a new property called Id on Address class? This new Id property is just a scalar non-primary key property that represents let's say another piece of information about address. In this case, Code First actually can infer a key and therefore marks Address as an entity that has its own mapping table unless we specify otherwise. This is where explicit complex type registration comes into play. CTP5 defined a new attribute in System.ComponentModel.DataAnnotations namespace called ComplexTypeAttribute. All we need to do is to use this attribute on our Address class:
    [ComplexType]
    public class Address
    {
        public int Id { getset; }
        public string Street { getset; }
        public string City { getset; }
        public string PostalCode { getset; }
    }
    This will cause Address to remain as a complex type in our model. As always, we can do the same with fluent API. In CTP5 a new generic method has been added to ModelBuilder class which is called ComplexType and has the following signature (when working with fluent API, we don't really care about the method's return values):
    public virtual ComplexTypeConfiguration<TComplexType> ComplexType<TComplexType>() 
                                            where TComplexType : class;
    Here is how we can register our Address type as complex in fluent API:
    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.ComplexType<Address>();
    }

    Best Practices When Working with Complex Types

    • Always initialize the complex type: Because of the problem we just saw, I recommended to always initialize the complex type inside its owning entity's constructor.
    • Add a read only property to the complex type for null value checking: Defining a non-persistent read only property like HasValue will help to test for null values.
    • Consider always using ComplexType attribute: Even if your class is automatically detected as a complex type by Code First, I still recommend to mark it with [ComplexType] attribute. Not only that helps your object model to be more readable but also ensures that your complex types will stay as complex type as your model evolves in your project. Furthermore, there is a bug in CTP5 and that is if you put Required attribute (a data annotation that Code First supports for validation) on any of the complex type's properties (e.g. PostalCode) then Code First will stop thinking that it is a complex type and will throw a ModelValidationException. The workaround for this bug is to explicitly mark Address with ComplexType attribute. Hence, it will be beneficial in such cases as well.
    Therefore, our final object model will be:
    public class User
    {
        public User()
        {
            Address = new Address();
        }
     
        public int UserId { getset; }
        public string FirstName { getset; }
        public string LastName { getset; }
        public string Username { getset; }
        public Address Address { getset; }
    }
     
    [ComplexType]
    public class Address
    {
        public string Street { getset; }
        public string City { getset; }
        public string PostalCode { getset; }
            
        public bool HasValue 
        {
            get
            {
                return (Street != null || PostalCode != null || City != null);
            }
        }
    }
    The interesting point is that we do not have to explicitly exclude HasValue property from the mapping. Since this property does not have a setter, EF Code First will be ignoring it based on a convention which makes sense since a read only property is most probably represents a computed value and does not need to be persist in the database.

    Complex Types and the New Change Tracking API

    EF Code First CTP5 exposes a new set of change tracking information that enables us to access Original, Current & Stored values, and State (e.g. Added, Unchanged, Modified, Deleted) of our entities. The Original Values are the values the entity had when it was queried from the database. The Current Values are the values the entity has now. This feature also fully supports complex types:
    using (var context = new EntityMappingContext())
    {
        var user = context.Users.Find(1);
     
        Address originalValues = context.Entry(user)
                                        .ComplexProperty(u => u.Address)
                                        .OriginalValue;    
        
        Address currentValues = context.Entry(user)
                                       .ComplexProperty(u => u.Address)
                                       .CurrentValue;
    }
    The entry point for accessing the new change tracking API is DbContext's Entry method which returns an object of type DbEntityEntry. DbEntityEntry contains a ComplexProperty method that returns a DbComplexPropertyEntry object where we can access the original and current values:
    namespace System.Data.Entity.Infrastructure
    {    
        public class DbEntityEntry<TEntity> where TEntity : class
        {       
            public DbComplexPropertyEntry<TEntity, TComplexProperty> 
                ComplexProperty<TComplexProperty>
                    (Expression<Func<TEntity, TComplexProperty>> property);
        }
    }

    Limitations of This Mapping

    There are two important limitations to classes mapped as Complex Types:
    • Shared references is not possible: The Address Complex Type doesn’t have its own database identity (primary key) and so can’t be referred to by any object other than the containing instance of User (e.g. a Shipping class that also needs to reference the same User Address).
    • No elegant way to represent a null reference: As we saw there is no elegant way to represent a null reference to an Address. When reading from database, EF Code First always initialize Address object even if values in all mapped columns of the complex type are null.

    Summary

    In this post we learned about fine-grained domain models which complex type is just one example of it. Fine-grained is fully supported by EF Code First and is known as the most important requirement for a rich domain model. Complex type is usually the simplest way to represent one-to-one relationships and because the lifecycle is almost always dependent in such a case, it’s either an aggregation or a composition in UML. In the next posts we will revisit the same domain model and will learn about other ways to map a one-to-one association that does not have the limitations of the complex types.

    References

    Read more...

  • My First Blog Post

    Finally, here I am, starting my weblog.

    My name is Morteza Manavi. I am going to blog here mostly about technologies that I am passionate about. They are Core C#, Entity Framework, WCF and ASP.Net MVC as well as software design and architecture.

    A special thanks to Joe Stagner, who was kind enough to help me get this blog.

    Read more...