August 2008 - Posts
If you have read my previous blog posting about the Load operator in entity framework and how it works in terms of lazy loading object regardless if it object has been tracked earlier, you must be keen to know how does this behavior work with Linq to SQL. I must say Linq to SQL has always surprised me in terms of how efficient it is in terms of dealing with retrieving data from the database. When you lazy load a particular entity reference, Linq to SQL will first track its tracking repository to see if it can find the object there. If an object is found it by passes database and simply returns the object from its cache. This is an optimized behavior as compared to how entity framework tackles this problem. For example if I have much of orders and I want to lazy load its customer, Linq to SQL will not make a database call for every Order to get its customer. If those orders belong to a total 5 distinct customers, than only 5 database calls will be made. Below is an example that shows this behavior.


In the above code, I am retrieving Orders shipped in the city of London and than lazy loading its customer to print the customer for each order. Although orders returned numbered to 33 but the total database calls made was 5. This is because those 33 orders belong to total of 5 customers. Linq to SQL was smart enough to use its tracking service to figure out that if it can find an object in the cache, there is no need to make a call to the database.
If you have queries in your classes that you are reusing across your entire business layer, it becomes really hard sometimes to do eager loading of certain child entities on those queries. For example, if you have method that runs a complex linq to entity query to return you a collection of Orders. What if in certain scenarios of your application, you would like to retrieve Customers for those orders as well. Either you can create another method just like your previous method which has that complex query duplicated but along with that query you support the concept of eager loading of Customer entities as well. Other option you have is if the orders returned from the method are not too many and in manageable size, than you may end up wanting to reuse the method and incur the cost of lazy loading of the Customers. Below is an example that shows how I would lazy load Customers for an Order.


In the above code, I have a method called GetOrdersBycity which retrieves the Orders for the city of London. Just to keep you in loop, the total orders in the database for the city of London is 33. Since we also want to print the CustomerId for those orders, I am explicitly calling CustomerReference.Load to load the customer for that order. We need to call load on the customer reference otherwise we will end up with a null reference because entity framework does not load anything automatically from the database. Since there are 33 orders that meet the criteria we end up making 33 database calls to retrieve customers for those orders. Now the question is, do we really think that entity framework should have made 33 database calls. Personally I think, entity framework, should make database calls equal to the number of unique customers for those orders. Here is the reason why. Since we already have orders in our hand and each order knows what CustomerId it belongs to, I think entity framework should look into its object tracking service to see if its tracking this object. If the object is found in the tracking repository, it should not make an extra database call to retrieve the same customer that was already being tracked in the tracking service. But the reality of the matter is, if you turn on your profiler, you would see 33 database calls being made.
The question is how can we prevent the entity framework from making extra round trips to the database? Entity framework exposes an additional property called IsLoaded which you can check to see if the reference to the object is already loaded. If the object is already from a previous database query, than do not call load on the object again. Below is an example that shows to achieve that.


In the above code, I am checking the IsLoaded property to check if customer reference is already loaded. Only when the customer is not loaded I call Load. By applying a simple tweak in my query, I end up only making 5 database calls which means that I had 5 unique customers for all 33 orders. With simple checking we managed to improve our loading experience of related entitles by orders of magnitude.
If you want to compare dates in Linq to SQL query, you are free to use normal operators available in C# and Linq to SQL provider will translate the date comparison into appropriate SQL which makes uses of date functions available on SQL server. To demonstrate the usage of Date functions let's look at an example.

In the above example, I am getting an instance of Date Time object by parsing date from a string. I have an event table which has an event date that defines when the event will happen. To find an event that matches the date we have selected, I am comparing my date in the lambda expression to the date available on the Event Timing table. Linq to SQL is smart enough to convert the expression into appropriate SQL that SQL server can execute. Above query generates the follow SQL query to be executed by the database.

Above SQL query is the translation of our comparison of dates in Linq query. From the query, we can assert that this comparison was by no means complicated. It simply compares two fields ensuring that the parameter passed in to the SQL query is of Datetime data type.
But sometimes we don't really care about the time portion of the date and all we care is about finding an item in the database that matches a specific date with no time. Well, if you are using SQL server 2008, than I would highly recommend that you make use of the new data type called date which only captures the date portion of a date time. If SQL server 2008 is not an option, you can do your regular comparison of just the date portion in your query and Linq to SQL provider would gladly convert your query into appropriate SQL translation that only compares the date portion of a date. Below is an example of a query where I am only comparing date portions of the query.

In the above example, I make use of Date property available on a Date instance to just get the date portion of my selection date. To ensure that both sides of query make use of just the date portion, in my Linq query, I am getting just the date portion from my event date and comparing that to my selected date. When I run the above query, I get the following SQL query.

In the above SQL query, we are just grabbing the date portion from EventDate column by making use of convert operator passing in date as one of parameter. This confirms that when we make use of Date property on the Datetime object as part of our Linq query, Linq provider will translate the query which will make use of Convert function to convert Datetime instance to just date. It might be surprising that value portion of our date still consists of date and time. This is because Date property on Datetime object return the date and a default time and since the convert function in SQL server also returns a date with default time, the criteria matches based on dates with default time.
So far in my previous blog postings, I have discovered goodness with linq to SQL as I travel the path of migrating from linq to SQL to entity framework. This is not to deny that entity framework also has plus points which cannot be ignored. Among them is support for many to many relationship. Many to Many relationship is a concept that is very common in most OR mappers so one would wonder why it did not make it into the service pack release for sp1 for linq to SQL. I am sure Danny Simmons will have a better answer to this question than me.
Many to Many tables are used in the database when both sides of the tables has many relationship to the other side of the table. It is easier to explain the concept through an example. For instance if you have Employees and Projects. An employee can be part of many projects and in a single project there can be many employees participating. A scenario like this requires a third table which can include relationship from both tables. That table can be called ProjectAssignment. Below shows how the database diagram looks like.

In the above example, I have a table called Employees, which has all the employees in the database. Another table Projects which lists all the projects in the database. Than I have the connecting table called ProjectAssignment which defines how each employee is tied to a project. Notice in the ProjectAssignment, I have marked combination of EmployeeId and ProjectId to be the primary key. because you cant have the same employee be repeated for a given project. When I run the entity framework designer on the database, I end up with the following entity framework diagram.

When you look at the above generated entity diagram, you must be wondering where did my third table ProjectAssignment go? Well you don't need it. ProjectAssignment was a table in the database only used for normalization because database tables do not directly support the concept of many to many tables and therefore the need to bring in the third table arose. However in the OO world, many to many relationships can be mapped easily between objects using navigation relations where you have reference to Employee object, you can use Projects property to get access to all the projects an employee is part of. Similarly Project entity exposes an entity set called Employees which you can use to traverse to get a collection of all the employees participating in the project. Following code shows how to achieve that in the code.

In the above example, I am accessing the projects collection of the employee to find out the total projects an employee belongs to. Same is the case with getting all the employees that are part of the project by accessing the employees entity set available on project entity.
So what are the use cases that actually do not turn into many to many relationship in the entity diagram? In cases where many to many table is not only used for relationship, but also carries the responsibility of capturing data specific to that relationship. For example, if we we decide that for each employee in a specific project, we need track how much percentage of the project has been completed and also how much money an employee has spent on that project. The best place to keep track of these additional attributes would be the ProjectAssignment table because it is a table that manages the relation of an employee to a project. The screen shot below shows how the database diagram looks like after adding additional attributes.

Above screen shot shows 2 more new columns, Amount Spent and PerComplete that we are tracking inside of ProjectAssignment table for each employee on a specific project. Since these attributes are an important attributes to a relationship, it is no longer converted to many to many tables relation. Converting this relation to simply many to many relationship would cause it to loose attributes that we are tracking for that relation. Hence entity framework keeps this table intact in the entity framework designer and if you have to access the projects for an employee, you have to travel your way through the ProjectAssignment table to reach either projects or employee table. Code below shows the how to get all projects for an employee and all employees for a project.

In the example above, to access all the projects for an employee, I have to navigate to all the ProjectAssignment and for each ProjectAssignment access its Project. I do the similar navigation when I have to get access to all the employees for a particular project.
As I continue to migrate my linq to SQL code over to entity framework, I am discovering more constraints that I never faced working with linq to SQL. For instance in one of my linq query, I was able to do object comparison to apply filter to queries. For example, If you wanted to retrieve customers who are in the city of London and also customer that you currently hold in your hand and find out the count of customers that meet this criteria or probably find out how much total sales those customers have given us, you probably want to perform this query on the database. In the past I could apply object comparison in my query rather than applying comparisons based on column values. You could actually use column values to dictate your filters, but sometimes being explicit on how you are applying a filter puts noise in your code as compared to just saying make sure this customer is also part of the filter.
Below is an example of code, that demonstrates running a query to get the count of customers who are in the city of London or customer filtered using object comparison.

In the above example, I am first retrieving a customer based on primary key column. In the next query, I am use the object that I obtained from my previous query to filter the customers to only customers that are in City of London or matches the customer that has the same object reference as the reference passed in the query. Although it may seem that you are doing object comparison but in reality linq to SQL transforms the object comparison to filter being applied on database based on primary key. You can confirm this behavior by looking at the SQL query that was generated. Interesting point to consider is, the entire query got execute on the database including the count operation and filter based on primary key column.
The concept of object comparison does not work in entity framework. In fact applying this concept will give you a runtime error because entity framework cannot transform object comparison to filter being applied on the database. Below code shows the same example being written in entity framework which causes runtime exception.

When you run the above piece of code, you will get a runtime error saying Unable to create a constant value but if you get rid of object comparison in the above entity query, you will not get this error.
To set the stage of what I am going to blog about, I would recommend that you read by earlier blog posting here where I have talked about how the cast operator works. In short when you use cast operator with Linq to SQL to fetch a particular concrete type, you end up making a call to the database that fetches all records and than cast is performed in memory to cast the objects to the type passed in as a generic type. If the object cannot be cast to the correct type, the object is assigned a null value. The implementation is very similar to as operator in C# where if the object cannot be cast to right type, the reference of the object is set to null. Important point to understand is, even though your Linq query may not give the results you desire, but the query wont crash. However if you were to write the same query in Linq to entities or entity framework, you get a runtime exception stating that a cast is not valid. Below is an example that shows the behavior?
Running the above code with entity framework throws an exception.
Unable to cast the type Employee to type HourlyEmployee. LINQ to Entities only supports casting Entity
When I run the same piece of the code in Linq to SQL, I don't get exception but I do get 2 items in my collection, 1 being Hourly Employee and second is null because that employee is a Salaried Worker.

I am not sure which implementation is correct but I do think that which ever implementation is correct should be correct in both providers so that at the very least when we are migrating code from Linq to SQL to entity framework, implementation does not change and cause weird errors at runtime.
As I am starting to learn more and more, I think linq to sql first release is much more mature that v1 release of entity framework. For instance, in linq to sql, if I would create custom property in my partial class, I could actually use that custom property in the projection of my query. One of the obvious reasons for creating custom property is to reuse code functionality such as creating a property called Name which combines, FirstName and LastName. You really don't to write that code all over you class diagram. Exposing the function as a property and being able to use that property in a projection is a valuable functionality added to the framework. Below is the code that shows using custom property in a projection of a query.

In the above example, I am using the custom property Name available on Employee partial class to project the output in an anonymous type.
Something like, is currently not supported by entity framework. When I run code similar to to above in entity framework, I get the following exception.


In the above code, I am making use of custom property Name with entity framework and I am greeted with an exception stating that Name being a custom property is not supported.
In this blog posting, I will do a walk through of how to apply single table inheritance using entity framework. Entity framework supports 3 different models of inheritance.
- Table Per Hierarchy (Single Table Inheritance)
- Table Per Type
- Table Per Concrete Class
Of all these supported inheritance models, the most simplest and easiest to implement is Table Per Hierarchy (Single Table Inheritance). To implement this inheritance, you store all concrete types in one table. In Entity framework to identity a row as a specific concrete type, you define a discriminator column which identities which concrete type a specific row gets mapped to. From a usability point, I have found Single table model to be very easy to get started. However from the database perspective, the model doesn't seem to favor a clean approach. The reason is, you are storing all different concrete types in a single table. Some concrete types would need certain columns where as others won't. To accomplish flexibility at the table level, you have to mark all columns that are specific to their concrete implementation as allow nulls. Some database developers may find this approach not a good solution because it does not efficient use of disk space. Let's walk through an example to see how we can use the entity framework designer to apply single table inheritance.
To demonstrate the example, I will create a simple table called Employees. In the employees table we will capture two kinds of Employee, Hourly Worker and Salaried Employee. For both employees, we would like to know about their Name, Phone,Email. For Hourly Employee, we also want to capture their hourly rate and the number of Hours worked. For salaried employee, we would like to know their Salary.

The above employee schema shows that we have Name, Phone and Email as required because that is required for both types for Employees. However Salary, Rate, Hours are not required as they are specific fields for each concrete type. The last column type is the discriminator column that identifies what type of row it is; is it Hourly Employee or Salaried Employee?
We will start with adding a new Ado.net Entity data model.
2. Select the database and save the connection string to app.config file.
3. Select the Employee table.
After clicking finish, you end up with a single table Employee on the Entity framework designer. Now we will add two more entities that inherit from Employee entity; Hourly Employee and Salaried Employee.
4. Add Hourly Employee to the designer, inheriting from Employee class.
4. Add Salaried Employee to the designer, inheriting from Employee class.

5. After creating both concrete classes your entity diagram should look like this.
From the above diagram, you will notice that Salary, Rate and Hours are not attributes specific to Employee class. So let's move those properties to their respective class by cutting and pasting the properties to each of their classes.
Now that we have our properties in their respective classes, we can map Salary property on Salaried Employee class to Salary column in Employee table. We will also specify how Salaried Employee will map to Salary table by using condition column.
In the above example, I am setting a condition that when Type column on the Employee table has a value of SE (Salaried Employee), it should be mapped to Salaried Employee entity. Additionally I am also mapping Salary property to Salary column in the Employee table.
5. Next we will map Hourly Employee class Employee table using the discriminator column and mapping additional columns specific to Hourly Employee.

In the above diagram, I am setting a condition of Hourly Employee where type is equal to HE. Furthermore, I am also mapping additional fields specific to Hourly Employee class.
At this point we are not done because if you try to compile the project you will get build errors and those build errors would be something in the line of Type property not being mapped. The error is not very clear but in order to clear this error you have to remove the Type column from the base class because the column that you are using as discriminator column cannot be mapped.
After removing the Type column from Employee base class, you are still not done. If you try to compile the project, you will get the error below.
Once again the error is very vague and does not explain how to overcome this error. What it means is, we have mapped our concrete classes, Hourly Employee and Salaried Employee using the Type column attribute but we have not identified how to map the base Employee class. Either we can map the Employee class to a specific value in Type column but in my case, Employee class is really a base class for both Hourly and Salaried Employee and it does not map to any specific implementation. Therefore I can mark my Employee class as an abstract class to get rid of compile time error as shown below.

Once we mark the Employee class as abstract, the error goes away because we are explicitly stating that Employee class cannot be instantiated therefore it cannot be mapped to any concrete implementation.
Now all this required is to write a query that fetches Hourly and Salaried Employee as shown below.
In the above code, to access the Hourly Employee, I am making use of OfType operator specifying Hourly Employee as a generic type to retrieve only Hourly Employee. To fetch the salaried employee, I pass in Salaried Employee as my generic type. Output below shows the results of running the query.

If you are like me, migrating to entity framework from linq to SQL, I must say you have been spoiled. What seemed to be very simple has gotten much more complex. I find that Entity framework solved my persistence ignorance problem but brought me much more complicated problems to deal with. Here is one that took ages to find out
In linq to SQL, I can not only control the namespace for datacontext but also the code generated classes for entities generated by linq to SQL. Bellow is the snap shot that shows where I can make these changes.

In the above screen shot, you can see that I am given two options to namespace generation. First, I get to specify which namespace my datacontext gets generated in. Second to avoid name collision, I am also given the opportunity to specify where my code generated entities are created. The option of Entity Namespace can be used to specify the namespace where entities are generated. On changing that you can also confirm that linq to SQL designer performed what you told it do by going to the designer.cs file. Below example shows the preview.

When using entity framework, you get only one option for changing the namespace and that is changing the namespace for entity datamodel. Not sure what namespace changes it makes because when I change the value for the namespace, I still saw my NorthWindEntityDataContext in the same namespace and also my code generated entities in the default namespace. One change I noticed after changing the namespace is, some of attributes defined in the code generated classes were now pointing to my new namespace declared but I don't know how does that help me much because entities are still in the same namespace.
Below is an example that shows this behavior.
After struggling for half an hour, I finally decided to right click the edmx file and sure enough i was greeted with another namespace change that actually allowed me to define which namespace my CLR objects be generated in.

Hopefully some of you will save sometime if you cant see an option for changing the namespaces for code generated entities.
MultipleActiveResultSets was a feature introduced in the connectionstring with ado.net 2.0. In prior versions of ado.net, you can only execute a single query at a time on a particular open connection. When you set MultipleActiveResultSets to true, you can execute multiple commands on a single open connection, resulting in better performance. It's interesting that when you generate your linq to SQL classes, by default it does not make use of MultipleActiveResultSets in the connectionstring. However when you generate entities using entity framework, the connectionstring added to web.config or app.config has explicit setting for MultipleActiveResultSets = true to enable the feature. If you set MultipleActiveResultSets to false in the connectionstring most of the queries in the entity framework fail. It appears that you have to set MultipleActiveResultSets to true in the connectionstring if you want to use EntityFramework.
More Posts
Next page »