Returning IQueryable<T>

Monday, December 1, 2008

Design

Since LINQ was added to the .Net Community as a new wonderful new player, more and more solutions I have seen returns the IQueryable<T> from the Data Access Layer. One reason is to easy create different kind of queries in an easy way and execute them. Some developers use the IQueryable<T> interface to create a light weight interface for the Data Access Layer. It's an interesting solution, but at the moment I don't like it.

If we look at how developers uses the IQueryable<T>, we often see code like this:

public class CustomerRepository/CustomerDAL

public IQueryable<Customer> GetCustomers()

What is wrong with this code?

1) The GetCustomers method is lying (A method is there to communicate). It will not get the Customers, it will return a Query object which can be used to create a Query for getting customers. A better name for the method would be GetQueryableCustomers or maybe GetQueryObjectForCustomers. If we want to use the name GetCustomers, I think the name of the Class which has this member would have the word Query in the name, for example: QueryProvider.GetCustomer. This will make sure the users of the code, know that we are getting a Query nothing else. So when return the IQueryable<T>, we are basically using a Query Provider, which should be used by the Data Access Layer, no other layers.

2) By passing the IQueryable<T> between the layers, we will have a deferred execution. We will "never" know where and when the execution will take place (If we own the code, we will, but not is some other developers want to use our code). For me this is like passing SQL statements from the Data Access Layer up to the Business or Presentation layer for later execution.

By not passing the IQueryable<T> from the Data Access Layer to the Business Logic Layer, we will know when the execution will take place, and when. If we use the Repository Pattern, all execution of Queries should take place inside of the Repository. It sad that the IQueryable<T> is indirect bound to a specific LINQ feature, such as Entity Framework or LINQ to SQL. What I want, is a way to attach a IQueryable<T> to a LINQ feature. In that way I could use it as a replacement of the Specification pattern/Query object pattern, and that would solve a lot of problem and also result in a light weight interface for the Data Access components.

public IEnumerable<Customer> GetCustomersBySpecification(IQueryable<Customer> query)

If this was possible and if we could use Entity Framework or LINQ to SQL as the infrastructure of our Repositories, we could create a Factory to return a IQueryable<T> interface which it not indirectly attached to a specific LINQ feature. In this case we could create our queries in the Business Logic layer, and pass the Query back down to the Data Access Layer and let it attache the Query to a LINQ feature and execute it:

var query = from customer in QueryFactory.GetQuery<Customer>()
where customer.ID = 1234 select customer;

var customers = CustomerRepository.GetCustomerBySpecification(query);

By using this solution we could avoid deferred execution, when and where the execution will take place is under our control.

Before I end this post I want to let you know that the problem is not about returning the IQueryable<T> interface, the problem is how the LINQ features are indirect bound to the IQueryable<T> interface. If it wasn't and we could bind it later, I think we could use it to solve problems in a really wonderful and nice way. What do you think?

Architecture, Clean Code, Design, LINQ, Repository Pattern

10 Comments

Agree completely, it's nice that someone shares my opinion in this topic. Have searched long and hard for a more generic way of handling the different queries.

By the way, thanks for switching to Entity Framework and then writing about it from a ddd-perspective. Since EF is not a framework that supports DDD out of the box it's very important to tell the masses how it could be used. Unfortunately there are some limitations still in the EF, like the last thing you pointed out which makes it quite tricky to use in a right way.

Mikael Henriksson - Monday, December 1, 2008 8:47:49 AM

I agree that having the query execution handled by the repository is the way you want it. How do you feel about lazy loading? I am big fan of how NHibernate supports lazy loading, sure it will cause some repository abstraction leakage, but it is extremely useful in some scenarios and can simplify and reduce the code you have to write.

Torkel - Monday, December 1, 2008 10:14:59 AM

I run issue 2 all the time when working with StringTemplate for view processing. I like to just pass my objects into the processor and leave, but b/c the .ToString() on an Iqueryable, just returns the type - I have to ToList() before sending the object into the processor.

Hal - Monday, December 1, 2008 11:39:42 AM

@Roger:
">>We are basically using a Query Provider, which should be used by the Data Access Layer, no other layers.
That depends on what you consider to be your data access layer.
Semantics, semantics :-)
Eg, are the repositories your data access layer or are they part of your domain layer?"
I refer to a Query Provider used by the Repository, for example, a Repository ask the Query Provider for a IQueryable<T>, the Repository executes the query by using the ToList or Singe methods.
"But "We will "never" know where and when" is a bit over dramatic IMO, are you affraid to use delegates and events too? ;-)"
You know what I mean, read the sentence in its current context ;)

Fredrik N - Monday, December 1, 2008 2:24:47 PM

Why is it so important to know when exactly the query will be executed, at the repository level? As long as the top level application understand that they are getting a "query", and not an actual list of items, they can take the appropriate steps to make sure that any delay is accounted for. Isn't this better than always executing immediately, and not giving the application any control at all?

IQueryable is not in any way bound to LINQ to SQL; the whole point of its existence is to be provider agnostic. There are any number of alternate implementations.

The idea about using IQueryable as a query specification is absolutely possible, if you want to go that way. Its not even too terribly difficult, although there might be some double maintenance issues.

David Nelson - Friday, February 27, 2009 9:03:23 PM

@David:
Do you also pass SQL queries from the data access layer to the presentation layer and let it execute in the that layer?
About deferred exectution, do you also prefer lazy loading?
Ayende wrote a good post about Lazy loading here:
ayende.com/.../Lazy-loading-The-Good-The-Bad-And-The-Evil-Witch.aspx
I don't "hate" the IQueryable<T> interface, I like it, the only thing I don't like is to pass it up in the layer because mostly developers are using EF or LINQ to SQL, and that is the big problem. It would be much much better if we could pass down the query to the data access layer and let the right layer do the execution. Why you may ask.. because of seperation of concern and Single Responsibility principle etc.

Fredrik N - Friday, February 27, 2009 9:26:04 PM

Roger Alsing, has it right. It depends on your seperation of responiblity. Our layers are like this Present-> BLL -> DAL -> ORM. By having the DAL pass an IQueryable the BLL can apply business rules like what rows does user security allow? How do I want this sorted. The BLL returns a List back out to presentation.

Pete - Friday, February 27, 2009 11:19:15 PM

"Do you also pass SQL queries from the data access layer to the presentation layer and let it execute in the that layer?"

Of course not. I don't let implementation details of my data access layer leak into higher layers. And there is nothing about passing back an IQueryable that leaks those details, any more than passing back an IEnumerable.

I'm not sure what your point is about lazy loading. In general I think its more trouble than its worth except in very specific cases. But deferred execution and lazy loading are not the same thing. With lazy loading, you don't know what is going to happen when you access that property: is it going to execute a database query? Will that hurt the response time? You have no control. Deferred executing is the opposite; it gives the most control to the layer that has the most information, which is a good thing.

"It would be much much better if we could pass down the query to the data access layer and let the right layer do the execution."

That's exactly what you're doing by returning the IQueryable. The application layer requests an IQueryable instance, which the data layer provides, and then the application layer fills out the query and executes it. The only difference between that model, and passing the query to the data layer in a method call to execute, is a semantic one.

David Nelson - Friday, February 27, 2009 11:49:31 PM

@Pete:
There is a known pattern called Query Provider Pattern, when we return a Query/(QueryObject), we use that pattern. So if the DAL returns a IQueryable, the object is a Query Provider. By passing around a IQueryable<T> up in the layer and create queries in the DAL (like common queries) and also some in the BLL, the queries will be distributed into several places, and if the underlaying model will be changed, there are several places that need to be changed. If we use EF or LINQ to SQL, we could rather use the DataContext as the DAL if we would like to create queries within the BLL.

Fredrik N - Saturday, February 28, 2009 7:19:23 AM

@David Nelson:
About Lazy Loading vs Deferred Execution
You are correct! BUT! The User of your API maybe doesn’t know that an execution will take place when he/she uses foreach, or calling the ToList method of the IQueryable<T> interface, because it all depends on the infrastructure used. A user of a public API maybe got this API from someone else, should they need to read the documentation or get the source code to understand how to use the API and what infrastructure it will use? Or should the API be self described? The problem is not the interface; it’s the infrastructure that uses the interface and how it’s used by others. The IQueryable<T> is an abstraction; it hides the details which is ok. The problem is that people returns the IQuearyable<T> and define queries in different places which will make it harder to maintain and reuse the queries. They will also distribute queries among several layers, and if the underlying domain model changes, there will be several places to update.  If we take all those “distributed” queries and put them into an object, we will get a Repository which will handle the execution of the query and bring all queries together in one single place.

Fredrik N - Wednesday, March 4, 2009 1:15:30 PM

Comments have been disabled for this content.