What purpose does the Repository Pattern have?
I have watch Rob Conery’s great screencast about MVC Storefront. If you haven’t seen them, you should take a look. Really interesting, he build and app by using Agile, "TDD" etc. I have some comments about his implementation I want to share, and if you don't agree with me, it's fine, because I'm not an expert, this post is based on my own experience and knowledge ;)
Feel absolutely free to criticize me, but please give suggestions about what things can be done better, and also a reason why. There is no use if you add comments like "I don't agree with you!" if you don't say why.
Something that I don’t like with his implementation so far is his use of the Repository Pattern. I don't know what pattern Rob refer to, I assume it's Fowler's Repository Pattern. This post is based on my experience and interpretation of the Repository pattern, only so you know.
Rob creates a Repository which has a simple interface, he have for example a method GetCategories which returns an IQueryable<Category> object.
public interface ICategoryRepository { IQueryable<Category> GetCategories(); }
He also use the Service layer to implement common used queries such as GetCategories.
public class CategoryService { //... public IList<Category> GetCategories() { return _repository.GetCategories().ToList(); } }
Another method that Rob put in to the Service Layer is GetProductsByCategory(int categoryId).
This is the part I don’t like; I will try to explain why and based on my knowledge and experience of the Repository pattern.
The Repository has a responsibility to return entities. What Rod does is returning an IQueryable object, a query nothing else. So his Repository basically don’t return any entities, it’s when he first make a call to the ToList() he execute the query and then get the entities, but it’s the object he returns from the Repository that gives him the entities, not the Repository itself. For me the Repository he use is sort of meaningless; it only works as a query provider not as a Repository regarding to the definition of the Repository Pattern.
“The Repository will delegate to the appropriate infrastructure services to get the job done. Encapsulating in the mechanisms of storage, retrieval and query is the most basic feature of a Repository implementation”
“With a Repository, client code constructs the criteria and then passes them to the Repository, asking it to select those of its objects that match. From the client code’s perspective, there’s no notion of query “execution”; rather there’s the selection of appropriate object through the “satisfaction” of the query’s specification.”
“Most common queries should also be hard coded to the Repositories as methods.”
Source: PoEAA [Fowler] and DDD [Evans]
The interface of a Repository I should have used if I should slavery follow the definition of the Repository pattern is:
public interface IProductRepository { //... public IList<Product> GetProductsByCategory(int categoryID); public IList<Products> GetProducts(); }
What infrastructure service the Repository should use is something I will decide later in my project (This is also something Rob mention in his screencasts), first I will make sure my Domain Model is correct, then I decide based on my model what infrastructure service I should use to persist my model. The way to persist my model is something I probably never going to change, and to follow YAGNI which is a important part when working Agile, I shouldn't care or write code which make is possible to easy replace the infrastructure service a Repository use because I think it may change or I may need it later. But if I decide to use LINQ to SQL, the implementation of my Repository will probably look like this:
public class ProductRepository : IProductRepository { public IList<Product> GetProductsByCategory(int categoryID) { using (MyDataContext dataContext = new MyDataContext()) { return (from p in _dataContext.Products where p.CategoryId == categoryID select p).ToList(); } } }
Something to observe here is that I dispose the DataContext after I execute my Query. By doing that I will lose the track changing of my entities, my Unit of Work and also my Identity map which is handle by the Context. But when I write this code using Agile, I don't need it at the moment.
My Mock object for the ProductRepository would probably look like this:
public class MockProductRepository : IProductRepository { public IList<Product> GetProducts() { var products = new List<Product>(); for (int i = 0; i < 10; i++) products.Add(new Product(.....)); return products; } public IList<Product> GetProductsByCategory(int categoryID) { var products = this.GetProducts(); return (from p in products where p.CategoryID == categoryID select p).Single(); } //... }
If we return an IQueryable<> instead of a IList from our Repository and decide to use LINQ To SQL, we will need to have something in mind. Correct me if I'm wrong, but the GetTable method used by LINQ to SQL will return a Table<> object, which implements the IQueryable<> interface. The Table<> object will hold a reference to the DataContext. So a call to the ToList() method of the IQueryable<> require the context. So the following code will not work:
public class ProductRepository : IProductRepository { public IQueryable<Product> GetProductsByCategory(int categoryID) { using (MyDataContext dataContext = new MyDataContext()) { return dataContext.Products; } } }
The implementation of the Repository can for example look like this to make it work:
public class ProductRepository : IProductRepository { private MyDataContext dataContext = new MyDataContext(); public IQueryable<Product> GetProductsByCategory(int categoryID) { return dataContext.Products; } }
What does this code really do? Well it only expose Table<> objects at the moment and serve more like a query provider than as a Repository.
If we still keep this implementation we need to make sure the DataContext get disposed, so we don't add to much unnecessary entities to the Identity map etc, right!? This is something the Service layer need to do if we use the solution Rob uses in his project. If we don't want to reuse the same context for all method in our Repository we can use the following implementation:
public class ProductRepository : IProductRepository { public IQueryable<Product> GetProductsByCategory(int categoryID) { MyDataContext dataContext = new MyDataContext(); return dataContext.Products; } }
The problem here is that each call to the Repository's methods, will create an instance of the DataContext which will be added to the memory, each will have it's on Identity Map etc, those features can probably be turned of so it will not be unnecessary copies of entities in the memory. But still I assume we need to have more things in concern when returning a IQueryable<>, maybe not at an early stage but later. Most of those can be avoided by not returning the IQuerable<>.
One last thing that I'm not a fan about is the following code:
return _repository.GetProducts.WithId(10).ToList();
It will break the Law of Demeter. Instead if we do a call like this:
return _repository.GetProductsWithId(10);
We will not break the law, and this is the kind of method a Repository should have, if the query of a products is a common query we need to use.
No one says that the Repository pattern Fowler and Evans talk about is a Silver bullet, and Rob has a good point when he told me:
“One thing I’ll suggest is that with a new feature set (.NET 3.5) comes some new ways of doing things.”
Even within the computer world, there are evolutions and we shouldn’t be afraid of changes and test new way to solve things.
It will be interesting to see how the final version of Rob's application will look like, maybe I will change my mind or his implementation will change ;)