May 2011 - Posts

MongoDB is by far one of the most well-known and powerful documental databases created in the open source community. The simplicity that you find in this database is also another factor that help a lot in adoption, as you don’t need to know much to start using it.  

ELMAH, which stands for Error Logging Modules and Handlers, is another famous open source project in the .NET world. Almost any ASP.NET developer in the planet is already familiar with the project and what you can do it with it. If you haven’t heard of it yet, it is an extensible framework that provides an application-wide error logging facility for ASP.NET applications.

MongoDB represents a group of documents as a collection. As an analogy with relational databases, a document could be seen as a row, and a collection as a table. The main difference is that collections are schema-free and can store any kind of document, and a document can have any structure.

Collections are usually created dynamically and automatically grow in size to fit additional data.  However, there is an specific collection type called “capped collection”, which is created in advance and is fixed in size. A capped collection behave like circular queues, so if the collection runs out of space, the oldest documents will be deleted, and the new ones will take their place. Documents in this type of collection can not be moved or deleted, making this collection extremely fast for new insertions. There is no need to allocate additional space, or search through a free list to find the right place to put a document. The inserted document can always be placed directly at the tail of the collection and overwrite old documents if needed.

This makes capped collections a great fit for use cases like logging. Having said that, capped collections in MongoDB are also a great candidate for being used in a logging provider for ELMAH. 

ELMAH organizes log entries per application, so I though it would be a good idea to use the same approach in MongoDB and have separate collections for applications. For this implementation, I used mongodb-csharp.

When a new entry needs to be persisted in the database, I first check whether the collection exists and I create one on the fly in case it does not.

using (var mongo = new Mongo(_connectionString))
{
    mongo.Connect();
 
    var master = mongo.GetDatabase("master");
 
    IMongoCollection collection = null;
 
    if (!master.GetCollectionNames().Any(collectionName => collectionName.EndsWith(ApplicationName)))
    {
        // Create event collection
        var options = new Document();
        options.Add("capped", true);
        options.Add("max", _maxEntriesCount);
        options.Add("size", _defaultCollectionSize);
        
        master.Metadata.CreateCollection(ApplicationName, options);
    
        var indexes = new Document();
        indexes.Add("id", 1);
 
        collection = master.GetCollection(ApplicationName);
        collection.MetaData.Indexes.Add("id", indexes);
    }

As you can see, that’s something really easy to accomplish with a few lines of code. I also created an index for the “id” property, which is the one that ELMAH uses for searching specific entries.

Once the collection is created, the ELMAH log entries need to be transformed to a document and stored in the collection, so I wrote an small helper for doing that.

var document = ErrorDocument.EncodeDocument(error);
document.Add("id", id);
 
collection.Save(document);

Getting the documents from the collection is also very straightforward, as the MongoDB API already support paging, which is something the ELMHA providers must provide.

mongo.Connect();
 
var master = mongo.GetDatabase("master");
 
var collection = master.GetCollection(ApplicationName);
var documents = collection.FindAll()
    .Skip(pageIndex * pageSize)
    .Limit(pageSize)
    .Documents;
 
foreach (var document in documents)
{
    var errorLog = ErrorDocument.DecodeError(document);
    errorEntryList.Add(new ErrorLogEntry(this, (string)document["id"], errorLog));
}

You can use this provider in any existing ASP.NET web application by adding this configuration to define the provider and the connection string to an existing MongoDB instance.

<elmah>
    <errorLog type="Elmah.MongoDb.MongoDbErrorLog, Elmah.MongoDb" connectionStringName="ELMAH.MongoDB" />
</elmah>
<connectionStrings>
    <add name="ELMAH.MongoDB" connectionString="Server=localhost:27017"/>
</connectionStrings>

The code is available to download from here. (Don’t forget to download the MongoDB drivers to compile the code)

Posted by cibrax
Filed under: ,

One of the main features that SO-Aware provides is the central repository for storing service artifacts (WSLD, schemas, bindings) and configuration that any organization generates. This central repository is completely exposed as an OData service that third party applications and tools can easily consume using Http.

However, the initial SO-Aware release lack of support for extending the standard artifacts with custom metadata. For example, if you wanted to associate some external documentation to your WCF bindings or set the developer or owner for an specific service,  that was not supported out of the box.

Based on all the feedback we received from customers, we decided to include this feature in the latest version. From the UI standpoint, you will see a new tab “Custom Metadata” in every resource.

Metadata     

The custom metadata that you can associate to existing resources is represented by key/value pairs (Metadata Term and Value). For example, the Metadata term can be “Owner” and the value “Pablo Cibraro”.

SO-Aware will create also a dictionary of available “Metadata” terms that every user can reuse in the repository.  These metadata terms can also be queried and modified through the standard OData API that the repository exposes. 

Enjoy!!.

Posted by cibrax
Filed under: , ,

SOAP services are in nature transport agnostics so they can not rely on specific transport features. Http is a great example where SOAP services make a poor use of Http as application protocol. This means that many of the http constraints are simply ignored, http headers are not used at all, messages are not self descriptive (You can not easily infer what a message does by looking at the content), the uniform interface is not used either as everything goes as a POST to the server and the list keeps growing. This makes impossible to leverage the existing web architecture and use intermediaries for caching results.  

As consequence of this, the only viable alternative is to provide caching as part of the service implementation. Caching at this level might be a good option for improving the service performance when some bottlenecks are detected and caused by IO operations with long delays (Calls to databases, legacy services, etc.) or CPU intensive code. In both scenarios, you might want to keep the results in memory as much as you can to reuse them in the service implementation. Here is where a distributed cache solution like AppFabric caching, NCache or MemCached makes a lot of sense. Local caches in memory such as Enterprise library might be another option but only if you are not running services in a farm scenario.

Client applications might opt to cache responses from the services too, but there is not any explicit mechanism for negotiating the lifetime of the cached copies between clients and services, making necessary to use out of band information.

Cache 

On the other hand, REST services can reuse all the available caching infrastructure in the web by leveraging Http as application protocol. Http already specifies headers for caching control and a process to revalidate cached copies that any intermediary can use (No need to use out of band information). Intermediaries in this contexts are usually represented by local caches like the one might find in a browser, proxies or reverse proxies. In described the implementation details here.

Cache2

REST services can still use a caching layer on the service implementation as it was discussed with SOAP services or use some output caching mechanism in the web server (ASP.NET Cache for example), but a good thing is that the web can be your friend in this matter too.

Posted by cibrax
Filed under: ,

There are multiple factors or requirements that might lead you to refactor some functionality into services. Here are some examples,

  • You have explicit requirements to distribute work across machines. This is a typical example of smart client applications (i.e. Silverlight apps), which runs a thin UI layer and have all the backend logic as part of services.
  • You need to run code remotely in a specific machine, so a good choice is to expose that as a service.  
  • You need to expose certain functionally of your system to other parties in a loosely coupled manner. Other parties in this context could mean anything such as other applications in the same organization, third party client applications, etc.
  • You have some CPU intensive code that you might want to execute in another machine to do some load balancing.

If you don’t have any of these requirements, you might want to think twice before coming up with the crazy idea of building services because you just think it is cool. The idea of building services must born from explicit requirements in the system you are creating, and that must be planned carefully. 

A common problem I usually see is that many developers or architects opt to build services to follow old principles of work distribution like you find in N-Tier applications, so they end up with tons of services that only makes sense in the context of the application they are building, but not as a unit of reuse.  In many cases, these services are usually UI driven, which means they are tied to the UI workflow and does not have a well defined interface.

Here is where I recommend to stick to the Martin Fowler’s first law of distribution, “Don’t distribute your objects”. Using a different layer with services for your business logic to solve scalability issues don’t necessarily makes sense these days. That’s a problem you can usually solve by scaling horizontally with additional web servers to support more user requests at the same time. If you still need to reuse some logic from different places within your application, nothing prevents you from moving all that logic to common libraries that run within the same process.

Services when used incorrectly only adds more complexity to the solution you are creating. You have an additional layer to maintain and configure, and configuring services is not something trivial. Unless you don’t use a central repository, you will probably run into a configuration hell with configuration files everywhere.  This generally leads to maintainability issues in the long run.

In conclusion, avoid distributing work with services unless you really need it.

Posted by cibrax | 3 comment(s)
Filed under: ,

The HttpMessageHandlerFactory shipped out of the box in the WCF Web Apis Preview 4 can only construct channels instances with a single argument in the constructor representing the inner channel in the pipeline.

public abstract class DelegatingChannel : HttpMessageChannel
{
    protected DelegatingChannel(HttpMessageChannel innerChannel);

This is the constructor that the factory always try to invoke by default. However, this approach does not work well when you need to inject some dependencies into the channels for doing some real work and making the channel more testable.

Going back to the example I shown a couple of weeks ago for doing a key validation, the key validator was an external dependency I had to inject into my channel.

public class ApiKeyVerificationChannel : DelegatingChannel
{
    public const string KeyHeaderName = "X-AuthKey";
 
    IKeyVerifier keyVerifier;
 
    public ApiKeyVerificationChannel(HttpMessageChannel innerChannel)
        : this(innerChannel, new KeyVerifier())
    {
    }
 
    public ApiKeyVerificationChannel(HttpMessageChannel innerChannel, IKeyVerifier keyVerifier)
        : base(innerChannel)
    {
        this.keyVerifier = keyVerifier;
    }

At that time, I just used a poor man DI solution for passing that dependency to the channel. The constructor invoked by the factory was creating an instance of the concrete implementation and passing that to the constructor receiving the dependency as an interface.

The good thing is that the HttpMessageHandlerFactory can be extended to support other strategies, so that’s what I am going to do as part of this post to show how the factory can be extended to inject any number of external dependencies into the channels.

public class CustomHttpMessageHandlerFactory : HttpMessageHandlerFactory
{
    Func<HttpMessageChannel, DelegatingChannel>[] factories;
 
    public CustomHttpMessageHandlerFactory()
        : base()
    {
    }
 
    public CustomHttpMessageHandlerFactory(params Func<HttpMessageChannel, DelegatingChannel>[] factories)
    {
        this.factories = factories;
    }
 
    protected override HttpMessageChannel OnCreate(HttpMessageChannel innerChannel)
    {
        if (innerChannel == null)
        {
            throw new ArgumentNullException("innerChannel");
        }
 
        HttpMessageChannel pipeline = innerChannel;
 
        foreach (var factory in this.factories)
        {
            pipeline = factory(innerChannel);
        }
 
        return pipeline;
    }
}

The trick here is to pass an array of Func delegates representing factories for the channels as part of the constructor, and override the OnCreate method to actually instantiate those channels with the delegates.

This makes possible to do things like the following to instantiate the channels as part of the configuration,

var keyVerifier = new KeyVerifier();
 
var config = HttpHostConfiguration.Create()
    .SetMessageHandlerFactory(new CustomHttpMessageHandlerFactory(innerChannel =>
        new ApiKeyVerificationChannel(innerChannel, keyVerifier)));
 
// setting up contacts services
RouteTable.Routes.MapServiceRoute<ContactsResource>("contacts", config);

As you can see, I am passing a lambda expression for instantiating the ApiKeyVerificationChannel with an existing IKeyVerifier instance. No need to use the Poor Man DI approach anymore Smile. You can also extend this example and resolve things like the IKeyVerifier from a service container, but I will let that part as homework for the reader.

Posted by cibrax | 1 comment(s)
Filed under: , ,

The repository pattern became popular a couple of years ago by the hand of Eric Evans with the DDD (Domain Driven Design) movement and Martin Fowler with his catalog of Enterprise Application Patterns.

The idea of having persistence ignorant entities inferred from the business domain and a repository as a simple intermediary for abstracting the underline persistence storage details had a great acceptance in the development community. No matter if you decided to stick to DDD or not, the repository pattern brought an incredible value to the table by decoupling your business domain code from details of the database access code (or any other underline storage).  This was particularly important for unit testing your business domain classes. I personally don’t think in a case of completely replacing  the underline storage, from a database to a http service or a file for example, because those scenarios are not very common and usually involves more changes than just switch the repository layer. 

The idea of using an abstraction layer at that level was not new at all, but it was mutating with different names and shapes over the time. I am pretty sure many of you still remember the data access layer in the old times of COM when the N Tier architecture was a popular idea pushed by Microsoft. Same idea but different name, design and implementation details.

However, a time after the repository became popular, a great game changer was introduced in the .NET ecosystem with something with all know as Linq.

Linq already provides an abstraction layer with common query capabilities on top of any data source so why bother with yet another abstraction layer like a repository. It results that the entry point to the query system in Linq, IQueryable, is not enough most of the times and only provides a read-only view with query support over the data source. We also need operations to persist changes in the repository, and that’s not something we get out of the box with IQueryable.

Microsoft on his side has provided some popular Linq implementations as part of the .NET framework to address particular scenarios or common problems like object relation mapping for databases with Entity Framework, xml management with Linq to XML or data management over http with WCF data services to name a few.

An initial problem with some of these implementations is that they didn’t make certain abstractions implicit making really hard to replicate their behavior with mocks or stubs as part of an unit test. A typical example was the the “eager loading” capability of EF in the initial versions. It was not possible to use lazy loading for associations, and the Include method for loading those was not something you could easily abstract as part of the repository. While this issue was partially addressed with POCOs, only the latest EF code first bits makes the approach of making an unit testable  repository something possible.

I agree with Daniel that a repository should only provide IQueryable for the aggregated roots  and methods for persisting changes in the underline store. This is how the ideal repository abstraction should look to me,

public interface IRepository
{
    void SaveChanges();
 
    void Add<T>(T entity) where T : IEntity;
 
    void Update<T>(T entity) where T : IEntity;
 
    void Delete<T>(long id) where T : IEntity;
 
    IQueryable<OneEntity> Entities { get; }
 
    // Other aggregate roots.
}

Before you say something, this is not the typical repository pattern, yes, because it also uses the unit of work pattern to submit the changes. To be honest, I don’t really care having the unit of work as part of this abstraction because it makes the repository easier to test and use too. However, there are some disadvantages of exposing IQueryable directly in your repository.

  • You can not easily reuse queries. At that point, you might want to have some pre-defined extension methods for creating IQueryable instances with all the filters already set. If you are still not a big fan of using this approach, you might want to switch to an approach based on query specs as this one, LinqSpecs
  • Other developers can alter the original purpose of the queries, or make things wrong with it. I think this is not a problem of this approach at all, but it mostly related to “developer protection”.

What about methods for filtering data that receive an expression like this one,

IQueryable<T> Find(Expression<Func<T,bool>> predicate); 

Unless Microsoft provides a good support for comparing expression trees in the .NET framework, I would stay away from that approach. I made a mistake myself of using it in the past for a repository over WCF Data Services. What I learned from that experience is that most of the mocking frameworks don’t support that approach very well, so it works some times.

In conclusion, try to provide a single entry point to your unit of work and expose all the aggregate roots with IQueryable as part of it whenever it is possible.

Posted by cibrax | 2 comment(s)
Filed under: ,

We are happy to announce a new program “Technology Updates” as part our plans to help developers and IT people to adopt new technologies.

As part of this program, we will explore and debate about many of the emergent technologies through different webinars.  The first two webinars announced are:

NOSQL Databases for the .Net developer: What’s the fuss all about ? (May 24 2011 - 2:00pm - 3:00pm EST) 
I Like IPHONE and ANDROID but I am a .Net Developer: Developing .Net Applications for IPHONE and ANDROID (Jun 21 2011 - 2:00pm - 3:00pm EST)

You can read more about these webinars in our brand new website!  Those webinars are completely free and open to anyone.

Posted by cibrax
Filed under: ,
More Posts