April 2011 - Posts

As I discussed in my previous post, web caching relies on specific headers that you need to use correctly on your services. That’s an http application protocol thing, and something that you can easily use in any application framework that treats Http as first citizen. This means that you don’t need to implement anything fancy or exclusively rely on an specific caching technology or components for doing ouput caching (e.g ASP.NET Cache).

You only need to worry about using the Expires and Cache-Control headers correctly and optionally setting up a validation process with conditional gets. If the application framework automatically provides this for you under the hood, that’s definitely a good thing too.

I promised to write a more detailed post with some extensions for WCF, but Jose beat me to it. He wrote an excellent post on how to implement some generic extensions for WCF Web Apis, and what’s more important, how to configure intermediaries like local caches in the browser or reverse proxies (Squid) to cache the data for us. 

Posted by cibrax
Filed under: , , ,

One of the beauties of using Http as application protocol when building Web Apis is that you can reuse all the available infrastructure in the web to make your services scalable to the extreme. Caching is one of the fundamental aspects that makes scalability possible in the web today because it provides the following benefits when it is implemented correctly,

  1. Decrease load on the servers where the content was originally generated.
  2. Decrease latency between clients and servers. Clients can get responses much faster.
  3. Saves some bandwidth. Fewer networks hops are required as the content might be found in some caching intermediary before the request reaches the origin server.

Implementing caching in a Web Api involves two steps

  1. Using the right http headers to instruct intermediaries (e.g. proxies, reverse proxies, local caches) to cache the responses.
  2. Implement conditional gets so the intermediaries can revalidate the cached copies of the data when they become stale (they are not fresh anymore as they expired).

The first part requires the use of either the “Expires” header or the “Cache-Control” header. Expires is useful for expressing an absolute expiration date. For example, this piece of data expires tomorrow at 2 pm. Cache-Control on the other hand gives more granular control for expressing sliding expiration dates and also who is allowed to cache the data.

The following two examples illustrates the use of the “Expires” header and “Cache-Control”.

[WebGet(UriTemplate = "")]
public HttpResponseMessage Get()
{
    var response = new HttpResponseMessage();
 
    response.Content = new ObjectContent<List<Contact>>(contacts);
    response.Content.Headers.Expires = DateTime.Now.AddHours(1);
 
    return response;
}

 

[WebGet(UriTemplate = "")]

public HttpResponseMessage Get()
{
    var response = new HttpResponseMessage();
 
    response.Content = new ObjectContent<List<Contact>>(contacts);
    response.Headers.CacheControl = new CacheControlHeaderValue();
    response.Headers.CacheControl.Public = true;
    response.Headers.CacheControl.MaxAge = TimeSpan.FromHours(1);
 
    return response;
}

Both examples returns a list of contacts as response, but the first one sets an absolute expiration of one hour using the “Expires” header and the second one an sliding expiration of one hour using the “Cache-Control” header. I also set the Public flag to true to specify that the data can be cached by any intermediary.

According to the Http specification, any Http GET request should be idempotent and safe. In this context, these principles have the following meaning:

  • Impotence: One or more calls with identical requests should return the same resource representation (As long as the resource has not changed on the server between calls).
  • Safety:  The only action produced by a GET request is to retrieve a resource representation, no side-effects should be evident on the service side after executing the request. An example that violates this principle could be a GET request that also updates some info in the resource.

Based on those facts, caching intermediaries can also use a technique knows as Conditional GET to revalidate a copy of the data that has become stale. An Http conditional get involves the use of two special headers generated by the service, ETag and Last-Modified. An Etag represents an opaque value that only the server knows how to recreate, it could represent anything, but it is usually a hash representing the resource version (it can be generated hashing the whole representation content or just some parts of it, like the timestamp). On the other hand, the Last-Modified headers represents a datetime that the service can use to determine whether the resource has changed since the last time it was served.

The following example illustrates the request/response messages interchanged by the client/service for a service that returns a list of contacts.

First request, the client does not have anything on the cache.

Request –>

GET http://localhost:8080/contacts

Host localhost
Accept */*

Response –>

Connection close
Date Thu, 02 Oct 2008 14:46:57 GMT
Server Microsoft-IIS/6.0
Content-Length 1212
Expires Sat, 01 Nov 2008 14:46:57 GMT
Last-Modified Mon, 29 Sep 2008 15:40:27 GMT
Etag a9331828c517ac5d97f93b3cfdbcc9bc

Content-Type text/xml

The process of validating the cached copy with the origin server involves sending a conditional get with the following headers,

Request –>

GET http://localhost:8080/contacts

Host contacts
Accept */*
If-Modified-Since Mon, 29 Sep 2008 15:40:27 GMT
If-None-Match a9331828c517ac5d97f93b3cfdbcc9bc

With these two new headers, the server can now determine whether it has to serve the resource representation again or the intermediary can still use the cached version. If the resource has not changed according to the values in those headers (If-Modified-Since for Last-Modified and If-None-Match for Etag), the service can return an Http status code "304 Not Modified", which instructs the intermediary to keep the cached version. The example above shows both headers, but in practice, the intermediary only uses one of them (If-Modified-Since if the original response included Last-Modified or If-None-Match if the original response included ETag).

[WebGet(UriTemplate = "")]
public HttpResponseMessage Get(HttpRequestMessage request)
{
    var response = new HttpResponseMessage();
 
    var etag = request.Headers.IfNoneMatch.FirstOrDefault();
    
    if(etag != null && etag.Tag == "This is my ETag")
    {
        response.Content = new EmptyContent();
        response.StatusCode = HttpStatusCode.NotModified;
 
    }
    else
    {
        response.Content = new ObjectContent<List<Contact>>(contacts);
        response.Headers.CacheControl = new CacheControlHeaderValue();
        response.Headers.CacheControl.Public = true;
        response.Headers.CacheControl.MaxAge = TimeSpan.FromHours(1);
 
        response.Headers.ETag = new EntityTagHeaderValue("This is my ETag");
    }
 
    return response;
}

The example above shows how to implement conditional gets with an “etag”. The etag in that example is hardcoded, something that is not going to happen in the real world but I just want to show the control flow to implement this kind of logic. You can also implement conditional gets with Last-Modified and Modified-Since as it is shown bellow,

[WebGet(UriTemplate = "")]
public HttpResponseMessage Get(HttpRequestMessage request)
{
    var response = new HttpResponseMessage();
 
    if (request.Headers.IfModifiedSince.HasValue && request.Headers.IfModifiedSince.Value == contacts_timestamp)
    {
        response.Content = new EmptyContent();
        response.StatusCode = HttpStatusCode.NotModified;
    }
    else
    {
        response.Content = new ObjectContent<List<Contact>>(contacts);
        response.Content.Headers.Expires = DateTime.Now.AddHours(1);
 
        response.Content.Headers.LastModified = contacts_timestamp;
    }
 
    return response;
}

contacts_timestamp in that example is also a harcoded date with the last modification date for the returned list of contacts.

We have discussed so far how caching can be implemented in your WCF web Apis to allow data to be cached in any existing intermediary. Next time I will show how this logic can be refactored in some generic extensions so we don’t need to repeat it in every single operation where we want to cache data.

Posted by cibrax | 1 comment(s)
Filed under: , ,

Another major improvement in this new WCF Web Api release is the introduction of a fluent API for configuring your WCF Web Apis. All the available extensibility points in the current bits are now exposed through this API making possible to easily discover them.

Let’s discuss in detail how this new configuration API looks like and what extensibility points you have today.

var config = HttpHostConfiguration.Create().
        SetOperationHandlerFactory<ContactManagerOperationHandlerFactory>().
        SetInstanceFactory(new MefInstanceFactory(container)).
        SetMessageHandlerFactory(typeof(LoggingChannel), typeof(UriFormatExtensionMessageChannel)).
        SetErrorHandler<ContactManagerErrorHandler>();

HttpPostConfiguration is the entry point for the configuration builder and the use you need to use to create a new instance of IHttpHostConfigurationBuilder. As part of this configuration builder, you can find several factories for hooking up the different extensibility extensibility points.

An Operation Handler Factory is responsible for creating new instances of Media Type Formatters and Operation Handlers.  A media type formatter knows how to handle specific media types by serializating and deserializing objects into an stream. Typical examples of a Media Formatter are JsonMediaTypeFormatter for “application/Json” or XmlMediaTypeFormatter for “application/xml”. An operation handler on the other hand is what we used to call Http Processor in the previous bits and I already discussed here. In a few words, it is something tied to specific operations that knows how to transform a set of input arguments into some outputs.

An Instance Factory is responsible for creating new service instances. This is the entry point where you can configure your favorite DI framework for creating instances or injecting dependencies on them. An IInstanceFactory (the interface for representing an Instance Factory) looks as follow,

public interface IInstanceFactory
{
    object GetInstance(Type serviceType, InstanceContext instanceContext, HttpRequestMessage request);
    void ReleaseInstance(InstanceContext instanceContext, object service);
}

The method names on this interface are self descriptive. The GetInstance method is invoked by WCF to get an instance of an specific service type. The ReleaseInstance is invoked for releasing or disposing that instance when the service has been used already.

A Message Handler Factory is responsible for creating new instances of Http Message Handlers. An Http Message Handler is a new interception mechanism that runs at channel level. I discussed Http Message Channels in detail here.

Finally, an Error Handler is what you can use to map a .NET exception to a Http Response Message. This is where you can set a friendly response message or any specific http status code according to the received exception. The following code shows the error handler sample implementation that comes with the bits,

public class ContactManagerErrorHandler : HttpErrorHandler
{
   protected override bool OnHandleError(Exception error)
   {
       return false;
   }
 
   protected override System.Net.Http.HttpResponseMessage OnProvideResponse(Exception error)
   {
       var exception = (HttpResponseException)error;
       var response = exception.Response;
       response.ReasonPhrase = "[Handled]" + response.ReasonPhrase;
       return response;
   }
}

The resulting IHttpHostConfigurationBuilder instance can later be passed as argument in the constructor to WCF service host (HttpConfigurableServiceHost) or to the Service Route extension method if you are hosting your services in ASP.NET as it is shown bellow,

RouteTable.Routes.MapServiceRoute<ContactResource>("Contact", config);
Posted by cibrax | 1 comment(s)
Filed under: , ,

The new WCF Web Apis Preview 4 released yesterday in the wcf.codeplex.com introduced a new extensibility point for intercepting messages at channel level. The name for this new feature is “Http Message Channels” and the good thing is that you don’t need to rely anymore on the REST Starter Kit request interceptors for doing the same thing. Actually, a Http Message Channel is more useful as you can intercept either request or response messages, and also you get an Http message with all the context information you might need and not a WCF generic message, which usually requires some additional processing.

A Http Message Channel uses the new Task-Based Asynchronous pattern for implementing asynchronous work, which simplifies the implementation a lot. This is how a generic Http Message Channel looks like,

public class MyHttpChannel : DelegatingChannel
{
    public MyHttpChannel(HttpMessageChannel innerChannel)
        : base(innerChannel)
    {
    }
 
    protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
    {
        // DO YOUR WORK Here
    }
}

You need to derive the base class “DelegatingChannel” and override the method “SendAsync” to provide your implementation. You only receive the request message as input argument, but you can call the inner channel in the stack for processing the request and getting an instance of the response message if you want to do some processing on the response as well.

The following example shows a typical channel for validating a API key. The API key is passed in a custom header “X-AuthKey” as part of the request message-

public class ApiKeyVerificationChannel : DelegatingChannel
{
    public const string KeyHeaderName = "X-AuthKey";
 
    IKeyVerifier keyVerifier;
 
    public ApiKeyVerificationChannel(HttpMessageChannel innerChannel)
        : this(innerChannel, new KeyVerifier())
    {
    }
 
    public ApiKeyVerificationChannel(HttpMessageChannel innerChannel, IKeyVerifier keyVerifier)
        : base(innerChannel)
    {
        this.keyVerifier = keyVerifier;
    }
 
    protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
    {
        IEnumerable<string> values = null;
        if (request.Headers.TryGetValues(KeyHeaderName, out values))
        {
            var key = values.First();
 
            if (this.keyVerifier.Verify(key))
            {
                return base.SendAsync(request, cancellationToken);
            }
        }
 
        return Task.Factory.StartNew(() =>
            {
                var response = new HttpResponseMessage(HttpStatusCode.Unauthorized, "You need to provide a valid key for consume these Apis");
                return response;
            });
    }
}

This implementation basically validates the received key in the http header and calls the inner channel for further processing of the request (base.SendAsync). However, if the key is not provided or it is invalid, a new response is returned and the execution is interrupted.

As you can see, this interception mechanism is also great for implementing common concerns like security that needs to run much before the service is invoked.

Finally, a custom Http Message Channel can be injected in the WCF pipeline with the new fluent configuration interface as it is showed bellow,

public class Global : System.Web.HttpApplication
{
    private void Application_Start(object sender, EventArgs e)
    {
        var config = HttpHostConfiguration.Create().
            SetMessageHandlerFactory(typeof(ApiKeyVerificationChannel));
 
        // setting up contacts services
        RouteTable.Routes.MapServiceRoute<ContactsResource>("contacts", config);
    }
}
The SetMessageHandlerFactory receives either an array of “Type”s representing the channels or an instance of a HttpMessageHandlerFactory that knows how to instantiate new channels. You might want to extend the default HttpMessageHandlerFactory to use your favorite DI framework to resolve the channels.
Posted by cibrax | 3 comment(s)
Filed under: , , ,
More Posts