Implementing caching in your WCF Web Apis

One of the beauties of using Http as application protocol when building Web Apis is that you can reuse all the available infrastructure in the web to make your services scalable to the extreme. Caching is one of the fundamental aspects that makes scalability possible in the web today because it provides the following benefits when it is implemented correctly,

  1. Decrease load on the servers where the content was originally generated.
  2. Decrease latency between clients and servers. Clients can get responses much faster.
  3. Saves some bandwidth. Fewer networks hops are required as the content might be found in some caching intermediary before the request reaches the origin server.

Implementing caching in a Web Api involves two steps

  1. Using the right http headers to instruct intermediaries (e.g. proxies, reverse proxies, local caches) to cache the responses.
  2. Implement conditional gets so the intermediaries can revalidate the cached copies of the data when they become stale (they are not fresh anymore as they expired).

The first part requires the use of either the “Expires” header or the “Cache-Control” header. Expires is useful for expressing an absolute expiration date. For example, this piece of data expires tomorrow at 2 pm. Cache-Control on the other hand gives more granular control for expressing sliding expiration dates and also who is allowed to cache the data.

The following two examples illustrates the use of the “Expires” header and “Cache-Control”.

[WebGet(UriTemplate = "")]
public HttpResponseMessage Get()
{
    var response = new HttpResponseMessage();
 
    response.Content = new ObjectContent<List<Contact>>(contacts);
    response.Content.Headers.Expires = DateTime.Now.AddHours(1);
 
    return response;
}

 

[WebGet(UriTemplate = "")]

public HttpResponseMessage Get()
{
    var response = new HttpResponseMessage();
 
    response.Content = new ObjectContent<List<Contact>>(contacts);
    response.Headers.CacheControl = new CacheControlHeaderValue();
    response.Headers.CacheControl.Public = true;
    response.Headers.CacheControl.MaxAge = TimeSpan.FromHours(1);
 
    return response;
}

Both examples returns a list of contacts as response, but the first one sets an absolute expiration of one hour using the “Expires” header and the second one an sliding expiration of one hour using the “Cache-Control” header. I also set the Public flag to true to specify that the data can be cached by any intermediary.

According to the Http specification, any Http GET request should be idempotent and safe. In this context, these principles have the following meaning:

  • Impotence: One or more calls with identical requests should return the same resource representation (As long as the resource has not changed on the server between calls).
  • Safety:  The only action produced by a GET request is to retrieve a resource representation, no side-effects should be evident on the service side after executing the request. An example that violates this principle could be a GET request that also updates some info in the resource.

Based on those facts, caching intermediaries can also use a technique knows as Conditional GET to revalidate a copy of the data that has become stale. An Http conditional get involves the use of two special headers generated by the service, ETag and Last-Modified. An Etag represents an opaque value that only the server knows how to recreate, it could represent anything, but it is usually a hash representing the resource version (it can be generated hashing the whole representation content or just some parts of it, like the timestamp). On the other hand, the Last-Modified headers represents a datetime that the service can use to determine whether the resource has changed since the last time it was served.

The following example illustrates the request/response messages interchanged by the client/service for a service that returns a list of contacts.

First request, the client does not have anything on the cache.

Request –>

GET http://localhost:8080/contacts

Host localhost
Accept */*

Response –>

Connection close
Date Thu, 02 Oct 2008 14:46:57 GMT
Server Microsoft-IIS/6.0
Content-Length 1212
Expires Sat, 01 Nov 2008 14:46:57 GMT
Last-Modified Mon, 29 Sep 2008 15:40:27 GMT
Etag a9331828c517ac5d97f93b3cfdbcc9bc

Content-Type text/xml

The process of validating the cached copy with the origin server involves sending a conditional get with the following headers,

Request –>

GET http://localhost:8080/contacts

Host contacts
Accept */*
If-Modified-Since Mon, 29 Sep 2008 15:40:27 GMT
If-None-Match a9331828c517ac5d97f93b3cfdbcc9bc

With these two new headers, the server can now determine whether it has to serve the resource representation again or the intermediary can still use the cached version. If the resource has not changed according to the values in those headers (If-Modified-Since for Last-Modified and If-None-Match for Etag), the service can return an Http status code "304 Not Modified", which instructs the intermediary to keep the cached version. The example above shows both headers, but in practice, the intermediary only uses one of them (If-Modified-Since if the original response included Last-Modified or If-None-Match if the original response included ETag).

[WebGet(UriTemplate = "")]
public HttpResponseMessage Get(HttpRequestMessage request)
{
    var response = new HttpResponseMessage();
 
    var etag = request.Headers.IfNoneMatch.FirstOrDefault();
    
    if(etag != null && etag.Tag == "This is my ETag")
    {
        response.Content = new EmptyContent();
        response.StatusCode = HttpStatusCode.NotModified;
 
    }
    else
    {
        response.Content = new ObjectContent<List<Contact>>(contacts);
        response.Headers.CacheControl = new CacheControlHeaderValue();
        response.Headers.CacheControl.Public = true;
        response.Headers.CacheControl.MaxAge = TimeSpan.FromHours(1);
 
        response.Headers.ETag = new EntityTagHeaderValue("This is my ETag");
    }
 
    return response;
}

The example above shows how to implement conditional gets with an “etag”. The etag in that example is hardcoded, something that is not going to happen in the real world but I just want to show the control flow to implement this kind of logic. You can also implement conditional gets with Last-Modified and Modified-Since as it is shown bellow,

[WebGet(UriTemplate = "")]
public HttpResponseMessage Get(HttpRequestMessage request)
{
    var response = new HttpResponseMessage();
 
    if (request.Headers.IfModifiedSince.HasValue && request.Headers.IfModifiedSince.Value == contacts_timestamp)
    {
        response.Content = new EmptyContent();
        response.StatusCode = HttpStatusCode.NotModified;
    }
    else
    {
        response.Content = new ObjectContent<List<Contact>>(contacts);
        response.Content.Headers.Expires = DateTime.Now.AddHours(1);
 
        response.Content.Headers.LastModified = contacts_timestamp;
    }
 
    return response;
}

contacts_timestamp in that example is also a harcoded date with the last modification date for the returned list of contacts.

We have discussed so far how caching can be implemented in your WCF web Apis to allow data to be cached in any existing intermediary. Next time I will show how this logic can be refactored in some generic extensions so we don’t need to repeat it in every single operation where we want to cache data.

Comments

No Comments