December 2012 - Posts

If you are building a SaaS application in Windows Azure that relies on SQL Azure, it’s probably that you will need to support multiple tenants at database level.

This is short overview of the different approaches you can use for support that scenario,

A different database per tenant

A new database is created and assigned when a tenant is provisioned.

Pros

  • Complete isolation between tenants. All the data for a tenant lives in a database only he can access.

Cons

  • It’s not cost effective. SQL Azure databases are not cheap, and the minimum size for a database is 1GB.  You might be paying for storage that you don’t really use.
  • A different connection pool is required per database.
  • Updates must be replicated across all the databases
  • You need multiple backup strategies across all the databases

Multiple schemas in a database shared by all the tenants

A single database is shared among all the tenants, but every tenant is assigned to a different schema and database user.

Pros

  • You only pay for a single database.
  • Data is isolated at database level. If the credentials for one tenant is compromised, the rest of the data for the other tenants is not.

Cons

  • You need to replicate all the database objects in every schema, so the number of objects can increase indefinitely.
  • Updates must be replicated across all the schemas.
  • The connection pool for the database must maintain a different connection per tenant (or set of credentials)
  • A different user is required per tenant, which is stored at server level. You have to backup that user independently.

Centralizing the database access with store procedures in a database shared by all the tenants

A single database is shared among all the tenants, but nobody can read the data directly from the tables. All the data operations are performed through store procedures that centralize the access to the tenant data. The store procedures contain some logic to map the database user to an specific tenant.

Pros

  • You only pay for a single database.
  • You only have a set of objects to maintain and backup.

Cons

  • There is no real isolation. All the data for the different tenants is shared in the same tables.
  • You can not use traditional ORM like EF code first for consuming the data.
  • A different user is required per tenant, which is stored at server level. You have to backup that user independently.

SQL Federations

A single database is shared among all the tenants, but a different federation is used per tenant. A federation in few words, it’s a mechanism for horizontal scaling in SQL Azure, which basically uses the idea of logical partitions to distribute data based on certain criteria.

Pros

  • You only have a single database with multiple federations.
  • You can use filtering in the connections to pick the right federation, so any ORM could be used to consume the data.

Cons

  • There is no real isolation at that database level. The isolation is enforced programmatically with federations.
Posted by cibrax
Filed under: ,

For simple scenarios of Web API consumption where identity delegation is not required, traditional http authentication schemas such as basic, certificates or digest are the most used nowadays. All these schemas rely on sending the caller credentials or some representation of it in every request message as part of the Authorization header, so they are prone to suffer phishing attacks if they are not correctly secured at transport level with https.

In addition, most client applications typically authenticate two different things, the caller application and the user consuming the API on behalf of that application. For most cases, the schema is simplified by using a single set of username and password for authenticating both, making necessary to store those credentials temporally somewhere in memory.

The true is that you can use two different identities, one for the user running the application, which you might authenticate just once during the first call when the application is initialized, and another identity for the application itself that you use on every call.

Some cloud vendors like Windows Azure or Amazon Web Services have adopted an schema to authenticate the caller application based on a Message Authentication Code (MAC) generated with a symmetric algorithm using a key known by the two parties, the caller and the Web API.

The caller must include a MAC as part of the Authorization header created from different pieces of information in the request message such as the address, the host, and some other headers. The Web API can authenticate the caller by using the key associated to it and validating the attached MAC in the request message. In that way, no credentials are sent as part of the request message, so there is no way an attacker to intercept the message and get access to those credentials.

Anyways, this schema also suffers from some deficiencies that can generate attacks. For example, brute force can be still used to infer the key used for generating the MAC, and impersonate the original caller. This can be mitigated by renewing keys in a relative short period of time. This schema as any other can be complemented with transport security.

Eran Rammer, one of the brains behind OAuth, has recently published an specification of a protocol based on MAC for Http authentication called Hawk. The initial version of the spec is available here. A curious fact is that the specification per se does not exist, and the specification itself is the code that Eran initially wrote using node.js.

In that implementation, you can associate a key to an user, so once the MAC has been verified on the Web API, the user can be inferred from that key. Also a timestamp is used to avoid replay attacks.

As a pet project, I decided to port that code to .NET using ASP.NET Web API, which is available also in github under https://github.com/pcibraro/hawknet

Enjoy!.

Posted by cibrax
Filed under: , ,
More Posts