MIX11 - Session Review - Windows Azure Storage - Getting Started and Best Practices

Friday, April 22, 2011

Haridas (Software Developer Lead)

Windows Azure Storage
- What is it?
  - Scalable, Durable, Highly Available Cloud Storage System
  - Pay for what you use
  - Abstractions
    - Blobs – Provides a simple interface for storing named files along with metadata for the file
    - Drives – Provides durable NTFS volumes for Windows Azure Applications to use – based on Page Blobs
    - Tables – Provides structured storage. A Table is a set of entities which contains a set of properties
    - Queues – Provides reliable storage and delivery of messages for an Application
- Data Storage Concepts
  - Based on the Account
    - Container for Blobs (Blob Storage)
      - https://<account>.blob.windows.net/<container>
    - Table for Entities (Table Storage)
      - https://<account>.table.windows.net/<table>
    - Queue for Messages (Queue Storage)
      - https://<account>.queue.windows.net/<queue>
- Blobs
  - Provides a highly scalable, durable and available file system in the cloud
  - An account can create many containers
    - No limit on number of blobs in a container
    - Limit of 100TB per account
  - Associate metadata with Blobs
  - Upload / Download Blobs
    - Allows range reads
    - Conditional operations – If-Match, Id-Not-Modified-Since, …
    - Sharing – Public containers, Shared Access Signatures (SAS)
      - SAS – pre-authenticated url
  - Storage client uses by default a timeout of 90 seconds. This can be changed using the BlobRequestOptions class to set the timeout to be according to the type of blob you’re uploading
  - Types of Blobs
    - Block Blobs
      - Targeted at streaming workloads
      - Each Blob consists of a sequence of blocks
        
        2 phase commit: Blocks are uploaded and them separately committed
        
        Efficient continuation and retry
        
        Send multiple out of order blocks in parallel and decide the block order during commit
        
        Random range reads possible
      - Size limit is 200GB per blob
      - What to do?
        
        File has variable sized blocks
        
        Upload blocks in Parallel using PutBlock
        
        Retry failed blocks
        
        Commit blob using PutBlockList
    - Page Blobs
      - Targeted at random write workloads
      - Each blob consist of an array of pages
      - Size limit 1TB per blob
      - Page
        
        Each page range write is committed on PUT
        
        Page is 512 byte in size
        
        Write boundary aligned at multiple of 512 byte
        
        Range reads possible
        
        Pages that do not have data are zeroed out
      - How?
        
        Write 5K bytes – PutPage
        
        Clear starting at a particular offset – ClearPage
        
        Overwrite bytes – PutPage
        
        Truncate Blob – SetMaxBlobSize
  - Sharing
    - Every blob request must be signed with the account owner’s key
    - Share your files options
      - The container must be public – read-only rights whenever you make a container public
      - Shared Access Signatures (SAS) – share pre-authenticated URLs with users
        
        You decided who you’d like to share it with
        
        You can give variable permition
        
        Delete Blob
        
        Write Blob
        
        Read or Listing Blob
        
        Two ways to do that
        
        Everything embedded using the URL and doing that signing it with your owner key
        
        Create an access policy that will contain all the parameters that normally are in the url
        
        Advantage is that this will make it possible to change the policy after giving the URL so someone.
    - SAS
      - Use container level access as it allows access to be easily revoked
  - Snapshots
    - Point in time read-only copy of blob
    - Every snapshot creates a new read only point in time copy
    - Charged only for unique blocks or pages (ex. reuse blocks or pages)
      - For reuse, use WritePages or PutBlock & PutBlock
    - Restore snapshots using copy blob
    - Remember to cleanup your snapshots
  - Best Practices
    - Use parallel blocks upload count to reduce latency when uploading photo
    - Client Library uses a default of 90 seconds timeout – use size based timeout
    - Snapshots – For block or page reuse, issue block and page uploads in place of UploadXXX methods in Storage Client
    - Shared Access Signatures
      - Use container level policy as it allows revoking permissions
      - Share SAS Url using Https
    - Create new container for blobs like log files that have retention period
      - Delete logs after 1 month – create new containers every month
    - Container recreation
      - Garbage Collection can take time until which time container with same name cannot be created (this might make that you cannot create a new container immediately with exactly the same name after deleting the previous one)
      - Use unique names for containers
- Drive
  - Provides a durable NTFS volume for Windows Azure Applications
    - Use existing NTFS APIs
    - Easy migration path to the cloud
    - Durability and survival of data on application failover or hardware failure
      - All flushed and un-buffered writes to drive are made durable
  - A Windows Azure Drive is a Page Blob
    - Mounts Page Blob as an NTFS drive
    - Mounted by one VM at a time for read/write
    - A VM can dynamically mount up to 16 drives
    - Drives can be up to 1TB
- Tables
  - Provides Structured Storage
    - Massively Scalable and Durable Tables
      - Billions of entities (rows) and TBs of data
      - A storage account can contain many tables
      - No limit on number of entities (aka rows) in each table
      - Provides flexible schema
    - Familiar and Easy to use API
      - WCF Data Services – .NET classes and LINQ
      - REST (OData Protocol) – with any platform and language
  - Best Practices
    - Use the context.SaveChangesWithRetries(SaveChangesOptions.Batch)
      - SaveChangesOptions.Batch gives you transactional semantics to the operation
      - Transactions on entities are only allowed if the partition key is the same
    - CloudTableQuery<> handles ContinuationToken
    - Use clustered index in queries for performance (PartitionKey)
    - Limit large scans and expect continuation tokens for queries that scan
      - Split “OR” on keys as individual queries
    - Entity Group Transactions – Batch to reduce costs and get transaction
    - Do not reuse DataServiceContext across multiple logical operations
    - Discard DataServiceContext on failures
    - AddObject/AttachTo can Throw exceptions if entity is already being tracked
    - Point query throws an exception if resource does not exist. Use IgnoreResourceNotFoundException property
- Queue
  - Queue are highly scalable, available and provide reliable message delivery
    - Simple, asynchronous work dispatch
    - A storage account can create any number of queues
    - 8K message size limit and default expiry of 7 days
    - Programming semantics ensures that a message can be processed at least once
      1. Get message to make the message invisible
      2. Delete message to remove the message
  - Access is provided via REST
  - Best Practices
    - Make message processing idempotent
    - Do not rely on order – invisibility time can result in out of order
    - Messages > 8K => use blobs or tables to store and message contains the blob or table entity key
    - Use message count to dynamically increase/decrease workers. Example:
      - Retain one instance that polls once every X time period
      - One instance polling every second result in 2.678.400 calls which cost around $2.67
      - Spawn more instances when you detect backlog
    - Use dequeue count to detect
      - Visibility expiry time needs to increase
      - Poison messages
Partitioning & Scalability
- Know the scalability Targets
  - Single Blob Partition
    - Throughput up to 60 MB/s
  - Single Queue/Table Partition
    - up to 500 transactions (entities or messages) per second
  - Storage account
    - SLA – 99,9% availability
    - Capacity – Up to 100 TBs
    - Transactions – Up to 5000 entities per second
    - Bandwidth – Up to 3 gigabits per second
  - Scale above the limits
    - Partition between multiple storage accounts and partitions
    - When limit is hit, app may see ‘503 server busy’. Apps should implement exponential back-off
- Storage Partition – How to Scale?
  - Every data object has a partition key
    - Different for each data type (blobs, tables, queues)
  - Partition Key is unit of scale
    - A partition can be served by a single server
    - System load balances partitions based on traffic
    - Controls entity locality
  - Systems load balances
    - Load balancing can take a few minutes to kick in
    - Can take a couple of seconds for partition to be available on a different server
  - Server busy
    - Use exponential back-off on “Server Busy”
    - Our system load balances to meet your traffic needs
    - Single partition limits have been reached
- Automatic Load Balancing
  - Assignment
    - Process:
      1. When a request is made the Load Balancer delivers it to one of the Front-Ends and it’s delivered to the appropriate Back-Ends that is serving that partition
        
        Each server has multiple partitions and the loads can be different
      2. If the Master System recognizes that a single service has to many request, then it will offload those partitions and reassign then to the systems that have less load
- Partition Keys in each abstraction
  - Blobs
    - “Container Name + Blob Name” is the partition key
    - Every blob and its snapshot are in a single partition
  - Tables
    - “Table Name” + Partition Key is the partition
    - Entities with same partition key value are served from the same partition
  - Queues
    - Queue Name is the Partition Key
    - All messages for a single queue belong to the same partition

Interesting Tools to understand what’s happening with the Storage
- Fiddler
- Wireshark
- NetMon

Resources
- http://blogs.msdn.com/windowsazurestorage/

No Comments