200 likes | 491 Views
Highly Available Cloud Storage Azure and S3. Udaiappa Ramachandran NHDN-Cloud Computing UG Lead Email: udaiappa@gmail.com Blog: http://cloudycode.wordpress.com. Cloud Storage - Agenda. Overview of Cloud Storage Azure Blob Storage Amazon S3 Demo Resource Q&A. Why use Cloud Storage?.
E N D
Highly Available Cloud StorageAzure and S3 Udaiappa Ramachandran NHDN-Cloud Computing UG Lead Email: udaiappa@gmail.com Blog: http://cloudycode.wordpress.com
Cloud Storage - Agenda • Overview of Cloud Storage • Azure Blob Storage • Amazon S3 • Demo • Resource • Q&A
Why use Cloud Storage? • Highly Available with Strong Consistency • Provide access to data in face of failures/partitioning • Durability • Replicate data several times within and across data centers • Scalability • Need to scale to Exabyte's and beyond • Provide a global namespace to access data around the world • Automatically load balance data to meet peak traffic demands
Storage Abstractions • Blobs • Simple named files along with metadata for the file • Blob Types • Block Blob • Targeted for streaming workloads • Each blob consists of a sequence of blocks, Each block is identified by a Block ID targeted for streaming workloads • Size limit 200GB per blob • Optimistic concurrency via ETags • Page Blob • Targeted at random read/write workloads • Each blob consists of an array of pages. Each page is identified by its offset from the start of the blob • Size limit 1TB per blob • Optimistic or Pessimistic (locking) concurrency via leases • Drives • Durable NTFS volumes for Windows Azure applications to use. Based on Blobs. • SMB can be used to access drive across multiple instance(s)
Storage Concept http://<account>.blob.core.windows.net/<container>/<blobname> Account Container Blobs Pages/Blocks
Blob Details • Mock Storage Emulator • Can CDN Enable Account • Blobs delivered via 24 global CDN nodes • Can co-locate storage account with compute account • Explicitly or using affinity groups • Accounts have two independent 512 bit shared secret keys • 100 TBs per account • Geo-Replication • Storage Analytics • Logs: Provide trace of executed requests for your storage accounts • Metrics: Provide summary of key capacity and request statistics for Blobs, Tables, and Queues • HTTP headers for Blobs • RESTful and Client API support
Blob Details • Associate Metadata With blob • Standard HTTP metadata/headers (Cache-Control, Content-Encoding, Content-Type, etc) • Metadata is <name, value> pairs, up to 8KB per blob • Either as part of PutBlob or independently • Blob always accessed by name • Can include ‘/‘ or other delimeterin name e.g. /<container>/myblobs/blob.jpg
Blob Operations • PutBlob • GetBlob • DeleteBlob • CopyBlob • SnapshotBlob • LeaseBlob • ListBLobs
Azure Drives • Durable NTFS volume for Windows Azure Instances • Use existing NTFS APIs to access a network attached durable drive • Use System.IO from .NET • Benefits • Move existing apps using NTFS more easily to the cloud • Durability and survival of data on instance recycle • A Windows Azure Drive is an NTFS VHD Page Blob • Mounts Page Blob over the network as an NTFS drive • Local cache on instance for read operations • All flushed and un buffered writes to drive are made durable to the Page Blob • A Windows Azure Drive is a Page Blob formatted as a NTFS single volume Virtual Hard Drive (VHD) • Drives can be up to 1TB
Azure Drives • A Page Blob can be mounted: • On one instance at a time for read/write • Using read-only snapshots to multiple instances at once • An instance can dynamically mount up to 16 drives • Remote Access via standard Blob UI • Can’t remotely mount drive • Can upload the VHD to a Page Blob using the blob interface, and then mount it as a Drive • Can download the VHD to a local file and mount locally • Operations performed via Drive API not REST Calls • Operations on Drives • Create Drive • Mount / UnMountDrive • Get Mounted Drives • Snapshot Drive
Storage Concept http://<bucketname>.s3-website-[us-east-1].amazonaws.com/ http[s]://s3.amazonaws.com/<bucketname>/keyname
S3 Details • Associate Metadata With blob • Standard HTTP metadata/headers (Cache-Control, Content-Encoding, Content-Type, etc) • Metadata is <name, value> pairs, up to 8KB per blob • Either as part of PutBlob or independently • Blob always accessed by key • Can include ‘/‘ or other delimeterin name but folder must end with ‘/’e.g. /<bucketname>/myblobs/blob.jpg • 1-5 TB per blob • RESTful/API support • Storage Analytics • Ability to host static template from blob • Bittorrent protocol support
Blob Operations • Put • Get • Delete • Copy • List
Best Practices • Chose the location closest your customer • Use only lower cases for container/key name • Always remove public access from container/bucket. Instead grant public access to keys if required. • Avoid unnecessary request (such a call to check if blob exists instead use the response) • Use compression for large file if possible • Enable CDN
Resource • http://microsoft.com/azure • http://blogs.msdn.com/b/windowsazurestorage/archive/2010/12/30/windows-azure-storage-architecture-overview.aspx • http://aws.amazon.com/articles