120 likes | 277 Views
Window Azure Storage. HUNG H. Ho CSS534 - Parallel Programming Grid and Cloud - Survey Work. Overview. Blobs - stores file data . Table storage - stores structured datasets.
E N D
Window Azure Storage HUNG H. Ho CSS534 - Parallel Programming Grid and Cloud - Survey Work
Overview • Blobs- stores file data. • Table storage - stores structured datasets. • Queue storage - reliable async messaging for workflow processing and for communication between components of cloud services. • Development using: • REST API • .NET, Native code, Java, Node.js, php, Ruby, Python, Powershell.
Replication for Durability and High Availability • Locally redundant storage (LRS) is replicated three times within a single data center. Write operation performed synchronously. • Geo-redundant storage (GRS) is replicated three times within a single region, and is also replicated asynchronously to a second region hundreds of miles away from the primary region. • Read-access geo-redundant storage (RA-GRS) = GRS + allows read access to data at the secondary region in the event that the primary region becomes unavailable.
Blob Storage • Max size 200GB per block blob or 1TB • per page blob • Metadata = 8KB • Allow snapshots • Container lease – copy blobs from 1 to • Another • Blob lease – exclusive updates, read stillallowed • Acquire • Renew • Release • Break
Table Storage • Entity <= 1MB; # of properties <= 255; with different schemas • PartitionKey –string values that identify the partition that an entity belongs to. • RowKey –string values that uniquely identify entities within each partition. • Timestamp/ETag– The Timestamp property provides traceability for an entity. • Insert • Merge new props into existing entities • Replace set of props • Retrieve all or subset or props • Delete • Optimistic Concurrency pattern – Etag & Precondition Failed (412) • Laster writer wins pattern – Etag = * • Partition determine throughput & Entity Group Transaction
Queue • Infinite # of messages of 64KB sent as XML • FIFO not guaranteed; at-least-once delivery; dequeue doesn’t remove the message from the queue • Dequeue-count: used to keep track of # of dequeue; find out trouble some message • Unprocessed messages deleted after 7 days
Twister4Azure: Parallel Data Analytics on Azure • Benchmarking Twister4Azure ( extended decentralized iterative MapReduce runtime for Azure) against traditional MapReduce and similar frameworks by using different bioinformatics tasks
References • Overview • http://azure.microsoft.com/en-us/documentation/articles/storage-introduction/ • http://msdn.microsoft.com/en-us/library/azure/hh508997.aspx • A Comparison of Amazon Elastic MapReduce & Azure MapReduce • http://www.elixirpublishers.com/articles/1355562518_53%20%282012%29%2012059-12064.pdf • Twister4Azure: Parallel Data Analytics on Azure • http://grids.ucs.indiana.edu/ptliupages/publications/CloudFuture.pdf
Thank You Q & A