370 likes | 544 Views
Windows Azure Data Storage. Anton Boyko .NET Developer. Windows Azure Storage. Storage in the Cloud Scalable, durable, and available Anywhere at anytime access Only pay for what the service uses Exposed via RESTful Web Services Use from Windows Azure Compute
E N D
Windows AzureData Storage Anton Boyko .NET Developer
Windows Azure Storage Storage in the Cloud Scalable, durable, and available Anywhere at anytime access Only pay for what the service uses Exposed via RESTful Web Services Use from Windows Azure Compute Use from anywhere on the internet
Windows Azure Storage AccountUser specified globally unique account name Can choose geo-location to host storage account: US Europe Asia South Central US North Central US Western Europe South East Asia West US East US Northern Europe East Asia
Storage in the Development Fabric Provides a local “Mock” storage Emulates storage in cloud Allows offline development Requires SQL Express 2005/2008 or above There are some differences between Cloud and Dev Storage: http://msdn.microsoft.com/en-us/gg433135 A good approach for developers: To test pre-deployment, push storage to the cloud first Use Dev Fabric for compute connect to cloud hosted storageFinally, move compute to the cloud
Storage Security Windows Azure Storage provides simple security for calls to storage service HTTPS endpoint Two 512bit symmetric keys per storage account Can be regenerated independently More granular security via Shared Access Signatures
Windows Azure Storage Abstractions Tables Structured storage. A table is a set of entities; an entity is a set of properties. Blobs Simple named files along with metadata for the file. Queues Reliable storage and delivery of messages for an application. Drives Durable NTFS volumes for Windows Azure applications to use. Based on Blobs.
Blob Storage Concepts http://<account>.blob.core.windows.net/<container>/<blobname> Blob Account Container Pages/ Blocks PIC01.JPG images Block/Page PIC02.JPG contoso Block/Page videos VID1.AVI
Blob Details Main Web Service Operations PutBlob GetBlob DeleteBlob CopyBlob SnapshotBlob LeaseBlob
Blob Details Associate Metadata with Blob Standard HTTP metadata/headers (Cache-Control, Content-Encoding, Content-Type, etc) Metadata is <name, value> pairs, up to 8KB per blob Either as part of PutBlob or independently
Blob Details Blob always accessed by name Can include ‘/‘ or other delimeter in name e.g. /<container>/myblobs/blob.jpg
Enumerating Blobs GET Blob operation takes parameters Prefix Delimiter Include= (snapshots, metadata etc…) http://adventureworks.blob.core.windows.net/ Products/Bikes/SuperDuperCycle.jpg Products/Bikes/FastBike.jpg Products/Canoes/Whitewater.jpg Products/Canoes/Flatwater.jpg Products/Canoes/Hybrid.jpg Products/Tents/PalaceTent.jpg Products/Tents/ShedTent.jpg GET http://.../products?comp=list&prefix=Tents&delimiter=/ <Blob>Tents/PalaceTent.wmv</Blob> <Blob>Tents/ShedTent.wmv</Blob>
Pagination Large lists of Blobs can be paginated Either set maxresults or; Exceed default value for maxresults (5000) http://.../products?comp=list&prefix=Canoes&maxresults=2 <Blob>Canoes/Whitewater.jpg</Blob> <Blob>Canoes/Flatwater.jpg</Blob> <NextMarker>MarkerValue</NextMarker> http://.../products?comp=list&prefix=Canoes&maxresults=2 &marker=MarkerValue <Blob>Canoes/Hybrid.jpg</Blob>
Two Types of Blobs Under the Hood Page Blob Targeted at random read/write workloads Each blob consists of an array of pages Each page is identified by its offset from the start of the blob Size limit 1TB per blob Optimistic or Pessimistic (locking) concurrency via leases Block Blob Targeted at streaming workloads Each blob consists of a sequence of blocks Each block is identified by a Block ID Size limit 200GB per blob Optimistic Concurrency via Etags
Uploading a Block Blob Uploading a large blob blobName = “TheBlob.wmv”; PutBlock(blobName, blockId1, block1Bits); PutBlock(blobName, blockId2, block2Bits); ………… PutBlock(blobName, blockIdN, blockNBits); PutBlockList(blobName, blockId1,…,blockIdN); THE BLOB 10 GB Movie Block Id 1 Block Id 2 Block Id 3 Block Id N Benefit Efficient continuation and retry Parallel and out of order upload of blocks TheBlob.wmv Windows AzureStorage TheBlob.wmv
Page Blob – Random Read/Write Create MyBlob Specify Blob Size = 10 Gbytes Sparse storage - Only charged for pages with data stored in them Fixed Page Size = 512 bytes Random Access Operations PutPage[512, 2048) PutPage[0, 1024) ClearPage[512, 1536) PutPage[2048,2560) GetPageRange[0, 4096) returns valid data ranges: [0,512) , [1536,2560) GetBlob[1000, 2048) returns All 0 for first 536 bytes Next 512 bytes are data stored in [1536,2048) 0 512 1024 1536 2048 2560 10 GB Address Space 10 GB
Shared Access Signatures Fine grain access rights to blobs and containers Sign URL with storage key – permit elevated rights Revocation Use short time periods and re-issue Use container level policy that can be deleted
Content Delivery Network (CDN) High-bandwidth global blob content delivery 24 locations globally (US, Europe, Asia, Australia and South America), and growing Same experience for users no matter how far they are from the geo-location where the storage account is hosted Blob service URL vs. CDN URL: Windows Azure Blob URL: http://images.blob.core.windows.net/ Windows Azure CDN URL: http://<id>.vo.msecnd.net/ Custom Domain Name for CDN: http://cdn.contoso.com/
Windows Azure CDN GET http://guid01.vo.msecnd.net/images/pic.1jpg 404 EdgeLocation EdgeLocation EdgeLocation To Enable CDN: Register for CDN via Dev Portal Set container images to public Content Delivery Network TTL http://sally.blob.core.windows.net/ http://guid01.vo.msecnd.net/ Windows Azure Blob Service pic1.jpg pic1.jpg pic1.jpg http://sally.blob.core.windows.net/images/pic1.jpg
Table Storage Concepts Account Table Entity Name =… Email = … customers Name =… EMailAdd= contoso Photo ID =… Date =… photos Photo ID =… Date =…
TableDetails Not an RDBMS! Table Create, Query, Delete Tables can have metadata Insert Update Merge – Partial update Replace – Update entire entity Delete Query Entity Group Transactions Multiple CUD Operations in a single atomic transaction Entities
Entity Properties Entity can have up to 255 properties Up to 1MB per entity Mandatory Properties for every entity PartitionKey & RowKey (only indexed properties) Uniquely identifies an entity Defines the sort order Timestamp Optimistic Concurrency Exposed as an HTTP Etag No fixed schema for other properties Each property is stored as a <name, typed value> pair No schema stored for a table Properties can be the standard .NET types String, binary, bool, DateTime, GUID, int, int64, and double
No Fixed Schema FAV SPORT Canoeing
Querying ?$filter=Last eq ‘Wegner’
Query Operators(Table Service Support) From Where Take(the value specified for the Take operator must be less than or equal to 1000) First FirstOrDefault Select(projection is supported) more details http://msdn.microsoft.com/en-us/library/windowsazure/dd135725.aspx
Purpose of the PartitionKey Entity Locality Entities in the same partition will be stored together Entity Group Transactions Atomic multiple Insert/Update/Delete in same partition in a single transaction Table Scalability Target throughput – 500 tps/partition, several thousand tps/account Windows Azure monitors the usage patterns of partitions Automatically load balance partitions
Partitions and Partition Ranges Server A Table = Products [MinKey - Canoes) Server A Table = Products Server B Table = Products [Canoes - MaxKey)
A Server Is Not AMachine SQL Server A Machine SQL Azure Database Server A TDS Endpoint
How It Works Architecture Client Layer - Used by application to communicate directly with SQL Database. Services Layer – Gateway between Client layer and Platform layer. Platform Layer – Includes physical servicers and services that support the Services layer. Infrastructure Layer – IT administration of the physical HW and OS. Client Layer WCF SQL App and Tools PHP ADO .NET ODBC Tabular Data Stream (TDS) TDS + SSL Endpoint Service Layer
Create Database… Use Familiar Technologies Transact-SQL Languages .NET Framework (C#, Visual Basic, F#) via ADO.NET C / C++ via ODBC Java via Microsoft JDBC provider PHP via Microsoft PHP provider Frameworks OData, Entity Framework, WCF Data Services, NHibernate Tools SQL Server Management Studio (2008 R2 and later) SQL Server command-line utilities (SQLCMD, BCP) CA Erwin® Data Modeler Embarcadero Technologies DBArtisan® SQL Server Comparison Focus on logical vs. physical administration Database and log files automatically placed Three high-availability replicas maintained for every database Tables require a clustered index Maximum database size is 150 GB Unsupported SQL Server Features Use command, distributed transactions, distributed views Service Broker Common Language Runtime (CLR) SQL Agent
SQL Database Firewall Internet Securing your data IP Address-based access control for SQL Database Rules can be defined at the server and database No IP authorized by default Configurable using the SQL Database Portal and REST API Option to disable/enable access from applications hosted in Windows Azure SQL Database Firewall Services Layer
SQL Federation Database Scalability Scale to hundreds of nodes via database sharding Multi-tenancy via flexible repartitioning Online split operations to minimize downtime Automatic data discovery regardless of changes in how data is partitioned