210 likes | 411 Views
SQL and NoSQL Are Two Sides Of The Same Coin. Michael Rys, Microsoft Corp. @ SQLServerMike. Strata 2012 Conference, March 2012. Agenda. Scaling out your business is important! NoSQL Paradigms and NoSQL Platforms SQL learns from NoSQL (with a demo of SQL Azure Federations)
E N D
SQL and NoSQL Are Two Sides Of The Same Coin Michael Rys, Microsoft Corp.@SQLServerMike Strata 2012 Conference, March 2012
Agenda • Scaling out your business is important! • NoSQL Paradigms and NoSQL Platforms • SQL learns from NoSQL(with a demo of SQL Azure Federations) • NoSQL learns from SQL • Scalable Data Processing Platform of the Future
The Web 2.0 Business Architecture • Attract Individual Consumers: • Provide interesting service • Provide mobility • Provide social • Monetize Individual: • Upsell service • VIP • Speed • Extra Capabilities • Monetize the Social: • Improve individual experience • Re-sell Aggregate Data (e.g., Advertisers)
Social NetworkING: the Business Problem • 100s of million of users • 10s of million of users concurrently • Terabytes to petabytes of data • Structured and unstructured • Required (eventual) data consistency across users • E.g. show your updated state in your friends’ profile pages
Solution • Shard/Partition user data across hundreds to thousands of SQL Databases • Propagate data changes from one DB to other DBs using reliable, async Message Service • Managing routes from each DB to every other DB would be too complex • Global Transactions would hinder scale and availability • Provide a caching layer for performance • And also used for • Clean-up state (e.g. on account close) • Deploy business logic (stored procedures)
Example Architecture 3001-4000 1-1000 My DB gets updated AsyncMessage Service Dispatcher TX1 TX3 TX2 AsyncMessage I change my status userId=1024 AsyncMessage 2001-3000 1001-2000 TX4 TX5 Web Tier 4001-5000 5001-6000 Data Tier
Many LARGE SCALE customers using similar patterns • Patterns • Sharding and reliable messaging • Sharding and fan/out query layer • Caching layer • Customer Examples • Social Networking: Facebook, MySpace, etc • Online electronic stores (cannot give names ) • Travel reservation systems (e.g. Choice International) • MSN Casual Gaming • etc.
Lessons Learned from THESE scenarios • Requirehigh availability • Be able to scale out: • Functional and Data Partitioning Architecture • Provide scale-out processing: • Function shipping • Fanout and Map/Reduce processing • Be able to deal with failures: • Quorum • Retries • Eventual Consistency (similar to Read-consistent Snapshot Isolation) • Be able to quickly grow and change: • Elastic scale • Flexible, open schema • Multi-version schema support Move better support for these patterns into the Data Platform!
What is NoSQL about? • NoSQL = operational and developer agility at low CapEx and OpEx! • Low Cost • Free Open Source Stores • Scale CapEx cost below customer growth rate • Web friendly developer model and tool chain • Processing Paradigms • High Availability (scalable Replication, Fast Failover, DR/GeoDR, tunable latency) • Scale-out (Sharding, Map-Reduce, Elasticity) • Performance (tuned for specific workloads, Caching, co-located compute with partitioned state) • Tunable/Eventual Consistency • Data Model Paradigms • Data first: Flexible Schema • Low-impedance mismatch between programming and data model: • Key-Documents and Objects (BLOBS, JSON, XML, POJO) • Key-Wide Sparse Column Sets • Graphs (e.g., RDF) • Range from devices, over OLTP Web 2.0 applications to BigData Analytics
What Can SQL learn From NoSQL? • Low CapEx, Low OpEx • Built-in tunable High-Availability • Data scale-out (Sharding) • Processing scale-out (Map-Reduce, Fan-Out, tunable consistency) • Flexible Data Models • JSON (& XML) support • Sparse columns/Column sets • Integrate with BigData Analytics (e.g., Hadoop) Many Relational Database Systems are incorporating these learning!
Example: SQL Azure Federations • Provides Data Partitioning/Sharding at the Data Platform • Enables applications to build elastic scale-out applications • Provides non-blocking SPLIT/DROP for shards (MERGE to come later) • Auto-connect to right shard based on shardingkeyvalue • Provides SPLIT resilient query mode
SQL Azure Federation Concepts • Federation • Represents the data being sharded • Federation Root • Database that logically houses federations, contains federation meta data • Federation Key • Value that determines the routing of a piece of data (defines a Federation Distribution) • Atomic Unit • All rows with the same federation key value: always together! • Federation Member (aka Shard) • A physical container for a set of federated tables for a specific key range and reference tables • Federated Table • Table that contains only atomic units for the member’s key range • Reference Table • Non-sharded table Azure DB with Federation Root Federation Directories, Federation Users, Federation Distributions, … Federation “Orders_Fed” (Federation Key: CustomerID) Member: PK [min, 100) AUPK=5 AUPK=25 AUPK=35 Member: PK [100, 488) AUPK=105 AUPK=235 AUPK=365 Member: PK [488, max) Connection Gateway AUPK=555 AUPK=2545 AUPK=3565 Sharded Application
DemoMap-Reduce scale-out over SQL Azure Federations ShardedGamesInfo table using SQL Azure Federations Use a C# library that does implement a Map/Reduce processor on top SQL Azure Federations Mapper and Reducer are specified using SQL
What Can NOSQL learn From SQL? • Flexible data is good, but: • Provide optional schema in data platform to help with constraints and optimizations • Procedural Scale-Out processing is good, but: • Develop a declarative language suited for and across the data models (e.g., coSQL) • Standardize suitable abstractions and languages • Eventual Consistency is good, but: • Provide users the choice • Simple Queries are good, but: • Provide me with secondary indexes • it will be more efficient to join between two collections of JSON documents in the query engine than in the Application layer Many NoSQL Database Systems are starting to incorporate these learnings!
The Web 2.0 Business Architecture • Attract Individual Consumers: • Provide interesting service • Provide mobility • Provide social • Monetize Individual: • Upsell service • VIP • Speed • Extra Capabilities • Monetize the Social: • Improve individual experience • Re-sell Aggregate Data (e.g., Advertisers)
Scale-Out Data PLATFORM Architecture Copy OLTP Workloads Highly Available High Scale High Flexibility mostly touching 1 to low number of shards Traditional OLAP Workloads known schema Data warehouse, “Star joins” Dynamic OLAP Workloads 3Vs (Volume, Velocity, Variety) Exploratory Scale-out queries, often using eventual consistent scale-out frameworks like Hadoop Query SQL or NoSQL Store
Call To Action • Familiarize yourself with the NoSQL genes in the Microsoft Online Platform • Free 3-Month Trial for Windows and SQL Azure: http://www.windowsazure.com • Engage with us throughout Strata • Download slides with additional information and related resources: http://.../....
Related Resources • Scale-Out with SQL Databases • http://gigaom.com/cloud/facebook-shares-some-secrets-on-making-mysql-scale/ • Windows Gaming Experience Case Study: http://www.microsoft.com/casestudies/Case_Study_Detail.aspx?CaseStudyID=4000008310 • Scalable SQL: http://cacm.acm.org/magazines/2011/6/108663-scalable-sql • http://www.slideshare.net/MichaelRys/scaling-with-sql-server-and-sql-azure-federations • NoSQL and the Windows Azure Platform • Whitepaper: http://download.microsoft.com/download/9/E/9/9E9F240D-0EB6-472E-B4DE-6D9FCBB505DD/Windows%20Azure%20No%20SQL%20White%20Paper.pdf • SQL Federation blog: http://blogs.msdn.com/b/cbiyikoglu/archive/2011/03/03/nosql-genes-in-sql-azure-federations.aspx • Contact me • @SQLServerMike • http://sqlblog.com/blogs/michael_rys/default.aspx