270 likes | 537 Views
Consistency-Based Service Level Agreements for Cloud Storage. Douglas B. Terry, Vijayan Prabhakaran, Ramakrishna Kotla, Mahesh Balakrishnan, Marcos K. Aguilera, Hussam Abu-Libdeh Microsoft Research. “A foolish consistency is the hobgoblin of little minds” -- Ralph Waldo Emerson (1841)
E N D
Consistency-Based Service Level Agreements for Cloud Storage Douglas B. Terry, Vijayan Prabhakaran, Ramakrishna Kotla, Mahesh Balakrishnan, Marcos K. Aguilera, Hussam Abu-Libdeh Microsoft Research
“A foolish consistency is the hobgoblin of little minds” -- Ralph Waldo Emerson (1841) “… and of large clouds” -- Douglas Brian Terry (2013)
Today’s Cloud Storage Providers • Replicate data widely • Offer choice of strongor eventualconsistency e.g. Amazon DynamoDB, Yahoo PNUTS, Google App Engine, Oracle NoSQL, Cassandra, … Microsoft Windows Azure • Tradeoff consistency, availability and performance
Problem • Developers must choose consistency • No single choice is best for all clients and situations Shopping cart: Want read in under 300 ms. roundtrip times in milliseconds
Pileus key features a cap cloud • Replicated, partitioned key-value store • Choice of consistency • Consistency-based service level agreements (SLAs)
Pileus System Model API secondary nodes Get(key, SLA) BeginSession (SLA) BeginTx (SLA) Put (key, value) Get (key, SLA) returns value, consistency EndTx () EndSession () primary core Put(key, value) sync replication Get(key, SLA) lazy replication Get(key, SLA)
Read Consistency Guarantees [COPS 2011] [TACT 2002] [Bayou 1994]
Read Latencies consistency affects latency client location affects latency roundtrip times in milliseconds
Consistency-based SLA • Applications declare desired consistency/latency Shopping Cart: consistency latency utility strong 300 ms. 1.0 1. read my writes 300 ms. 0.5 2. eventual 300 ms. 0.1 3.
SLA Enforcement: Client Monitoring For each tablet: measured on Gets, Puts, and pings from configuration service returned from Gets, Puts, and pings
SLA Enforcement: Node Selection On Get (key, SLA): • For each subSLA and node, • compute Platency • compute Pconsistency • compute Platencyx Pconsistency x utility • Select node with maximum expected utility • Send Get operation to node • Measure RTT and update records • Return data and delivered consistency to caller
Experimental Setup 161 England U.S. 149 308 China 436 287 181 India System configuration: Primary: England Secondaries: U.S., India Clients: U.S., England, India, China Benchmark: YCSB with 50/50 Gets/Puts 500-op sessions Node selection schemes: Primary = get from primary Random = get from random node Closest = get from closest node Pileus = get from node with highest expected utility Measurement: Average utility for Get operations
Experiment #1: SLA Simplified shopping cart SLA: consistency latency utility 1. read my writes 300 ms. 1.0 2. eventual 300 ms. 0.5
Experiment #1: Delivered Utility Average utilityper Get (secondary) (primary) (secondary) (client only) Client datacenter
Experiment #1: Delivered Utility Primary selection works well when close to the primary, but poorly when distant Average utilityper Get (secondary) (primary) (secondary) (client only) Client datacenter
Experiment #1: Delivered Utility Random selection rarely works well Average utilityper Get (secondary) (primary) (secondary) (client only) Client datacenter
Experiment #1: Delivered Utility 100% Gets from England; 100% meet top subSLA Average utilityper Get (secondary) (primary) (secondary) (client only) Client datacenter
Experiment #1: Delivered Utility 91% from U.S.; 9% from England; 100% meets top subSLA Average utilityper Get 14.5 ms. avg. latency vs. 148 ms. for primary (secondary) (primary) (secondary) (client only) Client datacenter
Experiment #1: Delivered Utility 99.6% from U.S.; 0.4% from India; 96% meets top subSLA Average utilityper Get (secondary) (primary) (secondary) (client only) Client datacenter
Experiment #1: Delivered Utility Average utilityper Get (secondary) (primary) (secondary) (client only) Pileus always delivers the most utility! Client datacenter
Experiment #1: Delivered Utility 9% fail to meet read-my-write Average utilityper Get (secondary) (primary) (secondary) (client only) Client datacenter
Experiment #2: SLA Password checking SLA: consistency latency utility 1. strong 150 ms. 1.0 2. eventual 150 ms. 0.5 3. strong 1000 ms. 0.25
Experiment #2: Delivered Utility Average utilityper Get (secondary) (primary) (secondary) (client only) Client datacenter
Conclusions: Main Contributions Our Pileus system • provides a broad choice of consistency guarantees and range of delivered read latency • allows declarative specification of desired consistency and latency throughconsistency-based SLAs • selects nodes to maximize expected utility while adapting to varying conditions