1 / 33

Towards Predictable Multi-Tenant Shared Cloud Storage

Towards Predictable Multi-Tenant Shared Cloud Storage. David Shue*, Michael Freedman*, and Anees Shaikh ✦. *Princeton ✦ IBM Research. Shared Services in the Cloud. Y. Y. F. F. Z. Z. T. T. DynamoDB. S3. EBS. SQS. Co-located Tenants Contend For Resources. 2x demand. 2x demand. Z.

cahil
Download Presentation

Towards Predictable Multi-Tenant Shared Cloud Storage

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards Predictable Multi-Tenant Shared Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh✦ *Princeton ✦IBM Research

  2. Shared Services in the Cloud Y Y F F Z Z T T DynamoDB S3 EBS SQS

  3. Co-located Tenants Contend For Resources 2x demand 2x demand Z Z Z Y Y Y Y T T F F F F F DD Shared Storage Service DD DD DD

  4. Co-located Tenants Contend For Resources 2x demand 2x demand Y Y Y Y F F F F F Z Z Z Y Y Y Y T T F F F F F Shared Storage Service DD DD DD DD

  5. Co-located Tenants Contend For Resources 2x demand 2x demand Y Y Y Y F F F F F Z Z Z Y Y Y Y T T F F F F F DD Shared Storage Service DD DD DD Resource contention = unpredictable performance

  6. Towards Predictable Shared Cloud Storage Per-tenant Resource Allocation and Performance Isolation Z Z Z Z Z Z Y Y Y Y Y Y Y Y T T T T F F F F F F F F F F Shared Storage Service DD SS DD SS DD SS DD SS

  7. Towards Predictable Shared Cloud Storage ≤ 80 kreq/s 120 kreq/s 160 kreq/s 40 kreq/s Zynga Yelp TP Foursquare Z Z Z Y Y Y Y T T F F F F F Shared Storage Service SS SS SS SS Hard limits are too restrictive, achieve lower utilization

  8. Towards Predictable Shared Cloud Storage ≥ wz = 20% wy = 30% wt = 10% wf = 40% Zynga Yelp TP Foursquare Z Z Z Y Y Y Y T T F F F F F demandz = 40% demandf = 30% ratez = 30% Shared Storage SS SS SS SS Goal: per-tenant max-min fair share of system resources

  9. Partition Placement (Migration) Local Weight Allocation Replica Selection Fair Queuing microseconds seconds minutes PISCES: Predictable Shared Cloud Storage • PISCES Goals • Allocate per-tenant fair share of total system resources • Isolate tenant request performance • Minimize overhead and preserve system throughput • PISCES Mechanisms

  10. B A D C keyspace keyspace keyspace keyspace Place Partitions By Fairness Constraints weightA weightB weightC weightD 1 Tenant A Tenant B Tenant C Tenant D 2 VM VM VM VM VM VM VM VM VM VM VM VM 3 4 popularity keyspace partition PISCES Node 1 PISCES Node 2 PISCES Node 3 PISCES Node 4

  11. Place Partitions By Fairness Constraints Rate A < wA Rate B < wB Rate C < wC Compute feasible partition placement Controller PP Over-loaded 25

  12. Place Partitions By Fairness Constraints Compute feasible partition placement Controller PP

  13. PISCES Mechanisms Place partitions by fairness constraints Feasible Global Allocation Partition Placement (Migration) Controller Locally Fair Allocation Mechanism Load Balancing (relaxation) Fairness and Isolation microseconds seconds minutes Timescale

  14. wa3 wa4 wa1 wa2 wb3 wb4 wb2 wb1 wc4 wc2 wc3 wc1 wd4 wd3 wd2 wd1 Allocate Local Weights By Tenant Demand = = = wA wB wC wD wA wB wC wD RA > wA RB < wB RC < wC RD > wD VM VM VM VM VM VM VM VM VM VM VM VM 32

  15. Allocate Local Weights By Tenant Demand wA wB wC wD VM VM VM VM VM VM VM VM VM VM VM VM Swap weights to minimize tenant latency (demand) Controller WA A→C D→B C→D B→C C→A

  16. Achieving System-wide Fairness Place partitions by fairness constraints Allocate weights by tenant demand Feasible Global Allocation Partition Placement (Migration) Controller Local Weight Allocation Locally Fair Allocation Controller Mechanism Load Balancing (relaxation) Fairness and Isolation microseconds seconds minutes Timescale

  17. Select Replicas By Local Weight wA wB wC wD VM VM VM VM VM VM VM VM VM VM VM VM GET 1101100 RR Select replicas based on node latency RS GET 1101100 1/2 1/2 C over-loaded C under-utilized 42

  18. Select Replicas By Local Weight wA wB wC wD VM VM VM VM VM VM VM VM VM VM VM VM RR Select replicas based on node latency RS 1/3 2/3

  19. Place partitions by fairness constraints Achieving System-wide Fairness Allocate weights by tenant demand Feasible Global Allocation Partition Placement (Migration) Controller Select replicas by local weight Local Weight Allocation Locally Fair Allocation Controller Mechanism RR RR ... Replica Selection Load Balancing (relaxation) Fairness and Isolation microseconds seconds minutes Timescale

  20. 0.5 1 6.3 0.5 12.5 7.3 3.2 50 4 4 50 62.5 in in out out req req Queue Tenants By Dominant Resource wA wB VM VM VM VM VM VM Out bytes fair sharing Fair Queuing Bandwidth limited Request Limited

  21. 11.3 6.9 0.5 1 0.5 5.6 3.2 45 4 4 55 55 in in out out req req Queue Tenants By Dominant Resource wA wB Shared out bytes bottleneck VM VM VM VM VM VM Dominant resource fair sharing Fair Queuing Bandwidth limited Request Limited

  22. Achieving System-wide Fairness Feasible Global Allocation Partition Placement (Migration) Controller Place partitions by fairness constraints Local Weight Allocation Locally Fair Allocation Controller Allocate weights by tenant demand Mechanism RR RR ... Replica Selection Load Balancing (relaxation) Select replicas by local weight Fair Queuing N N ... Fairness and Isolation Queue tenants by dominant resource microseconds seconds minutes Timescale

  23. Evaluation • Does PISCES achieve system-wide fairness? • Does PISCES provide performance isolation? • Can PISCES adapt to shifting tenant distributions? YCSB 1 YCSB 2 YCSB 3 YCSB 4 YCSB 5 YCSB 6 YCSB 7 YCSB 8 Gigabit Ethernet ToR Switch PISCES 1 PISCES 2 PISCES 3 PISCES 4 PISCES 5 PISCES 6 PISCES 7 PISCES 8

  24. PISCES Achieves System-wide Fairness Ideal fair share: 110 kreq/s (1kB requests) GET Requests (kreq/s) Time (s)

  25. PISCES Provides Strong Performance Isolation 2x demand vs. 1x demand tenants (equal weights) GET Requests (kreq/s) Time (s)

  26. PISCES Achieves Dominant Resource Fairness 76% of effective bandwidth 76% of effective throughput GET Requests (kreq/s) Bandwidth (Mb/s) Bottleneck Resource Time (s) Latency (ms)

  27. PISCES Achieves System-wide Weighted Fairness GET Requests (kreq/s) Fraction of Requests Time (s) Latency (ms)

  28. PISCES Achieves Local Weighted Fairness GET Requests (kreq/s) Fraction of Requests Time (s) Latency (ms)

  29. wt = 4 wt = 3 wt = 2 wt = 1 PISCES Achieves Local Weighted Fairness GET Requests (kreq/s) Time (s)

  30. 1 2 2 3 3 4 1 4 3 4 1 4 1 2 2 3 PISCES Adapts To Shifting Tenant Distributions Tenant 1 Tenant 2 Tenant 3 Tenant 4 Weight 1x Weight 2x Weight 1x Weight 2x

  31. 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 PISCES Adapts To Shifting Tenant Distributions Tenant Weight Tenant 1 Tenant 2 Tenant 3 Tenant 4

  32. Conclusion • PISCES achieves: • System-wide per-tenant fair sharing • Strong performance isolation • Low operational overhead (< 3%) • PISCES combines: • Partition Placement: find a feasible fair allocation (TBD) • Weight Allocation: adapts to (shifting) per-tenant demand • Replica Selection: distributes load according to local weights • Fair Queuing: enforces per-tenant fairness and isolation

  33. Future Work • Implement partition placement (for T >> N) • Generalize the fairness mechanisms to different services and resources (CPU, Memory, Disk) • Scale evaluation to a larger test-bed (simulation) Thanks!

More Related