1 / 51

The Only Constant is Change: Incorporating Time-Varying Bandwidth Reservations in Data Centers

The Only Constant is Change: Incorporating Time-Varying Bandwidth Reservations in Data Centers. Di Xie, Ning Ding , Y. Charlie Hu, Ramana Kompella. Cloud Computing is Hot. Private Cluster. Key Factors for Cloud Viability. Cost Performance. Performance Variability i n Cloud.

baruch
Download Presentation

The Only Constant is Change: Incorporating Time-Varying Bandwidth Reservations in Data Centers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Only Constant is Change: Incorporating Time-Varying Bandwidth Reservations in Data Centers Di Xie, Ning Ding, Y. Charlie Hu, Ramana Kompella

  2. Cloud Computing is Hot Private Cluster

  3. Key Factors for Cloud Viability • Cost • Performance

  4. Performance Variability in Cloud • BW variation in cloud due to contention [Schad’10 VLDB] • Causing unpredictable performance

  5. Network performance variability • Data analytics on an isolated cluster Map Reduce Job Results Completion Time 4 hours Enterprise • Tenant

  6. Network performance variability • Data analytics on an isolated cluster Map Reduce Job Map Reduce Job Results Results Completion Time 4 hours Enterprise • Tenant • Tenant • Data analytics in a multi-tenant datacenter Completion Time 10-16 hours Datacenter Variable network performance can inflate the job completion time

  7. Network performance variability • Data analytics on an isolated cluster Map Reduce Job Map Reduce Job Results Results Completion Time 4 hours Enterprise • Tenant • Tenant • Data analytics in a multi-tenant datacenter Completion Time 10-16 hours Datacenter Variable tenant costs Expected cost (based on 4 hour completion time) = $100 Actual cost = $250-400

  8. Network performance variability • Data analytics on an isolated cluster Map Reduce Job Map Reduce Job Results Results Completion Time 4 hours Enterprise • Tenant • Tenant • Unpredictability of application performance and tenant costs is a key hindrance to cloud adoption • Key Contributor: Network performance variation • Data analytics in a multi-tenant datacenter Completion Time 10-16 hours Datacenter Variable tenant costs Expected cost (based on 4 hour completion time) = $100 Actual cost = $250-400

  9. Reserving BW in Data Centers • SecondNet [Guo’10] • Per VM-pair, per VM access bandwidth reservation • Oktopus [Ballani’11] • Virtual Cluster (VC) • Virtual Oversubscribed Cluster (VOC)

  10. How BW Reservation Works Only fixed-BW reservation Request <N, B> Bandwidth B Time Virtual Switch 0 T . . . N VMs Virtual Cluster Model 2. Allocate and enforce the model 1. Determine the model

  11. Network Usage for MapReduce Jobs Hadoop Sort, 4GB per VM

  12. Network Usage for MapReduce Jobs Hadoop Sort, 4GB per VM Hadoop Word Count, 2GB per VM

  13. Network Usage for MapReduce Jobs Hadoop Sort, 4GB per VM Hadoop Word Count, 2GB per VM Hive Join, 6GB per VM

  14. Network Usage for MapReduce Jobs Hadoop Sort, 4GB per VM Hadoop Word Count, 2GB per VM Hive Aggregation, 2GB per VM Hive Join, 6GB per VM

  15. Network Usage for MapReduce Jobs Hadoop Sort, 4GB per VM Time-varying network usage Hadoop Word Count, 2GB per VM Hive Aggregation, 2GB per VM Hive Join, 6GB per VM

  16. Motivating Example • 4 machines, 2 VMs/machine, non-oversubscribed network • Hadoop Sort • N: 4 VMs • B: 500Mbps/VM Not enough BW 1Gbps 500Mbps 500Mbps 500Mbps

  17. Motivating Example • 4 machines, 2 VMs/machine, non-oversubscribed network • Hadoop Sort • N: 4 VMs • B: 500Mbps/VM 1Gbps 500Mbps

  18. Under Fixed-BW Reservation Model 1Gbps Bandwidth 500 500Mbps Job1 Job2 Job3 Time 0 5 10 15 20 25 30 Virtual Cluster Model

  19. Under Time-Varying Reservation Model Hadoop Sort 1Gbps Bandwidth 500 500Mbps Job2 Job3 Job4 Job5 Job1 Time 0 5 10 15 20 25 30 TIVC Model Doubling VM, network utilization and the job throughput J5 J3 J1 J4 J2

  20. Temporally-Interleaved Virtual Cluster (TIVC) • Key idea: Time-Varying BW Reservations • Compared to fixed-BW reservation • Improves utilization of data center • Better network utilization • Better VM utilization • Increases cloud provider’s revenue • Reduces cloud user’s cost • Without sacrificing job performance

  21. Challenges in Realizing TIVC Q1: What are right model functions? Q2: How to automatically derive the models? Bandwidth Bandwidth B B Time Time Virtual Switch 0 T 0 T Request <N, B> Request <N, B(t)> . . . N VMs Virtual Cluster Model

  22. Challenges in Realizing TIVC Q3: How to efficiently allocate TIVC? Q4: How to enforce TIVC?

  23. Challenges in Realizing TIVC • What are the right model functions? • How to automatically derive the models? • How to efficiently allocate TIVC? • How to enforce TIVC?

  24. How to Model Time-Varying BW? Hadoop Hive Join

  25. TIVC Models Virtual Cluster T32 T11

  26. Hadoop Sort

  27. Hadoop Word Count v

  28. Hadoop Hive Join

  29. Hadoop Hive Aggregation

  30. Challenges in Realizing TIVC • What are the right model functions? • How to automatically derive the models? • How to efficiently allocate TIVC? • How to enforce TIVC?

  31. Possible Approach • “White-box” approach • Given source code and data of cloud application, analyze quantitative networking requirement • Very difficult in practice • Observation: Many jobs are repeated many times • E.g., 40% jobs are recurring in Bing’s production data center [Agarwal’12] • Of course, data itself may change across runs, but size remains about the same

  32. Our Approach • Solution: “Black-box” profiling based approach • Collect traffic trace from profiling run • Derive TIVC model from traffic trace • Profiling: Same configuration as production runs • Same number of VMs • Same input data size per VM • Same job/VM configuration How much BW should we reserve to the application?

  33. Impact of BW Capping No-elongation BW threshold

  34. Choosing BW Cap • Tradeoff between performance and cost • Cap > threshold: same performance, costs more • Cap < threshold: lower performance, may cost less • Our Approach: Expose tradeoff to user • Profile under different BW caps • Expose run times and cost to user • User picks the appropriate BW cap Only below threshold ones

  35. From Profiling to Model Generation • Collect traffic trace from each VM • Instantaneous throughput of 10ms bin • Generate models for individual VMs • Combine to obtain overall job’s TIVC model • Simplify allocation by working with one model • Does not lose efficiency since per-VM models are roughly similar for MapReduce-like applications

  36. Generate Model for Individual VM • Choose Bb • Periods where B > Bb, set to Bcap Bcap BW Bb Time

  37. Challenges in Realizing TIVC • What are the right model functions? • How to automatically derive the models? • How to efficiently allocate TIVC? • How to enforce TIVC?

  38. TIVC Allocation Algorithm • Spatio-temporal allocation algorithm • Extends VC allocation algorithm to time dimension • Employs dynamic programming • Chooses lowest level subtree • Properties • Locality aware • Efficient and scalable • 99th percentile 28ms on a 64,000-VM data center in scheduling 5,000 jobs

  39. Challenges in Realizing TIVC • What are the right model functions? • How to automatically derive the models? • How to efficiently allocate TIVC? • How to enforce TIVC?

  40. Enforcing TIVC Reservation • Possible to enforce completely in hypervisor • Does not have control over upper level links • Requires online rate monitoring and feedback • Increases hypervisor overhead and complexity • Enforcing BW reservation in switches • Most small jobs will fit into a rack • Only a few large jobs cross the core • Avoid complexity in hypervisors

  41. Challenges in Realizing TIVC • What are the right model functions? • How to automatically derive the models? • How to efficiently allocate TIVC? • How to enforce TIVC?

  42. Proteus: Implementing TIVC Models 1. Determine the model 2. Allocate and enforce the model

  43. Evaluation • Large-scale simulation • Performance • Cost • Allocation algorithm • Prototype implementation • Small-scale testbed

  44. Simulation Setup • 3-level tree topology • 16,000 Hosts x 4 VMs • 4:1 oversubscription • Workload • N: exponential distribution around mean 49 • B(t): derive from real Hadoop apps 50Gbps 20 Aggr Switch … 10Gbps 20 ToR Switch … … 1Gbps 40 Hosts … … … …

  45. Batched Jobs • Scenario: 5,000 time-insensitive jobs 1/3 of each type All rest results are for mixed Completion time reduction 42% 21% 23% 35%

  46. Varying Oversubscription and Job Size 25.8% reduction for non-oversubscribed network

  47. Dynamically Arriving Jobs • Scenario: Accommodate users’ requests in shared data center • 5,000 jobsarrives dynamically with varying loads Rejected: VC: 9.5% TIVC: 3.4%

  48. Analysis: Higher Concurrency • Under 80% load 28% higher VM utilization 28% higher revenue Rejected jobs are large Charge VMs • 7% higher job concurrency VM

  49. Testbed Experiment • Setup • 18 machines • Real 30 MapReduce jobs • 10 Sort • 10 Hive Join • 10 Hive Aggre.

  50. Testbed Result Baseline suffers at variability of completion time, TIVC achieves similar performance as VC TIVC finishes job faster than VC, Baseline finishes the fastest

More Related