1 / 29

Volley: Automated Data Placement for Geo-distributed Cloud Services

Volley: Automated Data Placement for Geo-distributed Cloud Services. Presented By- Komal Pal VaibhavRastogi. Agenda. Introduction Motivation Design & Implementation Evaluation Conclusions and Future Work. Introduction.

kalyca
Download Presentation

Volley: Automated Data Placement for Geo-distributed Cloud Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Volley: Automated Data Placement for Geo-distributed Cloud Services Presented By- Komal Pal VaibhavRastogi

  2. Agenda • Introduction • Motivation • Design & Implementation • Evaluation • Conclusions and Future Work

  3. Introduction • Volley is a system for cloud services that performs automatic placement across geo-distributed datacenters and takes care of - • User perceived latencies • Business constraints - Datacenter resources, bandwidth costs

  4. Motivation

  5. Motivation • Problem : Automated data placement for serving each user from the best datacenter for that user. • Simplistic solution : Migrate data to DC geographically closest to user • Challenges : Costs to DC operator – • WAN bandwidth between DCs • Skewed DC utilization due to over-provisioning

  6. Motivation • Need of a new heuristic that can meet latest trends in modern cloud services : • Shared data • Data Inter-dependencies • Application changes • Reaching DC capacity limits • User mobility

  7. Cloud service trends • Live Mesh , Live Messenger: month-long workload traces a) Data Inter-dependencies

  8. Cloud Service Trends b) Client Geographic Diversity

  9. Cloud Service Trends c) Geographically Distant Data Sharing

  10. Cloud Service Trends d) Client Mobility

  11. Volley! • First research work to address placement of data across geo-distributed DCs. • Incorporates an iterative optimization algorithm based on weighted spherical means that handles complexities of shared data and data inter-dependencies.

  12. Design and Implementation

  13. Design Typical dataflow of an application using Volley

  14. Design • Workflow – • Request logging : timestamp, src, dst, req_size, id • Additional inputs – • requirements of RAM, disk, CPU for each type of data • capacity & cost model for all DCs • Model of inter-DC latency and client-DC latencies • Any additional constraints e.g. legal • Application specific migration

  15. Algorithm Phase 1: Compute initial placement : weighted spherical means Phase 2: Iteratively move data to reduce latency: weighted spring model, spherical coordinates Phase 3: Iteratively collapse data to DCs

  16. Evaluation

  17. Evaluation • Comparison of Volley with – • commonIP: data at DC closest to user • oneDC: all data in one DC • hash : hash data to DCs for load-balancing • Analytical evaluation using 12 commercial DCs as potential locations.

  18. Evaluation : DC capacity skew

  19. Evaluation : Inter-datacenter traffic

  20. Evaluation : User-perceived latency

  21. Evaluation: Volley vsCommonIPon a live system

  22. Evaluation : Convergence

  23. Evaluation: Convergence

  24. Evaluation : Resource Demands & Frequent re-computation • Small operational cost compared to operational savings in B/W consumption

  25. Conclusion and Future Work

  26. Conclusions and Future Work • Need for automated techniques to place data across geo-distributed DCs • Volley is the first system in this domain • Volley is based on analysis of traces of 2 large scale commercial cloud services – Live Mesh & Live Messenger

  27. Conclusions and Future Work • Reduces DC capacity skew by over 2x, inter-DC traffic by over 1.8x and 75th percentile latency by over 30% • What’s next - Using Volley to identify potential DC sites that will improve latency at modest cost

  28. Thank You!

  29. Limitations • Analysis may not be representative – only 2 applications, MS- specific. (data with interdependencies etc. – very representative). • Latency improvements are not very significant – no real cost-benefit analysis. (confidentiality issues) • Too simplistic to assume that only one such policy is in use at every datacenter without any optimization. (most common case – no other published work to show other alternatives) • Uses only geographic location – no RTT analysis (first foray into this area, can be combined with other approaches for further optimization) • Dependency on geo-location databases – may not be accurate, always. (still an improvement over existing mechanisms, may not even require higher granularity than what is being offered by DB)

More Related