1 / 23

Dynamic Provisioning for Multi-tier Internet Applications

This paper explores the challenges of provisioning resources for multi-tier internet applications and proposes a dynamic capacity allocation strategy based on workload prediction and reactive provisioning. The approach aims to provide desired response times under varying workloads.

wiegand
Download Presentation

Dynamic Provisioning for Multi-tier Internet Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dynamic Provisioning for Multi-tier Internet Applications Bhuvan Urgaonkar, Prashant Shenoy, Abhishek Chandra, Pawan Goyal University of Massachusetts University of Minnesota Veritas Software India Pvt. Ltd.

  2. Internet Applications • Proliferation of Internet applications auction site online game online store • Growing significance in personal, business affairs • Focus: Internet server applications

  3. requests Load balancer database http J2EE Multi-tiered Internet Applications • Internet applications: multiple tiers • Example: 3 tiers: HTTP, J2EE app server, database • Replicable components • Individual tiers: partially or fully replicable • Example: clustered HTTP, J2EE server, shared-nothing db • Each tier uses a dispatcher: load balancing

  4. 140000 120000 100000 80000 Request Rate (req/min) 60000 40000 20000 0 0 5 10 15 20 Time (hrs) Internet Workloads Are Dynamic 1200 • Multi-time-scale variations • Time-of-day, hour-of-day • Flash crowds Key issue: How to provide desired response time under varying workloads? 0 0 1 2 3 4 5 Arrivals per min Time (days) 140K 0 0 12 24 Time (hours)

  5. Internet Data Center • Internet applications run on data centers • Server farms • Provide computational and storage resources • Applications share data center resources • Problem: How should the platform allocate resources to absorb workload variations?

  6. Our Provisioning Approach • Flexible queuing theoretic model • Captures all tiers in the application • Predictive provisioning • Long-term workload variations • Reactive provisioning • Short-term variations, flash crowds

  7. Talk Outline • Introduction • Internet data center model • Existing provisioning approaches • Dynamic capacity provisioning • Implementation and evaluation • Summary

  8. Data Center Model • Dedicated hosting: each application runs on a subset of servers in the data center • Subsets are mutually exclusive: no server sharing • Data center hosts multiple applications • Free server pool: unused servers Retail Web site streaming

  9. 10 10 14 14 req/s C=10.1 C=10 C=15 dropped 4 req/s Single-tier Provisioning • Single tier provisioning well studied [Muse] • Non-trivial to extend to multiple-tiers • Strawman #1: use single-tier provisioning independently at each tier • Problem: independent tier provisioning may not increase goodput

  10. C=10.1 C=15 Single-tier Provisioning • Single tier provisioning well studied [Muse] • Non-trivial to extend to multiple-tiers • Strawman #1: use single-tier provisioning independently at each tier • Problem: independent tier provisioning may not increase goodput 10.1 14 14 14 req/s C=20 dropped 3.9 req/s

  11. 10.1 14 14 14 req/s C=10.1 C=20 C=15 Model-based Provisioning • Black box approach • Treat application as a black box • Measure response time from outside • Increase allocation if response time > SLA • Use a model to determine how much to allocate • Strawman #2: use black box for multi-tier apps • Problems: • Unclear which tier needs more capacity • May not increase goodput if bottleneck tier is not replicable

  12. Provisioning Multi-tier Apps • Approach: holistic view of multi-tier application • Determine tier-specific capacity independently • Allocate capacity by looking at all tiers (and other apps) • Predictive provisioning • Long-term provisioning: time scale of hours • Maintain long-term workload statistics • Predict and provision for the next few hours • Reactive provisioning • Short term provisioning: time scale of several minutes • React to “current” workload trends • Correct errors of long-term provisioning • Handle flash crowds (inherently unpredictable)

  13. Predictive Provisioning • Workload predictor • Predicts workload based on past observations • Application model • Infers capacity needed to handle given workload past workload predicted workload required capacity Predictor Model response time target

  14. Workload Prediction • Long term workload monitoring and prediction • Monitor workload for multiple days • Maintain a histogram for each hour of the day • Capture time of day effects • Forecast based on • Observed workload for that hour in the past • Observed workload for the past few hours of the current day • Predict a high percentile of expected workload Mon Tue Wed Today

  15. G/G/1 G/G/1 G/G/1 Model-based Capacity Inference • Queuing theoretic application model • Each individual server is a G/G/1 queue • Derive per-tier E(r) from end-to-end SLA • Monitor other parameters and determine l (per-server capacity) • Use predicted workload lpred to determine # servers per tier • Assumes perfect load balancing in each tier lpred

  16. Reactive Provisioning lactual Prediction error Invoke reactor allocate servers > t lerror • Idea: react to current conditions • Useful for capturing significant short-term fluctuations • Can correct errors in predictions • Track error between long-term predictions and actual • Allocate additional servers if error exceeds a threshold • Account for prediction errors • Can be invoked if request drop rate exceeds a threshold • Handles sudden flash crowds • Operates over time scale of a few minutes • Pure reactive provisioning: lags workload • Reactive + predictive more effective! lpred time series

  17. Talk Outline • Introduction • Internet data center model • Existing provisioning approaches • Dynamic capacity provisioning • Implementation and evaluation • Summary

  18. Apps Apps Apps Nucleus Nucleus Nucleus OS OS OS Prototype Data Center Server Node • 40+ Linux servers • Gigabit switches • Multi-tier applications • Auction (RUBiS) • Bulletin-board (RUBBoS) • Apache, Tomcat (replicable) • Mysql database Applications Resource monitoring Parameter estimation Control Plane Dynamic provisioning

  19. Only Predictive Provisioning • Auction application RUBiS • Factor of 4 increase in 30 min Workload Response time • Predictor fails during [15, 30] resulting in under-provisioning • Response time violations occur

  20. Only Reactive Provisioning • Auction application RUBiS • Factor of 4 increase in 30 min Workload Response time Resp time (msec) Time (min) • Response time shows oscillatory behavior • Several response time violations occur

  21. 160 7000 140 6000 120 5000 100 4000 Arrivals per min Resp time (msec) 80 3000 60 2000 40 1000 20 0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Time (min) Time (min) Predictive + Reactive Provisioning • Auction application RUBiS • Factor of 4 increase in 30 min Server allocations Workload Response time • Server allocations increased to match increased workload • Response time kept below 2 seconds

  22. Summary • Dynamic provisioning for multi-tier applications • Flexible queuing theoretic model • Captures all tiers in the application • Predictive provisioning • Reactive provisioning • Implementation and evaluation on a Linux cluster

  23. Thank you! More information at: http://www.cs.umass.edu/~bhuvan

More Related