230 likes | 241 Views
This paper explores the challenges of provisioning resources for multi-tier internet applications and proposes a dynamic capacity allocation strategy based on workload prediction and reactive provisioning. The approach aims to provide desired response times under varying workloads.
E N D
Dynamic Provisioning for Multi-tier Internet Applications Bhuvan Urgaonkar, Prashant Shenoy, Abhishek Chandra, Pawan Goyal University of Massachusetts University of Minnesota Veritas Software India Pvt. Ltd.
Internet Applications • Proliferation of Internet applications auction site online game online store • Growing significance in personal, business affairs • Focus: Internet server applications
requests Load balancer database http J2EE Multi-tiered Internet Applications • Internet applications: multiple tiers • Example: 3 tiers: HTTP, J2EE app server, database • Replicable components • Individual tiers: partially or fully replicable • Example: clustered HTTP, J2EE server, shared-nothing db • Each tier uses a dispatcher: load balancing
140000 120000 100000 80000 Request Rate (req/min) 60000 40000 20000 0 0 5 10 15 20 Time (hrs) Internet Workloads Are Dynamic 1200 • Multi-time-scale variations • Time-of-day, hour-of-day • Flash crowds Key issue: How to provide desired response time under varying workloads? 0 0 1 2 3 4 5 Arrivals per min Time (days) 140K 0 0 12 24 Time (hours)
Internet Data Center • Internet applications run on data centers • Server farms • Provide computational and storage resources • Applications share data center resources • Problem: How should the platform allocate resources to absorb workload variations?
Our Provisioning Approach • Flexible queuing theoretic model • Captures all tiers in the application • Predictive provisioning • Long-term workload variations • Reactive provisioning • Short-term variations, flash crowds
Talk Outline • Introduction • Internet data center model • Existing provisioning approaches • Dynamic capacity provisioning • Implementation and evaluation • Summary
Data Center Model • Dedicated hosting: each application runs on a subset of servers in the data center • Subsets are mutually exclusive: no server sharing • Data center hosts multiple applications • Free server pool: unused servers Retail Web site streaming
10 10 14 14 req/s C=10.1 C=10 C=15 dropped 4 req/s Single-tier Provisioning • Single tier provisioning well studied [Muse] • Non-trivial to extend to multiple-tiers • Strawman #1: use single-tier provisioning independently at each tier • Problem: independent tier provisioning may not increase goodput
C=10.1 C=15 Single-tier Provisioning • Single tier provisioning well studied [Muse] • Non-trivial to extend to multiple-tiers • Strawman #1: use single-tier provisioning independently at each tier • Problem: independent tier provisioning may not increase goodput 10.1 14 14 14 req/s C=20 dropped 3.9 req/s
10.1 14 14 14 req/s C=10.1 C=20 C=15 Model-based Provisioning • Black box approach • Treat application as a black box • Measure response time from outside • Increase allocation if response time > SLA • Use a model to determine how much to allocate • Strawman #2: use black box for multi-tier apps • Problems: • Unclear which tier needs more capacity • May not increase goodput if bottleneck tier is not replicable
Provisioning Multi-tier Apps • Approach: holistic view of multi-tier application • Determine tier-specific capacity independently • Allocate capacity by looking at all tiers (and other apps) • Predictive provisioning • Long-term provisioning: time scale of hours • Maintain long-term workload statistics • Predict and provision for the next few hours • Reactive provisioning • Short term provisioning: time scale of several minutes • React to “current” workload trends • Correct errors of long-term provisioning • Handle flash crowds (inherently unpredictable)
Predictive Provisioning • Workload predictor • Predicts workload based on past observations • Application model • Infers capacity needed to handle given workload past workload predicted workload required capacity Predictor Model response time target
Workload Prediction • Long term workload monitoring and prediction • Monitor workload for multiple days • Maintain a histogram for each hour of the day • Capture time of day effects • Forecast based on • Observed workload for that hour in the past • Observed workload for the past few hours of the current day • Predict a high percentile of expected workload Mon Tue Wed Today
G/G/1 G/G/1 G/G/1 Model-based Capacity Inference • Queuing theoretic application model • Each individual server is a G/G/1 queue • Derive per-tier E(r) from end-to-end SLA • Monitor other parameters and determine l (per-server capacity) • Use predicted workload lpred to determine # servers per tier • Assumes perfect load balancing in each tier lpred
Reactive Provisioning lactual Prediction error Invoke reactor allocate servers > t lerror • Idea: react to current conditions • Useful for capturing significant short-term fluctuations • Can correct errors in predictions • Track error between long-term predictions and actual • Allocate additional servers if error exceeds a threshold • Account for prediction errors • Can be invoked if request drop rate exceeds a threshold • Handles sudden flash crowds • Operates over time scale of a few minutes • Pure reactive provisioning: lags workload • Reactive + predictive more effective! lpred time series
Talk Outline • Introduction • Internet data center model • Existing provisioning approaches • Dynamic capacity provisioning • Implementation and evaluation • Summary
Apps Apps Apps Nucleus Nucleus Nucleus OS OS OS Prototype Data Center Server Node • 40+ Linux servers • Gigabit switches • Multi-tier applications • Auction (RUBiS) • Bulletin-board (RUBBoS) • Apache, Tomcat (replicable) • Mysql database Applications Resource monitoring Parameter estimation Control Plane Dynamic provisioning
Only Predictive Provisioning • Auction application RUBiS • Factor of 4 increase in 30 min Workload Response time • Predictor fails during [15, 30] resulting in under-provisioning • Response time violations occur
Only Reactive Provisioning • Auction application RUBiS • Factor of 4 increase in 30 min Workload Response time Resp time (msec) Time (min) • Response time shows oscillatory behavior • Several response time violations occur
160 7000 140 6000 120 5000 100 4000 Arrivals per min Resp time (msec) 80 3000 60 2000 40 1000 20 0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Time (min) Time (min) Predictive + Reactive Provisioning • Auction application RUBiS • Factor of 4 increase in 30 min Server allocations Workload Response time • Server allocations increased to match increased workload • Response time kept below 2 seconds
Summary • Dynamic provisioning for multi-tier applications • Flexible queuing theoretic model • Captures all tiers in the application • Predictive provisioning • Reactive provisioning • Implementation and evaluation on a Linux cluster
Thank you! More information at: http://www.cs.umass.edu/~bhuvan