330 likes | 536 Views
Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting. Roy, N., A. Dubey , and A. Gokhale 4th IEEE International Conference on Cloud Computing (Cloud 2011). Agenda. Introduction Related Work Challenge Solution Evaluation Conclusion Comment.
E N D
Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale4th IEEE International Conference on Cloud Computing (Cloud 2011)
Agenda • Introduction • Related Work • Challenge • Solution • Evaluation • Conclusion • Comment
Introduction (1/4) • Typically customers maintain SLAs with service providers for the QoS properties. • Failure to comply with satisfying these QoS metrics leads to a major loss of revenue in the form of decreased user base • Catering to the SLA while still keeping costs low is challenging for such enterprise systems
Introduction (3/4) • A problem with such a resource allocation scheme • Is the chance of thrashing where due to frequent variation of workload, machines can be added and released on every sample • A desirable solution would require an ability to predict the incoming workload on the system and allocate resources a priori
Introduction (4/4) • autoscaling the resources in a cloud environment is not an easy and straightforward task. • (i) overheads related to state transition when number of resources are changed • (ii) ability to accurately predict future workload • (iii) compute the right number of resources required for the expected increase or decrease in workload.
Related Work (1/2) • (1) Heuristics-based virtual machine allocation and migration • Urgaonkar et. al. [2], 2008 • VM, dynamic provisioning, Queueing model • Only a single VM can be run in a host • Wood et. al. [3], 2007 • VM, dynamic migration • define a unique metric based on the consumption data of the three resources to make the migration decision • CPU, network and memory • Cunha et. al. [4], 2007 • Queueing model • Pricing Model that gives rewards for throughput to be within SLA limits and penalty for throughput going above
Related Work (2/2) • (2) Autonomic management of virtual computing environment using control-theoretic approaches: • Wang et. al. [6] • A load balancing controller • VMs are all load-balanced and the response time of the applications in all the VMs are the same • Moreno et. al. [7] • An architecture for elastic management of cluster-based services • Waheed et. al. [8] • Reactive algorithm to allocate resources to a cluster farm • Yang et. al [9] • Profiled based approach
Challenge (1/4) • Challenge 1: Workload Forecasting • Correctness of prediction • Releasing resources is easy, but.. • Acquiring resources • Make a call on the cloud API which starts the acquisition process • The machine will be needed to boot up with the specified image • The application need to be started
Challenge (2/4) • Challenge 2: Identify Resource Requirement for Incoming Load • The required number of resources is a function of • the number of customers • the nature of the application • the type of calls that each customer makes on the application
Challenge (3/4) • Challenge 3: Resource Allocation while Optimizing Multiple Cost Factors • To optimize resource usage and/or minimize idle resources • define a time interval and change resources as many times as possible as workload changes. • In the limit, this interval could be made infinitesimally small and resources are changed continuously in accordance with the change in load
Challenge (4/4) • Obviously, such as scheme is not possible • the overhead in allocating a resource • scaling up or down resources also involves cost and needs to be optimized
Solution (1/9) • Control theory offers a promising methodology to address the challenges
Solution (2/9) • 1. For every future time step, it computes the cost of selecting each possible resource allocation • 2. To compute the cost of a particular allocation, it uses Algorithm 1 to compute the estimated response time for that particular machine configuration • 3. Once the response time is calculated, it is used to calculate the cost of the allocation which is a combination of • how far the estimated response time is from the SLA bounds (SLA violation) • cost of leasing additional machines • and also a cost of re-configuration
Solution (3/9) • A. Workload Prediction • Authors used a second order autoregressive moving average method (ARMA) filter for the workload • The value for the variables βand γ are given by the values 0.8 and 0.15
ARMA • Autoregressive Model (AR) • a model depends on the level of the lagged observations • For example, if we observe a high realisation of GDP we would expect that the GDP in the next few periods are high as well
ARMA • Moving Average Model (MA) • model that the observations of a random variable at time t are not only affected by the shock at time t, but also the shocks of prior periods • Ex. if we observe a negative shock to the economy, say, 9/11, then we would expect that the negative effect affects the economy also for the near future.
ARMA • Autoregressive Moving Average Model • combine both models we get a ARMA(p,q) model • ARMA models are widely used for prediction of economic and industrial time series
Solution (5/9) • B. Performance Model • The next challenge we resolve is identifying resource requirements for the predicted workload • The workload used in this work is the number of users currently in the system. • It also depends upon what each user does. • In prior work [20] we have used Customer Behavior Modeling Graphs (CBMG) (?)to model the overall behavior of customers
Solution (6/9) • A CBMG is built from a log of previous customer behavior and computes the probability of a typical user to visit each page • Using this information, we can calculate the number of visits to a single page from the total number of customers in the system. • The number of visits to each page helps in calculating the average load on each page.
Solution (8/9) • C. Optimizing Resource Provisioning • The intuition is to identify the right number of time intervals • Our solution works on look-ahead optimization • iteratively solves an optimization problem, Costopt, starting from t0
Solution (9/9) • The next challenge is the choice of the look-ahead period. • A small look-ahead period will neglect trends • A very large period will increase computational complexity • The actual algorithm is not described here • because the implementation requires recursive data structures • is difficult to describe in the limited space available.
Evaluation (1/7) • Cost Function
Evaluation (2/7) Just in time Resource Allocation • the weights on each component of the cost function is the same
Evaluation (3/7) -- Resource Usage under Different Cost Priorities • 1) SLA violation against Resource Cost • The ratio of SLA penalty to machine cost is varied from 4 : 1 to 1 : 13
Evaluation (4/7) -- Resource Usage under Different Cost Priorities
Evaluation (5/7) -- Resource Usage under Different Cost Priorities
Evaluation (6/7) -- Resource Usage under Different Cost Priorities • 2) Including the Cost of Reconfiguration
Evaluation (7/7) -- Resource Usage under Different Cost Priorities
Conclusion • this paper describes a look-ahead resource allocation algorithm based on model predictive control • predicts future workload • adjusts resources allocated to users ahead-of-time
Comments • The detail of the model in the paper is too simple • I cannot understand why the authors did these evaluations • The paper use control theory and it seems to have a good prediction of workload • Something in 3 challenges • the overhead of allocating resources • the prediction interval • the costs of SLA violation, reconfiguring machine….