Dynamic Resource Allocation for Shared Data Centers Using Online Measurements

Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and PrashantShenoy

Overview Outline • Motivation • System Model • Dynamic Allocation Techniques • Experimental Results • Conclusions

Motivation • Data Centers • Server farms • Rent computing and storage resources to applications • Revenue for meeting QoSguarantees • Goals • Satisfy application QoSguarantees • Maximize resource utilization of platform • Robustness against “Slashdot” effects • Cluster of servers – Dedicated or Shared • Static Allocation is problematic

Dynamic Resource Allocation • Periodically re-allocate resources among applications • Estimate resource requirements for near future • Challenges • Reallocation at short time-scales • No prior workload profiling/knowledge • Low overhead • Approach: Online Measurement-based Allocation

Research Contribution • Generalized processor sharing (GPS) • Time domain queuing model & Non-linear optimization technique • Prediction algorithm • Synthetic Workloads & Real Web Traces

ProblemFormulation • Resource Model • Queue are assumed to be served in FIFO order and the resource capacity C is shared among the queues using GPS • Queue is assigned a weight • Allocated a resource share in proportion to its weight. • GPS Scheduler

Problem Definition • If denotes the target response time of application and is its observed mean response time, then the application should be allocated a share , such that . • The discontent of an application grows as its response time deviates from the target di. This discontent function can be represented as follows • System goal then is to assign a share to each application such that the total system-wide discontent, i.e., the quantity is minimized.

Dynamic Resource Allocation

Adaptation Window History Measurement Interval Monitoring • Measure system and application metrics • Queue lengths • Request response times • Monitoring windows Time

Allocating • Invoked periodically to dynamically partition the resource capacity among the various applications running on the shared server. • Resource Model Types • Time-domain Queuing Model • Online optimization-based Model

Time Domain Queuing Model • Transient queuing behavior over adaptation window • The request service rate is • Relation between mean response time T¯ and application share. Average response time in near future: • Relation is parameterized by the measured workload • Arrival rateλand mean service time s¯

Optimization-based Resource Allocation • Discontent function • Non-linear Optimization Problem: • Solved using Lagrange multiplier method

Prediction • Short-term prediction of workload characteristics • Request arrival process • Service demand distribution • Use history of measured system metrics

Prediction Techniques • Estimating the Arrival Rate • Accurate estimate of allows the time domain queuing model to estimate the average queue length for the next adaptation window. • We represent Ai at any time by the sequence of values from the measurement history. • To predict , model using the AR(1), a sample value of Ai is estimated as • Estimating the Service Demand • Computes the probability distribution of the per-request service demands • Mean of the distribution is used to represent the service demand of application requests • Measuring the Queue Length • Monitoring module records the no. of outstanding requests at the beginning of each adaptation window.

Experiments • Soccer World Cup’98 Traces • Results based on a 24-hour portion of the trace • 755,000 requests • Mean req rate: 8.7 req/sec • Mean req size: 8.47 KB

Experiments Evaluation • Synthetic Web Workload Comparison of static and dynamic resource allocations for a synthetic web workload

Trace-driven Web Workloads Comparison of static and dynamic resource allocations in the presence of heavy-tailed request sizes and varying arrival rates.

Adaptation to Transient Overloads The workload and the resulting allocations in the presence of varying arrival rates and varying request sizes

Conclusions • Dynamic Resource Allocation needed for data centers • Measurement-based allocation: • Monitoring and Prediction gather online state • Use this state for application modeling and allocation • Results showed that these techniques can judiciously allocate system resources, especially under transient overload conditions

Thank You

Dynamic Resource Allocation for Shared Data Centers Using Online Measurements