200 likes | 309 Views
Dynamic Resource Allocation for Shared Data Centers Using Online Measurements. By- Abhishek Chandra, Weibo Gong and Prashant Shenoy. Overview Outline. Motivation System Model Dynamic Allocation Techniques Experimental Results Conclusions. Motivation. Data Centers Server farms
E N D
Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and PrashantShenoy
Overview Outline • Motivation • System Model • Dynamic Allocation Techniques • Experimental Results • Conclusions
Motivation • Data Centers • Server farms • Rent computing and storage resources to applications • Revenue for meeting QoSguarantees • Goals • Satisfy application QoSguarantees • Maximize resource utilization of platform • Robustness against “Slashdot” effects • Cluster of servers – Dedicated or Shared • Static Allocation is problematic
Dynamic Resource Allocation • Periodically re-allocate resources among applications • Estimate resource requirements for near future • Challenges • Reallocation at short time-scales • No prior workload profiling/knowledge • Low overhead • Approach: Online Measurement-based Allocation
Research Contribution • Generalized processor sharing (GPS) • Time domain queuing model & Non-linear optimization technique • Prediction algorithm • Synthetic Workloads & Real Web Traces
ProblemFormulation • Resource Model • Queue are assumed to be served in FIFO order and the resource capacity C is shared among the queues using GPS • Queue is assigned a weight • Allocated a resource share in proportion to its weight. • GPS Scheduler
Problem Definition • If denotes the target response time of application and is its observed mean response time, then the application should be allocated a share , such that . • The discontent of an application grows as its response time deviates from the target di. This discontent function can be represented as follows • System goal then is to assign a share to each application such that the total system-wide discontent, i.e., the quantity is minimized.
Adaptation Window History Measurement Interval Monitoring • Measure system and application metrics • Queue lengths • Request response times • Monitoring windows Time
Allocating • Invoked periodically to dynamically partition the resource capacity among the various applications running on the shared server. • Resource Model Types • Time-domain Queuing Model • Online optimization-based Model
Time Domain Queuing Model • Transient queuing behavior over adaptation window • The request service rate is • Relation between mean response time T¯ and application share. Average response time in near future: • Relation is parameterized by the measured workload • Arrival rateλand mean service time s¯
Optimization-based Resource Allocation • Discontent function • Non-linear Optimization Problem: • Solved using Lagrange multiplier method
Prediction • Short-term prediction of workload characteristics • Request arrival process • Service demand distribution • Use history of measured system metrics
Prediction Techniques • Estimating the Arrival Rate • Accurate estimate of allows the time domain queuing model to estimate the average queue length for the next adaptation window. • We represent Ai at any time by the sequence of values from the measurement history. • To predict , model using the AR(1), a sample value of Ai is estimated as • Estimating the Service Demand • Computes the probability distribution of the per-request service demands • Mean of the distribution is used to represent the service demand of application requests • Measuring the Queue Length • Monitoring module records the no. of outstanding requests at the beginning of each adaptation window.
Experiments • Soccer World Cup’98 Traces • Results based on a 24-hour portion of the trace • 755,000 requests • Mean req rate: 8.7 req/sec • Mean req size: 8.47 KB
Experiments Evaluation • Synthetic Web Workload Comparison of static and dynamic resource allocations for a synthetic web workload
Trace-driven Web Workloads Comparison of static and dynamic resource allocations in the presence of heavy-tailed request sizes and varying arrival rates.
Adaptation to Transient Overloads The workload and the resulting allocations in the presence of varying arrival rates and varying request sizes
Conclusions • Dynamic Resource Allocation needed for data centers • Measurement-based allocation: • Monitoring and Prediction gather online state • Use this state for application modeling and allocation • Results showed that these techniques can judiciously allocate system resources, especially under transient overload conditions