1 / 20

Hidra: History Based Dynamic Resource Allocation For Server Clusters

Hidra: History Based Dynamic Resource Allocation For Server Clusters. Jayanth Gummaraju 1 and Yoshio Turner 2 1 Stanford University, CA, USA 2 Hewlett-Packard Labs., Palo Alto, CA, USA. ITA05, Wrexham, UK September 2005. Why Dynamic Resource Allocation.

mennis
Download Presentation

Hidra: History Based Dynamic Resource Allocation For Server Clusters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hidra: History Based Dynamic Resource Allocation For Server Clusters Jayanth Gummaraju1 and Yoshio Turner2 1 Stanford University, CA, USA 2Hewlett-Packard Labs., Palo Alto, CA, USA ITA05, Wrexham, UK September 2005

  2. Why Dynamic Resource Allocation • High demand variation for an Internet service • Daily: peak load ~10 times average load during day • Variation over longer time scales (days, weeks) • Benefits of Dynamic Resource Allocation • Reduce operating costs for a service • Energy • Software license fees • Support more services on a shared infrastructure • Shift resources between services on-demand • Practical: fast server re-purposing • Blade server management • Networked storage • Virtual machine cloning/migration

  3. Problem • Determine resource requirements for a service on-the-fly • Challenges: • Frequent service updates • Frequent changes in client interest set  Static a priori capacity planning won’t work

  4. Approach: Hidra Hidra: History-based Dynamic Resource Allocation • “Black-box approach”: continuously build and update a model of system behavior from externally visible performance attributes, without knowledge of internal operation (e.g., what is the bottleneck resource) • Model updates: introduce freshness and confidence • Extrapolation: determine resource requirements with only a partial model

  5. Scope • Large services requiring multiple servers • Multi-tier: each tier = a cluster of servers. Assumptions: • Identical servers within a tier • Servers in different tiers can be different • Allocation granularity = Server (ex: blade in a blade server) • Predictable client request rate • Reasonable if smoothly varying, or occasional discontinuities • Service and server behavior can change over time • Goal: Find minimum cost resource allocation that meets server response time requirement • Cost = sum of cost of servers allocated to each tier • Mean response time (may be generalized)

  6. Outline • Single-tier history-based resource allocation • Constructing and updating history-based model (freshness and confidence) • Using the model to determine resource allocation (extrapolation) • Multi-tier history-based resource allocation • Summary

  7. Single-Tier History-Based Model • Model represents the average behavior of a server in a tier • Consists of a collection of measured operating points (history) for the tier • Each history point: at least (request rate per server, mean response time) • Model provides an estimate of function F (): response time = F (request rate) (increasing function in range of interest) (per-server request rate)

  8. Response time threshold l Using the History-Based Model • Goal: find the fewest servers needed to meet a requirement for maximum mean response time • Extrapolate model to find l, the largest feasible average request rate per server • Given R = tier’s applied load (requests per second) Resource allocation = N = R/l servers (per-server request rate)

  9. Updating the Model • Response time function can change over time: • Service content or implementation • Client interest set • Number of allocated servers (request distribution, and non-linear performance scaling) • Nevertheless, history-based model is useful • Gradual changes  recent history is a good approximation • Occasional large changes  recent history is relevant except in immediate moments after a large change • Periodically update model based on current performance measurements • Balance responsiveness and accuracy: Incorporate new measurements quickly to model current behavior, but not so aggressively that transient glitches pollute the model

  10. History Update: Freshness and Confidence • History point update as weighted average of stored value and new measurement New stored value = a * old stored value + (1 – a) * new measurement • Older history is less likely to represent current behavior • Recent history can be obsolete after a sudden shift in behavior • Weighting factor a combines: • Freshness: value which decreases with time since last update • Confidence: value which increases with repeated confirmation of consistent behavior for the history point • Combination: EWMA (captures freshness) with decay rate that slows with increasing confidence

  11. Extrapolation: Determining Resource Allocation • Model has incomplete view of response time function • To find optimal l, Hidra extrapolates/interpolates unique pair of history points • Only use points that match general shape of typical response time curve (positive slope) • Favor points with high avalue (ignore if ais very small) • If only one point exists (current operating point), adjust allocation differently 7 8 9 6 Response Time 5 X Y Z Threshold 4 3 2 1 Applied Load • Limits on consecutive changes in resource allocation (fixed limit for decreases, growing limits for increases)

  12. Single-Tier Evaluation: Overview • Approach: Apply Hidra to allocate resources for a simulated cluster • Simulation allows easy control of cluster behavior and determination of optimal allocation • Each server modeled as simple M/M/1 queue with time-varying arrival rate l and service rate m • Provides response time function that varies over time • More complex models not needed for our purposes • Effectiveness of freshness and confidence • Effectiveness for clusters with non-linear cluster performance scaling

  13. Effectiveness of Freshness • Increase msteadily over time from 40 to 70 req/s • No freshness (red) uses obsolete information • Freshness (green) close to optimal (blue) allocation

  14. Effectiveness of Confidence • Set mconstant over time except for periodic transients Freshness only, no Confidence Freshness and Confidence • Using Confidence, Hidra less susceptible to short-term transients by preserving more commonly observed values

  15. Non-Linear Cluster Scaling • Response time function may be sensitive to the resource allocation. Examples: • Caching effect: Memory in each additional server adds to total effective content cache capacity if shared effectively  throughput scales faster than N • Communication effect: Overhead of coordination between servers  throughput scales slower than N • Evaluate using request rates from hp.com logs for a 24-hour period • Caching: assume hit ratio increases linearly with N, causing increase of service rate m • Communication: increase service time (1/m) linearly with N

  16. Response Time Resource Allocation Service Rate m Caching Effect Results • Wide variation in the average behavior of a server • Each server is more effective as allocation is increased • Hidra adapts, achieving close to optimal allocation

  17. Response Time Resource Allocation Service Rate m Communication Effect Results • Opposite service rate behavior compared to caching • Each server is less effective as allocation is increased • Hidra handles this case also

  18. Multi-Tier Resource Allocation • Multi-Tier characteristics • A request to first tier could trigger multiple secondary requests to other tiers • Average response time is sum of average response times of each tier • Cost of resource could be different for different tiers • Multi-Tier resource allocation as an extension of the single-tier case • Response time for each tier computed using single-tier algorithm • Dynamically vary target response times for each tier to minimize total cost resource allocation • Same client request rate used for all tiers

  19. Two-Tier Results Total cost of allocated servers Caching (both tiers) Communication (both tiers) Caching (Tier1) Communication (Tier 2) • Same effect in both tiers  results similar to single-tier case are optimal • Different effects in each tier  optimal allocation has cost intermediate between the two extremes • Hidra adapts successfully to all these cases

  20. Summary • Presented Hidra for history-based resource allocation of server clusters • Proposed use of freshness and confidence to update history-based model effectively • Developed extrapolation approach for finding operating point with incomplete model • Extended the model to multi-tier systems • Simulation-based results show scheme is promising for both single-tier and multi-tier systems

More Related