Managing Energy and Server Resources in Hosting Centers

Managing Energy and Server Resources in Hosting Centers Jeff Chase, Darrell Anderson, Ron Doyle, Prachi Thakar, Amin Vahdat Duke University

Back to the Future • Return to server-centered computing: applications run as services accessed through the Internet. • Web-based services, ASPs, “netsourcing” • Internet services are hosted on server clusters. • Incrementally scalable, etc. • Server clusters may be managed by a third party. • Shared data center or hosting center • Hosting utility offers economies of scale: • Network access • Power and cooling • Administration and security • Surge capacity

Managing Energy and Server Resources • Key idea: a hosting center OS maintains the balance of requests and responses, energy inputs, and thermal outputs. • US in 2003: 22 TWh ($1B - $2B+) • Adaptively provision server resources to match request load. • Provision server resources for energy efficiency. • Degrade service on power/cooling failures. energy requests responses Power/cooling “browndown” Dynamic thermal management [Brooks] waste heat

Contributions • Architecture/prototype for adaptive provisioning of server resources in Internet server clusters (Muse) • Software feedback • Reconfigurable request redirection • Addresses a key challenge for hosting automation • Foundation for energy management in hosting centers • 25% - 75% energy savings • Degrade rationally (“gracefully”) under constraint (e.g., browndown) • Simple “economic” resource allocation • Continuous utility functions: customers “pay” for performance. • Balance service quality and resource usage.

Static Provisioning • Dedicate fixed resources per customer • Typical of “co-lo” or dedicated hosting • Reprovision manually as needed • Overprovision for surges • High variable cost of capacity How to automate resource provisioning for managed hosting?

Load Is Dynamic • ibm.com external site • February 2001 • Daily fluctuations (3x) • Workday cycle • Weekends off M T W Th F S S • World Cup soccer site • May-June 1998 • Seasonal fluctuations • Event surges (11x) • ita.ee.lbl.gov Week 6 7 8

Adaptive Provisioning - Efficient resource usage - Load multiplexing - Surge protection - Online capacity planning - Dynamic resource recruitment - Balance service quality with cost - Service Level Agreements (SLAs)

Utilization Targets i=allocated server resource for servicei i= utilization of iat i’s current load i • target= configurable target level for i • Leave headroom for load spikes. i >target: service i is underprovisioned i <target: service i is overprovisioned

Muse Architecture Executive performance measures configuration commands Control offered request load storage tier reconfigurable switches server pool stateless interchangeable • Executive controls mapping of service traffic to server resources by means of: • reconfigurable switches • scheduler controls (shares)

Server Power Draw 866 MHz P-III SuperMicro 370-DER (FreeBSD) Brand Electronics 21-1850 digital power meter boot 136w CPU max 120w CPU idle 93w watts Idling consumes 60% to 70% of peak power demand. disk spin 6-10w off/hiber2-3w work

A B C D Energy vs. Service Quality A B Active set = {A,B,C,D} Active set = {A,B} • i <target • Low latency • i =target • Meets quality goals • Saves energy

Energy-Conscious Provisioning • Light load: concentrate traffic on a minimal set of servers. • Step down surplus servers to a low-power state. • APM and ACPI • Activate surplus servers on demand. • Wake-On-LAN • Browndown: can provision for a specified energy target.

Resource Economy • Input: the “value” of performance for each customer i. • Common unit of value: “money”. • Derives from the economic value of the service. • Enables SLAs to represent flexible quality vs. cost tradeoffs. • Per-customer utility functionUi = bid – penalty. • Bid for traffic volume (throughput i). • Bid for better service quality, or subtract penalty for poor quality. • Allocate resources to maximize expected global utility (“revenue” or reward). • Predict performance effects. • “Sell” to the highest bidder. • Never sell resources below cost. Maximize bidi(i(t, i)) Subject to i  max

Maximizing Revenue • Consider any customer i with allotment iat fixed time t. • The marginal utility (pricei) for a resource unit allotted or reclaimed from i is the gradient of Uiat i. pricei Adjust allotments until price equilibrium is reached. The algorithm assumes that Uiis “concave”: the price gradients are non-negative and monotonically non-increasing. Expected Utility Ui(t, i) Resource allotment i

Feedback and Stability • Allocation planning is incremental. • Adjust the solution from the previous interval to react to new observations. • Allow system to stabilize before next re-evaluation. • Set adjustment interval and magnitude to avoid oscillation. • Control theory applies. [Abdelzaher, Shin et al, 2001] • Filter the load observations to distinguish transient and persistent load changes. • Internet service workloads are extremely bursty. • Filter must “balance stability and agility” [Kim and Noble 2001].

“Flop-Flip” Filter • EWMA-based filter alone is not sufficient. • Average At for each interval t: At= At-1 + (1-)Ot • The gain  may be variable or flip-flop. • Load estimate Et = Et-1ifEt-1 - At < tolerance elseEt = At • Stable • Responsive

IBM Trace Run (Before) Power draw (watts) Latency (ms*50) Throughput (requests/s) 1 ms

IBM Trace Run (After) 1 ms

Evaluating Energy Savings Trace replay shows adaptive provisioning in action. Server energy savings in this experiment was 29%. • 5-node cluster, 3x load swings, target = 0.5 • Expect roughly comparable savings in cooling costs. • Ventilation costs are fixed; chiller costs are proportional to thermal loading. For a given “shape” load curve, achievable energy savings increases with cluster size. • E.g., higher request volumes, • or lower targetfor better service quality. • Larger clusters give finer granularity to closely match load.

Expected Resource Savings

Conclusions • Dynamic request redirection enables fine-grained, continuous control over mapping of workload to physical server resources in hosting centers. • Continuous monitoring and control allows a hosting center OS to provision resources adaptively. • Adaptive resource provisioning is central to energy and thermal management in data centers. • Adapt to energy “browndown” by degrading service quality. • Adapt to load swings for 25% - 75% energy savings. • Economic policy framework guides provisioning choices based on SLAs and cost/benefit tradeoffs.

Future Work • multiple resources (e.g., memory and storage) • multi-tier services and multiple server pools • reservations and latency QoS penalties • rational server allocation and request distribution • integration with thermal system in data center • flexibility and power of utility functions • server networks and overlays • performability and availability SLAs • application feedback

Muse Prototype and Testbed SURGE or trace load generators Executive power meter Extreme GigE switch LinkSys 100 Mb/s switch redirectors (PowerEdge 1550) server pool client cluster faithful trace replay + synthetic Web loads server CPU-bound FreeBSD-based redirectors resource containers APM and Wake-on-LAN

Throughput and Latency saturated:i > target i increases linearly with i overprovisioned:i > target may reclaim: i(target - i) Average per-request service demand: i i /i

An OS for a Hosting Center • Hosting centers are made up of heterogeneous components linked by a network fabric. • Components are specialized. • Each component has its own OS. • The role of a hosting center OS is to: • Manage shared resources (e.g., servers, energy) • Configure and monitor component interactions • Direct flow of request/response traffic

Allocation Under Constraint (0)

Allocation Under Constraint (1)

Outline • Adaptive server provisioning • Energy-conscious provisioning • Economic resource allocation • Stable load estimation • Experimental results

Managing Energy and Server Resources in Hosting Centers