230 likes | 372 Views
Power Management (Application of Autonomic Computing Concepts) Omer Rana. Requirements. Power an important design constraint: Electricity costs Heat dissipation Two key options in clusters – enable scaling of: Operating frequency (square relation) Supply voltage (cubic relation)
E N D
Power Management(Application of Autonomic Computing Concepts)Omer Rana
Requirements • Power an important design constraint: • Electricity costs • Heat dissipation • Two key options in clusters – enable scaling of: • Operating frequency (square relation) • Supply voltage (cubic relation) • Balance QoS requirements – e.g.fraction of workload to process locally – with power management
From: Justin Moore, Ratnesh Sharma, Rocky Shih, Jeff Chase, Chandrakant Patel, Partha Ranganathan (HP Labs)
The case for power management in HPC • Power/energy consumption a critical issue • Energy = Heat; Heat dissipation is costly • Limited power supply • Non-trivial amount of money • Consequence • Performance limited by available power • Fewer nodes can operate concurrently • Opportunity: bottlenecks • Bottleneck component limits performance of other components • Reduce power of some components, not overall performance • Today, CPU is: • Major power consumer (~100W), • Rarely bottleneck and • Scalable in power/performance (frequency & voltage) Power/performance “gears”
Is CPU scaling a win? • Two reasons: • Frequency and voltage scaling Performance reduction less than Power reduction • Application throughput Throughput reduction less than Performance reduction • Assumptions • CPU large power consumer • CPU driver • Diminishing throughput gains power (1) CPU power P = ½ CVf2 performance (freq) application throughput (2) performance (freq)
AMD Athlon-64 • x86 ISA • 64-bit technology • Hypertransport technology – fast memory bus • Performance • Slower clock frequency • Shorter pipeline (12 vs. 20) • SPEC2K results • 2GHz AMD-64 is comparable to 2.8GHz P4 • P4 better on average by 10% & 30% (INT & FP) • Frequency and voltage scaling • 2000 – 800 MHz • 1.5 – 1.1 Volts From: Vincent W. Freeh (NCSU)
LMBench results • LMBench • Benchmarking suite • Low-level, micro data • Test each “gear” From: Vincent W. Freeh (NCSU)
Operating system functions From: Vincent W. Freeh (NCSU)
Communication From: Vincent W. Freeh (NCSU)
The problem • Peak power limit, P • Rack power • Room/utility • Heat dissipation • Static solution, number of servers is • N = P/Pmax • Where Pmax maximum power of individual node • Problem • Peak power > average power (Pmax > Paverage) • Does not use all power – N * (Pmax - Paverage) unused • Under performs – performance proportional to N • Power consumption is not predictable From: Vincent W. Freeh (NCSU)
From: Vincent W. Freeh (NCSU) Safe over provisioning in a cluster • Allocate and manage power among M > N nodes • Pick M > N • Eg, M = P/Paverage • MPmax > P • Plimit = P/M • Goal • Use more power, safely under limit • Reduce power (& peak CPU performance) of individual nodes • Increase overall application performance Pmax Pmax power power Paverage Paverage Plimit P(t) P(t) time time
From: Vincent W. Freeh (NCSU) Safe over provisioning in a cluster • Benefits • Less “unused” power/energy • More efficient power use • More performance under same power limitation • Let P be performance • Then more performance means: MP * > NP • Or P */ P > N/M or P */ P > Plimit/Pmax unused energy Pmax Pmax power power Paverage Paverage Plimit P(t) P(t) time time
When is this a win? P */ P < Paverage/Pmax • When P */ P > N/M or P */ P > Plimit/Pmax In words: power reduction more than performance reduction • Two reasons: • Frequency and voltage scaling • Application throughput power (1) P */ P > Paverage/Pmax performance (freq) application throughput (2) performance (freq) From: Vincent W. Freeh (NCSU)
Feedback-directed, adaptive power control • Uses feedback to control power/energy consumption • Given power goal • Monitor energy consumption • Adjust power/performance of CPU • Several policies • Average power • Maximum power • Energy efficiency: select slowest gear (g) such that From: Vincent W. Freeh (NCSU)
A more holistic approach: Managing a Data Center From: Justin Moore, Ratnesh Sharma, Rocky Shih, Jeff Chase, Chandrakant Patel, Partha Ranganathan (HP Labs)
From: Justin Moore, Ratnesh Sharma, Rocky Shih, Jeff Chase, Chandrakant Patel, Partha Ranganathan (HP Labs)
From: Justin Moore, Ratnesh Sharma, Rocky Shih, Jeff Chase, Chandrakant Patel, Partha Ranganathan (HP Labs)
From: Justin Moore, Ratnesh Sharma, Rocky Shih, Jeff Chase, Chandrakant Patel, Partha Ranganathan (HP Labs)
From: Justin Moore, Ratnesh Sharma, Rocky Shih, Jeff Chase, Chandrakant Patel, Partha Ranganathan (HP Labs)
From: Justin Moore, Ratnesh Sharma, Rocky Shih, Jeff Chase, Chandrakant Patel, Partha Ranganathan (HP Labs)
From: Justin Moore, Ratnesh Sharma, Rocky Shih, Jeff Chase, Chandrakant Patel, Partha Ranganathan (HP Labs)
From: Justin Moore, Ratnesh Sharma, Rocky Shih, Jeff Chase, Chandrakant Patel, Partha Ranganathan (HP Labs)