1 / 62

Power Management: Research Review

Power Management: Research Review. Bithika Khargharia Aug 5th, 2005. Single data-center rack: Some figures. Cost of power and cooling equipment ~ $52,800 over 10 yr lifespan Electricity costs for a typical 300W server

rainer
Download Presentation

Power Management: Research Review

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Power Management:Research Review Bithika Khargharia Aug 5th, 2005

  2. Single data-center rack: Somefigures • Cost of power and cooling equipment ~ $52,800 over 10 yr lifespan • Electricity costs for a typical 300W server Energy consumption/year = 2,628 kWh Cooling/year = 748 kWh Electricity/kWh = $ 0.10 • Excludes energy costs due to air circulation and power delivery sub-systems • Electricity cost/10 years for typical data center rack = $22,800 Total = $ 338/year

  3. Motivation: Reduce TCO Power Equipment = 36% Cooling Equipment = 8% Electricity = 19% ----------------------------- Total =63% of the TCO of data-center’s physical infrastructure

  4. Some Objectives • Explore possible power savings areas • Reduce TCO by operating within a reduced power budget. • Develop QoS aware power management techniques. • Develop power aware resource scheduling, resource partitioning techniques.

  5. Power management : Problem Domains • Battery-operated devices • Server systems – App Servers, Storage Servers, Front-end Servers - Local schemes per server - Partition-wide schemes - Component-wide schemes • Whole data centers – Server systems, Interconnect switches, power supplies, disk-arrays - Heterogeneous cluster-wide schemes - Homogeneous cluster-wide schemes

  6. Power management : Problem Domains • Battery-operated devices • Server systems – App Servers, Storage Servers, Front-end Servers - Local schemes per server - Partition-wide schemes - Component-wide schemes • Whole data centers – Server systems, Interconnect switches, power supplies, disk-arrays - Heterogeneous cluster-wide schemes - Homogeneous cluster-wide schemes

  7. Battery-operated devices: Power management • Transition hardware components between high and low power states (Hsu & Kremer, ’03, Rutgers, Weiser, ’94, Xerox PARC) • Deactivation decisions involve Power Usage Prediction - Periods of inactivity e.g. time between disk accesses (Douglis, Krishnan, Marsh, ’94, Li, ’94, UCB) - Other high-level information (Health, ’02, Rutgers, Weissel et al, ’02, University of Erlangen) • Mechanism supported by ACPI technology • Usually incurs both energy and performance penalties

  8. Power management : Problem Domains • Battery-operated devices • Server systems – App Servers, Storage Servers, Front-end Servers - Local schemes per server - Partition-wide schemes - Component-wide schemes • Whole data centers – Server systems, Interconnect switches, power supplies, disk-arrays - Heterogeneous cluster-wide schemes - Homogeneous cluster-wide schemes

  9. Power management Schemes: Server Systems • Battery-operated devices • Server systems – App Servers, Storage Servers, Front-end Servers - Local schemes per server - Partition-wide schemes - Component-wide schemes • Whole data centers – Server systems, Interconnect switches, power supplies, disk-arrays - Heterogeneous cluster-wide schemes - Homogeneous cluster-wide schemes

  10. Power management Schemes: Server Systems • Battery-operated devices • Server systems – App Servers, Storage Servers, Front-end Servers - Local schemes per server - Partition-wide schemes - Component-wide schemes • Whole data centers – Server systems, Interconnect switches, power supplies, disk-arrays - Heterogeneous cluster-wide schemes - Homogeneous cluster-wide schemes

  11. Server Power management: Local Schemes Attacks processor power usage (Elnozahy, Kistler, Rajamony, ’03, IBM, Austin) • DVS - extends DVS to server environments with concurrent tasks (Flautner, Reinhardt, Mudge, ’01, UMich) - conserves the most energy for intermediate load intensities • Request Batching - processor awakens when accumulated requests pending time > batch time-out - conserves the most energy for low load intensities • Combination of both - conserves energy for wide range of load intensities

  12. Server Power management: QoS driven Local Schemes Specified QoS Apply Management Strategies QoS aware management strategies Compute QoS Actual QoS Fig: Feed-back driven control framework

  13. Server Power management: QoS driven Local Schemes Some results (Elnozahy, Kistler, Rajamony, ’03, IBM, Austin) • Measured QoS is 90th percentile response time of 50ms • Validated Web-server simulator • Web workload from real Web server systems - Nagano Olympics 98 server - Financial Services company site - Disk-intensive workload.

  14. Server Power management: QoS driven Local Schemes Savings increase with workload, stabilize and then reduce Some results Finance Workload Disk-intensive Workload

  15. Server Power management: QoS driven Local Schemes Results Summary • DVS saves 8.7 to 38 % of the CPU energy • Request Batching saves 3.1 to 27 % of CPU energy • Combined technique saves 17 to 42% for all the three workload types for different load intensities.

  16. Power management Schemes: Server Systems • Battery-operated devices • Server systems – App Servers, Storage Servers, Front-end Servers - Local schemes per server - Partition-wide schemes - Component-wide schemes • Whole data centers – Server systems, Interconnect switches, power supplies, disk-arrays - Heterogeneous cluster-wide schemes - Homogeneous cluster-wide schemes

  17. Server Power management: Local Schemes Storage servers: Attacks disk power usage • Multi-speed disks for servers (Carrera, Pinheiro, Bianchini, ’02 ,Rutgers, Gurumurthi, PennState, IBM T.J Watson, ’03,) - dynamically adjust speed according to load imposed on the disk - performance and power models exist for multi-speed disks - based on disk response time, transition speeds dynamically - results with simulation and synthetic workload: energy savings up to 60%

  18. Server Power management: Local Schemes Storage servers: Attacks disk power usage (Carrera, Pinheiro, Bianchini, ’02 ,Rutgers) • Four disk energy management techniques - combines laptop and SCSI disks - results with kernel level implementation and real workloads; Up to 41% energy savings for over-provisioned servers - two-speed disks (15,000 rpm and 10,000 rpm) - results with emulation and same real workload: energy savings up to 20% for properly provisioned servers.

  19. Server Power management: Local Schemes Alternation of server load peaks and valleys Lighter weekend loads 22% energy savings Switch to 15,000 rpm only 3 times

  20. Server Power management: Local Schemes Storage servers: Attacks database servers power usage • Effect of RAID parameters for disk-array based servers (Gurumurthi, ’03, PennState) - RAID level, stripe size, number of disks parameters - effect of varying these parameters on performance and energy consumption for database servers running transaction workloads

  21. Server Power management: Local Schemes Storage servers: Attacks disks power usage • Storage cache replacement techniques (Zhu ’04, UIUC) - Increase disk idle time by selectively keeping certain disk blocks in main memory cache • Dynamically adjusted memory partitions for caching disk data (Zhu, Shankar, Zhou ’04, UIUC)

  22. Server Power management: Local Schemes Storage servers: Attacks disks power usage, involves data Movement • Using MAID (massive array of idle disks) (Colarelli, GrunWald, ’02, U of Colorado, Boulder) - replace old tape back-up archives - copy accessed data to cache-disks, spin down all disks - LRU to implement cache disk replacement - write back when dirty - sacrifice access time in favor of energy conservation

  23. Server Power management: Local Schemes Storage servers: Attacks disks power usage, involves data movement • Popular data concentration (PDC) technique (Pinheiro, Bianchini, ’04, Rutgers) - heavily skewed file access frequencies for server workloads - concentrate most popular disk data on a sub-set of disks - other disks are idle longer - sacrifice access time in favor of energy conservation

  24. Server Power management: Local Schemes Some results: Comparing MAID and PDC(Pinheiro, Bianchini, ’04, Rutgers) • MAID and PDC can only conserve energy when server is very low • Using 2-speed disks MAID and PDC can conserve 30-40% of disk energy with small fraction of delayed requests • Overall PDC is more consistent and robust than MAID

  25. Power management Schemes: Server Systems • Battery-operated devices • Server systems – App Servers,Storage Servers,Front-end Servers - Local schemes per server - Partition-wide schemes - Component-wide schemes • Whole data centers – Server systems, Interconnect switches, power supplies, disk-arrays - Heterogeneous cluster-wide schemes - Homogeneous cluster-wide schemes

  26. Server Power management: Local Schemes • Power management schemes for application servers has not • been much explored.

  27. Power management Schemes: Server Systems • Battery-operated devices • Server systems – App Servers,Storage Servers,Front-end Servers - Local schemes per server - Partition-wide schemes - Component-wide schemes • Whole data centers – Server systems, Interconnect switches, power supplies, disk-arrays - Heterogeneous cluster-wide schemes - Homogeneous cluster-wide schemes

  28. Server Power management: Partition-wide Schemes • No known work done so far

  29. Power management Schemes: Server Systems • Battery-operated devices • Server systems – App Servers,Storage Servers, Front-end Servers - Local schemes per server - Partition-wide schemes - Component-wide schemes • Whole data centers – Server systems, Interconnect switches, power supplies, disk-arrays - Heterogeneous cluster-wide schemes - Homogeneous cluster-wide schemes

  30. Server Power management: Component-wide Schemes • The power management schemes in this space are mostly the ones used by battery-operated devices • Scheme applies to transitioning single device (CPU, memory, NIC etc) into different power modes • These schemes normally work independently of each other, even when applied to server power management techniques at the local level

  31. Power management Schemes: Server Systems • Battery-operated devices • Server systems – App Servers, Storage Servers, Front-end Servers - Local schemes per server - Partition-wide schemes - Component-wide schemes • Whole data centers – Server systems, Interconnect switches, power supplies, disk-arrays - Heterogeneous cluster-wide schemes - Homogeneous cluster-wide schemes

  32. Server Power management: HeterogeneousCluster-wide Schemes • Not much work done in this space

  33. Power management Schemes: Server Systems • Battery-operated devices • Server systems – App Servers, Storage Servers, Front-end Servers - Local schemes per server - Partition-wide schemes - Component-wide schemes • Whole data centers – Server systems, Interconnect switches, power supplies, disk-arrays - Heterogeneous cluster-wide schemes - Homogeneous cluster-wide schemes

  34. Server Power management: HomogeneousCluster-wide Schemes Front-end Web servers: (Pinheiro, ’03, Rutgers, Chase, ’01, Duke) • Load Concentration (LC) technique - dynamically distributes load offered to a server cluster under light load - idles some hardware and puts them in low power mode - under heavy load the system brings back resources to high power mode

  35. Server Power management: Cluster-wide Schemes As load increases, # of nodes increases Some results 38% energy savings

  36. Server Power management: HomogeneousCluster-wide Schemes Front-end Web server clusters: Attacks CPU power usage (Elnozahy, Kistler, Rajamony, ’03, IBM, Austin) • Independent voltage scaling (IVS) - server independently decides CPU operating points (voltage , frequency) at runtime • Co-coordinated voltage scaling (CVS) - servers co-ordinate to determine CPU operating points (voltage , frequency) for overall energy conservation.

  37. Server Power management: HomogeneousCluster-wide Schemes Hot server clusters: Thermal Management (Weissel, Bellosa,Virginia) • Throttling processes to keep CPU temperatures in server clusters - CPU performance counters to infer the energy that each process consumes - CPU halt cycles introduced if energy consumption is more than permitted • Results - Implementation in Linux Kernel for a server cluster with one Web, one factorization and one database server - Can schedule client requests according to pre-established energy allotments when throttling CPU

  38. Server Power management: HomogeneousCluster-wide Schemes Hot server clusters: Thermal Management for Data centers (Moore et al, HP Labs) • Hot spots can develop at certain parts irrespective of cooling - temperature modeling work by HP Labs • Temperature aware load-distribution policies - adjusts load distribution to racks according to temperature distribution between racks on the same row. - moved load away from regions directly affected by failed air-conditioners

  39. Challenges • No existing tool to model power and energy consumption. • Develop schemes that intelligently exploit SLA’s such as ‘request priorities to increase savings’. • Develop accurate workload based power usage prediction. • Partition-wide power management schemes are not yet explored. • Power management schemes for application servers has not been much explored. - use CPU and memory intensively - store state typically not replicated - challenge is to correctly trade-off energy savings and performance overheads

  40. Challenges • No previous work for energy conservation in memory servers - Challenge is to properly lay-out data across main memory banks and chips to exploit low power states more extensively. • Power management for Interconnects and interfaces - 32 port gigabit ethernet switch consumes 700W when idle. • Thermal Management - very good understanding of components and system lay-outs, air-flow in server enclosures and data centers required. - accurate temperature monitoring mechanisms

  41. Challenges • Peak power management - dynamic power management can limit over provisioning of cooling - challenge is to provide the best performance under fixed smaller power budget - IBM Austin is doing some work related to memory - power shifting project dynamically redistributes budget between active and inactive components - lightweight mechanisms to control power and performance of different system components. - automatic work-load characterization techniques. - algorithms for allocating power among components

  42. Discussion

  43. Power related decision making QoS aware adaptive Power-management Schemes • Translate certain power envelope to compute & IO power. • 2. Add a new parameter to workload requirements characterization – Power • 3. Power usage prediction for different devices (CPU, ,memory, disks etc) and server systems under • different kinds of workloads – like compute-intensive, IO intensive etc • 4. Global power states for servers and data-center systems like ACPI –(ACPI has rudimentary global states right now) 4. Exploit SLA’s such as ‘request priorities to increase savings’. 5. Exploit Server characteristics to increase power savings – workloads, replication, frequency of access for disk array servers • For devices that exists in the ‘battery-operated world’ – • - CPU NIC, memory etc • (additional power savings ?) e.g. • 2. For new devices introduced • by data-centers – disk-arrays, interconnect switches etc. • 3. Relate power consumption with the • ability to self-optimize a platform to • achieve promised QoS • - Power & QoS aware scheduling • - Power & QoS aware resource • aggregation to provision • platforms on demand. • - Power & QoS aware Resource • partitioning.

  44. Power related decision making • Translate certain power envelope to compute & IO power. • 2.Power usage prediction for different devices (CPU, ,memory, disks etc) and server systems under • kinds of workloads – like compute-intensive, IO intensive etc • 3. Add a new parameter to workload requirements characterization – Power • 4. Global power states for servers and data-center systems like ACPI –(ACPI has rudimentary global states right now)

  45. Power related decision making • Translate certain power envelope to compute & IO power. • 2.Power usage prediction for different devices (CPU, ,memory, disks etc) and server systems under • kinds of workloads – like compute-intensive, IO intensive etc • 3. Add a new parameter to workload requirements characterization – Power • 4. Global power states for servers and data-center systems like ACPI –(ACPI has rudimentary global states right now)

  46. Power related decision making • Translate certain power envelope to compute & IO power. • 2. Power usage prediction for different devices (CPU, ,memory, disks etc) and server systems under • kinds of workloads – like compute-intensive, IO intensive etc • 3. Add a new parameter to workload requirements characterization – Power • 4. Global power states for servers and data-center systems like ACPI –(ACPI has rudimentary global states right now)

  47. Power related decision making QoS aware adaptive Power-management Schemes • Translate certain power envelope to compute & IO power. • 2. Add a new parameter to workload requirements characterization – Power • 3. Power usage prediction for different devices (CPU, ,memory, disks etc) and server systems under • kinds of workloads – like compute-intensive, IO intensive etc • 4. Global power states for servers and data-center systems like ACPI –(ACPI has rudimentary global states right now) • For devices that exists in the ‘battery-operated world’ – • - CPU NIC, memory etc • (additional power savings ?) e.g. • 2. For new devices introduced • by data-centers – disk-arrays, interconnect switches etc.

  48. Power related decision making QoS aware adaptive Power-management Schemes • Translate certain power envelope to compute & IO power. • 2. Add a new parameter to workload requirements characterization – Power • 3. Power usage prediction for different devices (CPU, ,memory, disks etc) and server systems under • kinds of workloads – like compute-intensive, IO intensive etc • 4. Global power states for servers and data-center systems like ACPI –(ACPI has rudimentary global states right now) • For devices that exists in the ‘battery-operated world’ – • - CPU NIC, memory etc • (additional power savings • 2. For new devices introduced • by data-centers – disk-arrays, interconnect switches etc. • 3. Relate power consumption with the • ability to self-optimize a platform to • achieve promised QoS • - Power & QoS aware scheduling • - Power & QoS aware resource • aggregation to provision • platforms on demand. • - Power & QoS aware Resource • partitioning.

  49. Power related decision making QoS aware adaptive Power-management Schemes • Translate certain power envelope to compute & IO power. • 2. Add a new parameter to workload requirements characterization – Power • 3. Power usage prediction for different devices (CPU, ,memory, disks etc) and server systems under • kinds of workloads – like compute-intensive, IO intensive etc • 4. Global power states for servers and data-center systems like ACPI –(ACPI has rudimentary global states right now) • Exploit SLA’s such as ‘request priorities to increase savings’. • 5. Exploit Server characteristics • to increase power savings • workloads, replication, • frequency of access for disk • array servers • For devices that exists in the ‘battery-operated world’ – • - CPU NIC, memory etc • (additional power savings • 2. For new devices introduced • by data-centers – disk-arrays, interconnect switches etc.

  50. Power related decision making QoS aware adaptive Power-management Schemes • Translate certain power envelope to compute & IO power. • 2. Add a new parameter to workload requirements characterization – Power • 3. Power usage prediction for different devices (CPU, ,memory, disks etc) and server systems under • kinds of workloads – like compute-intensive, IO intensive etc • 4. Global power states for servers and data-center systems like ACPI –(ACPI has rudimentary global states right now) • Exploit SLA’s such as ‘request priorities to increase savings’. • 5. Exploit Server characteristics • to increase power savings – • workloads, replication, • frequency of access for disk- • array servers • For devices that exists in the ‘battery-operated world’ – • - CPU NIC, memory etc • (additional power savings ?) e.g. • 2. For new devices introduced • by data-centers – disk-arrays, interconnect switches etc. • 3. Relate power consumption with the • ability to self-optimize a platform to • achieve promised QoS • - Power & QoS aware scheduling • - Power & QoS aware resource • aggregation to provision • platforms on demand. • - Power & QoS aware Resource • partitioning.

More Related