210 likes | 323 Views
Energy and heat-aware metrics for data centers. Jaume Salom , Laura Sisó IREC - Catalonia Institute for Energy Research Ariel Oleksiak, Mateusz Jarus PSNC – Poznan Supercomputing and Networking Center Thomas Zilio IRIT – Institut de Recherche en Informatique de Toulouse
E N D
Energy and heat-aware metrics for data centers Jaume Salom, Laura Sisó IREC - Catalonia Institute for Energy Research Ariel Oleksiak, Mateusz Jarus PSNC – Poznan Supercomputing and Networking Center Thomas Zilio IRIT – Institut de Recherche en Informatique de Toulouse 30 September 2013
Introduction • CoolEmAll will provide tools for planners and operators of DC to carry out flexible and fast simulations to improve energy efficiency and to reduce the carbon footprintassociated • Metrics suitable to quantify CoolEmAll improvement in energy efficiency EuroEcoDC Workshop - Karlsruhe
Contents • Present status of DC metrics • Properties of CoolEmAll metrics • New metrics proposed • Description of experiment • Check Imbalance of Temperature • Further steps: other metrics to test • Conclusions EuroEcoDC Workshop - Karlsruhe
Present status of DC metrics • Metrics related to power for complete DC • PUE • Global KPI • Metrics that consider energy reuse, carbon emissions or water use: • ERE,CUE,WUE • Metrics to consider the power required in idle conditions • FVER • Metrics for IT Components: • Power Usage, resource/Watt EuroEcoDC Workshop - Karlsruhe
Properties of CoolEmAllmetrics • Focus on Energy not only on peak-power • Focus on Temperature not only on Power • Heat-aware metrics • Focus on Useful Work on Applications not only on IT Consumption • Selection of useful and consistent metrics to assess different granularity levels of a DC (CPU, rack, room) • Holisticapproach EuroEcoDC Workshop - Karlsruhe
Properties of CoolEmAllmetrics • Granularity: • Nodeunit • Nodegroup • Rack level • Room of a DC • Focuson: • Resourceusage • Energy • Heat-aware EuroEcoDC Workshop - Karlsruhe
New metricsproposed Metrics at node-grouplevel • Node-groupcoolingindex Referredtothe air inlettemperatures Recommended and allowedvaluesby ASHRAE EuroEcoDC Workshop - Karlsruhe
New metricsproposed Metrics at node-grouplevel • Node-groupcoolingindex - meaning • CING,HI= 100% All intake temperatures ≤ max. recommended temperature. • CING,HI < 100% At least one intake temperatures > max. recommended temperature. • CING,LO = 100% All intake temperatures ≥ min. recommended temperature. • CING,LO < 100% At least one intake temperatures < min. recommended temperature. EuroEcoDC Workshop - Karlsruhe
New metricsproposed Metrics at node-group, rack and DC level • Imbalance of temperature of CPU • ImNG,temp =0 means all of nodes works at the same temperature EuroEcoDC Workshop - Karlsruhe
Description of experiment • Prototype server RECS from Christmann Company • RECS: highdensitymultinodecomputer of 18 single server nodeswithingone Rack Unit • CPU: Intel Core i7-3615QE CPU @ 2.30GHz, CPU Cache: 6144 KB, RAM: 16 GB • Load OpenSSLBenchmark EuroEcoDC Workshop - Karlsruhe
Description of experiment • 6 configuration EuroEcoDC Workshop - Karlsruhe
CheckImbalance of Temperature Unexpected imbalance ! EuroEcoDC Workshop - Karlsruhe
CheckImbalance of Temperature • Analysis: • Failure of one fan at right side ! • Imbalance was higher when load was placed on right side instead of left side • Metric recalculated assuming CPU temperature of the node with failed fan as average of other nodes with similar load EuroEcoDC Workshop - Karlsruhe
CheckImbalance of Temperature “Inlet” configuration:temperature of loadednodesaffectstemperature of idle nodes Balanced! EuroEcoDC Workshop - Karlsruhe
Furthersteps: othermetricsto test • Idea: heat-aware + useful work + energy • Other metrics that will be deeply analysed: • Relation Imbalance of temperature vs Temperature or Heat-Dissipated • Productivity (Useful work / Energy) • PUE Scalability • FVER EuroEcoDC Workshop - Karlsruhe
Furthersteps: othermetricsto test • PUE Scalability Source: The Green Grid. WP#49 EuroEcoDC Workshop - Karlsruhe
Furthersteps: othermetricsto test • PUE Scalability Source: The Green Grid. WP#49 EuroEcoDC Workshop - Karlsruhe
Furthersteps: othermetricsto test • FVER – Fixed to variable energy ratio • Source: BSC • How much energy produces useful work and how much could be removed • E_fixed energy when useful work = 0 • During flat operation DC can consume up to 80 % of peak power! 1st Review, 30.10.2012, Brussels
Conclusions • Imbalance of temperature permits to detect failure of IT equipment. • Complementarity between Imbalance of temperatures and Node-Group-Cooling-Index • Analysis of several metrics together: • Imbalance of Temperature • Power Usage (Power/Max Power rated) • Productivity (Useful work/Energy) • FVER • PUE Scalability will allow improve aware about cooling requirements and the possibility of reducing it. EuroEcoDC Workshop - Karlsruhe
Conclusions • Relation between • power, • cooling requirements, • resource-usage, and • workload management will be identified to disclose the appropriate strategies to improve the energy efficiency • First results on tests of the first prototype have been collected. More experiments will be carried out to validate the proposed metrics EuroEcoDC Workshop - Karlsruhe
Questions? Comments? EuroEcoDC Workshop - Karlsruhe