190 likes | 425 Views
Thermal Management of Datacenter. Qinghui Tang. Preliminaries. What is data center What is thermal management Why does Intel Care Why Computer Science. Typical layout of a datacenter. Rack outlet temperature T out Rack inlet temperature T in Air conditioner supply temperature T s.
E N D
Thermal Management of Datacenter Qinghui Tang
Preliminaries • What is data center • What is thermal management • Why does Intel Care • Why Computer Science
Typical layout of a datacenter • Rack outlet temperature Tout • Rack inlet temperature Tin • Air conditioner supply temperature Ts
State-of-Art Thermal Management of Data Center • Power densities are increasing exponentially along with Moore’s Law • Current cooling solutions at various levels • Chip / component level • Server/board level • Rack level • Data center level • S/W based Thermal management solutions – HP+Duke
Thermal Management of Datacenter • Motivation and significance • Compute Intensive Applications (Online Gaming, Computer Movie Animation, Data Mining) requiring increased utilization of Data Center • Maximizing computing capacity is a demanding requirement • New blade servers can be packed more densely • Energy cost is rising dramatically • Goal • Improving thermal performance • Lowering hardware failure rate • Reducing energy cost
New Challenges • Planning perspective: How to design efficient data center? • does upgrading 10% blade servers to smart ones help to reduce cost • Operation perspective: How to efficiently operate data center and lower the cost? • What’s the trade-off between utility cost and hardware failure cost • Overcooling: wastes energy and increases utility cost • Undercooling: increases frequency of hardware failures
Research Issues of Thermal Management of Datacenter Scheduler Other Impact Factors Control Thermal Performance Evaluation Cost Optimization Abstract Heat Flow Model Power & Load Characterization Modeling Thermal Performance Multiscale & Multimodal Info Analysis Understanding
Multiscale and multimodal nature of datacenter management • Information perspective • Multiple system variables • Different change pattern • Different sampling Rate • Control perspective • Responsiveness • Control granularity (spatial and temporal level) • Sensitivity Analysis
Approaches • CFD simulation to characterize thermal performance of data center • Online measurement and feedback control system
CFD Simulation CFD real model based on ASU HPC center
Thermal-aware task scheduling 6 5 1 2 4 3
Two-Pronged Approach • Real-time measurement • Online lightweight simulation & prediction
Different optimization goals • Maximizing computation capacity given energy cost constraint • Minimizing individual cost (computing cost/cooling cost) • Achieving thermal balancing