390 likes | 534 Views
Thermal Management: Technologies & Design Techniques. Yan Lin, Philip Lee, Jinjun Xiong. ---- Part of the slides courtesy of the original authors. Reference.
E N D
Thermal Management: Technologies & Design Techniques Yan Lin, Philip Lee, Jinjun Xiong ---- Part of the slides courtesy of the original authors.
Reference • [1] W. Liao, F. Li and L. He, "Microarchitecture Level Power and Thermal Simulation Considering Temperature Dependent Leakage Model," ISLPED, 2003 • [2] D. Brooks, M. Martonos, "Dynamic Thermal Management for High-Performance Microprocessor," ISHPCA, 2001 • [3] F. Bellosa, S. Kellner, M. Waitz, A. Weissel, "Event-Driven Energy Accounting for Dynamic Thermal Management," COLP, 2003 • [4] H. Zeng, C. Ellis, A. Lebeck, and A. Vahdat, "ECOSystem: Managing Energy as a First-Class Operating System Resource," ASPLOS, 2002
Outline [1] • Introduction • Leakage power modeling with temperature scaling • Coupled power and thermal simulation • Sub-Conclusions
Introduction • Leakage power is about 40% of total power for Intel Pentium IV processors at 3GHz [A. Grove, IEDM 2002] • Leakage power exponentially increases with respect to temperature Coupled power and thermal simulation is needed for accurate power and thermal modeling
Circuit and Power States • Active circuit and power state • Pa: full power dissipation without any throttling • Pa = Pd + Ps • Pd: dynamic power • Clock gating and standby state • Ps: leakage power with clock gating • Power gating and inactive state • Pi: reduced leakage power with power gating
VRC(Virtual power/ground Rails Clamp) for memory units • Less power reduction but with data retention Low Vt Logic Virtual GND Low Vt Logic Sleep Virtual GND Sleep Leakage Power Reduction by Power Gating • MTCMOS for logic circuits • Near 100% leakage power reduction in sleep mode • No data retention
Its maximum and minimum values are stable when the number of circuit blocks is large enough (> 20) Leakage Power Model for Logic • Leakage power: • Iavg is the averaged leakage current over circuit blocks considering logic states, transistor stacking, and transistor size
Leakage Power Model for Memory Units • Memory units modeled by SRAM array: • Plogic : leakage power for logic such as wordline drivers, write circuits and precharge transistors • Pcircuit : leakage power for SRAM cells
Memory units: Temperature Scaling • Iavg in logic circuits: • α,β,γandδare coefficients • T is the absolute temperature • Get coefficient by curve fitting, less than 6% error with SPICE
Temperature Calculation • Stable on-chip temperature • T: on-chiptemperature • Ta: ambient temperature • Rt: thermal resistance (for unit-area) • Transient Temperature • Suppose the average power within (t1, t2) is Pavg if Ta + Rt * Pavg > Tt1 else • τheat/cool: heating and cooling time constants
Temperature Calculation Modes • Universal mode • Assume the whole chip has a uniform temperature • Provide lower bound of the maximum on-chip temperature • Individual mode • Divide the whole system into components • Calculate a temperature for each individual component • Assume no horizontal heat transfer among components • Provide upper bound of maximum temperature and maximum temperature gap
100% 90% 80% 70% 60% Normalized total energy 50% 40% 30% 20% 10% 0% ind uni ind uni ind uni 90C 90C 90C 130C 110C 110C 130C 130C 110C Case I Case II Case III Temperature Dependent Energy • Total leakage energy changes by a factor of 2.5X when temperature changes from 90oC to 130oC • Any study regarding to leakage energy is not accurate without considering thermal issue • At 2GHz with individual mode, clock gating reduces dynamic energy by up to 69.29%, and reduce leakage energy by up to 48.06% • due to reduced temperature • Case I: 1GHz without throttling • Case II: 2GHz without throttling • Case III: 2GHz with clock gating Dynamic energy Leakage energy
Sub-conclusions • Coupled power and thermal simulation is necessary • growing significance of leakage • Leakage is an exponential function of temperature • The first cycle-accurate coupled power and thermal simulator is developed • Power and thermal management will be investigated • With inter-dependence between power and temperature
Outline [2] • Introduction • Mechanisms for Dynamic Thermal Management • Simulation Results • Conclusions
Introduction • Power dissipation becomes critical with increasing clock rate and transistor count • Thermal and power-delivery issues become especially critical for high-performance microprocessors Dynamic Thermal Management is needed for high-performance processors
Mechanisms for DTM • Initiation Delay i.e. operating system interrupt and handler • Response Delay i.e. Voltage and Frequency Scaling • Policy Delay : # of cycles before checking temperature after turning on DTM Turn Response off Check Temp Trigger Reached Check Temp Turn Response On Shutoff Delay Initiation Delay Response Delay Policy Delay Response On
Initiation Mechanisms Hardware support for initiating Responses Trigger Mechanisms Temperature Sensors for Thermal Feedback On-Chip Activity Counters Dynamic Profiling Analysis Compiler-time trigger requirements Response Mechanisms Micro-architecture techniques Frequency/Voltage Scaling techniques Trigger Mechanisms
Power Emergency Settings: 25W Emergencies are removed with DTM for all benchmarks except Fppp (No average power with DTM in paper)
Performance Degradation Frequency/Voltage Scaling Techniques Microarchitecture Techniques Performance loss at various trigger level
Sub-Conclusions • Allows arbitrary tradeoffs between performance and savings • Designer can focus on average power • Trigger delay is a key factor in performance overhead
Introduction [3] • Two major design alternatives to deal with power dissipation: • Cooling technology designed to handle maximum power consumption • Heat removal designed for typical sustained power across realistic workloads. • Most dynamic thermal management (DTM) techniques do not account for application-specific techniques
Introduction • An event-driven energy estimation model • Uses event-monitoring counters to estimate actual power consumption • Identifies which processes are using the power of the system • Allow OS to treat energy as a resource • A CPU scheduler limits the execution time slices of “hot” processes
Event-Driven DTM • Use performance counters available in modern processors to determine if process is “hot” • Faster to estimate power consumption based on counters than to actually measure • Experiments performed on Pentium 4 • Limitation: accounts for thermal management of only the CPU
From Events to Energy • Energy estimation done by correlating a processor-internal event to an amount of energy
Energy Containers • Energy abstracted as a first class resource • Allows OS to actively schedule/manage based on energy • An energy container is a specific type of resource container • Processes are throttled based on the limits of the energy containers
From Energy to Temperature • From energy equations and Newton’s Law of Cooling, the following formula is derived to estimate processor temperature: • The constants c1, c2, and T0 were determined experimentally using test programs • In all cases, estimated temperature > measured
Evaluation • Compared estimated temperature with measured temperature on various benchmarks
Overhead • Event-monitoring counters read with timer interrupt (1000 times/sec), and context switches with energy container support increases by 49% • However, performance loss < 1% • Estimating temperature is negligible since it takes ~4.85us and is only executed 1-10 times/sec
Introduction [4] • Energy as a first class resource • Explicit allocation of energy to competing applications • Control of battery resource • Goal of extending battery life by limiting average discharge rate • Uses currentcy model, an energy accounting framework • ECOSystem (a modified Linux)
The Currentcy Model • Model uses a common unit of currentcy for energy accounting and allocation • 1 unit of currentcy represents the right to consume a certain amount of energy within a fixed amount of time
The Currentcy Model • Allocation: currentcy is divided among competing tasks based on specified weights • Payback: each managed device has a cost that requires payment in currentcy • Allows OS to determine which tasks get access to the energy resource
ECOSystem (Energy-Centric Operating System) • Currentcy model implemented in Linux OS • Models the power characteristics of 3 primary devices: • CPU • Disk • Wireless Network Interface
Energy Accounting • Accuracy of currentcy model vs. program counter sampling
Q & A • Thank you!