610 likes | 1.2k Views
GREEN COMPUTING. Power Consumption Basics in ICT Products. Maziar Goudarzi. Outline. Metrics Energy consumption in ICT products Some common energy optimization techniques. Acknowledgements: Some slides/parts from http://www.ida.liu.se/~TDDD50/. Electrical Units. Power Metrics.
E N D
GREEN COMPUTING Power Consumption Basics in ICT Products Maziar Goudarzi
Outline • Metrics • Energy consumption in ICT products • Some common energy optimization techniques Acknowledgements: Some slides/parts from http://www.ida.liu.se/~TDDD50/
Performance related energy metrics • Energy-per-instruction (EPI) • Energy spent to execute an instruction • Used to compare micro-architectural traits • Sometimes to model software consumption • Not all the instructions consume the same • Application energy consumption • Power vs. Time
Comparing CPU energies • Example: Same program, • AMD CPU, 2GHz, 150W, 10s • Intel CPU, 2.5GHz, 200W, 8s Which one is better? • Another (perhaps better) example • Same program • Atom processor, 1.5GHz, 10W, 20s • Core i7 processor, 2GHz, 55W, 5s Which one is better?
Performance related energy metrics • Energy delay product (EDP) • Encourages low consumption and fast runtime • Energy or delay increase → EDP increases EDP = Watts * runtime2 Energy = Watts * runtime Delay = runtime
Outline • Metrics • Energy consumption in ICT products • Some common energy optimization techniques
Power Consumption Fundamentals • Most widely used technology today • CMOS (complementary Metal Oxide Semiconductor) technology • Technology name • Minimum feature size: 65nm, 45nm, … • Latest technology?
Power Consumption Fundamentals • Elements of power consumption • Dynamic power • Dissipated when charging /discharging capacitors • Inevitable! • Static power • Leakage • Total waste! • Was negligible until recently • Increased with technology scaling (<180nm) • 20 to 40% in today processors • AMD OpteronX2: 300mm wafer, 117 chips, 90nm technology • Opteron X4: 45nm technology
CMOS Leakage • Transistor is not a perfect digital switch! • Subthreshold leakage • Gate leakage -> high-k dielectric • Junction leakage
Subthreshold Leakage • Subthreshold leakage depends on
Outline • Metrics • Energy consumption in ICT products • Some common energy optimization techniques • Static power reduction • Dynamic power reduction
Leakage reduction techniques • Subthreshold leakage depends on • Architectural techniques to reduce leakage • Stacking effect and gated Vdd • Drowsy effect • Threshold voltage manipulation
Stacking effect and gated Vdd • Connection of transistors in series source to drain • Reduces the Vds of each transistor • Popular stacking technique: Gated Vdd • Sleep transistor gates the ground (disconnects power supply)
Gated Vdd for SRAM • Dynamically Resized Instruction Cache • Cache decay • Disable individual lines • Managed with counters to estimate dead lines • Disabled lines lose the state • Expensive management StefanosKaxiras, ZhigangHu, Margaret Martonosi, Cache Decay: Exploiting Generational Behavior to Reduce Cache Leakage Power, ISCA, 2001.
Drowsy effect • Voltage-scale of idle memory cells • Two levels of supply voltage (Vdd and VddLow) • Transistors leak much less than with full Vdd • No loss of memory state • High level policies for drowsy caches • No need for complex management mechanisms • Reading delay (cell voltage scaled back to Vdd) • Worst case are few cycles of delay • Examples • Simple: whole cache periodically put in drowsy mode • Petit et al.: Simple with heuristics, such as avoid setting the Most Recently Used (MRU) line to drowsy mode
Threshold voltage manipulation • The lower the VT, the higher the leakage • Technology scaling enforces • Reduce Vdd to reduce power consumption and temperature • Reduce VT to reduce delay • Architectural level techniques • Combination of high-VT and low-VT devices • High-VT : low leakage, long latency • Low-VT : high leakage, short latency • Gated-Vdd using a high-VT device
Variable Threshold CMOS • Body Biasing • Body effect to change device Vth • Standby leakage reduction with maximum reverse bias • Triple well structure http://mtlweb.mit.edu/researchgroups/icsystems/pubs/tutorials/jkao_2002_iccad_I.pdf
Outline • Metrics • Energy consumption in ICT products • Some common energy optimization techniques • Static power reduction • Dynamic power reduction
Capacitance and switching activity • Capacitance and Switching factor intertwined P=C⋅V2⋅A⋅f • Capacitance (C) • Fixed at design time • Dependant on • number of transistors • Interconnections • Switching activity or factor (A) • Fraction between 0 and 1 • Factor of capacitance charged/discharged each CPU cycle
Capacitance • Description of capacitance (Burd and Brodersen) CL=CW + Cfixed • CW: Product of technology constant and device width • Optimized at circuit level • Cfixed: Capacitance of the interconnections • Optimized at architectural level • Reduction of wire length • Effective placement and routing (locality) • Break up large memory banks in smaller chunks
Excess switching activity • Avoidable charge/discharge activity • Types • Idle-unit • Idle-width • Idle-capacity • Parallel-speculative • Cacheable • Speculative
Idle-unit switching activity • Triggered by clock activity in unused units
Idle-width switching activity • Processor structures wider than needed • Example • Units with support for 64 bit operands • Most common operations use 16 bit operands • Solutions • Adapt width of machine according to operands • Pack multiple narrow-width operations
Idle-capacity switching activity • Over-provisioned processor resources • Resource partitioning or re-sizing • Grounds • Wire delay increases as technology scale decreases • Long wires imply • Non affordable delay • High capacitance and consumption • Buffered wires reduce circuit delay
Complexity-adaptive structures • Complexity-adaptive structures (Albonesi) • Trade latency & consumption with capacity • Structures become faster as they become smaller • Solution • Partitions with tri-state buffers • When structures are reduced • Faster processing • Less energy consumed • Suitable for SRAM
Parallel speculative switching activity • Parallel activity is spent for performance • Associative caches • All but one associative ways fail to produce a hit • All ways are accessed in parallel for speed • Solution: Smart way access approaches
Cache Way Memorization Upon failure
Voltage-Frequency Scaling • Basic dynamic power equation: P = C⋅V2⋅A⋅f • Voltage reduction decreases power by the square of it • Maximum frequency is limited by voltage • Potential cubic reduction in power dissipation • Considering f and V • Performance decreases linearly
Dynamic voltage/frequency scaling (DVFS) • Dynamic adjustment of voltage/frequency • Tradeoff power dissipation / performance • DVFS decision level • Hardware level • Exploits different timings of hardware components • Program level • Program behavior drives decision • E.g. scale down when program knows that has to wait • System level (OS) • Idleness of the system drives decision • Voltage/frequency scaled to eliminate idle periods
Dynamic voltage/frequency scaling (DVFS) • Examples of commercial systems • Intel SpeedStep • AMD PowerNow! (for laptops) • Cool'n'Quiet (for desktop and servers) • Decision taken at system level • Changes through specific CPU register Enhanced Intel ®SpeedStep ® Technology for the Intel ® Pentium ® M Processor (White Paper) http://download.intel.com/design/network/papers/30117401.pdf
تمرین اضافی • روی کامپیوتر شخصی خود DVFS روی پردازنده را اعمال کرده و میزان مصرف توان آن را تحت کاربردهای مختلف اندازه گیری نمایید. • میزان مصرف توان پردازنده را جدا از توان مصرفی دیگر اجزا گزارش کنید. • چه اثری مشاهده می کنید؟
Coming Next • Power Aware Computing • Higher-level power reduction techniques