220 likes | 302 Views
On the Limits of Leakage Power Reduction in Caches. Yan Meng, Tim Sherwood and Ryan Kastner UC, Santa Barbara HPCA-2005. Overview. Caches are good targets for tackling the leakage problem Much work has been done in this field Gated -Vdd
E N D
On the Limits of Leakage Power Reduction in Caches Yan Meng, Tim Sherwood and Ryan Kastner UC, Santa Barbara HPCA-2005
Overview • Caches are good targets for tackling the leakage problem • Much work has been done in this field • Gated-Vdd • [Powell 01], [Agarwal 02], [Roy 02], [Hu 02], [Kaxiras 01], [Zhou 03], [Velusamy 02] • Multiple supply voltages • [Flaunter 02], [Kim 02,04], [Mudge 04] • Others • [Hu 03] , [Li 04], [Heo 02], [Hanson 01], [Li 03], [Bai 05], [Skadron 04], [Zhang 02], [Azizi et al. 03]
Research Question and Finding • What is the best leakage power saving we could hope to achieve with existing techniques? • Far more potential left for further reducing leakage power in caches
Outline • Motivation • Definitions • Optimal approach • The generalized model • Experimental results • Conclusions
Motivation • Why to study leakage problem? • Leakage power: dominant source for power consumption as technology scales down below 100nm Fig: Projected leakage power consumption as a fraction of the total power consumption according to International Technology Roadmap for Semiconductor
Motivation • Why to tackle the leakage problem through caches? • Caches : huge chip area (50% 2005 [ITRS]) • Major source for leakage power consumption Alpha 21364 microprocessor die photo [http://www.oracle.com/technology/products/rdb/pdf/2002_tech_forums/rdbtf_2002_opt_on_alpha_mdr.pdf]
Motivation • How to tackle the problem with existing techniques? • Keep frequently accessed cache lines active to ensure high performance • Turn off cache lines that are not used for a long time • Use low supply voltage to save power for the rest • What’s the best that the existing circuit and architecture techniques could achieve? How much room is left for further research?
access(i) access(i+1) |Ii| Time Definitions – Cache Interval • Time between two successive accesses to the same cache line
Definitions --- Operating Modes • Active mode • Power on the whole cache line • No power saving • Sleep mode[Roy01, Hu01] • Sleep/“turn off” transistors • Lose data • Refetch data with high overhead • Drowsy mode[Flautner02,Mudge04] • Use low supply voltage to save power when it is not needed • Preserve data for fast reaccess • Wake up to the high voltage and return data
Choosing Operating Modes • Active mode • Sleep mode • Drowsy mode |Ii|
Optimal Approach • Differences • Studying optimality • Combining all three modes to achieve the maximal leakage power saving • Optimal policy • Oracle knowledge of future address trace • Applying the appropriate operating mode on each cache interval • Obtaining optimal leakage power saving • Formal proof of the optimality
Inflection Points • Which mode to apply on each interval? • Active-drowsy inflection point a • The least amount of time drowsy mode needs to save energy • Sleep-drowsy inflection point b • The time where sleep and drowsy modes consume the same amount of energy
Selecting Operating Modes with Inflection Points Active Interval Active Mode 0<|I|≤a Drowsy Interval Drowsy Mode |I|? I a<|I|≤b |I|>b Sleep Interval Sleep Mode Optimality
Calculating Inflection Points • Active-drowsy inflection point a • Sleep-drowsy inflection point b CD
Saving Leakage Power without Performance Degradation • Deriving the interval lengths with perfect knowledge of the future address trace • Fetching any needed data just before it is needed • Avoiding any performance impact • Taking into account the power cost of just-in-time refetch CD
Just before needed Just before needed Saving Leakage Power without Performance Degradation
EAS The Generalized Model • Parameterized model • Inputs • Wake-up latencies • Interval distribution • Leakage power of each state • Transition energy between states • Outputs • Optimal savings of OPT-Drowsy, OPT-Sleep, and OPT-Hybrid • Can be extended to accommodate future technologies and power saving modes • Publicly available • http://express.ece.ucsb.edu/software/leakage.html P(Active) P(Sleep)
Methodology • Core: Compaq Alpha 21264 [Kessler 99] • Memory • 2-way L1 instruction and data caches, 64KB • Unified direct mapped L2 cache, 2MB • LRU replacement policy • Tools • SimAlpha simulator • HotLeakage • Leakage power and dynamic cost • Parameters:taken from HotLeakage • Averaged results over all benchmark applications
Calculating Inflection Points • The sleep-drowsy point decreases from 180nm to 70nm • Because the leakage power consumption increases while the dynamic power consumption caused by an induced miss decreases • Our approach can be parameterized and applied to many other memory technologies • 70nm, the most advanced technology, is used in the rest of our study
OPT-Drowsy Sleep(10K) OPT-Sleep(10K) OPT-Hybrid Exploring the Upper-bound OPT-Drowsy No performance penalty for waking up data Sleep(10K) Turning off cache lines after 10K cycles [Hu01] OPT-Sleep(10K) Turning off cache lines with lengths greater than 10K cycles OPT-Hybrid Optimally combining three modes w/o performance penalty L1 data cache
Research Finding • Larger leakage saving can be achieved for data cache • Drowsy and sleep modes each achieve fairly high savings • Savings are complementary: potential in combining drowsy and sleep technologies
Conclusions • Why leakage? • Leakage: dominant source of power consumption as technology scales down below 100nm • Caches: primary targets to tackle the problem • Optimal approach and software • Calculating the maximal leakage savings • Quantifying how much room left for improvement • Used to guide future power management policy research • Great potential in combining techniques • Optimally combining Active, Drowsy, and Sleep • The optimal approach reduces power dissipation • Instruction cache: by a factor of 5.3 • Data cache: by a factor of 2