80 likes | 193 Views
On-demand solution to minimize I-cache leakage energy. Group members: Chenyu Lu and Tzyy-Juin Kao. Motivation. High power dissipation causes thermal problems, such as higher packaging, power delivery and cooling costs
E N D
On-demand solution to minimize I-cache leakage energy Group members: Chenyu Lu and Tzyy-Juin Kao
Motivation • High power dissipation causes thermal problems, such as higher packaging, power delivery and cooling costs • In 70nm technology, leakage may constitute as much as 50% of total energy dissipation • Use the super-drowsy leakage saving technique • Lower the supply voltage to a level (0.25V) near the threshold voltage (0.2V) • Data can still maintain but can not be accessed • Require one cycle penalty to wake up from the saving mode to the active mode • Use the on-demand wakeup policy on the I-cache • Only the cache lines currently in use need to be awake • Accurately predict the next cache line by using the branch predictor • On most branch mispredictions, the extra wakeup stage is overlapped with the mispredictionrecovery
Overview • Super-drowsy cache line • A Schmitt trigger inverter controls the voltage of the cache line at the leakage saving mode • Replace multiple supply voltage sources • Wakeup prediction policy • enables on-demand wakeup • The branch predictor already identifies which line need to be woken up • No additional wakeup-prediction structure is needed
Methodology • Leakage energy = drowsy_energy + active_energy + turn_on_energy • Monitoring every cycle in sim-outorder: active_lines & turn_on • Add a wake_bit to every block: • 0: means it’s in drowsy mode this cycle • 1: means it’s in active mode this cycle • 2: means it’s in active mode this cycle and the next cycle • 3: means it in drowsy mode this cycle and will be in active mode next cycle • Update the wake_bit and count the active_lines every cycle using Update_wakeup() • Change the wake_bit every instruction fetch using fetch_line() • Improved strategyInterval * Active_Power < Interval * Drowsy_Power + Turn_On_Energy • Speculatewith a list of recently-accessed cache lines
Future Work • One cycle extra latency when target address misprediction (0.08% performance drop according to the paper) • Apply On demand policy on data cache • No prediction • Extra latency can be hidden by locality and out of order execution