1 / 24

Microarchitectural Techniques for Power Gating of Execution Units

Authors: Zhigang Hu, Alper Buyuktosunoglu, Viji Srinivasan, Victor Zyuban, Hans Jacobson, Pradip Bose IBM T.J. Watson Research Center Page: 32-37, In International Symposium for Low Power Electronic Devices, 2004. Presenter: Sai Raghunath T.

presley
Download Presentation

Microarchitectural Techniques for Power Gating of Execution Units

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Authors: Zhigang Hu, Alper Buyuktosunoglu, Viji Srinivasan, Victor Zyuban, Hans Jacobson, Pradip Bose IBM T.J. Watson Research Center Page: 32-37, In International Symposium for Low Power Electronic Devices, 2004. Presenter: Sai Raghunath T Microarchitectural Techniques for Power Gating of Execution Units

  2. Sources of Power dissipation • Sub-threshold leakage • Gate leakage current • Circuit level approach for leakage power reduction • Body bias control • Dual threshold Domino circuits • Input vector control • Power gating

  3. Architectural level leakage power reduction in caches and buffers • Tristating the drivers of bitlines of SRAM • Determination of Sleep mode activation policies for the integer functional units using Dual-Vt Domino logic circuits • Role of compiler to detect long idle periods for different functional units and enable power gating.

  4. Work done in the paper: • Exploiting work load phases and characteristics to dynamically power gate OFF/ON selected units within a pipeline using Time based technique and Branch prediction technique • Specifications of out of-order issue Super scalar processor - Turandot

  5. Fundamentals of Power gating: • Power gating is achieved by using suitably sized header or footer for a circuit. • ‘Sleep’ signal is applied when the logic detects sufficiently long idle period and the macro is turned OFF.

  6. T1-T0= T(idle detect)‏ T2-T1= T(idle delay)‏ T3-T2= T(breakeven)‏ T4-T2= T(full discharge)‏ T5= detection of next busy interval T6-T5= T(busy delay)‏ T7-T6= T(wakeup) • Sequence • 1. T0 -> T1= Leakage energy • 2. T1 -> T2= Overhead energy+ Leakage energy • (Overhead energy is the energy required to generate ‘Sleep’ signal)‏ • Savings in leakage energy increase with decrease in supply voltage 3. T5 -> T6= Overhead energy 4. T6 -> T7= Leakage energy

  7. T(breakeven) is the point when the aggregate leakage energy savings E(avg saved) equals the energy overhead of switching ON and OFF the header/footer device. Typically, the value of N (breakeven) is 10 DIBL= Drain Induced Barrier Lowering factor (typically 0.1)‏ WH= total area of header device total area of clock gated macro α- switching factor m = 0.1

  8. Power gating of execution units • Quantifying the Power gating potential for out-of-order Superscalar processor model using different applications from SPEC2K suite. Assumptions: • T(idle delay)= T(busy delay)=0 →perfect predictor • T(idle) > T(overhead) ( =T(wakeup)+T(breakeven))‏

  9. The following equations estimate the fraction of cycles the units can be power gated: Ex: Sequence of activity bits of some unit 1111 00000 111111 0000 1111 000000 1111 T(overhead) =3 Opp cycles = (5-3)+ (4-3) +(6-3) =6 Power gating potential = 6/33 =18.18 %~ 19%

  10. Power gating potential averaged across SPEC2K FP applications for various values of T(overhead)‏

  11. Power gating potential averaged across SPEC2K integer applications for various values of T(overhead)‏

  12. Time-Based Power Gating: • Assumptions: • T(breakeven)= T(breakeven)+ T(idle delay)‏ • T(wakeup)= T(wakeup) +T(busy delay)‏ • One issue queue per execution unit • Logic used: • Observe the state of an execution unit and turn it OFF when a long streak of idle cycles is seen

  13. FSM: State Machine of an execution unit when power gating is engaged

  14. % of cycles in sleep mode for FPU with different T(idle detect) and T(breakeven). T(wakeup)= 3 cycles

  15. Avg IPC of SPECFP2K suite with different T(idle detect) and T(wake up) values. T(break even)=9 cycles. IPC is normalized to the base case where Power gating is disabled. • Long idle periods coupled with smaller values of T(break even) and T(wakeup) • help achieve large leakage reductions and mitigate overall performance loss savings • T(idle detect)= 6-12 cycles for optimum balance between performance and power

  16. % of cycles in sleep mode for FXU with different T(idle detect) and T(breakeven). T(wakeup)= 3 cycles

  17. Avg IPC of SPECINT2K suite with different T(idle detect) and T(wake up) values. T(break even)=9 cycles. IPC is normalized to the base case where Power gating is disabled.

  18. Branch prediction guided Power gating: • Observations from the previous graphs show that FXU typically had short idle periods. • So, it is difficult to efficiently implement Power gating in integer execution units. • Branch mispredictions are highly disruptive events in speculative out-of-order processors – Good chance of implementing Power gating techniques. • In the event of branch misprediction, the pipeline is flushed and correct instruction is fetched • During this process, execution unit is idle.

  19. New branch prediction guided power gating technique: • As soon as the branch misprediciton is detected, all idle FXUs are transferred to ‘Uncompensated’ state →reduction in T(idle detect) → higher % of cycles in ‘sleep’ mode → smaller performance loss and better leakage reduction

  20. % of performance loss in sleep mode versus performance degradation techniques T(breakeven)=9 cycles; T(wakeup)= 3 cycles

  21. Conclusions and critique: • Time based technique is efficient for FP execution units which have relatively high idle time. • Branch prediction technique is efficient for Integer execution units. • No mention about the advantage/disadvantage of power gating over other circuit level approaches for leakage power reduction. • How efficient is Power gating if the above mentioned assumptions are relaxed?? • What is the power consumption from the macro generating ‘Sleep’ signal? What is the ratio of its power consumption to power savings?

  22. How is this paper relevant to the class?? • State-of-art microprocessors are facing the problem of high power leakage due to scaling of technology. • Leakage power is high from the execution units which are the most important blocks in the microprocessor. This paper gives a good insight in understanding techniques to reduce leakage power. • Also, various power gating techniques to reduce the power dissipation in CMP and SMT architectures can be explored.

  23. Project: • Considering a small integer ALU and comparing various circuit level approaches with Power gating and suggesting the better technique(s)- the idea that will be suggested can be a optimum mix of using 2 or more circuit level approaches.

  24. THANK YOU • Q &A?

More Related