1 / 18

Exploiting Program Hotspots and Code Sequentiality for Instruction Cache Leakage Management

Exploiting Program Hotspots and Code Sequentiality for Instruction Cache Leakage Management. J. S. Hu, A. Nadgir, N. Vijaykrishnan, M. J. Irwin , M. Kandemir Computer Science and Engineering The Pennsylvania State University University Park, PA 16802, USA. Outline. Motivation

mliss
Download Presentation

Exploiting Program Hotspots and Code Sequentiality for Instruction Cache Leakage Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exploiting Program Hotspots and Code Sequentiality for Instruction Cache Leakage Management J. S. Hu, A. Nadgir, N. Vijaykrishnan, M. J. Irwin, M. Kandemir Computer Science and Engineering The Pennsylvania State University University Park, PA 16802, USA

  2. Outline • Motivation • Hotspots based leakage management • Just-in-time activation • Experimental results • conclusions

  3. Motivation • Leakage is projected to account for 70% of the cache power budget in 70nm technology • Instruction caches are much more sensitive (performance impact) to leakage control mechanisms • Objective: exploiting application dynamic characteristics for effective instruction cache leakage management • Two major factors that shape instruction access behavior: • Hotspot execution • Code sequentiality • Exploit program hotspots for further leakage savings • Exploit code sequentiality for performance penalty reduction

  4. Hotspot based Leakage Management • Management cache leakage in an application-sensitive manner • Track both spatial and temporal locality of instruction cache accesses • Observations: • Program execution occurs in phases • Instructions in a phase do not need to be clustered • Program phases produce hotspots in the cache • Hotspot based leakage management (HSLM) • Prevent hotspots from inadvertent turn-off • Detect phase change for early turn-off

  5. mask I-Cache Global Set PC PC Dynamic Hotspot Protection Original Scheme HSLM Scheme

  6. HSLM Detecting Phase Changes I-Cache Global Set Original Scheme Drowsy Window HSLM Scheme mask I-Cache Global Set Global Set New hotspots detected Drowsy window expires

  7. Hotspot based Leakage Management • Tracking program hotspots through the branch predictor (BTB) • Augmenting BTB entries to maintain the access history of each basic block • Adding a voltage control mask bit for each instruction cache line • Set mask bit indicates a hotspot cache line • HSLM performs two main functions: • Keep hotspot cache lines from being turned off by the Global Set counter • Track phase changes and allow early turn-off of cache lines not in a newly detected hotspot

  8. Global Set VCM Set power line SRAMs word line Augmented Cache Microarchitecture Global Set Set: drowsy Reset: active Q !Q Reset word line word line drivers row decoder 0.3V (drowsy) 1V (active) wordline gate to prevent accessing drowsy lines

  9. BTB Microarchitecture for HSLM Leakage Control Circuitry Global Mask Bit VCM ICache PC Branch Target Buffer vbittgt_cnt fth_cnt 1 target_addr 1 0000 0 0111 word lines 1 0 Branch taken BTB hit Global Reset

  10. Just-In-Time Activation (JITA) • Sequentiality is the norm in the execution of many applications (e.g., >80% static sequential code, 50% branches are not taken -> 90%) • Take advantage of the sequential access pattern to do just-in-time activation • The status of the next cache line is checked and pre-activated if possible • In sequential access, performance penalty due to activation of drowsy lines is eliminated • JITA may fail: • target of taken branch is beyond next cache line • or next instruction is beyond current bank

  11. PC I-Cache Activate Preactivate PA Just-In-Time Activation (JITA) PC PC PC PC PC I-Cache Activate Access Activate Access Original Scheme Activate PC PC Access Access PA Preactivate PA JITA

  12. Global Set VCM Preactivate Set power line SRAMs word line Preactivate Augmented Cache Microarchitecture Global Set Set: drowsy Reset: active Q !Q Reset word line word line drivers row decoder 0.3V (drowsy) 1V (active) wordline gate to prevent accessing drowsy lines

  13. Leakage Energy Breakdown (overhead)

  14. Leakage Energy Reduction DHS-Bank-PA achieved a leakage energy reduction of 63% over Base, 49% over Drowsy-Bank, and 29% over Loop (Compiler)

  15. Energy Delay Product (EDP) DHS-Bank-PA achieved the smallest EDP: 63% over Base, 48% over Drowsy-Bank, and 38% over Loop

  16. Effectiveness of JITA Applying JITA, DHS-PA removes 87.8% performance degradation of DHS

  17. Conclusions • HSLM and JITA can be implemented with minimum hardware overhead • Energy model takes the energy overhead of introduced hardware and other processor components into account • Applying both HSLM and JITA helps • Reduce leakage energy by 63% over Base, 49% over Drowsy-Bank, and 29% over Loop • Reduce the (leakage) energy*delay product by 63% over Base, 48% over Drowsy-Bank, 38% over Loop • Evaluation shows that application characteristics can be exploited for effective leakage control

  18. Thank You!

More Related