1 / 16

Energy Efficient Register Renaming

Energy Efficient Register Renaming. Gurhan Kucuk, Oguz Ergin, Dmitry Ponomarev, Kanad Ghose Department of Computer Science State University of New York Binghamton, NY 13902-6000 http://www.cs.binghamton.edu/~lowpower.

maren
Download Presentation

Energy Efficient Register Renaming

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PATMOS 2003 Energy Efficient Register Renaming Gurhan Kucuk, Oguz Ergin, Dmitry Ponomarev, Kanad Ghose Department of Computer Science State University of New York Binghamton, NY 13902-6000 http://www.cs.binghamton.edu/~lowpower 13th International Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS’03), September 11th 2003 *supported in part by DARPA through the PAC-C program and NSF

  2. PATMOS 2003 Outline • Motivations • The Register Alias Table (RAT) Complexity • Complexity-Effective RAT Designs • Exploiting the intra-group dependencies • Buffering recent address tranlations • Results and discussions • Conclusions

  3. PATMOS 2003 Motivation • RAT maintains the register address tranlations needed for handling the true data dependencies • High Power Dissipation • 14% of the overall power is attributed to the RAT in the global power analysis performed in [FG 01] • High Power Density

  4. PATMOS 2003 The RAT Complexity (W-way CPU) W write ports to update W RAT entries for the destinations of the co-dispatched instructions RAT 2W read ports to translate the source register addresses W read ports for checkpointing the old mapping of the destination register

  5. PATMOS 2003 Register Renaming Steps • Step 1. The following substeps are performed in parallel: • RAT reads for the sources of each of the co-dispatched instructions are performed in parallel, assuming that no dependencies exist among the instructions. • New physical registers are allocated for the destination registers of all of the co-dispatched instructions. • Data dependencies among the instructions are noted, using a set of comparators. The address of each destination register in a group of instructions is compared against the sources of all following instructions in the group and if a match occurs, the dependency is detected. • Step 2. If a data dependency is detected among a pair of instructions, the source physical register for the dependent instruction as read out from the RAT is replaced with the allocated destination register address of the instruction producing the source to preserve the true dependencies.

  6. PATMOS 2003 Conditional Sensing (CSense) – Exploiting the Intra-Group Dependencies • CSense disables parts of the RAT read accesses if the intra-group data dependency is noted ADD R1, R2, R3 LOAD R4, R1, R3 SUB R5, R4, R2 = =

  7. PATMOS 2003 Conditional Sensing (CSense) – Exploiting the Intra-Group Dependencies • CSense disables parts of the RAT read accesses if the intra-group data dependency is noted ADD R1, R2, R3 LOAD R4, R1, R3 SUB R5, R4, R2 0 1 = = 0 Disable sense amp

  8. PATMOS 2003 Conditional Sensing (CSense) – Exploiting the Intra-Group Dependencies • CSense disables parts of the RAT read accesses if the intra-group data dependency is noted ADD R1, R2, R3 LOAD R4, R1, R3 SUB R5, R4, R2 0 0 = = = = 1 Enable sense amp

  9. PATMOS 2003 Percentage of source operands that are produced by the co-dispatch instructions in the same cycle %

  10. PATMOS 2003 Buffering Recent Address Translations • SPEC 2000 simulations show that dependent instructions are usually very close in proximity to each other • If the register needed as a source is defined by an earlier co-dispatched instruction, CSense scheme becomes useful • If the register needed as a source is defined by an instruction that is dispatched in previous cycles, we utilize External Latches (ELs) to faster serve recent register mappings

  11. PATMOS 2003 Buffering Recent Address Translations • The RAT access for a source register now proceeds as follows: • Start accessing the RAT and at the same time address the ELs to see if the desired entry is located in one of the ELs • If a matching entry is found, discontinue the access from the RAT

  12. PATMOS 2003 Renaming Logic with Four External Latches

  13. PATMOS 2003 The Hit Ratio to External Latches %

  14. PATMOS 2003 Experimental Setup (AccuPower, DATE’02) Compiled SPEC benchmarks Performance stats Microarchitectural Simulator Datapath specs Transition counts, Context information Power/energy stats Energy/Power Estimator VLSI layout data SPICE SPICE deck SPICE measures of Energy per transition

  15. PATMOS 2003 Energy of the Baseline and Proposed RAT Designs EnergySavings 15% 19% 27% 30% pJ

  16. PATMOS 2003 Conclusions • Two techniques to reduce RAT power have been proposed: • CSense • Buffering Recent Register Mappings • 30% energy savings • No performance penalty • Little additional complexity • No increase in the processor’s cycle time

More Related