210 likes | 225 Views
Learn about delaying physical register allocation through virtual-physical registers. Explore register file design considerations, virtual-physical register mapping, instruction issues, sequential execution, register stealing, and implementation details.
E N D
CS 7960-4 Lecture 14 Delaying Physical Register Allocation Through Virtual-Physical Registers T. Monreal, A. Gonzalez, M. Valero, J. Gonzalez, V. Vinals Proceedings of MICRO-32 November 1999
Register File Design Considerations • Number of ports = 3 x issue width • Number of entries = window size + logical-regs • Multiple threads more registers (more power) • Wire delays, clock speeds multiple cycle access • Pipelining a RAM structure is hard
Register Allocation Fetch Rename Issue Complete Wake-up Commit assign pr7 cycle 4 cycle 15 write pr7 cycle 30 read pr7 cycle 50 release pr7 cycle 80 no result – 26 cyc useful time – 20 cyc no activity – 30 cyc
Two-Level Register File Base regfile Two-level regfile
Virtual-Physical Registers Register map table lr3 vr7 vr7 vr7 vr7 Virtual map table
Virtual-Physical Registers Register map table lr3 vr7 vr7 vr7 Instruction issues vr7 Virtual map table
Virtual-Physical Registers Register map table lr3 vr7, pr9 vr7 (pr9) vr7 pr9 Virtual map table vr7, pr9 Instruction completes Is assigned pr9
Virtual-Physical Registers Register map table lr3 vr7, pr9 vr7 (pr9) pr9 vr7 pr9 Virtual map table
Lack of Registers Finishes, has no register, keeps re-executing In-flight window Has physical register Has no physical register
Lack of Registers cycle t cycle t+1 commits Finishes, has no register, keeps re-executing gets reg In-flight window Has physical register Has no physical register
Deadlock Who will generate a register for this instr? Solution: Reserve a register for the oldest instruction Finishes, has no register, keeps re-executing In-flight window Has physical register Has no physical register
Sequential Execution Oldest instr has reserved register In-flight window Has physical register Has no physical register
Sequential Execution instr commits, releases another reg, that is then reserved for the new oldest instr In-flight window Has physical register Has no physical register
Sequential Execution Behaves like an in-order processor instr commits, releases another reg, that is then reserved for the new oldest instr In-flight window Has physical register Has no physical register
Reserving All Registers Allows quick progress, but almost behaves like a conventional processor Has physical register Has no physical register
Register Stealing Instr finishes; steals register from the youngest finished instr In-flight window • No reservation of regs • The younger instrs may • have to execute twice • Note the pre-execution effect Has physical register Has no physical register
Implementation • Finished instructions have to remain in issueq in • case they have to re-execute • Issued dependents of the victim instruction need • not re-execute • The VP tag of the victim has to be broadcast so • that unissued dependents can reset the ready bit • Can benefit from an instruction reuse buffer? • Pre-execution without explicitly attempting it
Results • Improves the base case by 5% (Int programs) • and 24% (FP programs) • FP programs have more ILP, better branch • prediction, and are more limited by cache misses • Re-executions: 10% (int) 58% (fp) • Steals: 5% (int) 12% (fp) • For the same IPC, VP registers employ 25% fewer • registers
Next Week’s Paper • “Pipeline Gating: Speculation Control for Energy • Reduction”, S. Manne, A. Klauser, D. Grunwald, • Proceedings of ISCA-25, June 1998
Harmonic and Arithmetic Means • HM of IPC = N / (1/IPCa + 1/ IPCb + 1/ IPCc) • = N / (CPIa + CPIb + CPIc) • = 1 / AM of CPI • Weight each benchmark as if they all execute one • instruction • If you want to assume each benchmark executes • for the same time, HM of CPI or AM of IPC is • appropriate
Title • Bullet