180 likes | 317 Views
A Configurable Simulator for OOO Speculative Execution. Design & Implementation. By Mustafa Imran Ali ID#230203. Architecture Modeled. Fetch logic Trace driven execution. Branches outcome explicitly specified. Issue Logic Issue width configurable
E N D
A Configurable Simulator for OOO Speculative Execution Design & Implementation By Mustafa Imran Ali ID#230203
Architecture Modeled • Fetch logic • Trace driven execution. Branches outcome explicitly specified. • Issue Logic • Issue width configurable • Functional Units’ Reservations Stations (RS) • RS count configurable • Execution Units modeled after MIPS R4000 Pipeline (Hennessy & Peterson Computer Architecture 3rd Ed.) • No. of pipeline stages configurable • Common Data Buses • No. of CDBs configurable • ROB and commit logic • ROB size and commit capacity configurable by Mustafa Imran Ali
Simulation Methodology • A program trace file written in comma separated variable (CSV) format • A configuration file to specify values of configurable parameters • Trace file and configuration file input to the simulator by Mustafa Imran Ali
Architectural Assumptions • Only load misses supported. Stores are committed in a single cycle • Stores use a direct bus to transfer the calculated Effective Address into the ROB • Branch outcomes are written to ROB using the CDB • Branch mispredict is handled when the branch instruction reaches the Head of ROB by Mustafa Imran Ali
Architectural Assumptions (cont.) • Dynamic memory disambiguation implemented by using a Store EA cache • A load is only allowed to proceed if there are no pending Stores with the same effective address • Reservations Stations issue the first ready instruction detected • Not necessarily the oldest Instruction by Mustafa Imran Ali
Architectural Assumptions (cont.) • The number of CDBs available are arbitrated • When a request for CDB arrives, the following priority order is used to grant the requests • Branch FU • Div FU • LD/ST • MULT FU • FPADD FU • INT ALU FU by Mustafa Imran Ali
List of Configurable Parameters • ISSUE SIZE • The maximum number of instructions examined for parallel issue • COMMIT SIZE • The maximum number of instructions examined in ROB for commit • ROB SIZE • The number of entries in Reorder Buffer • NUM CDB • Number of Common Data Buses • LSQ SIZE • Number of entries in load store buffer • STORE CACHE SIZE • Number of entries in store EA lookup table by Mustafa Imran Ali
List of Configurable Parameters • NUMRSBU • NUMRSINTALU • NUMRSMULT • MULTSTAGES • NUMRSDIV by Mustafa Imran Ali
List of Configurable Parameters • DIVCYCLES • NUMRSFPADD • FPADDSTAGES • MISSPROB • MPPROB by Mustafa Imran Ali
Simulator Structure main() { readtracefile(); readconfigfile(); while(NOT EXIT) { commit(); ROB_update(); RS_update(); CDB_Arbiter(); writeback(); execute(); issue(); fetch(); } printStatistics(); } by Mustafa Imran Ali
Block Diagram Issue Unit Trace INT ALU RS BR UNIT RS LSQ DIV UNIT RS MULT UNIT RS ROB Arbiter CDB RF by Mustafa Imran Ali
Metrics Measured • Cycles to Complete • Issue Stall Cycles • Cycles when no instructions can be issued to RS • FU utilizations (for each FU) • No. of FU type Instructions / Total Cycles • CDB utilizations (for each CDB) • No. broadcasts / Total Cycles • Cycles Per Instruction by Mustafa Imran Ali
Metrics Measured (cont.) • Frequency of Various Issue Count over all execution cycles • Frequency of Various Commit Count over all execution cycles • RS occupancy Frequency over all cycles • ROB occupancy Frequency over all cycles by Mustafa Imran Ali
Simulator Design • Coded in C++ • Compiled using MS VC++ 6.0 by Mustafa Imran Ali
Execution Demonstration Registers State Initializations REGS[1].valid=1 REGS[2].valid=1 REGS[3].valid=1 REGS[8].valid=1 REGS[9].valid=1 REGS[11].valid=1 REGS[12].valid=1 REGS[15].valid=1 REGS[16].valid=1 REGS[17].valid=1 Sample Program ADD R0,R1,R2; ADD R4,R0,R3; ADD R7,R4,R0; ADD R10,R11,R12; ADD R13,R10,R15; ADD R13,R16,R17; ADD R15,R11,R12; ADD R17,R15,R12; EXIT RAW{{ }RAW }RAW }WAR WAW{ RAW{ by Mustafa Imran Ali
Results: Cycles by Mustafa Imran Ali
Present Implementation • Completely Configurable Simulator • INT ALU in working State by Mustafa Imran Ali
Immediate Extension • Branch Unit Completion • Pipelined Multiplier Completion • LD/STORE Unit Completion by Mustafa Imran Ali