1 / 18

A Configurable Simulator for OOO Speculative Execution

A Configurable Simulator for OOO Speculative Execution. Design & Implementation. By Mustafa Imran Ali ID#230203. Architecture Modeled. Fetch logic Trace driven execution. Branches outcome explicitly specified. Issue Logic Issue width configurable

rivka
Download Presentation

A Configurable Simulator for OOO Speculative Execution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Configurable Simulator for OOO Speculative Execution Design & Implementation By Mustafa Imran Ali ID#230203

  2. Architecture Modeled • Fetch logic • Trace driven execution. Branches outcome explicitly specified. • Issue Logic • Issue width configurable • Functional Units’ Reservations Stations (RS) • RS count configurable • Execution Units modeled after MIPS R4000 Pipeline (Hennessy & Peterson Computer Architecture 3rd Ed.) • No. of pipeline stages configurable • Common Data Buses • No. of CDBs configurable • ROB and commit logic • ROB size and commit capacity configurable by Mustafa Imran Ali

  3. Simulation Methodology • A program trace file written in comma separated variable (CSV) format • A configuration file to specify values of configurable parameters • Trace file and configuration file input to the simulator by Mustafa Imran Ali

  4. Architectural Assumptions • Only load misses supported. Stores are committed in a single cycle • Stores use a direct bus to transfer the calculated Effective Address into the ROB • Branch outcomes are written to ROB using the CDB • Branch mispredict is handled when the branch instruction reaches the Head of ROB by Mustafa Imran Ali

  5. Architectural Assumptions (cont.) • Dynamic memory disambiguation implemented by using a Store EA cache • A load is only allowed to proceed if there are no pending Stores with the same effective address • Reservations Stations issue the first ready instruction detected • Not necessarily the oldest Instruction by Mustafa Imran Ali

  6. Architectural Assumptions (cont.) • The number of CDBs available are arbitrated • When a request for CDB arrives, the following priority order is used to grant the requests • Branch FU • Div FU • LD/ST • MULT FU • FPADD FU • INT ALU FU by Mustafa Imran Ali

  7. List of Configurable Parameters • ISSUE SIZE • The maximum number of instructions examined for parallel issue • COMMIT SIZE • The maximum number of instructions examined in ROB for commit • ROB SIZE • The number of entries in Reorder Buffer • NUM CDB • Number of Common Data Buses • LSQ SIZE • Number of entries in load store buffer • STORE CACHE SIZE • Number of entries in store EA lookup table by Mustafa Imran Ali

  8. List of Configurable Parameters • NUMRSBU • NUMRSINTALU • NUMRSMULT • MULTSTAGES • NUMRSDIV by Mustafa Imran Ali

  9. List of Configurable Parameters • DIVCYCLES • NUMRSFPADD • FPADDSTAGES • MISSPROB • MPPROB by Mustafa Imran Ali

  10. Simulator Structure main() { readtracefile(); readconfigfile(); while(NOT EXIT) { commit(); ROB_update(); RS_update(); CDB_Arbiter(); writeback(); execute(); issue(); fetch(); } printStatistics(); } by Mustafa Imran Ali

  11. Block Diagram Issue Unit Trace INT ALU RS BR UNIT RS LSQ DIV UNIT RS MULT UNIT RS ROB Arbiter CDB RF by Mustafa Imran Ali

  12. Metrics Measured • Cycles to Complete • Issue Stall Cycles • Cycles when no instructions can be issued to RS • FU utilizations (for each FU) • No. of FU type Instructions / Total Cycles • CDB utilizations (for each CDB) • No. broadcasts / Total Cycles • Cycles Per Instruction by Mustafa Imran Ali

  13. Metrics Measured (cont.) • Frequency of Various Issue Count over all execution cycles • Frequency of Various Commit Count over all execution cycles • RS occupancy Frequency over all cycles • ROB occupancy Frequency over all cycles by Mustafa Imran Ali

  14. Simulator Design • Coded in C++ • Compiled using MS VC++ 6.0 by Mustafa Imran Ali

  15. Execution Demonstration Registers State Initializations REGS[1].valid=1 REGS[2].valid=1 REGS[3].valid=1 REGS[8].valid=1 REGS[9].valid=1 REGS[11].valid=1 REGS[12].valid=1 REGS[15].valid=1 REGS[16].valid=1 REGS[17].valid=1 Sample Program ADD R0,R1,R2; ADD R4,R0,R3; ADD R7,R4,R0; ADD R10,R11,R12; ADD R13,R10,R15; ADD R13,R16,R17; ADD R15,R11,R12; ADD R17,R15,R12; EXIT RAW{{ }RAW }RAW }WAR WAW{ RAW{ by Mustafa Imran Ali

  16. Results: Cycles by Mustafa Imran Ali

  17. Present Implementation • Completely Configurable Simulator • INT ALU in working State by Mustafa Imran Ali

  18. Immediate Extension • Branch Unit Completion • Pipelined Multiplier Completion • LD/STORE Unit Completion by Mustafa Imran Ali

More Related