1 / 10

Using Trace Cache In SMT

Using Trace Cache In SMT. Huaxia Xia June 6, 2001. Why using trace cache in SMT?. For simultaneous multithreading machine, the bottleneck is the instruction fetch bandwidth. Trace cache is an efficient scheme to improve fetch bandwidth. What do we want to know?.

jensen
Download Presentation

Using Trace Cache In SMT

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Trace Cache In SMT Huaxia Xia June 6, 2001

  2. Why using trace cache in SMT? • For simultaneous multithreading machine, the bottleneck is the instruction fetch bandwidth. • Trace cache is an efficient scheme to improve fetch bandwidth

  3. What do we want to know? • The impact of trace cache on single thread and simultaneous multithread • The impact of different design options, such as cache associativity, cache size, methods to deal with un-conditional branches, etc

  4. Current progress New data structures: • trace cache in context • multiple branch predictor in context • save instruction in scoreboard • save branch type in scoreboard New procedures: • Fetch phase: get instructions from trace cache, analyze and save the branch type • Commit phase: update the predictor and fill the trace cache

  5. Deal with different branches Predictable branches: • Unconditional: BR, BSR, CALL_PAL • Conditional:FBEQ, FBLT, FBLE, FBNE, FBGE, FBGT, BLBC, BEQ, BLT, BLE, BLBS, BNE, BGE, BGT Unpredictable branches: • Indirect jump: JMP • Indirect call: JSR, JSR_COROUTINE • RET

  6. Discussion Large storage size for trace cache? • Trace cache potentially need more entries than branch predictor. • Need to deal with unconditional predictable branches as well as conditional branches • Different branches cannot share one entry even if they have same branch behavior • Multithread cannot share one trace cache?

  7. Discussion Single thread or multithread can benefit more from the trace cache? Assume the trace cache size is fixed. • For single thread, issue bandwidth is small, so regular prefetching seems enough; But trace cache can less the miss rate • For multithread, high issue bandwidth; But relative small trace cache brings more confliction

  8. if (MD_OP_FLAGS(op) & (F_CTRL|F_UNCOND)) : instruction is an unconditional branch (direct  or indirect which include BR, BSR, JSR, JMP, JSR_COROUTINE, RET for the Alpha AXP)  if (MD_OP_FLAGS(op) & (F_CTRL|F_COND)) : instruction is a conditional branch (direct or  indirect. Those include the Integer and FP conditional branches for the Alpha AXP)  if (MD_OP_FLAGS(op) & (F_CTRL|F_DIRJMP)) : instruction is a direct branch (conditional or  unconditional which include the Integer and FP conditional branches, BR and BSR for the Alpha AXP)  if (MD_OP_FLAGS(op) & (F_CTRL|F_INDIRJMP)) : instruction is an indirect branch (conditional  or unconditional which include JSR, JMP, JSR_COROUTINE, RET for the Alpha AXP)  if (MD_OP_FLAGS(op) & (F_CTRL|F_CALL)) : instruction is a procedure call (JSR, BSR for  the Alpha AXP)  if (MD_OP_FLAGS(op) & (F_CTRL|F_FPCOND)) : instruction is a FP conditional branch  If any of the conditions is false please let me know about the correct one. Pedictable Uncond_pred:BR and BSR, FBEQ, FBLT, FBLE, FBNE, FBGE, FBGT, BLBC, BEQ, BLT, BLE, BLBS, BNE, BGE, BGT, CALL_PALunpred: JSR, JMP, JSR_COROUTINE, RET

  9. beq Branch if Equal to Zero bne Branch if Not Equal to Zero blt Branch if Less Than Zero ble Branch if Less Than or Equal to Zero bgt Branch if Greater Than Zero bge Branch if Greater Than or Equal to Zero blbc Branch if Low Bit is Clear blbs Branch if Low Bit is Set br Branch Always bsr Branch to Subroutine jmp Jump jsr Jump to Subroutine ret Return from Subroutine jsr_coroutine Jump to Subroutine Return

  10. Data structure of trace cache typedef struct TraceCache{ address_t tag; //the address of the first branch unsigned char branchcount; //# of branches in this trace line unsigned char branchpred; //prediction for the branches unsigned char instrcount; //# of instructions unsigned char blockindex[3]; //index of the basic blocks, bi[0]==0 address_t addr[3]; //the starting addresses of the basic blocks instruction_t instr[16]; }

More Related