1 / 34

Modular Refinement - 2 Arvind Computer Science & Artificial Intelligence Lab

Modular Refinement - 2 Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology. rf. pc. CPU. rf. pc. write- back. fetch. memory. execute. decode. dMem. iMem. CPU. fetch & decode. execute. bu. Successive refinement & Modular Structure.

elden
Download Presentation

Modular Refinement - 2 Arvind Computer Science & Artificial Intelligence Lab

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modular Refinement - 2 Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology http://csg.csail.mit.edu/6.375

  2. rf pc CPU rf pc write- back fetch memory execute decode dMem iMem CPU fetch & decode execute bu Successive refinement & Modular Structure Can we derive the 5-stage pipeline by successive refinement of a 2-stage pipeline? http://csg.csail.mit.edu/6.375

  3. CPU FIFO bu dMem iMem RFile rf WB execute setPc pc stall enqIt A Modular organization • Suppose we include rf and pc in Fetch and bu in Execute • Fetch delivers decoded instructions to Execute and needs to consult Execute for the stall condition • Execute writes back data in rf and supplies the pc value in case of a branch misprediction modules call each other (recursive) fetch & decode http://csg.csail.mit.edu/6.375

  4. module mkCPU2#(Mem iMem, Mem dMem)(); Execute execute <- mkExecute(dMem, fetch); Fetch fetch <- mkFetch(iMem, execute); endmodule interface Fetch; method Action setPC (Iaddress cpc); method Action writeback (RName dst, Value v); endinterface interface Execute; method Action enqIt(InstTemplate it); method Bool stall(Instr instr) endinterface recursive calls Recursive modular organization Unfortunately, recursive module syntax is not as simple http://csg.csail.mit.edu/6.375

  5. Fetch Module module mkFetch#(IMem iMem, Execute execute) (Fetch); Instr instr = iMem.read(pc); Iaddress predIa = pc + 1; Reg#(Iaddress) pc <- mkReg(0); RegFile#(RName, Bit#(32)) rf <- mkBypassRegFile(); rule fetch_and_decode (!execute.stall(instr)); execute.enqIt(newIt(instr,rf)); pc <= predIa; endrule methodAction writeback(RName rd, Value v); rf.upd(rd,v); endmethod methodAction setPC(Iaddress newPC); pc <= newPC; endmethod endmodule http://csg.csail.mit.edu/6.375

  6. Execute Module module mkExecute#(DMem dMem, Fetch fetch) (Execute); SFIFO#(InstTemplate) bu <- mkSLoopyFifo(findf); InstTemplate it = bu.first; rule execute … methodActionenqIt(InstTemplate it); bu.enq(it); endmethod method Bool stall(Instr instr); return (stallFunc(instr, bu)); endmethod endmodule http://csg.csail.mit.edu/6.375

  7. Execute Module Rule rule execute (True); case (it) matchestagged EAdd{dst:.rd,src1:.va,src2:.vb}: begin fetch.writeback(rd, va+vb); bu.deq(); end tagged EBz {cond:.cv,addr:.av}: if (cv == 0) then begin fetch.setPC(av);bu.clear(); end else bu.deq();tagged ELoad{dst:.rd,addr:.av}: begin fetch.writeback(rd, dMem.read(av)); bu.deq(); end tagged EStore{value:.vv,addr:.av}: begin dMem.write(av, vv); bu.deq(); end endcase endrule http://csg.csail.mit.edu/6.375

  8. Fetch Module RefinementSeparating Fetch and Decode module mkFetch#(IMem iMem, Execute execute) (Fetch); FIFO#(Instr) buD <- mkLoopyFIFO(); Instr instr = iMem.read(pc); Iaddress predIa = pc + 1; rulefetch(True); pc <= predIa; buD.enq(instr); endrule ruledecode(!execute.stall(buD.first())); execute.enqIt(newIt(buD.first(),rf)); buD.deq(); endrule methodAction writeback(RName rd, Value v); rf.upd(rd,v); endmethod methodAction setPC(Iaddress newPC); pc <= newPC; buD.clear(); endmethod endmodule http://csg.csail.mit.edu/6.375

  9. Execute Module Refinement module mkExecute#(DMem dMem, Fetch fetch) (Execute); SFIFO#(InstTemplate) bu <- mkSLoopyFifo(findf); InstTemplate it = bu.first; rule execute … methodAction enqIt(InstTemplate it); bu.enq(it); endmethod method Bool stall(Instr instr); return (stallFunc(instr, bu)); endmethod endmodule Introduce two more queues and break this rule into execute, memory operation and commit rules http://csg.csail.mit.edu/6.375

  10. Execute Module Refinement module mkExecute#(DMem dMem, Fetch fetch) (Execute); SFIFO#(InstTemplate) buE <- mkSLoopyFifo(findfE); SFIFO#(InstTemplateM) buM <- mkSLoopyFIFO(findfM); SFIFO#(InstTemplateC) buC <- mkSLoopyFIFO(findfC); ruleexecute … rulememory … rulecommit … methodAction enqIt(InstTemplate it); buE.enq(it); endmethod method Bool stall(Instr instr); return(stallFuncE(instr, buE) && stallFuncM(instr, buM) && stallFuncC(instr, buC)); endmethod endmodule http://csg.csail.mit.edu/6.375

  11. New Types for Instruction Templates typedef union tagged { struct {RName dst; Value val;} MWB; void MNop; struct {RName dst; IAddress addr;} MLoad; struct {IAddress addr; Value val} MStore; } InstTemplateMderiving(Bits, Eq); typedef union tagged { struct {RName dst; Value val;} CWB; void CNop; } InstTemplateCderiving(Bits, Eq); http://csg.csail.mit.edu/6.375

  12. The New Execute RuleDoes not do mem-ops and writebacks ruleexecute(True); case (buE.first()) matches tagged EAdd{dst:.rd,src1:.va,src2:.vb}: begin buM.enq(tagged MWB{dst:rd, val:va+vb}); buE.deq();end tagged EBz {cond:0,addr:.av}: begin fetch.setPC(av); buE.clear(); buM.enq(tagged MNop{});end tagged EBz {cond:.cv,addr:.av}: begin buM.enq(tagged MNop{}); buE.deq(); end tagged ELoad{dst:.rd,addr:.av}: begin buM.enq(tagged MLoad{dst:rd,addr:av}); buE.deq();end tagged EStore{value:.vv,addr:.av}: begin buM.enq(tagged MStore{addr:av,val:vv}); buE.deq();end endcase endrule http://csg.csail.mit.edu/6.375

  13. The Memory Rule rulememory(True); case (buM.first()) matches tagged MWB{dst:.rd,val:.v}: begin buC.enq(tagged CWB{dst:rd, val:v}); buM.deq(); end tagged MNop {}: begin buC.enq(tagged CNop); buM.deq(); end tagged MLoad{dst:.rd,addr:.av}: begin buC.enq(tagged CWB{dst:rd, val:dMem.read(av)}); buM.deq();end tagged MStore{addr:.av, val:vv}}: begin buC.enq(tagged CNop{}); dMem.write(av,vv); buM.deq();end endcase endrule http://csg.csail.mit.edu/6.375

  14. The Commit Rule rulecommit(True); case (buC.first()) matches tagged CWB{dst:.rd,val:.v}: fetch.writeback(rd,v); tagged CNop {}: noAction; endcase buC.deq(); endrule http://csg.csail.mit.edu/6.375

  15. Stall Functions • Same as before, save variation in types in Queue. http://csg.csail.mit.edu/6.375

  16. rf pc write- back memory execute decode dMem iMem CPU Synchronous Refinements fetch Realistic processor Pipelines often have synchronous pieces connected asynchronously, i.e., using FIFOs http://csg.csail.mit.edu/6.375

  17. Sync. Execute Module module mkExecute#(DMem dMem, Fetch fetch) (Execute); SFIFO#(InstTemplate) buE <- mkSLoopyFifo(findf); Reg#(Maybe#(InstTemplateM) buRegM <- mkReg(Invalid); Reg#(Maybe#(InstTemplateC) buRegC <- mkReg(Invalid); ruleexecutePipe … methodAction enqIt(InstTemplate it); buE.enq(it); endmethod method Bool stall(Instr instr); return(stallFuncE(instr, buE) && stallFuncM(instr, buRegM) && stallFuncC(instr, buRegC)); endmethod endmodule http://csg.csail.mit.edu/6.375

  18. rule sync-pipeline (True); if (inQ.notEmpty()) begin sReg1 <= Valid f1(inQ.first()); inQ.deq(); end elsesReg1 <= Invalid; case (sReg1) matches tagged Valid .sx1: sReg2 <= Valid f2(sx1); tagged Invalid: sReg2 <= Invalid; endcase case (sReg2) matches tagged Valid .sx2: outQ.enq(f3(sx2)); endcase endrule Use the following Mapping: inQ -> buE sReg1 -> buRegM sReg2 -> buRegC outQ.enq -> fetch.writeback Sync. Execute PipelineFrom Lecture 5 http://csg.csail.mit.edu/6.375

  19. ruleexecutePipe (True); if (buE.notEmpty()) begin buRegM <= Valid f1(buE.first()); buE.deq(); end elsebuRegM <= Invalid; case (buRegM) matches tagged Valid .sx1: buRegC <= Valid f2(sx1); tagged Invalid: buRegC <= Invalid; endcase case (buRegC) matches tagged Valid .sx2: fetch.writeback(f3(sx2)); endcase endrule In Processor pipelines, the stage functions require additional side effects (actions) e.g. setPC/dMem.write/etc. For clarity, we will show these actions one by one Sync. Execute Pipeline Exec Action Mem Action Commit Action http://csg.csail.mit.edu/6.375

  20. if (buE.notEmpty()) begin buRegM <= Valid f1(buE.first()); buE.deq(); end elsebuRegM <= Invalid; Execute Action case (buE.first()) matches tagged EAdd{dst:.rd,src1:.va,src2:.vb}: begin buRegM <= Valid MWB{dst:rd, val:va+vb}; buE.deq();end tagged EBz {cond:0,addr:.av}: begin fetch.setPC(av); buE.clear(); buRegM <= Valid MNop{};end tagged EBz {cond:.cv,addr:.av}: begin buRegM <= Valid MNop{};buE.deq(); end tagged ELoad{dst:.rd,addr:.av}: begin buRegM <= Valid MLoad{dst:rd,addr:av}; buE.deq();end tagged EStore{value:.vv,addr:.av}: begin buRegM <= Valid MStore{addr:av,val:vv}; buE.deq();end endcase “Extra effects” http://csg.csail.mit.edu/6.375

  21. case (buRegM) matches tagged Valid .itM: buRegC <= Valid f2(itM); tagged Invalid: buRegC <= Invalid; Memory Action case (itM) matchestagged MWB{dst:.rd,val:.v}: buRegC <= tagged CWB{dst:rd, val:v}; tagged MNop {}: buRegC <= tagged CNop {}; tagged MLoad{dst:.rd,addr:.av}: buRegC <= tagged CWB{dst:rd, val:dMem.read(av)}; tagged MStore{addr:.av, val:vv}}: begin buRegC <= tagged CNop{}; dMem.write(av,vv);end endcase “Extra effects” http://csg.csail.mit.edu/6.375

  22. case (buRegC) matches tagged Valid .itC: fetch.writeback(f3(itC)); endcase Commit Action case (itC) matchestagged CWB{dst:.rd,val:.v}: fetch.writeback(rd,v); tagged CNop {}: noAction; endcase http://csg.csail.mit.edu/6.375

  23. ruleexecutePipe (True); if (buE.notEmpty()) begin buRegM <= Valid f1(buE.first()); buE.deq(); end elsebuRegM <= Invalid; case (buRegM) matches tagged Valid .sx1: buRegC <= Valid f2(sx1); tagged Invalid: buRegC <= Invalid; endcase case (buRegC) matches tagged Valid .sx2: fetch.writeback(f3(sx2)); endcase endrule yes Sync. Execute Pipeline Execute Action; Memory Action; Commit Action; http://csg.csail.mit.edu/6.375

  24. Refining the Memory Subsystem Non-combinational memory reads http://csg.csail.mit.edu/6.375

  25. Realistic memory systems • No combinational reads: The request to read the memory is a different event from the event when the response comes back • The response to the request should be treated as an asynchronous event because its time can vary (cache misses) • Request and response events or actions have to be dealt in different rules • iMem.read(pc) or dMem.read(addr) are no longer read methods – rules using them have to be split up http://csg.csail.mit.edu/6.375

  26. ctr++ Ready (ctr > 0) Ack ctr deq deq enq Enable Data Ready Synch Mem Latency N req ctr-- peek Data Addr Request-Response Interface for Synchronous Memory interface Mem#(type addrT, type dataT); method Action req(addrT x); method Action deq(); method dataT peek(); endinterface Making a synchronous component latency- insensitive http://csg.csail.mit.edu/6.375

  27. Request-Response Memory Interface interface Mem#(type addrT, type dataT); method Action req(MemReq#(addrT,dataT) x); method Action deq(); method dataT peek(); endinterface typedef union tagged { struct {addrT addr; dataT val;} Write; struct {addrT addr;}Read; } MemReq#(type addrT, type dataT) deriving(Bits, Eq); We will assume that the memory processes read and write requests in the order received http://csg.csail.mit.edu/6.375

  28. dMem.read(a) write-back dMem execute dMem.req(a) dMem.peek/deq write-back execute dMem write-back dMem.peek/deq execute dMem.req(a) dMem Non-combinational reads http://csg.csail.mit.edu/6.375

  29. The Modified Execute Rule rule execute(True); case (buE.first()) matches tagged EAdd{dst:.rd,src1:.va,src2:.vb}: begin buC.enq(taggedCWB{dst:rd, val:va+vb}); buE.deq();end tagged EBz {cond:0,addr:.av}: begin fetch.setPC(av); buE.clear(); buC.enq(taggedCNop{});end tagged EBz {cond:.cv,addr:.av}: begin buC.enq(taggedCNop{}); buE.deq(); end tagged ELoad{dst:.rd,addr:.av}: beginbuC.enq(taggedCLoad{dst:rd}); dMem.req(Read{addr:av}); buE.deq();end tagged EStore{value:.vv,addr:.av}: begin buC.enq(taggedCNop{}); dMem.req(Write{addr:av, val:vv}); buE.deq();end endcase endrule http://csg.csail.mit.edu/6.375

  30. The modified Commit Rule rule commit(True); case (buC.first()) matches tagged CWB{dst:.rd,val:.v}: fetch.writeback(rd,v); tagged CNop {}: noAction; tagged CLoad{dst:.rd}: begin fetch.writeback(rd,dMem.peek()); dMem.deq(); end endcase buC.deq(); endrule http://csg.csail.mit.edu/6.375

  31. Fetch Module module mkFetch#(IMem iMem, Execute execute) (Fetch); FIFO#(void)buD <- mkLoopyFIFO(); Instr instr = iMem.peek(); Iaddress predIa = pc + 1; … rulefetch(True); pc <= predIa; iMem.req(Read{addr:pc}); buD.enq(void); endrule ruledecode (!execute.stall(instr)); execute.enqIt(newIt(instr,rf)); iMem.deq(); buD.deq(); endrule method Action setPC … method Action writeback … endmodule Are any changes needed in the methods? http://csg.csail.mit.edu/6.375

  32. Does this work? Fetch Module Refinement module mkFetch#(IMem iMem, Execute execute) (Fetch); FIFO#(void) buD <- mkFIFO(); … Iaddress predIa = pc + 1; methodAction writeback(RName rd, Value v); rf.upd(rd,v); endmethod methodAction setPC(Iaddress newPC); pc <= newPC; buD.clear(); endmethod endmodule Must kill or drain the instruction fetches already in progress http://csg.csail.mit.edu/6.375

  33. Fetch Module module mkFetch#(IMem iMem, Execute execute) (Fetch) Reg#(Bool) drainingIMem <- mkReg(False); … rule fetch(!drainingIMem); pc <= predIa; iMem.req(Read{addr:pc}); buD.enq(void); endrule ruledecode(!drainingIMem && !execute.stall(instr)); iMem.deq(); buD.deq(); execute.enqIt(newIt(instr,rf)); endrule rule drain(drainingIMem); if(buD.notEmpty)begin iMem.deq(); buD.deq(); end else drainingIMem <= False; endrule methodAction setPC(Iaddress newPC); pc <= newPC; drainingIMem <= buD.notEmpty(); endmethod method Action writeback … endmodule http://csg.csail.mit.edu/6.375

  34. Power of Modular Refinement • Any Fetch-Decode module refinements we’ve shown, will work with each of the Execute module refinements we’ve shown • Some refinements require interface changes (e.g. magic memory vs. realistic memory) • The Correctness of refinements often depends on extra-linguistic properties (architectural reasoning) • It should be easy to add caches to new memory abstraction http://csg.csail.mit.edu/6.375

More Related