1 / 38

Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology. http://csg.csail.mit.edu/6.375. Dealing with Conflicts. When do conflicts arise? How do we Analyze them? How do we fix them?

hadar
Download Presentation

Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375

  2. Dealing with Conflicts • When do conflicts arise? • How do we Analyze them? • How do we fix them? • How do we make sure we’re okay? http://csg.csail.mit.edu/6.375

  3. SFIFO m V n interface SFIFO#(type t, type tr, type v); method Action enq(t); // enqueue an item method Action deq(); // remove oldest entry method t first(); // inspect oldest item method Action clear(); // make FIFO empty method Maybe#(v) find(tr); // search FIFO endinterface n = # of bits needed to represent the values of type “t“ m = # of bits needed to represent the values of type “tr“ v = # of bits needed to represent the values of type “v“ enab enq rdy not full enab SFIFO module rdy deq not empty n first rdy not empty enab clear bool find http://csg.csail.mit.edu/6.375

  4. Processor Example execute decode write- back memory rf pc fetch dMem iMem CPU 5 – stage Processor. 1 element FIFOs in between stages Let’s add bypassing http://csg.csail.mit.edu/6.375

  5. Decode Rule Decode is also correct correct anytime it’s allowed to execute rule decode (!newStallFunc(instr, d2eQ, e2mQ, m2wQ)); let fetInst = f2dQ.first(); f2dQ.deq(); match {.ra, .rb} = getRARB(fetInst); let va0 = rf[ra]; let va1 = fromMaybe (m2wQ.find(ra), va0); let va2 = fromMaybe (e2mQ.find(ra), va1); let vb0 = rf[rb]; let vb1 = fromMaybe (m2wQ.find(rb), vb0); let vb2 = fromMaybe (e2mQ.find(rb), vb1); let newInst = case (fetInst) match Add: return (DAdd .va2 .vb2); … endcase; d2eQ.enq(newInst); endrule Search through each place in design When do we want it to execute? http://csg.csail.mit.edu/6.375

  6. some insight intoConcurrent rule firing rule steps Ri Rj Rk Rules Rj HW Rk clocks Ri http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 There are more intermediate states in the rule semantics (a state after each rule step) In the HW, states change only at clock edges

  7. Parallel executionreorders reads and writes Rules rule steps reads writes reads writes reads writes reads writes reads writes reads writes reads writes clocks HW http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 In the rule semantics, each rule sees (reads) the effects (writes) of previous rules In the HW, rules only see the effects from previous clocks, and only affect subsequent clocks

  8. Correctness rule steps Ri Rj Rk Rules Rj HW Rk clocks Ri http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 Rules are allowed to fire in parallel only if the net state change is equivalent to sequential rule execution Consequence: the HW can never reach a state unexpected in the rule semantics

  9. Upshot • Given the concurrency of method/rules in a system we can determine viable schedules • Some variation do to applicability • BUT we know what schedule we want (mostly) • We should be able to back propagate results to submodules http://csg.csail.mit.edu/6.375

  10. Determining Concurrency Properties http://csg.csail.mit.edu/6.375

  11. Processor: Concurrencies execute decode write- back memory rf pc fetch dMem iMem CPU http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 In-order: F < D < E < M < W Pipelined W < M < E < D < F

  12. execute decode rf pc fetch write- back memory imem dMem CPU Concurrency requirements for Full Pipelining – Reg File • In-Order RF: • (D calls sub) < (W calls upd) • Pipelined RF: • (W calls upd) < (D calls sub) http://csg.csail.mit.edu/6.375

  13. Concurrency requirements for Full Pipelining – FIFOs In-Order FIFOs: 1. m2wQ, e2mQ: find < enq < first < deq 2. d2eQ: find < enq < first < deq, clear Pipeline FIFOs: 3. m2wQ, e2mQ : first < deq < enq < find 4. d2eQ : first < deq < find < enq execute decode rf pc fetch write- back memory imem dMem CPU http://csg.csail.mit.edu/6.375

  14. Constructing Appropriately concurrent submodules http://csg.csail.mit.edu/6.375

  15. From Analysis to Design • We need to create modules which behave as needed • Construct modules using “unsafe” primitives to have “safe” behaviors • Three major concepts: • Use primitives which remove “false” concurrency orderings (e.g. ConfigRegs vs. Regs) • Add RWires for forwarding values intra-cycle • Reason carefully to assure that execution appears “atomic” http://csg.csail.mit.edu/6.375

  16. ConfigReg and RWire • mkConfigReg is a Reg without this restriction • mkReg requires that read < write • Allows us to read stale values (dangerous) • RWire is a “wire” • wset :: a -> Action writes • wget :: Maybe#(a) returns written value if read happened. • wset happens before wget each cycle http://csg.csail.mit.edu/6.375

  17. Let’s implement some modules http://csg.csail.mit.edu/6.375

  18. Processor Redux execute decode write- back memory rf pc fetch dMem iMem CPU http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 In-order: F < D < E < M < W Pipelined W < M < E < D < F

  19. Concurrency: RegFile • The standard library regfile is implemented using with concurrency (sub < upd) • This handles the in-order case • We need to build a RegisterFile for the pipelined case http://csg.csail.mit.edu/6.375

  20. BypassRegFile module mkBypassRegFile(RegFile#(a,d)) #(d l, d h) provisos#(Bits(a,asz), Bits#(d,dsz)); RegFile#(a,d) rfInt <- mkRegFileWCF(l,h); RWire#(Tuple2#(a,d)) curWrite <- mkRWire(); method Action upd(a x, d v); rfInternal.upd(x,v); curWrite.wset(tuple2(x,v)); endmethod method d sub(a x); case (curWrite.wget()) matches tagged Valid {.wa, .wd} &&& wa == a: return wd; default: return rfInternal.sub(a); endcase endmethod endmodule http://csg.csail.mit.edu/6.375

  21. Processor Redux execute decode write- back memory rf pc fetch dMem iMem CPU http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 In-order: F < D < E < M < W Pipelined W < M < E < D < F

  22. One Element SFIFO (Naïve) module mkSFIFO1#(function Maybe#(v) findf(tr r, t x)) (SFIFO#(t,tr,v)); Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkReg(False); method Action enq(t x) if (!full); full <= True; data <= x; endmethod method Action deq() if (full); full <= False; endmethod method t first() if (full); return (data); endmethod method Maybe#(v) find(tr r); return (full ? findf(r, data): Nothing); endmethod endmodule Concurrency: find < first < (enq C deq) http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375

  23. One Element SFIFO (In-Order d2eQ #1) find < first < enq < deq module mkSFIFO1#(function Maybe#(v) findf(tr r, t x)) (SFIFO#(t,tr,v)); Reg#(t) data <- mkConfigRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(t) enqv <- mkRWire(); method Action enq(t x) if (!full); full <= True; data <= x; enqv.wset(x); endmethod method Action deq() if (full || isValid(enqv.wget())); full <= False; endmethod method t first() if (full); return data; endmethod method Maybe#(v) find(tr r); return full ? findf(r,data): Nothing; endmethod endmodule http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375

  24. One Element SFIFO (In-Order e2mQ, m2wQ #2) find < enq < first < deq module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(t) enqv <- mkRWire(); method Action enq(t x) if (!full); full <= True; data <= x; enqv.wset(x); endmethod method Action deq() if (full || isValid(enqv.wget())); full <= False; endmethod method t first() if (full || isValid(enqv.wget())); return (fromMaybe(enqv.wget(), data)); endmethod method Maybe#(v) find(tr r); return full ? findf(r,data): Nothing; endmethod endmodule http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375

  25. One Element Searchable SFIFO (Pipelined #3) first < deq < enq < find module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkConfigRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(void) deqw <- mkRWire(); RWire#(void) enqw <- mkRWire(); method Action enq(t x) if (!full || isValid(deqw.wget()); full <= True; data <= x; enqw.wset(x); endmethod method Action deq() if (full); full <= False; deqw.wset(?); endmethod method t first() if (full); return (data); endmethod method Maybe#(v) find(tr r); return (full&&!isValid(deqw.wget()) ? findf(r,data) : isValid(enqw.wget()) ? findf(r, fromMaybe(enqw.wget(),?)): Nothing; endmethod endmodule http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375

  26. One Element Searchable SFIFO (Pipelined #4) first < deq < find < enq module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkConfigRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(void) deqw <- mkRWire(); method Action enq(t x) if (!full || isValid(deqw.wget()); full <= True; data <= x; endmethod method Action deq() if (full); full <= False; deqw.wset(?); endmethod method t first() if (full); return (data); endmethod method Maybe#(v) find(tr r); return (full&&!isValid(deqw.wget()) ? findf(r, data): Nothing; endmethod endmodule http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375

  27. One Element Searchable SFIFO (Pipelined #4) first < deq < find < enq module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(void) deqEN <- mkRWire(); Bool deqp = isValid (deqEN.wget())); method Action enq(t x) if (!full|| deqp); full <= True; data <= x; 12endmethod method Action deq() if (full); full <= False; deqEN.wset(?); endmethod method t first() if (full); return (data); endmethod method Maybe#(v) find(tr r); return (full&&!deqp) ? findf(r, data): Nothing; endmethod endmodule http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375

  28. Up-Down Counter http://csg.csail.mit.edu/6.375

  29. Counter Module Interface interface Counter method Action up(); method Action down(); method Bit#(32) _read(); endinterface Concurrency: up and down should be independent http://csg.csail.mit.edu/6.375

  30. Naïve Counter Example module mkCounter(Counter); Reg#(int) r <- mkReg(); method int _read(); return r; endmethod method Action up(); r <= r + 1; endmethod method Action down(); c <= r – 1; endmethod endmodule http://csg.csail.mit.edu/6.375

  31. Counter Example module mkCounter(Counter); Reg#(int) r <- mkConfigReg(); RWire#(void) upW <- mkRWire(); RWire#(void) downW <- mkRWire(); method int _read(); return r; endmethod method Action up(); upW.wset(); endmethod method Action down(); downW.wset(); endmethod rule updateR(True); r <= r + (isValid( upW.wget()) ? 1 : 0) - (isValid(downW.wget()) ? 1 : 0); endrule endmodule What if want to call up then _read? http://csg.csail.mit.edu/6.375

  32. Completion Buffer http://csg.csail.mit.edu/6.375

  33. Completion buffer: Interface cbuf getToken getResult put (result & token) interface CBuffer#(type t); methodActionValue#(Token) getToken(); methodAction put(Token tok, t d); methodActionValue#(t) getResult(); endinterface typedef Bit#(TLog#(n)) TokenN#(numeric type n); typedef TokenN#(16) Token; http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375

  34. IP-Lookup module with the completion buffer enter getResult getToken cbuf yes done? RAM no fifo module mkIPLookup(IPLookup); rule recirculate… ; rule exit …; method Action enter (IP ip); Token tok <- cbuf.getToken(); ram.req(ip[31:16]); fifo.enq(tuple2(tok,ip[15:0])); endmethod method ActionValue#(Msg) getResult(); let result <- cbuf.getResult(); return result; endmethod endmodule for enter and getResult to execute simultaneously, cbuf.getToken and cbuf.getResult must execute simultaneously http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375

  35. IP Lookup rules with completion buffer rule recirculate (!isLeaf(ram.peek())); match{.tok,.rip} = fifo.first(); fifo.enq(tuple2(tok,(rip << 8))); ram.req(ram.peek() + rip[15:8]); fifo.deq(); ram.deq(); endrule rule exit (isLeaf(ram.peek())); cbuf.put(ram.peek()); fifo.deq(); ram.deq(); endrule For rule exit and method enter to execute simultaneously, cbuf.put and cbuf.getToken must execute simultaneously  For no dead cycles cbuf.getToken and cbuf.put and cbuf.getResult must be able to execute simultaneously http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375

  36. Naïve Completion Buffer module mkCBuffer(CBuffer#(a)); Vector#(Reg#(Bool)) valids <- replicateM(mkReg(False)); RegFile#(Token, t) data <- mkRegFile(); Reg#(Token) rdP <- mkReg(0); Reg#(Token) wrP <- mkReg(0); Reg#(Token) cnt <- mkReg(0); method ActionValue#(Token) getToken() if (cnt < Max); cnt <= cnt + 1; rdP <= nextPointer(rdP); valids[rdP] <= False; return rdp; endmethod method Action put(Token tok, t d); valids[tok] <= True; data.upd(tok, d); endmethod method ActionValue#(t) getResult() if (valids[wrP]) cnt <= cnt -1; wrP <= nextPointer(wrP); return (data.sub(wrP)); endmethod endmodule http://csg.csail.mit.edu/6.375

  37. Completion buffer: Interface Requirements cbuf getToken getResult put (result & token) Rules and methods concurrency requirement to avoid dead-cycles: exit < getResult < enter  cbuf methods’ concurency: cbuf.getResult < cbuf.put < cbuf.getToken http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375

  38. Completion Buffer getResult < put < getToken module mkCBuffer(CBuffer#(a)); Vector#(Reg#(Bool)) valids <- replicateM(mkReg(False)); RegFile#(Token, t) data <- mkRegFile(); Reg#(Token) rdP <- mkConfigReg(0); Reg#(Token) wrP <- mkConfigReg(0); Counter cnt <- mkCounter(); method ActionValue#(Token) getToken() if (cnt < Max); cnt.up(); rdP <= rdP + 1; valids[rdP] <= False; return rdp; endmethod method Action put(Token tok, t d); valids[tok] <= True; data.upd(tok, d); endmethod method ActionValue#(t) getResult() if (valids[wrP]) cnt.down(); wrP <= wrP + 1; return (data.sub(wrP)); endmethod endmodule Is valids okay? Is the ordering correct? http://csg.csail.mit.edu/6.375

More Related