1 / 19

Retiming with Interconnect and Gate Delay

Retiming with Interconnect and Gate Delay. CUHK CSE CAD Group Dennis Tong 29 th Sept., 2003. Presentation Outline. Retiming Revisit Retiming with Interconnect Delay Future Work Conclusion. Retiming. Problem Formulation

mora
Download Presentation

Retiming with Interconnect and Gate Delay

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Retiming with Interconnect and Gate Delay CUHK CSE CAD Group Dennis Tong 29th Sept., 2003

  2. Presentation Outline • Retiming Revisit • Retiming with Interconnect Delay • Future Work • Conclusion

  3. Retiming • Problem Formulation • given a sequence circuit G(V, E, d(v), w(euv)), retiming can be viewed as an vertex-to-integer mapping, r: V  Z, whereZis the set of integerssuch that a new circuit G’(V, E, d(v), wr(euv)) is obtained. wr(euv) = w(euv) + r(v) – r(u)  0

  4. Retiming with Interconnect Delay • Two Algorithms Proposed • an optimal approach • gives optimal solution when both gate and interconnect delay are considered • a near-optimal fast approach • gives optimal solution when gate delay is neglected, but still gives near-optimal results when both delays are considered • runs much faster

  5. An Optimal Approach • Extension from the Original Paper • “Retiming Synchronous Circuitry”, Charles E. Leiserson and James B. Saxe, Algorithmica, 6:5-35, 1991. • Main Idea • transform retiming to a special case of MILP which is polynomial time solvable

  6. d(v1) = 0 v1 d(v1,v2) = d(v) d(v) v d(v2) = 0 v2 Near-optimal Fast Approach • give optimal solution when no gate delay • Pre-processing • replace each gate by a wire • represent gate delay d(v) by wire delay d(v1,v2) pre-process

  7. v1 v1 “got removed” from gate v “got retimed” into gate v v2 v2 Near-optimal Fast Approach • Post-processing • remove registered “got retimed” into the gates • use linear programming to minimize clock post-process

  8. Near-optimal Fast Approach • Algorithm Overview • transform G(V,E) into a DAG G’(V’,E’) • construct timing constraints • solve the set of constraints • find optimum Topt by binary search • post-process flip flops “got retimed” into gates

  9. G(V,E) A A G’(V’,E’) B B C C A’ B’ Near-optimal Fast Approach Step 1: Transform G(V,E) into a DAG G’(V’,E’) • traverse G in a depth-first manner • break all back edges found • denote Vb the set of vertices have back edges (e.g., A and B  Vb) DFS traversal

  10. A tc1 B C tc2 A’ B’ Near-optimal Fast Approach Step 2: Define Timing Variable tv tv - for all v  V’, denotes the maximum interconnect delay from a register connecting to an input of v. In general, tv is given by: tv max { tu + d(u,v)  (w(euv) + r(v)  r(u)) T } u  in(v) in(v) : the set of vertices in V’ with an edge pointing to v in G’ tc = MAX { tc1 , tc2 } = tc1

  11. A tc1 B C tc2 A’ B’ Near-optimal Fast Approach Step 2: Construct Timing Constraints Given tv for all v  V’ : tv max { tu + d(u,v)  (w(euv) + r(v)  r(u)) T } (1) u  in(v) We have constraints : tv T v  V’ (2) tv’ tvv  Vb (3) r(v’) = r(v) v  Vb (4)

  12. A B C A’ B’ Near-optimal Fast Approach Step 3: Solve the Set of Timing Constraints • Main Idea Express tv for v  V’ in terms of tu and r(u) where u  Vb Reduce the constraints involve tu and r(u) only Use Bellman-Ford algo. to solve for tu and r(u) Derive tv and r(v) by propagating tu and r(u) in G’

  13. A B C A’ B’ Near-optimal Fast Approach Step 3: Express tv in terms of tu and r(u) where u  Vb uv- for all u, v  V’denotes the maximum delayamong all the directed paths from u to v in G’ when no retiming is done, reducing the delay by T if a register is encountered. For example, AA’ = max { dABCA’–5T ,dACA’– 3T } AB’ = max { dABCB’–4T , dACB’– 2T } Combining tv and uv, (1) becomes: tv max { tq + qv  (r(v)  r(q)) T } (1’) q  anc(v) anc(v) : the set of vertices in Vb with a directed path to v in G’

  14. A B C A’ B’ Near-optimal Fast Approach Step 3: Reduce the constraints involve tu and r(u) only Given tv for all v  V’ : tv tq + qv  (r(v)  r(q)) T } q  anc(v) Let v = tv + r(v) T for v  Vb, the constraints become: q + qv’ v(5) q + qv v(6) A system of difference inequalities

  15. A B C A’ B’ Near-optimal Fast Approach Step 3: Derive tv and r(v) from tu and r(u) in G’ Solve u for u  Vb by Bellman-Ford algo. Compute tu and r(u) given u = tu + r(u) T Compute tv and r(v) for v  V’ - Vbusing (1’) Step 4: Find optimum Topt by binary search

  16. Experimental Results - I • Testing Environment • Intel Xeon 1.8GHz, 512KB cache, 512MB RAM • ISCAS89’ suite

  17. Experimental Results - II

  18. v1 v1 u v2 u v2 v3 v3 flip flop shared in the stem unrealistic increase in number of flip flop Future Work • Multi-pin Net Handling • find a maximum sharing of flip flops in a net while the clock is preserved • avoid unrealistic increase in number of flip flops

  19. Y retimed to X X Future Work • Circuit Delay Modeling • flip flop positions affect delay estimation • load in each branch affect one another

More Related