1 / 104

Optimization Strategies for Physical Synthesis and Timing Closure

Optimization Strategies for Physical Synthesis and Timing Closure Charles J. Alpert IBM Corp. Sachin Sapatnekar University of Minnesota ECE Dept. Salil Raje Hierarchical Design, Inc. Optimization Strategies for Physical Synthesis and Timing Closure Part One Charles J. Alpert

Download Presentation

Optimization Strategies for Physical Synthesis and Timing Closure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Optimization Strategies for Physical Synthesis and Timing Closure Charles J. Alpert IBM Corp. Sachin Sapatnekar University of Minnesota ECE Dept. Salil Raje Hierarchical Design, Inc.

  2. Optimization Strategies for Physical Synthesis and Timing Closure Part One Charles J. Alpert Austin Research Laboratory IBM Research Division, Austin, TX 78758 alpert@austin.ibm.com

  3. Talk Outline • Introduction • Buffer insertion • Van Ginneken dynamic programming • Extensions • Steiner tree construction • Blockage avoidance • Wire sizing • Interconnect planning

  4. Simple Buffer Insertion Problem Given: Source and sink locations, sink capacitances and RATs, a buffer type, source delay rules, unit wire resistance and capacitance RAT4 Buffer RAT3 s0 RAT2 RAT1

  5. s0 RAT2 RAT1 Simple Buffer Insertion Problem Find: Buffer locations and a routing tree such that slack at the source is minimized RAT4 RAT3

  6. Slack Example RAT = 500 delay = 400 slack = -200 RAT = 400 delay = 600 RAT = 500 delay = 350 slack = +100 RAT = 400 delay = 300

  7. R1 R2 A B C C1 C2 Elmore Delay

  8. Common Approaches • Iteratively insert buffers • Closed-form solutions (2 pin nets) • Dynamic programming • Simultaneous constructions

  9. Talk Outline • Introduction • Buffer insertion • Van Ginneken dynamic programming • Extensions • Steiner tree construction • Blockage avoidance • Wire sizing • Interconnect planning

  10. Van Ginneken’s Classic Algorithm • Optimal for multi-sink nets • Quadratic runtime • Bottom-up from sinks to source • Generate list of candidates at each node • At source, pick the best candidate in list

  11. Key Assumptions • Given routing tree • Given potential insertion points

  12. (1) (2) (3) Generating Candidates

  13. (3) (b) (a) Both (a) and (b) “look” the same to the source. Throw out the one with the worst slack (4) Pruning Candidates

  14. (4) (5) Candidate Example Continued

  15. (5) At driver, compute which candidate maximizes slack. Result is optimal. Candidate Example Continued After pruning

  16. Left Candidates Right Candidates Merging Branches

  17. Critical With pruning Pruning Merged Branches

  18. Van Ginneken Example (20,400) Buffer C=5, d=30 Wire C=10,d=150 (30,250) (5, 220) (20,400) Buffer C=5, d=50 C=5, d=30 Wire C=15,d=200 C=15,d=120 (30,250) (5, 220) (45, 50) (5, 0) (20,100) (5, 70) (20,400)

  19. Van Ginneken Example Cont’d (30,250) (5, 220) (45, 50) (5, 0) (20,100) (5, 70) (20,400) (5,0) is inferior to (5,70). (45,50) is inferior to (20,100) Wire C=10 (30,250) (5, 220) (20,100) (5, 70) (30,10) (15, -10) (20,400) Pick solution with largest slack, follow arrows to get solution

  20. Van Ginneken Recap • Generate candidates from sinks to source • Quadratic runtime • Adding a buffer adds only one new candidate • Merging branches additive, not multiplicative • Optimal for Elmore delay model

  21. Talk Outline • Introduction • Buffer insertion • Van Ginneken dynamic programming • Extensions • Steiner tree construction • Blockage avoidance • Wire sizing • Interconnect planning

  22. Optimal Extensions • Multiple buffer types • Inverters • Polarity constraints • Controlling buffer resources • Capacitance constraints • Blockage recognition • Wire sizing

  23. (1) (2) Time complexity increases from O(n2) to O(n2B2) where B is the number of different buffer types Multiple Buffer Types

  24. (1) (2) • Maintain a “+” and a “-” list of candidates • Only merge branches with same polarity • Throw out negative candidates at source Inverters

  25. “-” list “+” list “-” list Polarity Constraints • Some sinks are positive, some negative • Put negative sinks into “-” list

  26. Controlling Buffering Resources Before, maintain list of capacitance slack pairs (C1, q1), (C2, q2), (C3, q3) (C4, q4), (C5, q5) (C6, q6), (C7, q7), (C8, q8) (C9, q9) Now, store an array of lists, indexed by # of buffers 3 2 1 0 (C1, q1, 3), (C2, q2, 3), (C3, q3, 3) (C4, q4, 2), (C5, q5, 2) (C6, q6, 1), (C7, q7, 1), (C8, q8, 1) (C9, q9, 0) Prune candidates with inferior cap, slack, and #buffers

  27. Buffering Resource Trade-off

  28. Capacitance Constraints • Each gate g drives at most C(g) capacitance • When inserting buffer g, check downstream capacitance. • If bigger than C(g), throw out candidate Total cap = 500 ff

  29. Blockage Recognition Delete insertion points that run over blockages

  30. Other Extensions • Simultaneous driver sizing • Modeling effective capacitance • Higher-order interconnect delay • Slew constraints • Noise constraints

  31. Driver Sizing

  32. Driver Sizing • Driver behaves like buffer • Pick driver with the best slack • Implications upstream in timing graph • Delay penalty for large input capacitance

  33. R Cn Cf p-Models • Van Ginneken candidate: (Cap, slack) C • Replace Cap with p-model (Cn, R, Cf) • Total capacitance preserved: Cn + Cf = C • R represents degree of resistive shielding

  34. Ceff Computing Gate Delay • When inserting buffer, compute effective capacitance from p-model • Use effective instead of lumped capacitance in gate delay equation • Optimality no longer guaranteed

  35. Higher-order Interconnect Delay • Moment matching with first 3 moments • Previously: candidate (p-model, slack) • Now: candidate (p-model, m1, m2, m3) • Given moments, compute slack on the fly • Bottom-up, efficient moment computation • Problem: guess slew rate

  36. Slew Constraints • When inserting buffer, compute slews to gates driven by buffer • If slew exceeds target, prune candidate • Difficulty: unknown gate input slew Slew 300 ps ? Slew 350 ps

  37. Noise Constraints • Each gate has acceptable noise threshold • Compute cumulative noise for each wire via Devgan noise metric • Throw out candidates that violate noise • Not in production code

  38. Extensions Recap • Multiple buffer types, including inverters • Polarity constraints • Controlling buffer resources • Slew, capacitance, and noise constraints • Blockage recognition • Driver sizing • Higher-order delay modeling • Wire sizing

  39. Talk Outline • Introduction • Buffer insertion • Van Ginneken dynamic programming • Extensions • Steiner tree construction • Blockage avoidance • Wire sizing • Interconnect planning

  40. Tour of Italy Problem • Van Ginneken uses fixed Steiner route • Need timing-driven Steiner trees

  41. Timing-Driven Steiner Approaches • BRBC • Prim-Dijkstra • P-Tree • A-Tree (RSA) • SERT • MVERT

  42. Rectilinear Steiner Arborescence • Assume all sinks in first quadrant • Iteratively • Find sink pair p and q maximimizing min(xp, xq) + min (yp, yq) • Remove p and q from consideration • Replace with r = (min(xp, xq), min (yp, yq) • Connect p and q to r

  43. RSA Example 2 1 5 4 6 3

  44. RSA Diagonal Line Sweep 1 2 3 4 5 6

  45. Prim-Dijkstra Algorithm Prim’s MST Dijkstra’s SPT Trade-off

  46. Prim’s and Dijkstra’s Algorithms • d(i,j): length of the edge (i, j) • p(j): length of the path from source to j • Prim: d(i,j) Dijkstra: d(i,j) + p(j) p(j) d(i,j)

  47. The Prim-Dijkstra Trade-off • Prim: add edge minimizing d(i,j) • Dijkstra: add edge minimizing p(i) + d(i,j) • Trade-off: c(p(i)) + d(i,j) for 0 <= c <= 1 • When c=0, trade-off = Prim • When c=1, trade-off = Dijkstra

  48. Skinny on RSA/Prim-Dijkstra • Fast, easy to implement • Converting spanning to Steiner tree easy • Ignores sink criticality • No natural decoupling opportunities • Polarity constraints problem

  49. + + + + _ _ _ _ _ _ _ Polarity Problem

  50. + + + + _ _ _ _ _ _ _ A Better Solution?

More Related