480 likes | 903 Views
Buffer and FF Insertion. Slides from Charles J. Alpert IBM Corp. Talk Outline. Introduction Buffer insertion Van Ginneken dynamic programming Extensions Interconnect planning. Simple Buffer Insertion Problem. Given : Source and sink locations, sink capacitances
E N D
Buffer and FF Insertion Slides from Charles J. Alpert IBM Corp.
Talk Outline • Introduction • Buffer insertion • Van Ginneken dynamic programming • Extensions • Interconnect planning
Simple Buffer Insertion Problem Given: Source and sink locations, sink capacitances and RATs, a buffer type, source delay rules, unit wire resistance and capacitance RAT4 Buffer RAT3 s0 RAT2 RAT1
s0 RAT2 RAT1 Simple Buffer Insertion Problem Find: Buffer locations and a routing tree such that slack at the source is minimized RAT4 RAT3
Slack Example RAT = 500 delay = 400 slack = -200 RAT = 400 delay = 600 RAT = 500 delay = 350 slack = +100 RAT = 400 delay = 300
R1 R2 A B C C1 C2 Elmore Delay
Common Approaches • Iteratively insert buffers • Closed-form solutions (2 pin nets) • Dynamic programming • Simultaneous constructions
Van Ginneken’s Classic Algorithm • Optimal for multi-sink nets • Quadratic runtime • Bottom-up from sinks to source • Generate list of candidates at each node • At source, pick the best candidate in list
Key Assumptions • Given routing tree • Given potential insertion points
(1) (2) (3) Generating Candidates
(3) (b) (a) Both (a) and (b) “look” the same to the source. Throw out the one with the worst slack (4) Pruning Candidates
(4) (5) Candidate Example Continued
(5) At driver, compute which candidate maximizes slack. Result is optimal. Candidate Example Continued After pruning
Left Candidates Right Candidates Merging Branches
Critical With pruning Pruning Merged Branches
Van Ginneken Example (20,400) Buffer C=5, d=30 Wire C=10,d=150 (30,250) (5, 220) (20,400) Buffer C=5, d=50 C=5, d=30 Wire C=15,d=200 C=15,d=120 (30,250) (5, 220) (45, 50) (5, 0) (20,100) (5, 70) (20,400)
Van Ginneken Example Cont’d (30,250) (5, 220) (45, 50) (5, 0) (20,100) (5, 70) (20,400) (5,0) is inferior to (5,70). (45,50) is inferior to (20,100) Wire C=10 (30,250) (5, 220) (20,100) (5, 70) (30,10) (15, -10) (20,400) Pick solution with largest slack, follow arrows to get solution
Van Ginneken Recap • Generate candidates from sinks to source • Quadratic runtime • Adding a buffer adds only one new candidate • Merging branches additive, not multiplicative • Optimal for Elmore delay model
Optimal Extensions • Multiple buffer types • Inverters • Polarity constraints • Controlling buffer resources • Capacitance constraints • Blockage recognition • Wire sizing
(1) (2) Time complexity increases from O(n2) to O(n2B2) where B is the number of different buffer types Multiple Buffer Types
(1) (2) • Maintain a “+” and a “-” list of candidates • Only merge branches with same polarity • Throw out negative candidates at source Inverters
“-” list “+” list “-” list Polarity Constraints • Some sinks are positive, some negative • Put negative sinks into “-” list
Controlling Buffering Resources Before, maintain list of capacitance slack pairs (C1, q1), (C2, q2), (C3, q3) (C4, q4), (C5, q5) (C6, q6), (C7, q7), (C8, q8) (C9, q9) Now, store an array of lists, indexed by # of buffers 3 2 1 0 (C1, q1, 3), (C2, q2, 3), (C3, q3, 3) (C4, q4, 2), (C5, q5, 2) (C6, q6, 1), (C7, q7, 1), (C8, q8, 1) (C9, q9, 0) Prune candidates with inferior cap, slack, and #buffers
Capacitance Constraints • Each gate g drives at most C(g) capacitance • When inserting buffer g, check downstream capacitance. • If bigger than C(g), throw out candidate Total cap = 500 ff
Blockage Recognition Delete insertion points that run over blockages
Other Extensions • Simultaneous driver sizing • Modeling effective capacitance • Higher-order interconnect delay • Slew constraints • Noise constraints
Driver Sizing • Driver behaves like buffer • Pick driver with the best slack • Implications upstream in timing graph • Delay penalty for large input capacitance
R Cn Cf p-Models • Van Ginneken candidate: (Cap, slack) C • Replace Cap with p-model (Cn, R, Cf) • Total capacitance preserved: Cn + Cf = C • R represents degree of resistive shielding
Ceff Computing Gate Delay • When inserting buffer, compute effective capacitance from p-model • Use effective instead of lumped capacitance in gate delay equation • Optimality no longer guaranteed
Higher-order Interconnect Delay • Moment matching with first 3 moments • Previously: candidate (p-model, slack) • Now: candidate (p-model, m1, m2, m3) • Given moments, compute slack on the fly • Bottom-up, efficient moment computation • Problem: guess slew rate
Slew Constraints • When inserting buffer, compute slews to gates driven by buffer • If slew exceeds target, prune candidate • Difficulty: unknown gate input slew Slew 300 ps ? Slew 350 ps
Noise Constraints • Each gate has acceptable noise threshold • Compute cumulative noise for each wire via Devgan noise metric • Throw out candidates that violate noise • Not in production code
Extensions Recap • Multiple buffer types, including inverters • Polarity constraints • Controlling buffer resources • Slew, capacitance, and noise constraints • Blockage recognition • Driver sizing • Higher-order delay modeling • Wire sizing
Talk Outline • Introduction • Buffer insertion • Van Ginneken dynamic programming • Extensions • Interconnect planning
What is the Problem? • DSM timing closure • Squeeze buffers into tight spaces • Alleviate hot spots, local wire congestion • Getting worse • Handle wire congestion, buffering resources early • Acknowledge these constraints when floorplanning
Which Floorplan Is Better? • Timing analysis worthless • Interconnect synthesis, electrical correction, routing, extraction • Days to find answer
Present Buffer Explosion Past • Number of buffers triples each generation • 800K buffers in 0.05 micron technology
Buffer Block Planning • Create blocks between macros just for holding buffers • Adjust floorplan accordingly • Computing size/#/location of blocks • Analyze 2-pin nets • Find feasible regions • Assign buffers with smallest region • Combine buffers into blocks
Feasible Regions feasible region
Buffer Block Planning Trade-offs • Goods • Buffer locations flexibile • Global view, buffers most difficult ones first • Bads • Wire congestion around blocks • Don’t have timing information • Some nets still cannot be buffered/routed