1 / 23

Analytical Minimization of Signal Delay in VLSI Placement

Analytical Minimization of Signal Delay in VLSI Placement. Andrew B. Kahng and Igor L. Markov UCSD, Univ. of Michigan http://www.eecs.umich.edu/~imarkov IBM technical contact: Paul Villarrubia. Outline. Background: Global Placement for VLSI wirelength minimization delay minimization

shirin
Download Presentation

Analytical Minimization of Signal Delay in VLSI Placement

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analytical Minimization of Signal Delayin VLSI Placement Andrew B. Kahng and Igor L. Markov UCSD, Univ. of Michigan http://www.eecs.umich.edu/~imarkov IBM technical contact: Paul Villarrubia

  2. Outline • Background: Global Placement for VLSI • wirelength minimization • delay minimization • Contribution • minimization objective • “generic” minimization algorithm: outer loop and inner loop • empirical results • Futures

  3. VLSI Global Placement • Find locations for standard cells • Standard cells placed in rows, without overlap • Minimize wirelength, “routing congestion” • Minimize clock cycle • Key abstractions: • standard cells  rectangular outlines • netlist  weighted hypergraph (signal nets  hyperedges) • signal delay  function of cell locations (interconnect dominates)

  4. A VLSI Global Placement Example bad placement good placement

  5. Netlist Hypergraph and Timing Graph • Two signal nets: 3 pins (l.blue), and 4 pins (l.green) • Ovals: hyperedges • Red edges: timing graph edges

  6. Top-Down Global Placement • Placement blocks represent cells and layout area • single block at the start,driven by recursive (min-cut) bipartitioning • each pass: number of blocks doubles, size of blocks halves • end case: several cells in a tiny region etc. • Intuition: many cells can operate in parallel. • Partitioning finds “independent” groups of cells

  7. Analytical Global Placement • Find a continuous placement (locations == reals) • Efficient optimizations when nonconvex constraints are relaxed (e.g., cells are allowed to overlap) • Represent multi-pin hyperedges by sets of edges • minimize total weighted “wirelength”of all edges Popular objectives: • Linear (Manhattan) WL = w12 ( |x1-x2| + |y1-y2| ) • Quadratic “squared” WL = w12 ( (x1-x2)2 + (y1-y2)2 ) Constraints: fixed vertices and/or “region constraints” P1 P2

  8. Analytical Placement Alone is Not Enough • Many cells overlap • Must “spread” the placement • IBM CPlace and XQ • Remove overlap (comp. geometry) • Cplace combines min-cutwith analytical techniques

  9. Timing-Driven Placement • Cycle time  maximum path delay, not total path delay (!) • max(x,y,...) is not differentiable • framework: pin-based timing graph • Analytical approaches allow cell overlaps • Cell overlaps are resolved later • Main difficulty: cannot enumerate signal paths • Signal paths implicitly defined by device types • signal path sources, sinks == I/O pins and storage elements • Timing constraints also implicitly defined • “actual arrival times” (AATs) at sources • “required arrival times” (RATs) at sinks • source-sink path constraint:path delay RAT@sink - AAT@source

  10. Implicit Analysis of Path Constraints • Static Timing Analysis (STA) methodology • forward topological traversal in timing graph  AAT@every_pin • similar backward traversal RAT@every_pin • slack@pinis given by RAT@pin - AAT@pin • negative slacks  violated timing constraints • STA-based and STA-inspired placement methods • slacks  net weights for HPWL minimization • top-down placement to maximize negative slack (Marek-Sadowska/Lin 86) • note: STA requires edge delays (e.g., from placement) • delay budgets • zero-slack (Hauge, Nair and Yoffa 86) • iterative min-max (Shragowitz et al. 90/92) • limit-bumping (Frankle 92)

  11. Motivations For Novelty • Many promising techniques available • net reweighting • delay budgeting • others • Existing frameworks have weaknesses • speed/scalability • loss or ignorance of input information • delay budgeting algorithms tend to ignore fixed locations, obstacles • optimization of “wrong” global objectives (e.g., average wirelength)

  12. The Dimensionless Path-Timing Objective • For path  consider edge e • Dimensionless Path-Timing Objective (DPO) =max {t /c}= max {(ede)/c} • Where • c is path constraint • t is path delay • de= dij(xi,yi,xj,yj) is edge delay

  13. DPO: Properties =max {t /c}= max {(ede)/c} •   1  all timing constraints are satisfied • Convex when edge delay models are convex • Min DPO max slackwhen allcare equal • Max slack can be reduced to min DPO • add two new vertices: the source and the sink • connect the source to former sources • connect the sink to former sinks • use constant edge delay models

  14. Criticalities: “Multiplicative Slacks” • By analogy with slack, define criticalities i= max  v{t /c} for vertex v=vi ij= max  e{t /c} for edge e=eij • Criticalities are multiplicative versions of slack • DPO and criticalities quickly computable • STA + postprocessing • Vertex criticalities  cells on critical paths • can be used by the proposed top-down timing-driven placement flow

  15. Generic Minimization of DPO • Reduce DPO to a simpler objective: maxijwijdij • maximal weighted edge delay • use “reweighting iterations” • One reweighting iteration • assume a placement • compute edge criticalities • compute new edge weights wij • minimize maxijwijdij • (New weights: wij’= ij / dijwhere  = maxijwijdij )

  16. Properties of Reweighting • Theorem 1. If  = maxijwijdijdoes not increase at a particular iteration, all timing constraints must be satisfied. • Theorem 2. A re-weighting iteration either decreases DPO, or leaves it unchanged. • Reweighting upper-bounds dij because wijdij   • can interpret reweighting as delay rebudgeting • Youssef and Shragowitz used wij= ij in 1990/92 • [interpretation of their iterative MiniMax] • no iterations with placement: ignore fixed pad locations

  17. Optimization of Maximal Edge Delay • Must consider particular edge delay models • popular choices: linear and quadratic • Theorem 3. 2-dim max edge delay can be reduced to 1-dim case with double #vertices • [“Inlined” implementation: no new graph] max akm |tk-tm| max bkm (tk-tm)2 • Theorem 4. Let bkm=akm2  minimizers coincide • Linear and quadratic WL are numerically equivalent!

  18. Top-Down Placement Framework • Top-down placement done in passes • In one pass • split every previously existing block • Cell-to-block assignments • viewed as region constraints • gradually refine, converge to cell locs • Assume we analytically minimized signal delay •  have cell locations  can compute edge delays •  can perform Static Timing Analysis •  know which cells lie on critical paths • Use delay-minimizing cell locs when splitting blocks

  19. Empirical Validation • We combined min-max placement with recursive min-cut bisection (Capo  CapoT) • Implemented minimization of edge delay objectives: • Length as delay • Squared length as delay • Quadratic RC delay • MST-based Elmore delay (using • Evaluated • Internal evaluators (after placement): sanity check • Industry timing analyzer • Compared to an industry placer on 4 test-cases • Won on three test-cases (by slack computed with industry STA)

  20. Results of Quadratic, Linear and Min-Max Placement

  21. Results of Quadratic, Linear and Min-Max Placement

  22. Conclusions and Ongoing Work • New timing-driven placement framework • can potentially be combined with budgeting or reweighting • expected to be successful enough on its own • leverages mincut placement • relies on a novel analytical delay minimization • Dimensionless Path-timing Objective (DPO) • novel global timing objective; generalizes slack optimization • New minimization algorithms • reweighting iteration: reduction to simpler MAX-based objective • MAX-based objective can be minimized very quickly • Ongoing work in the context of timing-driven flows

  23. Future Work • Observation (how the proposed method works) • a classic placement approach is split into stages • a new timing optimization is performed between those stages • most critical wires/gates are found first (traditionally: placement is found first) • Try other types of optimizations during placement • routing of timing-critical nets • better delay estimation • early cross-talk detection? • sizing of timing-critical drivers • buffer insertion for timing-critical nets • early detection of dangerous cross-talk • Faster and cheaper ICs

More Related