1 / 33

Hasan Arslan and Shantanu Dutt Electrical & Computer Eng. University of Illinois at Chicago

Efficient Timing-Driven Incremental Routing for VLSI Circuits Using DFS and Localized Slack-Satisfaction Computations. Hasan Arslan and Shantanu Dutt Electrical & Computer Eng. University of Illinois at Chicago DATE 2006. Outline. Introduction Importance of Incremental Routing

blythe
Download Presentation

Hasan Arslan and Shantanu Dutt Electrical & Computer Eng. University of Illinois at Chicago

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Efficient Timing-Driven Incremental Routing for VLSI Circuits UsingDFS and Localized Slack-Satisfaction Computations Hasan Arslan and Shantanu Dutt Electrical & Computer Eng. University of Illinois at Chicago DATE 2006

  2. Outline • Introduction • Importance of Incremental Routing • Previous Work • Our Goals • A DFS-Based TD Incr Routing Algorithm (TIDE) • Previous work on global TD routing • Global Routing with slack tolerance concepts • Detailed Routing with DFS-based B&R • Experimental Results • Conclusion

  3. Incremental Routing • After chip layout is completed • Time/noise/thermal violation • One or more optimization metrics (speed/power/area) unsatisfactory • Need correcting changes to the circuit/system • Engineering Change Order (ECO) process • Enormous resources and time already spent • Time to meet market requirements • Most ECOs lead to a requirement of routing changes after various design changes at earlier levels • The ECO could also be at the routing level • Incremental routing & interconnects critical • Need a time-efficient & effective TD-incremental routing algorithm

  4. Incremental Routing (Cont.) • Incremental Routing Problem • Set of existing routed nets R = E – D, E = original nets before ECO, D = deleted nets • Set of new nets S(resulting from correcting re-synthesis at different levels of the VLSI design flow) • Quality metrics • Time-efficient near-optimal incr solns for S subject to given constraints (slack satisfaction, crosstalk bounding, etc.) • Minimal changes to previous routing results • Complete incr routing in the available metal layers (if such a soln exists)

  5. Prior Work on Incremental Routing • 1)Emmert and Bhatia, “Incremental Routing in FPGA”,IEEE Int. ASIC Conference, 1998. • 2)Cong and Sarrafzadeh, “Incremental Physical Design”,ISPD 2000. • 3)Dutt, Shanmugavel and Trimberger, “Efficient Incremental Rerouting for Fault Reconfiguration in FPGAs”,ICCAD 1999. • 4)Dutt, Verma and Arslan“A Search-Based Bump and Refit Approach to Incremental Routing for ECO Applications in FPGAs”,TODAES 2002 • 5)Xiang, Chao, Wong “An ECO Algorithm for Eliminating Crossalk Violations”, ISPD 2004 • 6) S. Raman, et al., “A Timing-Constrained Incremental Routing Algorithm for Symmetrical FPGAs “, DATE 1996 • 7) H. Arslan and S. Dutt, “A Depth-First Search Controlled Gridless incremental routing Algorithm for VLSI Circuits”, ICCD 2004. • No work on TD incremental routing for ASICs

  6. Prior Work (Cont.) Emmert-Bhatia (ASIC’98) • Nets connected to faulty PLB, deleted and rerouted • Standard single-net routing mode (global then detailed) • Do not perturb or move existing nets Cong-Sarrafzadeh (ISPD’00) • Standard Net Routing : Route new nets without perturbing existing nets • Rip & Reroute : If some nets cannot be routed, rip-up “blocking” existing nets. Reroute the ripped up nets.

  7. Our Goals • TD-incremental routing for VLSI (ASIC) circuits • Address quality metrics of incr. routing and satisfy constraints • Satisfy slack constraints on new and existing nets that may be affected by new net routing • Fast near-WL,via-optimal incr solutions • Min. changes to existing routing-bounded WL, via increase • Complete incr routing in the available metal layers—aggressive exploration of routing space within above constraints

  8. Adj -via Global Routing of new net based on: 1) A new iterative slack-satisfaction algorithm IntAl for connecting next pin on the net based on local slack tolerances 2) Congestion + WL + efficiency-based min-cost on a grid-graph for each 2-pin path n j n 3 n 2 Adj -via Bumped seg. • Detailed routing of new net based on: • WL + via + bumping (degree of bumped • net) min-cost path for each global route • 2) Constraint-satisfying DFS-based partial- • B&R process for “overlapped” or “bumped” • existing nets so that: a) slacks not violated, • b) WL-increment bounded n 2 n 1 High-Level Approach V2 V0 Vi V2 Vj Vm V1

  9. Our Approach – TD Global Routing • In an iterative connections of pins on routing tree, most imp. • Q: Where to connect the next pin for slack-satisfaction of all • pins and min-WL? v0 • Simplified rule of thumb: • Closer is the connection to CC, more interconnect sharing there is w/ partial tree T & less additional delay seen by other sinks. But more “baggage” (accumulated delay) for new pin vu. • Higher up and away from CC the connection is, lesser is the accumulated delay of T seen by vu, but less sharing and more delay seen by other sinks due to more wire-cap load v1 vi (closest connection) CC vu v2 New pin v3 • Classic Prim-Dijkstra tradeoffs discussed in [Alpert et al., TCAD’95] • Optimal solution (even just slack-satisfying soln. somewhere in between • No one has solved it exactly (satisfying all slacks) or optimally (w/ min-WL) • We provide a near-min-WL all-slack-satisfying solution here

  10. delay New pin vi CC lx Various Approaches to TD Global Routing • In [Boese, et al., TCAD-95] (SERT/ERT algorithm): • The delay on any sink is a concave function of lxthe distance from CC of the connection point • Same for weighted sum of all sink delays (obj. func)—min. @ either vi or CC • Choose vi or CC based on min-WL v0 v1 vi vx lx CC vu v2 v3 • Does not solve the core TD problem of slack satisfaction

  11. Max envelope 0 delay - slack vk vi lx CC New pin Optimal slack- satisfying conn. point • Time complex— , where k = # sink pins (our analysis) • Misses some slack satisfaction solutions from initial SERT/ERT handoff Various Approaches to TD Global Routing (contd) • In [Hou & Sapatnekar, ISPD’98] (MVERT): • Constraint satisfaction of all sinks vk is explicitly considered: d(vk) - slack(vk) <= 0 (LHS is also concave); uses non-Hanan points • The technique involves navigating max[slack(vk)– d(vk)] via intersection points of the various concave curves • Use binary search to find min-WL point for constr. satisfaction v0 v1 vi vx lx CC vu v2 v3

  12. New pin Various Approaches to TD Global Routing (contd) • As our competitor, we consider a mix of SERT [Boese, TCAD’95] and SOAR [Wang and Kuh, MCMC’97] (SERT/SOAR): • Check if connection to CC satisfies all slacks • Else make a connection to driver v0 • Rationale: • If connection to CC violates slack to vu, then this is most likely due to the “baggage” delay of shared interconnects. • This can be avoided maximally by routing directly to vo • Classical Prim-Dijkstra tradeoff • Fast v0 v1 vi vx lx CC vu v2 v3

  13. Derived tolerances New pin Our Approach to TD Global Routing • Exact slack satisfaction of all sinks in “constant” time by checking satisfaction of derived slacks (called tolerances) as a function of lx of • only 3 classes of sinks: • vu • sinks in T(CC), where T(u) is routing subtree rooted at u (e.g., v3) • sinks below T(vi)/T(CC) (e.g., v1, v2) • For this we need tolerance concepts discussed next v0 v1 vi vx lx CC vu v2 v3

  14. v0 vi lij vj Cdnj Elmore Delay Model • D(vj) = D(vi) + (r.c.l2ij)/2 + r.lij.Cdnj • Cdnj = gate + wiring capacitance of • subtree rooted at vj • If vjis a sink pin, Cdnj = Cg(vj) • [gate cap of vj] • r, c = unit wire cap, res • Has good fidelity

  15. Tolerance Concepts—Delay Tolerance

  16. Tolerance Concepts—Capacitance Tolerances = upstream res. from vo To vj Min

  17. delay Properties x Theorem 1: The tree truncation method in the IntAl algorithm will always find a slack-satisfying connection point for new pin nearest valid edge, if one exists, of the partial routing tree T Updates of tolerances done on an as-needed basis. E.g., change in delay at a sink pin due to re-routing is only propagated to ancestors’ tolerances. Later when a node’s tolerance is needed, it may not be updated but this is accomplished by scanning all its ancestors. Q(h) time complexity per re-routing of T, h is T’s height Concave function f(x)= -r.c.x2+bx+d <=0 g(x)= -r.c.x2+ex+m <= 0 h(x) = c.x + l <= 0 IntAl Algorithm Global Routing: Connecting a New Pin—Interval Intersection (IntAl) Algorithm Set3 V0 V6 Vi V1 V’ V5 Intersection x Vu Tolcap(v’) = Tolcap(vj).Rup(vj)/Rup(v’) CC Optimal point Δcap Set1 Set2 Vj • Determine valid intervals for each inequality • If all intervals are non-empty, take intersection • If intersection non-empty • then take bottom point of intersection as min-WL slack-satisfaction point • else prune tree and repeat process with next nearest pruned-tree branch V2 V3 V4

  18. Adj -via n j n 3 n 2 Adj -via Bumped seg. • Detailed routing of new net based on: • WL + via + bumping min-cost path for • each global route • 2) Constraint-satisfying DFS-based partial- • B&R process for “overlapped” or “bumped” • existing nets so that: a) slacks not violated, • b) WL-increment bounded n 2 n 1 Timing-Driven Incremental Detailed Routing • If a portion of net ni is overlapped • Length of overlapped portion might be increased. • Increase the capacitance. • Slack of sink pins might be violated. • Possible overlapping: • With leaf interconnect • Interior edge • Steiner point Our goal: Satisfy source to sink delay requirement for all sink pins (slack) Slack:The amount of delay can be added on a net connecting the sink pin without increasing the maximum delay requirement of that sink pin.

  19. Self Test: Downstream Test: Upstream Test n2 n2 Δcap TD Incr. Detailed Routing—Overlapping a Leaf Interconnect n1 V0 T V6 Vi V1 V5 Vj V4 V2

  20. i Δcap n2 n2 DD(v’j) DD(vm) DD(v2) Do these checks only TD Incr. Detailed Routing—Overlapping an Interior Interconnect n1 Upstream Test: Downstream Test: V0 T Vi V6 • DD(v’j) <= Toldel(v’j) = Toldel(vj) • For each child vk (sink pin or Steiner node): • DD(vk) <= Toldel(vk) • [e.g., DD(v2) <= Toldel(v2), • DD(vm) <= Toldel(vm) ] V’j Vj V’’j Vm V2 V4 V3 V5

  21. Vi Vi Vi TD Incr. Detailed Routing—Overlapping Steiner Point n1 n1 n1 Vn Vn Vn T T T V5 V5 V5 (a) (b) (c) (b) moving vj upwards will increase capacitance and resistance (c) moving vj downwards decrease cap. change Steiner node n2 n2 n2 V’j Δcap Δcap Vj Vk Vj V’k V’j n2 n2 n2 V1 V2 V3 V4 V1 V2 V3 V4 V1 V2 V3 V4

  22. nj n1.b-seg P1 P2 n2..h2 n2.pin or obs n2.h2 n1..b-seg Constraint Satisfying DFS-Controlled Routing with Partial B&R • Adapted from [Arslan & Dutt, ICCD’04] n3 nj n3..v1 n2 nj n1..b-seg n3..h1 n2..h1 n2..h2 n1 n1..b-seg Pi= i-via path is explored • Exploring a richer solution space via • partial bump-&-reroute (B&R) of existing nets • Constraint of minimal effect on B&R’ed • nets need to be satisfied: slack staisfaction, min-WL • DFS retractions: • pin or logic as obstacles • ancestor nets bumped • slack violation of current net • WL of currently re-routed net excessive • other constraints

  23. nj n1.b-seg P1 P2 n2.pin or obs n2.h2 P1 n3.v1 DFS-Controlled Routing with Partial B&R n3 nj n3..v1 n2 nj n2..h2 n3..h1 n2..h1 n2..h2 n1 n1..b-seg Pi= i-via path is explored

  24. n3..v1 nj n1.b-seg P1 P2 n3..h1 n2.pin or obs n2.h2 P1 P1 n3.v1 obs P1 P2-P4 anc obs or anc.n1 or anc.nj DFS-Controlled Routing with Partial B&R n3 nj n3..v1 n2 nj n2..h2 n2..h1 n2..h2 n1 n1..b-seg Pi= i-via path is explored

  25. n3..v1 nj n1.b-seg P1 P2 n3..h1 n2.pin or obs n2.h2 P2-P3 P1 P1 n3.v1 n3.h1 P1 obs P1 P2-P4 P2-P4 obs anc obs or anc.n1 or anc.nj obs or anc.n1 or anc.nj DFS-Controlled Routing with Partial B&R n3 nj n3..v1 n2 nj n2..h1 n2..h2 n2..h2 n1 n1..b-seg Pi= i-via path is explored

  26. n3..v1 nj P2 n1.b-seg n2.h1 P1 P2 n3..h1 P1 n2.pin or obs n2.h2 obs P2-P3 P1 P1 n3.v1 n3.h1 P1 obs P1 P2-P4 P2-P4 obs anc obs or anc.n1 or anc.nj obs or anc.n1 or anc.nj DFS-Controlled Routing with Partial B&R n3 nj n3..v1 n2 nj n1..b-seg n2..h1 n2..h2 n1 n1..b-seg Pi= i-via path is explored

  27. n3..v1 nj P2 n1.b-seg n2.h1 P1 P2 n3..h1 P2 P1 n2.pin or obs n2.h2 VGL obs P2-P3 P1 P1 n3.v1 n3.h1 P1 obs P1 P2-P4 P2-P4 obs anc obs or anc.n1 or an.nj obs or anc.n1 or anc.nj DFS-Controlled Routing with Partial B&R n3 nj n3..v1 n2 nj n2..h1 n2..h2 n1 Pi= i-via path is explored • net lenghts and # of vias of all • modified/bumped nets unchanged

  28. Benchmark Circuits • Benchmarks: • 10 circuits ranging from 1643 to 10435 nets & 7200 to 47520 pins • Base 2x2 tile of Mcc1 bench. is repl. with diff. cell sizes and diff. # of pins • Nets randomly generated & routed using SERT/SOAR • Net distribution: 2-pin: 30%, 3-4 pins: each 20%, 5 pins: 10%, 6-7 pins: each 5%, 8-10 pins: each 2%, 11-14 pins: each 1% • Pin slacks normally distributed in range [0,5% max delay on net] • Ran on 2.6 Ghz Pentium Linux machines, 1GB RAM • Simulation: Randomly deleted 10% nets & rand. gen. 10% new nets • Evaluation: Crash Test—routing as many nets as possible under the constraint of only 2 metal layers & slack satisfaction (TD-S, TD-R, TIDE) • TD-S (TD-R) is SERT/SOAR overlaid on Std (R&R)

  29. Times TIDE is better 7x 6.7x 9.8x 9.5x 5.6x 4.7x 5.3x 4.2x Results 1) % Unrouted Nets 2) Slack Violations

  30. 3) Average Routed Net Length Results Times TIDE is better 6.7x 3x 4) Vias per New Net 4.4x 2.6x

  31. 9.5x 2x 2.4x 0.5x Results 5) Modified Nets per New Net Times TIDE is better 6) Runtime

  32. Conclusions • New TD Incremental Routing Algorithm TIDE • Uses new concepts of derived tolerances @ Steiner nodes to: a) In global routing—quickly determine slack-satisfying near-min-WL connection of the next pin for a new net routing b) In detailed routing—quickly determine slack satisfaction of B&R’ed nets • In global routing, the IntAl algorithm is the first in pruning trees (proven correct) for determining the next nearest connection point after the recent attempt failed. • In detailed routing, high-level DFS control  good routing soln. for new nets with min. impact on existing nets • Produces significant improvement over TD-Std and TD-R&R in all important metrics of interest and is reasonably fast • Future Work • TD incremental placement and integration with TIDE

  33. THANK YOU

More Related