660 likes | 898 Views
Development and Application of Tree Synthesis Algorithms. John Lillis University of Illinois Chicago. Overview. Part I: Buffer tree synthesis Formulations S/P/SP-tree Part II: Fanin tree embedding/replication Optimization across gate boundaries Interaction with placement.
E N D
Development and Application of Tree Synthesis Algorithms John Lillis University of Illinois Chicago
Overview • Part I: Buffer tree synthesis • Formulations • S/P/SP-tree • Part II: Fanin tree embedding/replication • Optimization across gate boundaries • Interaction with placement
Premises of Work • Conservation of Resources Crucial • Estimate: 700-800K Buffers/Chip in Near Future • Cost-Performance Tradeoffs • General Cost Model • Topology / Embedding / Buffering Spaces Should be Explored Simultaneously • 2-Phase Approach Not Robust / Predictable • Particularly Troublesome in Presence of Blockages • MAIN PREMISE: Powerful Buffer Tree Synthesis is a Core for Modern Design
Max Slack Weakness Overoptimized subtrees Slack Cost
Problem Formulation • Given: • Location of Driver and Sinks • Technology Parameters • Timing Requirements • Buffer Library • Target Routing Graph (Blockages) • Find: • Topology in corresponding space • its Embedding • and Buffer Assignment • Minimizing Cost • s.t. Timing Constraints
Philosophy of Constraint Imposition Full space Constrained space • Goals: • Predictable Behavior • Absence of ad-hoc Heuristics • Main Idea: • Optimally Solve Constrained Variant of the Problem • Well-Designed Constraints Produce • Large Flexible Solution Space • Tractability • Constraints: Topology Space
Topology Embedding Flexibility s c a b s s c c a a b b
Target Routing Graph Construction Routing blockage Buffer blockage s a c b
Algorithmic Description Timing-Driven Maze Routing Topology Embedding S-Tree P-Tree SP-Tree
Algorithmic Description Timing-Driven Maze Routing Topology Embedding S-Tree P-Tree SP-Tree
Core Subroutine: Timing-Driven Maze Routing Target Sources • Generalization of [Hur, et. al.; TCAD Feb 2000] • Single Target, Multiple Sources • Finds non-dominated paths • Simultaneous Buffer Insertion • Handling of Blockages in Topology Synthesis
Algorithmic Description Timing-Driven Maze Routing Topology Embedding S-Tree P-Tree SP-Tree
Topology Embedding • Goal: Obtain timing feasible embedding / buffering of given topology, minimizing cost • Solution: Dynamic Programming (bottom-up)
Solution sets A(u,v) u v • A(u,v) represents a set of solutions that correspond to • Vertex u in Topology • Vertex v in Target Graph A1b = Join(A1.left , A1.right) A1 = GenDijsktra(A1b)
Algorithmic Description Timing-Driven Maze Routing Topology Embedding S-Tree P-Tree SP-Tree
S-Tree • Notion of localities: • Spatial • Temporal • Polarity • Partition sinks into 2 sets based on: • estimated timing criticality • signal polarity requirements • some other criteria... • Subtrees can break topology and “stitch” at different place
S-Tree Topology Space s d c b s s b d d b c c a a Sink partition: {a,c,d} {b} a
S-Tree Recurrence A1b = Join(A1.left , A1.right) A1 = GenDijsktra(A1b) A2b = Join(A2.left , A2.right) A2 = GenDijsktra(A2b) A12b = Join(A12.left , A12.right) + Join(A1 , A2) A12 = GenDijsktra(A12b)
S-Tree Topology Space s s s b c f e c a f d a b d e s s b f c a d e c e c e a b f d f d a b Initial topology
Incorporating polarity • 4 sets: • critical & positive signal polarity • critical & negative • non-critical & positive • non-critical & negative • Other partitioning schemes...
Algorithmic Description Timing-Driven Maze Routing Topology Embedding S-Tree P-Tree SP-Tree
P-Tree Topology Space • All Permutation-Constrained Topologies s a b c d e s a e b c d a e b c d
Limitations of P-Tree Space Driver Critical Non-critical • Isolation of Critical / Non-Critical Subtrees: “Temporal-Locality” • Min WL May Not Produce Min Cost Driver Critical Non-critical
Algorithmic Description Timing-Driven Maze Routing Topology Embedding S-Tree P-Tree SP-Tree
SP-Tree • Combine everything said so far... • From P-Tree • Spatial locality • Robustness • From S-Tree • Temporal locality • Polarity locality • Ability to fix “topology problems” by “stitching”
Solution Space Entire space SP-Tree S-Tree P-Tree Fixed topo.
Experiments • Randomly generated nets • Non-uniform required arrival time • Non-uniform sink input capacitance • Buffer-biased cost • Interested in: • Min cost feasible solution • Max slack solution for verification • Runtime • More details in the paper...
Algorithms for Experiments • S-Tree • P-Tree • SP-Tree • RMP [Cong, Yuan; DAC 2000] • RMP-Quick [Cong, Yuan; DAC 2000]
Results Net2-06 Min cost feasible Max slack # buffers
Results Net2-08 Min cost feasible Max slack # buffers
Results Net2-12 Min cost feasible Max slack # buffers
Conclusions • Key Concepts: • General Cost Models • Routing Congestion • Buffer Congestion • Orthogonal Separation of Spatial and Temporal Locality • Polarity Requirements • Routing and Buffer Blockages • Targets: • Small-to-Medium Sized Signal Nets • Results Summary • Highly Cost-Efficient, High Performance Solutions • Substantially Outperforms Prior Approaches in Solution Quality and Runtime
Replication Overview • Hrkic, Lillis, Beraudo (DAC04, IWLS04) • Concept: Netlist structure limits potential of timing-driven placement • Difficult for top-down synthesis to fix • Main issue: inherently non-monotone paths • Approach (Hrkic, Lillis; DAC04) touches on placement, synthesis (netlist perturbation) and routing.
Logic Replication • Duplicate logic cell • Preserve functionality • Improve timing • Place / Move cells • Adjust connections B A B A CR C C D E D E
Early Work • Use replication to straighten I/O paths • Local monotonicity [Beraudo, Lillis, DAC 2003] • Sequence of 3 cells on the path • Incremental framework B B CR A A C C D E D E
Limitations of Local Monotonicity • Local Monotonicity satisfied • Still many non-monotone paths A B C F D E
Replication Tree Approach[Hrkic et. al. DAC04] • Identify critical sink • Extract critical fan-in tree (Replication Tree) • Optimize fan-in tree (Fan-in Tree Embedding) • Legalize placement
Slowest Paths Tree • Focus on slowest paths • Find slowest paths tree from critical sink • Include paths within epsilon of current critical delay • Focus on most critical portions of fan-in cone
Replication Tree C C CR A AR A B B BR DR D D E F E FR F • Most circuits do not contain large fan-in trees due to reconvergence • Given a critical tree temporarily replicate the entire tree • Assign connections: • if (u,v) is tree edge; connect uR to vR • else connect u to vR
Placement cost C CR AR A B BR DR D E F FR • Replication is temporary • Placement cost is crucial • Cost discount for placing cell over its logical equivalent • low cost for placing DR over D • actual replication will never occur • multiple low cost location possible
Fan-in Tree Embedding • Given: • Fan-in tree • Placement of sink and inputs • Arrival times at inputs • Placement and routing graph • Find: • Placement of internal tree nodes (Gates) • Minimizing Cost • s.t. Timing Constraints • cost / delay tradeoff
Fan-in Tree Embedding Example C C A A B B sink sink Higher delay, lower cost Lower delay, higher cost
Fan-out and Fan-in Tree C A source B C A sink B Bottom-up Top-down
Fan-in Tree Embedding • Adaptation of S-Tree algorithm [Hrkic, Lillis, DAC 2002] • Keep: • Graph Model for Embedding Target • Modified Timing-Driven Maze Routing • multiple source, multiple targets • at each vertex keep a list of non-dominated solutions • S. Hur, J. Lillis, IEEE TCAD 2000 • Modify: • Top-down vs. Bottom-up • Solution signature (c,t): • c - cost • t - signal arrival time • Gate placement cost p(x,y)
Fan-in Tree Embedding • Non-binary tree: multiple gate inputs • Top-Down Dynamic Programming • Maze Routing to populate solutions • deffered backtracking • Join Solutions • c=px,y + c1 + ... + cn • t=MAX(t1, ... ,tn) • Bottom-Up solution extraction • backtrack to extract maze route • extract gate placement Modified maze routing Join
Aside: Legalization • Use Modified Gain-Graph approach [Hur, Lillis; ICCAD00] • Modified to incorporate timing information
Optimization Flow • Identify critical sink (static timing analysis) • Extract Fan-in Tree • Replication Tree • epsilon-Slowest Paths Tree • Embed Fan-in Tree • Decide which cells to Replicate / Unify • Legalize placement • Repeat while there is improvement