380 likes | 489 Views
Practical Approximation Algorithms for Separable Packing LPs. F.F. Dragan (Kent State) A.B. Kahng (UCSD) I. Mandoiu (UCLA/UCSD) S. Muddu (Sanera Systems) A. Zelikovsky (Georgia State). Outline. VLSI design motivation Global routing via buffer-blocks Separable packing ILP formulations
E N D
Practical Approximation Algorithms for Separable Packing LPs F.F. Dragan (Kent State) A.B. Kahng (UCSD) I. Mandoiu (UCLA/UCSD) S. Muddu (Sanera Systems) A. Zelikovsky (Georgia State)
Outline • VLSI design motivation • Global routing via buffer-blocks • Separable packing ILP formulations • PTAS for separable packing LPs • Analysis • Experimental results
Outline • VLSI design motivation • Global routing via buffer-blocks • Separable packing ILP formulations • PTAS for separable packing LPs • Analysis • Experimental results
Outline • VLSI design motivation • Global routing via buffer-blocks • Separable packing ILP formulations • PTAS for separable packing LPs • Analysis • Experimental results
Outline • VLSI design motivation • Global routing via buffer-blocks • Separable packing ILP formulations • PTAS for separable packing LPs • Analysis • Experimental results
Buffer Blocks VLSI Global Routing Buffered
Problem Formulation Global Routing via Buffer-Blocks (GRBB) Problem Given: • BB locations and capacities • List of multi-pin nets • upper-bound on #buffers for each source-sink path • L/U bounds on the wirelength b/w consecutive buffers/pins Find: • Buffered routing of a maximum number of nets subject to the given constraints
Integer program changes • Split each BB vertex r of G into two copies, r’ and r’’ • Impose capacity constraint on the sets of vertices {r’,r’’} Enforcing Parity Constraints • Inverting buffers change the polarity of the signal • Each sink has a given polarity requirement • Parity constraints for the #buffers on each routed source-sink path • A path may use two buffers in the same buffer block
Combining with compaction Set capacity constraints: cap(BB1) + cap(BB2) const.
Integer program changes • Replace each BB vertex r of G by a set X(r) of vertices (one for each buffer type) • Modify edge set of G to take into account non-uniform driving strengths • Impose capacity constraint on the sets of vertices X(r): GRBB with Buffer Library • Discrete buffer library: different buffer sizes/driving strengths • Need to allocate BB capacity between different buffer types
“Relax+Round” Approach to GRBB • Solve the fractional relaxation • Exact linear programming algorithms are impractical for large instances • KEY IDEA: use an approximation algorithm • allows fine-tuning the tradeoff between runtime and solution quality • Round to integer solution • Provably good rounding [RT87] • Practical runtime (random-walk based)
Outline • VLSI design motivation • Global routing via buffer-blocks • Separable packing LP formulations • PTAS for separable packing LPs • Analysis • Experimental results
Previous Work • MCF and packing/covering LP approximation: [FGK73,SM90, PST91,G92,GK94,KPST94,LMPSTT95,R95,Y95,GK98,F00,…] • Exponential length function to model flow congestion [SM90] • Shortest-path augmentation + final scaling [Y95] • Modified routing increment [GK98] • Fewer shortest-path augmentations [F00] • We extend speed-up idea of [F00] to separable packing LPs
Separable Packing LP Algorithm w(X) , f 0, = For i = 1 to N do For k = 1, …, #nets do Find min weight feasible Steiner tree T for net k While weight(T) < min{ 1, (1+) } do f(T)= f(T) + 1 For every X do w(X) ( 1 + (T,X)/cap(X) ) * w(X) End For Find min weight feasible Steiner tree T for net k End While End For = (1+) End For Output f/N
Outline • VLSI design motivation • Global routing via buffer-blocks • Separable packing ILP formulations • PTAS for separable packing LPs • Analysis • Experimental results
Runtime • Choose #iterations N such that all feasible trees have weight 1 after N iterations (i.e., 1) • Tree weight lower bound is initially, and is multiplied by (1+) in each iteration Dual LP:
Theorem: For every <.15, the algorithm finds factor 1/(1+4 ) approximation by choosing where L is the maximum number of vertices in a feasible Steiner tree. For this value of , the running time is Approximation Guarantee
Outline • VLSI design motivation • Global routing via buffer-blocks • Separable packing ILP formulations • PTAS for separable packing LPs • Analysis • Experimental results
Provably Good Rounding • Store fractional flows f(T) for every feasible Steiner tree T • Scale down each f(T) by 1- for small • Each net k routed with prob. f(k)={ f(T) | T feasible for k } • Number of routed nets (1- )OPT • To route net k, choose tree T with probability = f(T) / f(k) • With high probability, no BB capacity is exceeded Problem: Impractical to store all non-zero flow trees
use random walk from source to sink Random-Walk 2-TMCF Rounding • Store fractional flows f(T) for every valid routing tree T • Scale down each f(T) by 1- for small • Each net k routed with prob. f(k)={ f(T) | T routing for k } • Number of routed nets (1- )OPT • To route net k, choose tree T with probability = f(T) / f(k) • With high probability, no BB capacity is exceeded Practical: random walk requires storing only flows on edges
T3 T1 S T2 Random-Walk MTMCF Rounding SourceSinks
T3 T1 S T2 Random-Walk MTMCF Rounding SourceSinks
The MTMCF Rounding Heuristic • Round each net k with probability f(k), using backward random walks • No scaling-down, approximate MTMCF < OPT • Resolve capacity violations by greedily deleting routed paths • Few violations • Greedily route remaining nets using unused BB capacity • Further routing still possible
Implemented Heuristics • Greedy buffered routing: • For each net, route sinks sequentially along shortest paths to source or node already connected to source • After routing a net, remove fully used BBs • Generalized MCF approximation + randomized rounding • G2TMCF • G3TMCF (3-pin decomposition) • G4TMCF (4-pin decomposition) • GMTMCF (no decomposition, approximate DRST)
Experimental Setup • Test instances extracted from next-generation SGI microprocessor • Up to 5,000 nets, ~6,000 sinks • U=4,000 m, L=500-2,000 m • 50 buffer blocks • 200-400 buffers / BB
Conclusions and Ongoing Work • Provably good algorithms and practical heuristics based on separable packing LP approximation • Higher completion rates than previous algorithms • Extensions: • Combine global buffering with BB planning • Buffer “site” methodology tile graph • Routing congestion (channel capacity constraints) • Simultaneous pin assignment
Resource Usage #nets = 4,764 #sinks = 6,038 400 buffers/BB
Resource Usage for 100% Completion #nets = 4,764 #sinks = 6,038 MTMCF wastes routing resources!