470 likes | 619 Views
Multi-Threaded Collision-Aware Global Routing with Bounded-Length Maze Routing. Wen-Hao Liu, Wei-Chun Kao, Yih-Lang Li, and Kai-Yuan Chao DAC 2010. Outline. Introduction Preliminaries Bounded-Length Maze Routing Task-based Concurrency Strategy Collision-aware rip-up & reroute
E N D
Multi-Threaded Collision-Aware Global Routing with Bounded-Length Maze Routing Wen-Hao Liu, Wei-Chun Kao, Yih-Lang Li, and Kai-Yuan Chao DAC 2010
Outline • Introduction • Preliminaries • Bounded-Length Maze Routing • Task-based Concurrency Strategy • Collision-aware rip-up & reroute • Experimental results and Conclusions
Outline • Introduction • Preliminaries • Bounded-Length Maze Routing • Task-based Concurrency Strategy • Collision-aware rip-up & reroute • Experimental results and Conclusions
Introduction • ISPD’07 and ’08 global routing contest encouraging the development of global router. • Maze routing is still the most important part in those routers, and heavily affects runtime and routing quality. • However, those routers seldom attempt to develop new approaches to improve it.
Introduction (cont.) • Multi-core architecture becomes the mainstream. • Partitioning-based concurrency is difficult to control load balancing and hence the speedup strategy tends to be limited.
Outline • Introduction • Preliminaries • Bounded-Length Maze Routing • Task-based Concurrency Strategy • Collision-aware rip-up & reroute • Experimental results and Conclusions
Problem Formulation supply s(e) the number of available routing tracks passing through an edge e demand d(e) the number of wires that utilize an edge e Global bins Global bins Cells Global edges Global edges • Wirelength is as small as possible 7
Preliminaries • Time complexity of maze routing. • O(VlogV), V is the size of search region • Many routers use bonding box to limit the search region. • If no overflow-free path exist, expand the bonding box • The limitation may waste routing resources.
Motivated example • (a) routing result with bonding box • (b) routing result without bonding box
Motivated example (cont.) • We want a approach that is not only restricts the searching region to speed up maze routing, but also effectively utilizes routing resources by avoiding producing redundant wirelength. • Develop BLMR
Bounded-Length Maze Routing • Identifying a minimal-cost path from a net’s source to the target with a specified length constraint. • Consider not only length but also cost of each net.
The proposed global router • Adopting Bounded-Length Maze Routing (BLMR) • 3D routability-driven • Objective: minimize the followings • Maximum overflow • Total overflow • Total wirelength • Runtime
Design flow of the proposed router (cont.) • Compact 3D routing problem into 2D • Do layer assignment [14] at the end of the flow • Use MST to decomposed each net into two-pin nets • Generates an initial congestion graph via monotonic routing • Negotiation-based rip-up & reroute adopting BLMR
Monotonic Pattern Routing Proposed in FastRoute 2.0 [ASPDAC07] Finds the routing path in the direction toward T within the bounding box formed by S and T Can be done by dynamic programming with the same complexity as Z-shaped pattern routing T S 15
Layer Assignment p3 p’3 p’2 p’1 p2 p1 p4 Compression One-layer routing p3 p’3 Layer Assignment p2 p’2 p’1 p1 p4
Outline • Introduction • Preliminaries • Bounded-Length Maze Routing • Task-based Concurrency Strategy • Collision-aware rip-up & reroute • Experimental results and Conclusions
Optimal-BLMR • Obtains a minimum-cost routing solution under the bounded-length constraint. • Search region V wl(Pi)=Manh(s,v)=5 Manh(v,t)=4
Two policies of Optimal-BLMR • Prunes the path with bounded-length violation and restricts the searching region. • Reserve all paths that are not been dominated by others. • If P1(s ,v) has longer wirelength and higher cost than P2(s, v), P1(s, v) is said to be inferior to P2(s, v), and can be discarded.
Examples • Two path candidates P1(s, v) and P2(s,v) from s to v: the gray regions are congested regions, the bounded length is 16, and pc(P1), pc(P2), wl(P1) and wl(P2) are 80, 90, 11 and 5, respectively. • P1 has lower cost but not enough length slack. • Thus, BLMR cannot reserve the min-cost path only.
But why keeping both of them? • Because the wirelength from v to t is uncertain before the end of routing, optimal-BLMR must reserve both paths. We do not know the routing length
Disadvantage of Optimal-BLMR • May not be suited to solve modern large-scale designs. • Hence, heuristic-BLMR approach is proposed.
Heuristic-BLMR • Reserves only one path from the source to the current node, and the other paths are discarded. • Cannot guarantee the optimal solution but faster. • How to decide which path should be reserved?
Heuristic-BLMR (cont.) • Reserve the minimal-cost path with enough length slack. • If no path candidates have enough length slack, reserve the shortest path candidate for its greater chance to bypass the congested regions. • Need to estimate the wirelength from v to t.
History-based estimated wirelength • ewk(v,t) is the history-based estimated wirelength from v to t in iteration k • k represents the iteration number of the global routing • Lk-1(s,t) is the history length, i.e., actual routed wirelength from s to t in iteration k-1
History-based estimated wirelength (cont.) • Heuristic-BLMR predicts that Pi(s,v) has sufficient length slack to bypass the congested regions from v to t if the following equation holds
Bounded-length relaxation • In the rip-up & reroute (R&R) stage, relaxing the bounded length encourages heuristic-BLMR to obtain routing results with less overflow at the cost of using longer wirelength. • Scheme: • is the bounded length of net n in the k-th routing iteration • .αandβare are user-defined constant
Phase1 Phase2 Phase3
Outline • Introduction • Preliminaries • Bounded-Length Maze Routing • Task-based Concurrency Strategy • Collision-aware rip-up & reroute • Experimental results and Conclusions
Task-Based Concurrency Strategy on Multi – Core Platform • Applied in rip-up & reroute stage to speed-up. • R&R may take 99.6% runtime of the entire routing flow • To keep all threads working with almost full load during routing. • Can explore more concurrency than partitioning-based approach
What is Task-Based Concurrency Strategy (TCS)? • Main ideas • A two-pin net routing is defined as a task • A task queue is maintained to contain all tasks • Update queue iteratively • Each thread repeatedly acquires a task from the task queue when the thread completes its task
Example of two concurrency strategies • TCS may have race condition between thread
Outline • Introduction • Preliminaries • Bounded-Length Maze Routing • Task-based Concurrency Strategy • Collision-aware rip-up & reroute • Experimental results and Conclusions
Collision • Due to two or more threads demand the same routing resource simultaneously. • Inconsistency • Often occurs when several nets are close to each other in a congested region, when the nets are routed simultaneously.
Collision-Aware R&R • An overflow net often only contains a few overflow grid edges. • The overflow net only needs a few detours to avoid congested regions • The bounded-length of a net is incrementally relaxed • Thread may reuse most grid edges of the original path in the new path
Collision-Aware R&R (cont.) • Hence, it is useful to avoid collisions by preventing new routing paths from passing through the original routing paths of other currently routed nets.
Outline • Introduction • Preliminaries • Bounded-Length Maze Routing • Task-based Concurrency Strategy • Collision-aware rip-up & reroute • Experimental results and Conclusions
Experimental results • Environment • C/C++ • Quad-core 3.0 GHz Intel Xeon-based PC • 32GB memory
Routing Results of SequentialGlobal Routers– overflow-free cases • Use proposed SGR adopting BLMR in R&R stage • Only this and FastRoute adopt a single set of control parameters to solve all benchmarks
Routing Results of SequentialGlobal Routers– hard-to-route cases • Faster but still keep quality
Comparison between proposed SGR and collision-aware PGR – overflow-free cases • Race condition may result in nondeterministic routing result • Collision-aware PGR is performed 10 times
Comparison between proposed SGR and collision-aware PGR – hard-to-route cases
Conclusions • First addresses the BLMR problem and develop solving algorithms. • Compared to other modern global routers, the proposed global router using heuristic-BLMR identifies less wirelength with less runtime. • Develop PGR adopts TCS and collision-aware R&R to increase CPU utilization and estimate collision