250 likes | 376 Views
Graph partition in PCB and VLSI physical synthesis. Lin Zhong ELEC424, Fall 2010. Outline. Graph representation Partition. Graphs. A graph G = (V, E) V = set of nodes E = set of edges = subset of V V Variations: A connected graph has a path from every node to every other
E N D
Graph partition in PCB and VLSI physical synthesis Lin Zhong ELEC424, Fall 2010
Outline • Graph representation • Partition
Graphs • A graph G = (V, E) • V = set of nodes • E = set of edges = subset of V V • Variations: • A connected graphhas a path from every node to every other • In an undirected graph: • Edge (u,v) = edge (v,u) • No self-loops • In a directed graph: • Edge (u,v) goes from node u to node v, notated uv • A weighted graph associates weights with either the edges or the nodes • E.g., a road map: edges might be weighted w/ distance David Luebke
Representing Graphs • Assume V = {1, 2, …, n} • An adjacency matrixrepresents the graph as a n x n matrix A: • A[i, j] = 1 if edge (i, j) E (or weight of edge) = 0 if edge (i, j) E David Luebke
Graphs: adjacency matrix • Example: 1 a d 2 4 b c 3 David Luebke
Graphs: adjacency matrix • The adjacency matrix is a dense representation • Usually too much storage for large graphs • But can be very efficient for small graphs • Most large interesting graphs are sparse • E.g., planar graphs, in which no edges cross, have |E| = O(|V|) by Euler’s formula • adjacency list WWW
Graphs in embedded system design • Algorithm & Software • Node: Computation • Edge: Input/output (data dependencies) • Embedded system synthesis • Parallel computing • Hardware • Node: component • Edge: wire/bus • PCB & IC
PCB & IC physical synthesis A. Kahng
Embedded system synthesis • Node: Computation task • Edge: Data communication/dependency FPGA ASIC ARMCore DSP
Load balancing in parallel computing • Node: Process • Edge: Inter-process communication Core 1 Core 2 Process 1 Process 2 Process 4 Process 3 Core 3 Core 4 Process 6 Process 5
Operational research • Node: Personnel • Edge: Collaboration Building 1 Building 2 Building 3 Building 4
Partition • Constraints • # of partitions • The capability of each partition • Objective • Edge cut • Wire/interconnection cost • Delay in computation (Software) • Too many computations assigned to a core NP-Complete Problem Bipartitioning: 2-way partitioning. Bisectioning: Bipartitioning such that the two partitions have the same size.
Kernighan-Lin (KL) Algorithm • Bisectioning • Input: A graph with • Set nodes V (|V| = 2n) • Set of edges E (|E| = m) • Cost cAB for each edge {A, B} in E • Output: Two partitions X & Y s.t. • Total cost of edges cut is minimized. • Each partition has n nodes NP-Complete Kernighan, B. W.; Lin, Shen (1970). "An efficient heuristic procedure for partitioning graphs". Bell Systems Technical Journal49: 291-307. D. Pan
Complexity of graph bisectioning • Brute-force method: n out of 2n nodes • (2n)!/(n!)2= [2n*(2n-1)*…*(n+1)]/n! • =Σr=0,…,n-1[(2n-r)/(n-r)] • (n+1)≥(2n-r)/(n-r)≥2 • The complexity is therefore between O(2n) and O(nn)
Idea of KL Algorithm • DA = Decrease in cut value if moving A • External cost (connection) EA – Internal cost IA • Moving A from partition X to Y would increase the value of the cut set by EA and decrease it by IA X Y X Y B B C C A D A D DA = 2-1 = 1 DB = 1-1 = 0 D. Pan
Idea of KL Algorithm • Note that we want to balance two partitions • If switch A & B, gain(A,B) = DA+DB-2cAB • cAB : edge cost for AB X Y X Y B B C C D A A D gain(A,B) = 1+0-2 = -1 D. Pan
Idea of KL Algorithm • Start with any initial legal partitions X and Y. • A PASS(exchanging each node exactly once) is described below: 1. For i := 1 to n do From the unlocked (unexchanged) nodes, choose a pair (A,B) s.t. gain(A,B) is largest Exchange A and B. Lock A and B for this pass Let gi = gain(A,B). 2. Find the k s.t. G=g1+...+gk is maximized 3. Switch the first k pairs. • Repeat the PASS until no improvement (G=0). D. Pan
Time complexity of KL • For each pass, • O(n2) time to find the best pair to exchange. • n pairs exchanged. • Total time is O(n3) per pass. • Better implementation can get O(n2log n) time per pass. • Number of passes is usually small. D. Pan
Fiduccia-Mattheyses (FM) Algorithm • Modification of KL Algorithm: • Allow non-uniform vertex weights (areas) • Allow unbalanced partitions • Extended to handle hypergraphs • Clever way to select nodes to move, run much faster. • Input: A hypergraph with • Set nodes V (|V| = n) • Set of hyperedgesE • Area au for each node u in V • Cost ce for each hyperedge in e • An area ratio r (Unbalanced partition) • Output: 2 partitions X & Y such that • Total cost of hyperedges cut is minimized • area(X) / (area(X) + area(Y)) is about r NP-Complete C. M. Fiduccia and R. M. Mattheyses. "A linear-time heuristic for improving network partitions". Proc. DAC, pp 174-181, 1982. D. Pan
Hypergraph vs. graph • Nodes: A, B, C, D • Hyperedges: {A,B,C}, {B,D}, {C,D} • Vertex label: Gate size/area • Hyperedgeweight: importance/cost of net B B A A D C C D D. Pan
Ideas of FM Algorithm • Similar to KL • Work in passes. • Lock nodes after moved. • Actually, only move those nodes up to the maximum partial sum of gain • Difference from KL • Not exchanging pairs of nodes Move only one node at each time • The use of bucket data structure for gains D. Pan
Time Complexity of FM • For each pass, • Constant time to find the best node to move. • After each move, time to update gains is proportional to degree of node moved. • Total time is O(p), where p is total number of pins • Number of passes is usually small. D. Pan
Find the optimal bisection KL Algorithm X Y 1 4 2 5 3 6 Original Cut Value = 9 D. Pan
X Y 4 1 2 5 3 6 Optimal Cut Value = 5 D. Pan
Homework • C. M. Fiduccia and R. M. Mattheyses, "A linear-time heuristic for improving network partitions,” Proceedings of the Design Automation Conference, pp 174-181, 1982 • B. Krishnamurthy,“An Improved Min-Cut Algonthm for Partitioning VLSI Networks,” IEEE Transactions on Computers, 1984