240 likes | 413 Views
Computation on Graphs: Partitioning, Maximal Independent Set, etc. P0 P1 P2 P3. x. P0 P1 P2 P3. y. Parallel sparse matrix-vector product. Lay out matrix and vectors by rows y( i ) = sum(A( i,j )*x(j)) Only compute terms with A( i,j ) ≠ 0 Algorithm Each processor i :
E N D
Computation on Graphs:Partitioning,Maximal Independent Set,etc.
P0 P1 P2 P3 x P0 P1 P2 P3 y Parallel sparse matrix-vector product • Lay out matrix and vectors by rows • y(i) = sum(A(i,j)*x(j)) • Only compute terms with A(i,j) ≠ 0 • Algorithm Each processor i: Broadcast x(i) Compute y(i) = A(i,:)*x Comm volume v = p*n (way too much!) • Reducing communication volume • Only send each proc the parts of x it needs • Reorder matrix for better locality by graph partitioning
2Dblock decomposition for 5-point stencil • nstencil cells, p processors • Each processor has a patch of n/p cells • Block row (or block col) layout: v = 2 * p * sqrt(n) • 2-dimensional block layout: v = 4 * sqrt(p) * sqrt(n)
Graphs and Sparse Matrices • Sparse matrix is a representation of a (sparse) graph 1 2 3 4 5 6 1 1 1 2 1 1 1 3 1 11 4 1 1 5 1 1 6 1 1 3 2 4 1 5 6 • Matrix entries are edge weights • Number of nonzeros per row is the vertex degree • Edges represent data dependencies in matrix-vector multiplication
edge crossings = 6 edge crossings = 10 Graph partitioning • Assigns subgraphs to processors • Determines parallelism and locality. • Tries to make subgraphs all same size (load balance) • Tries to minimize edge crossings (communication). • Exact minimization is NP-complete.
Graph partitioning demo > load meshes > gplotg(Airfoil,Axy) > specdice(Airfoil,Axy,5) > meshdemo spectral bisection 32-way partition
Applications of graph partitioning • Telephone network design • Original application, 1970 algorithm due to Kernighan • Load Balancing while Minimizing Communication • Sparse Matrix times Vector Multiplication • Solving PDEs • N = {1,…,n}, (j,k) in E if A(j,k) nonzero, • WN(j) = #nonzeros in row j, WE(j,k) = 1 • VLSI Layout • N = {units on chip}, E = {wires}, WE(j,k) = wire length • Sparse Gaussian Elimination • Used to reorder rows and columns to increase parallelism, and to decrease “fill-in” • Data mining and clustering • Physical Mapping of DNA • . . .
Partitioning by Repeated Bisection • To partition into 2k parts, bisect graph recursively k times
See separate slides on details of some graph partitioning methods
A graph problem: Maximal Independent Set • Graph with vertices V = {1,2,…,n} • A set S of vertices is independent if no two vertices in S are neighbors. • An independent set S is maximal if it is impossible to add another vertex and stay independent • An independent set S is maximum if no other independent set has more vertices • Finding a maximum independent set is intractably difficult (NP-hard) • Finding a maximal independent set is easy, at least on one processor. 1 2 4 3 6 5 7 8 The set of red vertices S = {4, 5} is independent and is maximalbut not maximum
Sequential Maximal Independent Set Algorithm 1 2 • S = empty set; • for vertex v = 1 to n { • if (v has no neighbor in S) { • add v to S • } • } 4 3 6 5 7 8 S = { }
Sequential Maximal Independent Set Algorithm 1 2 • S = empty set; • for vertex v = 1 to n { • if (v has no neighbor in S) { • add v to S • } • } 4 3 6 5 7 8 S = { 1 }
Sequential Maximal Independent Set Algorithm 1 2 • S = empty set; • for vertex v = 1 to n { • if (v has no neighbor in S) { • add v to S • } • } 4 3 6 5 7 8 S = { 1, 5 }
Sequential Maximal Independent Set Algorithm 1 2 • S = empty set; • for vertex v = 1 to n { • if (v has no neighbor in S) { • add v to S • } • } 4 3 6 5 7 8 S = { 1, 5, 6 } work ~ O(n), but span ~O(n)
Parallel, Randomized MIS Algorithm [Luby] 1 2 • S = empty set; C = V; • while C is not empty { • label each v in C with a random r(v); • for all v in C in parallel { • if r(v) < min( r(neighbors of v) ) { • move v from C to S; • remove neighbors of v from C; • } • } • } 4 3 6 5 7 8 S = { } C = { 1, 2, 3, 4, 5, 6, 7, 8 }
Parallel, Randomized MIS Algorithm [Luby] 2.6 4.1 1 2 • S = empty set; C = V; • while C is not empty { • label each v in C with a random r(v); • for all v in C in parallel { • if r(v) < min( r(neighbors of v) ) { • move v from C to S; • remove neighbors of v from C; • } • } • } 5.9 3.1 5.8 4 3 1.2 6 5 7 8 9.7 9.3 S = { } C = { 1, 2, 3, 4, 5, 6, 7, 8 }
Parallel, Randomized MIS Algorithm [Luby] 2.6 4.1 1 2 • S = empty set; C = V; • while C is not empty { • label each v in C with a random r(v); • for all v in C in parallel { • if r(v) < min( r(neighbors of v) ) { • move v from C to S; • remove neighbors of v from C; • } • } • } 5.9 3.1 5.8 4 3 1.2 6 5 7 8 9.7 9.3 S = { 1, 5 } C = { 6, 8 }
Parallel, Randomized MIS Algorithm [Luby] 1 2 • S = empty set; C = V; • while C is not empty { • label each v in C with a random r(v); • for all v in C in parallel { • if r(v) < min( r(neighbors of v) ) { • move v from C to S; • remove neighbors of v from C; • } • } • } 2.7 4 3 6 5 7 8 1.8 S = { 1, 5 } C = { 6, 8 }
Parallel, Randomized MIS Algorithm [Luby] 1 2 • S = empty set; C = V; • while C is not empty { • label each v in C with a random r(v); • for all v in C in parallel { • if r(v) < min( r(neighbors of v) ) { • move v from C to S; • remove neighbors of v from C; • } • } • } 2.7 4 3 6 5 7 8 1.8 S = { 1, 5, 8 } C = { }
Parallel, Randomized MIS Algorithm [Luby] 1 2 • S = empty set; C = V; • while C is not empty { • label each v in C with a random r(v); • for all v in C in parallel { • if r(v) < min( r(neighbors of v) ) { • move v from C to S; • remove neighbors of v from C; • } • } • } 4 3 6 5 7 8 Theorem: This algorithm “very probably” finishes within O(log n) rounds. work ~ O(n log n), but span ~O(log n)
1 2 3 4 5 6 7 1 2 2 1 3 4 5 4 5 7 6 7 6 3 Link analysis of the web • Web page = vertex • Link = directed edge • Link matrix: Aij = 1 if page i links to page j
Web graph: PageRank (Google) [Brin, Page] • Markov process: follow a random link most of the time; otherwise, go to any page at random. • Importance = stationary distribution of Markov process. • Transition matrix is p*A + (1-p)*ones(size(A)), scaled so each column sums to 1. • Importance of page i is the i-th entry in the principal eigenvector of the transition matrix. • But the matrix is 1,000,000,000,000 by 1,000,000,000,000. An important page is one that many important pages point to.
A Page Rank Matrix • Importance ranking of web pages • Stationary distribution of a Markov chain • Power method: matvec and vector arithmetic • Matlab*P page ranking demo (from SC’03) on a web crawl of mit.edu (170,000 pages)