1 / 69

Connected Components & All Pairs Shortest Paths

Connected Components & All Pairs Shortest Paths. Presented by Wooyoung Kim 3/4/09 CSc 8530 Parallel Algorithms , Spring 2009 Dr. Sushil K. Prasad. Outline. Adjacent matrix and connectivity matrix Parallel algorithm for computing connectivity matrix

Download Presentation

Connected Components & All Pairs Shortest Paths

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Connected Components& All Pairs Shortest Paths Presented by Wooyoung Kim 3/4/09 CSc 8530 Parallel Algorithms, Spring 2009 Dr. Sushil K. Prasad

  2. Outline • Adjacent matrix and connectivity matrix • Parallel algorithm for computing connectivity matrix • Parallel algorithm for computing connected components • Sequential algorithms for all-pairs shortest paths • Parallel algorithm for all-pairs shortest paths • Analysis • Related recent research • References

  3. 9.3. Connected Components

  4. Connected Components • Let G=(V,E) be a graph, V={v0,v1,…,vn-1} • It can be represented by an n x n adjacency matrixA defined as • Connected component of an undirected graph G is a connected subgraph of G of maximum size • Given such a graph G, we develop an algorithmfor computing its connected components on a hypercube interconnection network parallel computer

  5. Adjacency matrix - Examples v0 v0 v2 v1 v2 v1 v3 v3 v4 v4 v5 0 1 2 3 4 5 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 5 Example 1: undirected graph Example 2: directed graph

  6. Applications for Connected Components • Identifying clusters. We can represent each item by a vertex and add an edge between each pair of items that are ``similar.'' The connected components of this graph correspond to different classes of items. • Component labeling is commonly used in image processing, to join neighboring pixels into connected regions which are the shapes in the image. • Testing whether a graph is connected is an essential preprocessing step for every graph algorithm.

  7. Computing the Connectivity Matrix • A key step in the algorithm for finding the connected components is to find the so-called connectivity matrix • Definition:A connectivity matrix of a (directed or undirected) graph G with n vertices is an n x n matrix C defined as: for 0  j,k  n-1 • C also known as reflexive and transitive closure of G • Given the adjacency matrix A of G,it is required to compute C

  8. Computing the Connectivity Matrix – cont. • Approach: Boolean matrix multiplication • The matrices to be multiplied, and the product matrix are all binary • The logical “and” operation replaces regular multiplication • The logical “or” operation replaces regular addition • If X, Y and Z are n x n Boolean matrices, where Z is a Boolean product of X and Y, then zij = (xi1andy1j) or (xi2andy2j) or … or (xinandynj) (in regular product: )

  9. Computing the Connectivity Matrix – cont. • 1st step: obtain an n x n matrix B from A as follows for 0  j,k  n-1 i.e. B is equal to A with augmented 1’s along the diagonal  B represents all the paths in G of length less than 2, or

  10. Computing the Connectivity Matrix – cont. • Then B2 = B x B (a Boolean product of B with itself) represents paths of length 2 or less • bik2 represents a path of length 2 from vi to vk through vj • Generally, Bnrepresents paths of length nor less • Observe: If there is a path from vi to vj, it cannot have length more than n-1 since G has only n vertices. • Hence, the connectivity matrix C = Bn-1 • Bn-1 is computed through successive squaring j k k vk k vi i 1 0 i 1 = j 1 vj

  11. Computing the Connectivity Matrix – cont. • C is obtained after log (n-1) Boolean matrix multiplications • When n -1 is not a power of 2, C is obtained form Bm, where m = 2 log (n-1) (the smallest power of 2 larger than n-1) this is correct since Bm=Bn-1 for m > n-1 Implementation: • We use the algorithm HYPERCUBE MATRIX MULTIPLICATION, adopted to perform Boolean matrix multiplication • Input: the adjacency matrix A of G • Output: the connectivity matrix C

  12. Computing the Connectivity Matrix – cont. • The hypercube used has N = n3 processors: P0, P1, …, PN-1 • Arranged in an n x n x n array; Pr occupies position (i,j,k) where r = in2+jn+k , 0  i,j,k  n-1 • Processor Pr has 3 registers: A(i,j,k), B(i,j,k), C(i,j,k) • Initially, the processors in position (0,j,k) (0  j,k  n-1) contain the adjacency matrix: A(0,j,k) = ajk • At the end, the same processors contain the connectivity matrix: C(0,j,k) = cjk (0  j,k  n-1)

  13. Algorithm HYPERCUBE CONNECTIVITY (A,C) Step 1: forj = 0 ton-1 do in parallel A(0, j, j) 1 end for Step 2: forj = 0 ton-1 do in parallel fork = 0 ton-1 do in parallel B(0, j, k)  A(0, j, k) end for end for Step 3: fori = 1 tolog (n-1) do (3.1) HYPERCUBE MATRIX MULTIPLICATION (A,B,C) (3.2) forj = 0 ton-1 do in parallel fork = 0 ton-1 do in parallel (i) A(0, j, k)  C(0, j, k) (i) B(0, j, k)  C(0, j, k) end for end for end for

  14. Analysis of the “HYPERCUBE CONNECTIVITY” algorithm • Steps 1, 2 and 3 take constant time • HYPERCUBE MATRIX MULTIPLICATION: O(log n) time and this step is iterated log (n-1) times • Total running time: t(n) = O(log2n) • Since p(n) = n3  cost c(n) = O(n3 log2n)

  15. Algorithm for Connected Components • Construct an n x n matrix D using the connectivity matrix C: for 0  j,k  n-1 i.e. row j of D contains names of vertices to which vjis connected by a path • Connected components of G are found by assigning each vertex to a component in a following way: • vj is assigned to a component l if l is the smallest index for which djl  0 i value of l is i j vi vk 0…0

  16. Implementation of the Connected Components algorithm • Implemented on a hypercube using the HYPERCUBE CONNECTIVITY algorithm • It runs on a hypercube with N = n3 processors, each with three registers A, B, C • Processors are arranged in n x n x n array as required for the HYPERCUBE CONNECTIVITY algorithm • Initially: A(0,j,k) = ajkfor 0  j,k  n-1 • At the end: C(0,j,0) contains the component number for vertex vj

  17. Algorithm HYPERCUBE CONNECED COMPONENTS Algorithm HYPERCUBE CONNECED COMPONENTS (A,C) Step 1: HYPERCUBE CONNECTIVITY (A,C) Step 2: forj = 0 ton-1 do in parallel fork = 0 ton-1 do in parallel ifC(0,j,k) = 1 then C(0, j, k)  vk end if end for end for Step 3: forj = 1 ton-1 do in parallel (3.1) The n processors in row j find the smallest l for which C(0,j,l)  0 (3.2) C(0,j,0) = l end for Creating matrix D

  18. Analysis of the “HYPERCUBE CONNECED COMPONENTS” algorithm • Step 1 requires O(log2n) time • Steps 2 and (3.2) take constant time • Step (3.1): the n processors in row j form a log n dimensional hypercube; this step is a reduction operation (Step 3 of HYPERCUBE MATRIX MULTIPLICATION with “+” replaced by “min”) • Overall running time: t(n) = O(log2n) • p(n)=n3  c(n) =O(n3log2n)

  19. Example: comp. Conn. Comp. on a hypercube v0 v2 v1 v7 v5 Graph G v4 v6 v3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 Adjacency matrix of G

  20. Example v0 v2 v1 v7 v5 v4 v6 v3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 = X = A A2 x A

  21. Example 2 – computing the Connectivity Matrix v0 v2 v1 v7 v5 v4 v6 v3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 = X = A4 (= A2)  stop A2 x A2

  22. Example 2 – cont. v0 v2 v1 v7 v5 v4 v6 v3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 Connectivity matrix Matrix of connected components

  23. Example 2 – cont. v0 v2 v1 v7 v5 v4 v6 v3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 Component 1: { v0,v5,v7 } Component 2: { v1,v2,v4 } Component 3: {v3,v6 } Matrix of connected components

  24. 9.5. All-Pairs Shortest Paths

  25. Graph Terminology • G = (V, E) • W = weight matrix • wij = weight/length of edge (vi, vj) • wij = ∞ if vi and vj are not connected by an edge • wii = 0 • Assume W has positive, 0, and negative values • For this problem, we cannot have a negative-sum cycle in G

  26. 0 1 2 3 4 Weighted Graph and Weight Matrix v1 v0 5 -4 v2 3 1 2 7 9 6 v3 v4

  27. Directed Weighted Graph and Weight Matrix 0 1 2 3 4 5 v3 v0 -2 1 7 v1 v2 -1 2 5 9 6 3 4 v4 v5

  28. All-Pairs Shortest Paths Problem • For every pair of vertices vi and vj in V, it is required to find the length of the shortest path from vi to vj along edges in E. • Specifically, a matrix D is to be constructed such that dij is the length of the shortest path from vi to vj in G, for all i and j. • Length of a path (or cycle) is the sum of the lengths (weights) of the edges forming it.

  29. Sample Shortest Path v3 v0 -2 1 7 v1 v2 2 -1 5 9 6 3 4 v4 v5 Shortest path from v0 to v4 is along edges (v0, v1), (v1, v2), (v2, v4) and has length 6

  30. Disallowing Negative-length Cycles • APSP does not allow for input to contain negative-length cycles • This is necessary because: • If such a cycle were to exist within a path from vi to vj, then one could traverse this cycle indefinitely, producing paths of ever shorter lengths from vi to vj. • If a negative-length cycle exists, then all paths which contain this cycle would have a length of -∞.

  31. Sequential Algorithms for APSP • Floyd-Warshall algorithm is Θ(V3) • Appropriate for dense graphs: |E| = O(|V|2) • Johnson’s algorithm • Appropriate for sparse graphs: |E| = O(|V|) • O(V2 log V + V E) if using a Fibonacci heap • O(V E log V) if using binary min-heap

  32. Properties of Interest • Let denote the length of the shortest path from vi to vj that goes through at most k - 1 intermediate vertices (k hops) • = wij (edge length from vi to vj) • If i ≠ j and there is no edge from vi to vj, then • Also, • Given that there are no negative weighted cycles in G, there is no advantage in visiting any vertex more than once in the shortest path from vi to vj. • Since there are only n vertices in G,

  33. Guaranteeing Shortest Paths • If the shortest path from vi to vj contains vr and vs (where vr precedes vs) • The path from vr to vs must be minimal (or it wouldn’t exist in the shortest path) • Thus, to obtain the shortest path from vi to vj, we can compute all combinations of optimal sub-paths (whose concatenation is a path from vi to vj), and then select the shortest one vi vr vs vj MIN MIN MIN ∑ MINs

  34. Iteratively Building Shortest Paths v1 w1j v2 w2j … vn vi vj wnj

  35. Recurrence Definition • For k > 1, • Guarantees O(log k) steps to calculate vi vl vj MIN MIN ≤ k/2 vertices ≤ k/2 vertices ≤ k vertices

  36. Similarity

  37. Computing D • Let Dk = matrix with entries dij for 0 ≤ i, j ≤ n - 1. • Given D1, compute D2, D4, … , Dm Where • D = Dm • To calculate Dk from Dk/2, use special form of matrix multiplication • ‘’ → ‘’ • ‘’ → ‘min’

  38. “Modified” Matrix Multiplication Step 2: forr = 0 toN – 1 do par Cr = Ar + Br end Step 3: form = 2qto 3q – 1 do forallr N (rm = 0) do par Cr = min(Cr, Cr(m))

  39. “Modified” Example (1) P101 P100 P000 P001 1 -1 1 -2 From 9.2, Initial 3 -3 4 -4 P110 P010 P011 P111

  40. “Modified” Example(2) P101 P100 2 -2 1 -1 P000 P001 1 -1 2 -2 From 9.2, after step (1.1) 3 -3 4 -4 P110 P010 P011 P111 3 -3 4 -4

  41. “Modified” Example(3) P101 P100 2 -2 2 -1 P000 P001 1 -1 1 -2 From 9.2, after step (1.2) 3 -3 3 -4 P110 P010 P011 P111 4 -3 4 -4

  42. “Modified” Example(4) P101 P100 2 -4 2 -3 P000 P001 1 -1 1 -2 From 9.2, after step (1.3) 3 -1 3 -2 P110 P010 P011 P111 4 -3 4 -4

  43. “Modified” Example (5) P101 P100 -1 -2 P000 P001 0 -1 From 9.2, after modified step 2 2 1 P110 P010 P011 P111 1 0

  44. “Modified” Example (6) P101 P100 P000 P001 MIN MIN From 9.2, after modified step 3 -1 -2 1 0 MIN MIN P110 P010 P011 P111

  45. Hypercube Setup • Begin with a hypercube of n3 processors • Each has registers A, B, and C • Arrange them in an nnn array (cube) • Set A(0, j, k) = wjk for 0 ≤ j, k ≤ n – 1 • i.e processors in positions (0, j, k) contain D1 = W • When done, C(0, j, k) contains APSP = Dm

  46. 0 1 2 3 4 5 Setup Example D1 = Wjk = A(0, j, k) = v3 v0 -2 1 7 v1 v2 -1 2 5 9 6 3 v4 4 v5

  47. APSP Parallel Algorithm Algorithm HYPERCUBE SHORTEST PATH (A,C) Step 1: forj = 0 ton - 1 dopar fork = 0 ton - 1 dopar B(0, j, k) = A(0, j, k) end for end for Step 2: fori = 1 todo (2.1) HYPERCUBE MATRIX MULTIPLICATION(A,B,C) (2.2) forj = 0 ton - 1 dopar for k = 0 ton - 1 dopar (i) A(0, j, k) = C(0, j, k) (ii) B(0, j, k) = C(0, j, k) end for end for end for

  48. An Example 0 1 2 3 4 5 0 1 2 3 4 5 D1 = D2 = 0 1 2 3 4 5 0 1 2 3 4 5 D4 = D8 =

  49. Analysis • Steps 1 and (2.2) require constant time • There are iterations of Step (2.1) • Each requires O(log n) time • The overall running time is t(n) = O(log2n) • p(n) = n3 • Cost is c(n) = p(n) t(n) = O(n3 log2n) • Efficiency is

  50. Related Paper Edwin Romeijn and Rober Smith: “Parallel Algorithms for Solving Aggregated Shortest Path Problems”, Computers and Operations Research, Special Issue on Aggregation, Volume 26, Issue 10-11, pp 941-953, 1999

More Related