1 / 42

Parallel Graph Algorithms

This article provides insights into parallelizing Prim's Algorithm for Minimum Spanning Tree and Dijkstra’s Algorithm for Single-Source Shortest Path. It discusses techniques for optimizing the parallel execution to achieve faster computation and better efficiency. Additionally, it explores the complexities involved in parallelizing these algorithms, including the communication and computation overhead. Practical examples and patterns are presented to demonstrate the parallel implementation of these classical graph algorithms.

hedges
Download Presentation

Parallel Graph Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel Graph Algorithms

  2. Graph Algorithms • Minimum Spanning Tree (Prim’s Algorithm) • Single-Source Shortest Path (Dijkstra’s Algorithm) • All-Pairs Shortest Paths (Dijkstra’s and Floyd’s Algorithm)

  3. Adjacency Matrix • An adjacency matrix represent the edges of a graph

  4. Adjacency Matrix • Example 0 1 2 1 2 4 3 3 4 2

  5. Prim’s Algorithm for Minimum Spanning Tree V – set of vertices VT – set of vertices in the MST E – set of edges A – adjacency matrix r – root node d – minimum distance from MST to any vertex Prim_MST(V, E, A, r) { VT = {r}; d[r] = 0; for all v in (V – VT) d[v] = Ar,v; while (VT != V) { Find a vertex u such that d[u] = min(d[v] for all v in (V – VT)); VT = VT + {u}; for all v in (V – VT) { d[v] = min(d[v], Au,v); } } } Complexity = O(n2)

  6. Root is node b(Prim’s) a 3 Initialize 1 f 3 b 5 c 5 1 2 1 d e 4 Since d[3] = 1, add the edge b to d and consider node d next

  7. Next consider node d(Prim’s) a 3 Take Minimums except for b and d 1 f 3 b 5 c 5 1 2 1 d e 4 Since d[0] = 1, add the edge b to a and consider node a next

  8. Next consider node a (Prim’s) a 3 1 f 3 b 5 c 5 1 2 1 d e 4 Since d[2] = 2, add the edge d to c and consider node c next

  9. Next consider node c(Prim’s) a 3 1 f 3 b 5 c 5 1 2 1 d e 4 Since d[4] = 1, add the edge c to e and consider node e next

  10. Next consider node e(Prim’s) a 3 1 f 3 b 5 c 5 1 2 1 d e 4 Since d[5] = 3, add the edge a to f and consider node f next

  11. Next consider node f(Prim’s) a 3 1 f 3 b 5 c 5 1 2 1 d e 4 VT= V so stop

  12. Parallelizing Prim’s Algorithm • We can’t just simply execute the while loop in parallel because the d[] array changes with each selection of a vertex • We have to update values in d[] from all processors after each iteration • Suppose we have n vertices in the graph and p processors

  13. Parallelizing Prim’s Algorithm • Partition and adjacency matrix and the distance array (d) across processors d[ ] n A 0 1 2 p-1

  14. Parallelizing Prim’s Algorithm • Each processor computes the next vertex from among its vertices • A reduction is done on the distance array (d) to find the minimum • The result is broadcast out to all the processors

  15. Which pattern does this fit?

  16. Prim’s Algorithm (Parallel) Prim_MST(V, E, A, r) { ... // Initialize d as before #pragma paraguin begin_parallel while (VT != V) { Find a vertex u such that d[u] = min(d[v] for all v in (V – VT)); VT = VT + {u}; #pragma paraguin forall for v in V if (v  VT) d[v] = min(d[v], Au,v); #pragma paraguin reduce min d #pragma paraguin bcast d } #pragma paraguin end_parallel }

  17. Prim’s Algorithm (Parallel) • Complexity of Parallel algorithm: • Each reduction and broadcast takes log p time, but we have to do up to n of them. Communication Computation

  18. Dijkstra’s Algorithm for Single-Source Shortest Path • Given a source node, what is the shortest distance to each other node • The minimum spanning tree gives is this information

  19. Dijkstra’s Algorithm V – set of vertices VT – set of vertices in the MST E – set of edges A – adjacency matrix r – root node d – minimum distance from root to any vertex Dijkstra_SP(V, E, A, r) { VT = {r}; d[r] = 0; for v in (V – VT) d[v] = Ar,v; while (VT != V) { Find a vertex v such that d[u] = min(d[v] for all v in (V – VT)); VT = VT + {u}; for v in (V – VT) d[v] = min(d[v], d[u] + Au,v); } } Complexity = O(n2) This is the only thing different

  20. Source Node is node b(Dijkstra’s) a 3 Initialize 1 f 3 b 5 c 5 1 2 1 d e 4 Since d[3] = 1, consider node d next

  21. Next consider node d(Dijkstra’s) a 3 1 f 3 b 5 c 5 1 2 1 d e 4 Since l[0] = 1, consider node a next

  22. Next consider node a(Dijkstra’s) a 3 1 f 3 b 5 c 5 1 2 1 d e 4 Since l[2] = 3, consider node c next

  23. Next consider node c(Dijkstra’s) a 3 1 f 3 b 5 c 5 1 2 1 d e 4 Since l[4] = 4, consider node e next

  24. Next consider node e(Dijkstra’s) a 3 1 f 3 b 5 c 5 1 2 1 d e 4 Since d[5] = 4, add the edge a to f and consider node f next

  25. Next consider node f(Dijkstra’s) a 3 1 f 3 b 5 c 5 1 2 1 d e 4 VT= V so stop

  26. Parallelizing Dijkstra’s Algorithm • Since Dijkstra’s Algorithm and Prim’s Algorithm are essentially the same, we can parallelize them the same way: • Complexity of Parallel algorithm: • If we have n processors, this becomes: Communication Computation

  27. All Pairs Shortest Path • Dijkstra’s Algorithm gives us the shortest path from a particular node to all the others • For All Paris Shortest Path, we want to find the shortest path between all pairs of vertices • We can apply Dijkstra’s Algorithm to every pair of vertices • Complexity = O(n3)

  28. All Pairs using Dijkstra’s Algorithm V – set of vertices VT – set of vertices in the MST E – set of edges A – adjacency matrix r – root node d – minimum distance from root to any vertex Dijkstra_APSP(V, E, A) { for r in V { VT = {r}; d[r] = 0; for all v in (V – VT) d[v] = Ar,v; while (VT != V) { Find a vertex u such that d[u] = min(d[v] for all v in (V – VT)); VT = VT + {u}; for v in (V – VT) d[v] = min(d[v], d[u] + Au,v); } } } Complexity = O(n3)

  29. All Pairs Shortest Path • We can parallelize the outermost loop • Each processors assumes a different node vi and computes the shortest path to all nodes • No communication if needed • Complexity is O(n3/p) • If we have n processors, complexity is O(n2) • If we have n2 processors, we can use n processors for each vertex. Complexity becomes O(nlogn)

  30. Floyd’s Algorithm for All Pairs Shortest Path • Floyd’s Algorithm works off of this observation: • Consider a subset of V: • Let be the weight of the shortest path from vi to vj that includes one of the vertices in • If vk is not in the shortest path from vi to vj, then • Otherwise, the shortest path is

  31. Floyd’s Algorithm for All Pairs Shortest Path • This leads to the following recurrence: • We can implement this using iteration and not recursion

  32. All Pairs using Floyd’s Algorithm Floyd_APSP(V, E, A) { d0i,j = Ai,j for all i,j for k = 1 to n for i = 1 to n for j = 1 to n d(k)i,j = min(d(k-1)i,j , d(k-1)i,k + d(k-1)k,j ) • We don’t need n copies of the d matrix. We only need one. • In fact, we can do it with only one matrix V – set of vertices E – set of edges A – adjacency matrix Complexity = O(n3)

  33. Partitioning of the d matrix • We divide the d matrix into p blocks of size n/√p • Each processor is responsible for n2/√p elements of the d matrix … … … …

  34. Partitioning of the d matrix • However, we have to send data between processors k column j column k row i row

  35. Which pattern does this fit?

  36. Communication Pattern … … … …

  37. Analysis of Floyd’s Algorithm • Each processor has to send its block to all processors on the same row and column. • If we use a broadcast, then the time to communication is • The synchronization step requires • The time to compute the values for each processors is

  38. Analysis of Floyd’s Algorithm • So the complexity for each step is: • And finally, the complexity for n steps (of the k loop) is: Communication Computation

  39. A faster version of Floyd’s Algorithm • We can do a pipeline of values moving through the matrix. • The reason is because once processor pi, j computes the value of it can then send it to the processors pi, j-1 , pi, j+1 , pi+1, j , and pi-1, j

  40. Consider the movement of the value computed by processor 4 Time t t+1 t+2 t+3 t+4 1 2 3 4 5 6 7 8 Processors

  41. Analysis of Floyd’s Algorithm with pipelining • The net complexity of the algorithm using pipelining is: Communication Computation

  42. Questions

More Related