1 / 25

External-Memory MST

External-Memory MST. (Arge, Brodal, Toma). Minimum-Spanning Tree. Given a weighted, undirected graph G=(V,E), the minimum-spanning tree (MST) problem is the problem of finding a spanning tree for G of minimum weight. Assumptions: G is connected; No two edges in G have the same weight.

remy
Download Presentation

External-Memory MST

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. External-Memory MST (Arge, Brodal, Toma)

  2. Minimum-Spanning Tree • Given a weighted, undirected graph G=(V,E), the minimum-spanning tree (MST) problem is the problem of finding a spanning tree for G of minimum weight. • Assumptions: • G is connected; • No two edges in G have the same weight.

  3. External-Memory Graph Algorithms • Standard two-level I/O model with a single disk: • N = V + E • M= number of vertices/edges that can fit into internal memory. • B= number of vertices/edges per disk block. • The graph is given as a list of edges sorted by vertex.

  4. External-Memory Graph Algorithms (2) • For MST and CC, randomize O(sort(E)) I/Os algorithms are known.

  5. b a b d c e f c a e d      1           5  3      6 6 2     8 4   7  7  7  7 Prim’s Algorithm 7 {b,a} 1 3 {a,c} 5 {c,d} {d,e} 8 9 6 2 {a, f} 4 a b c d e f Priority Queue:

  6. Prim’s Algorithm (2) • Prim’s algorithm cannot be implemented efficiently in external memory: • It is not guaranteed that even the priority queue alone fits in memory. • Thus, we cannot in general get the current vertex priority without using an I/O. • A direct implementation leads to an Ω(E) I/O algorithm.

  7. f c d e b f a c d b a e Prim’s Algorithm (3) Modification: store edges in the priority-queue instead of vertices. 7 {b,a} 1 3 {a,c} 5 {c,d} {d,e} 8 9 6 2 {a, f} 4 {d,e} (4) {b,d} (6) {c,b} (5) {a, f} (7) {b,c} (5) {c,e} (8) {d,b} (6) {a, f} (7) {e,c} (8) {c,e} (8) {e, f} (9) {b,d} (6) {e,c} (8) {d,b} (6) {c,e} (8) {a, f} (7) {e, f} (9) {c,d} (2) {b,d} (6) {c,b} (5) {a, f} (7) {b,c} (5) {c,e} (8) {e,c} (8) {c,e} (8) {e, f} (9) {f, e} (9) {a,c} (3) {b,c} (5) {b,d} (6) {a, f} (7) {c,b} (5) {a, f} (7) {b,c} (5) {e,c} (8) {b,d} (6) {c,e} (8) {d,b} (6) {e, f} (9) {b,a} (1) {b,c} (5) {b,d} (6) Any two edges have distinct weights Priority Queue:

  8. Modified Prim Algorithm • The correctness follows directly from the correctness of the original algorithm (“blue rule” still applies). • Efficiency: • At least one I/O per vertex in order to read its adjacency list => O(V + E/B) I/Os. • O(E) operations on external priority queue can be performed in O(sort(E)). • Thus in total we have O(V + sort(E)) I/Os.

  9. a b d c e f Boruvka’s Algorithm (1) Select for each vertex the minimum weight edge adjacent to it. (2) Contract the graph and return to (1) {b,a} 7 1 3 5 {c,d} {d,e} 8 9 6 2 {a, f} 4

  10. Boruvka’s Algorithm (1) Select for each vertex the minimum weight edge adjacent to it. (2) Contract the graph and return to (1) {b,a} abf {a,c} {c,d} 3,5,6,9 {d,e} {a, f} cde

  11. External-Memory Boruvka’s Step • For each vertex v, let C(v) be the lightest vertex adjacent to it. • Let G’ be the graph obtained by taking only edges of the form (v, C(v)) for each v. • Let G’d be the graph obtained by directing each edge (v, C(v)) in G’ from C(v) to v. • The goal is to contract each connected component in G’ into a single vertex.

  12. Unique Representatives • In each connected component of G’d: • Each vertex has indegree 1. • The weight of the edges along any root-leaf path is increasing. • There is exactly one cycle, consisting of the minimal weight edge.

  13. External-Memory Boruvka’s Step (2) • The roots can be easily identified, and we can choose them to be the unique representatives of the components in G’. • We would like to replace each edge (u, v) with an edge (ur, vr), where ur and vr are the unique representatives of the components containing u and v respectively. • Then, we can remove parallel & self edges, and obtain the contracted graph.

  14. a b d c e f External-Memory Boruvka’s Step (3) L: Output: (b,a) (1); (a, f) (7) (c,d) (2); (d,e) (4) (d,e) (4) (a, f) (7) G G’ G’d b → b c → c a → b d → c f → b e → c 1 7 3 5 8 9 Priority Queue: 6 2 a (1) [b] d (2) [c] d (2) [c] f (7) [b] e (4) [c] f (7) [b] 4 Initialized with each vertex that is an immediate successor of a root vertex.

  15. External-Memory Boruvka’s Step (4) To finish the contraction: • sort the output of the previous phase and E by the first component. Then scan the two lists simultaneously, replacing each edge (v, u) in E with (vr,u). • sort the output and E by the second component, and then scan the two lists replacing each edge (vr, u) in E with (vr, ur). • sort E by both components and by weight, and with a single scan remove duplicate & self edges.

  16. Boruvka’s Step - I/O efficiency • Lightest incident edges can be collected in O(E/B) I/Os in a simple scan of the edge-list representation of G (we assume it is sorted). • Detection of cycles in G’d can be done in O(sort(V)) I/Os: • sort the collected edges by weight and find duplicates in a single scan. • remove edges to break cycles and identify unique representatives.

  17. Boruvka’s Step - I/O efficiency (2) • The list L contains each edge in G’d at most twice, and can be constructed in O(sort(V)) I/Os: • sort one instance of the list of edges by the second component. • sort another instance by the first component. • create the structure of L in a single scan and sort it by weight. 4. The PQ can be initialized in a similar way in O(sort(V)) I/Os.

  18. Boruvka’s Step - I/O efficiency (3) 5. We perform a total of V insertions to PQ, and V extract-min operations. That can be performed in O(sort(V)) I/Os. 6. Replacing the edges of G with the unique representatives is done using a few sorting and scanning operations as described before. Here the entire edge list is sorted, and thus O(sort(E)) I/Os are needed. Total: O(E/B + sort(V) + sort(E)) = O(sort(E)) I/Os.

  19. Results So Far Modified Prim O(V + sort(E)) I/Os Modified Boruvka O(sort(E) · lgV) I/Os • Contract G until V ≤ E/B using Boruvka’s steps. • Run Prim on the result. O(sort(E)·lg(V·B/E)) I/Os It is possible to perform lg(V·B/E) Boruvka’s steps using lglg(V·B/E) superphases requiring O(sort(E)) I/Os each.

  20. Yet a better MST algorithm Superphase Algorithm At superphase i : • Let Ni = 2(3/2)i (Ni+1= Ni·(Ni)1/2) • Let Gi= (Vi, Ei) be the graph prior to superphase i. • Let Ei‘  Ei be the set that for each vertex contains the √Ni lightest edges incident to it. • Let the blocking value for a vertex be the weight of the √Ni + 1th lightest edge incident to it (or infinity if no such edge exists). • Ei‘ and blocking values can be found with O(sort(Ei)) I/Os as described earlier.

  21. Superphase Algorithm • At superphase i, perform on Gi‘ log√Ni contraction phases as described before, but now select the lightest edge incident to a vertex only if it is smaller than its blocking value. • After a single contraction, the blocking value of a supervertex is set to be the minimum of the blocking values of the contracted vertices. • After that, the remaining edges of Ei‘ contain all edges of Ei adjacent to supervertex v with weight smaller than the blocking value of v. • Thus only edges that actually belong to the MST are contracted.

  22. Superphase Algorithm (2) But how many vertices remain after each superphase? • The blocking value might prevents us from selecting an edge for v. But if so than: • The blocking value of v corresponds to the blocking value of some vertex u in Vi, and v must contain the √Ni edges adjacent to u in Ei‘. • Thus v must be the contraction of at least √Ni vertices from Vi • If no blocking value prevents us from selecting an edge for v, then after log√Ni phases, v must be the contraction of at least 2log√Ni= √Nivertices.

  23. Superphase Algorithm (3) • It can be proved by induction on i that Vi ≤ 2V / Ni : • For i = 0, Ni = 2 and V0 = V. • Vi+1 ≤ Vi / √Ni ≤ (2V / Ni) / √Ni = 2V / Ni+1 • Conclusion: Ei‘ ≤ Vi√Ni ≤ 2V / √Ni • Thus, in order to reduce the number of vertices by a factor of √Ni we used so far: O(sort(Ei) + sort(Ei‘) · log√Ni) = O(sort(E) + sort(V / √Ni) · log√Ni) = O(sort(E)) I/Os.

  24. Superphase Algorithm (4) • In order to finish a superphase, we need to reincorporate edges from Ei not selected to Ei‘: • During the contraction phases, maintain a list C of the form (v, vs) for v Vi. • Use the output of the Boruvka’s step, as described earlier, in order to update C: • Sort C by second component and the output by first component and scan them simultaneously. • This is done using O(sort(Vi)) I/Os. • In total, in order to maintain C, we use: O(sort(Vi)·log√Ni) = O(sort(V / Ni)·log√Ni) = O(sort(V)) I/Os.

  25. Superphase Algorithm – I/O Efficiency • Ei‘ and blocking values are computed in O(sort(Ei)) I/Os. • Each superphase takes up O(sort(E)) I/Os. • Maintaining the list C during the superphase is done with O(sort(V)) I/Os. • Given C, the edges in (Ei \ Ei‘) can be reincorporated in O(sort(E)) as we did in the single contraction algorithm. • Finally, in order to reduce V to E/B, log3/2lg(V·B / E) superphases are needed. • Total: O(sort(E)·lglg(V·B / E)) I/Os.

More Related