1 / 48

A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees. 黃則翰 R96922141 蘇承祖 R96922077 張紘睿 R96922136 許智程 D95922022 戴于晉 R96922171. David R. Karger Philip N. Klein Robert E. Tarjan. Outline. Introduction Basic Property & Definition Algorithm Analysis. Outline. Introduction

santa
Download Presentation

A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees 黃則翰 R96922141蘇承祖 R96922077張紘睿 R96922136許智程 D95922022戴于晉 R96922171 David R. Karger Philip N. KleinRobert E. Tarjan

  2. Outline • Introduction • Basic Property & Definition • Algorithm • Analysis

  3. Outline • Introduction • Basic Property & Definition • Algorithm • Analysis

  4. Introduction • [Borůvka 1962] O(m log n) • Gabow et al.[1984] O(m log β(m,n) ) • β(m,n)= min { i |log(i)n <= m/n} • Verification algorithm • King[1993] O(m) • A randomize algorithm runs in O(m) time with high probability

  5. Outline • Introduction • Basic Property & Definition • Algorithm • Analysis

  6. Cycle property • For any cycle C in a graph, the heaviest edge in C dose not appear in the minimum spanning forest. 3 3 2 2 5 5 6 6

  7. Cut Property • For any proper nonempty subset X of the vertices, the lightest edge with exactly one endpoint in X belongs to the minimum spanning tree X 5 2 3 6

  8. Definition • Let G be a graph with weighted edges. • w(x,y) • The weight of edge {x,y} • If F is a forest of a subgraph in G • F(x, y) • the path (if any) connecting x and y in F • wF(x, y) • the maximum weight of an edge on F(x, y) • wF(x, y)=∞ • If x and y are not connected in F

  9. F-heavy & F-light • An edge {x,y} is F-heavy if w(x,y) > wF(x,y) and F-light otherwise • Edge of F are all F-light 3 3 E A C G W(C,D)=5WF(C,D)=5F-light 2 2 5 5 D H B F 6 6 W(B,D)=6WF(B,D)=max{2,3,5}F-heavy W(F,H)=6WF(F,H)= ∞ F-light

  10. Observation • No F-heavy edge can be in the minimum spanning forest of G (cycle property) • Discard edge that cannot be in the minimum spanning tree • F-light edge can be the candidate edge for the minimum spanning tree of G

  11. Outline • Introduction • Basic Property & Definition • Algorithm • Analysis

  12. Boruvka Algorithm • For each vertex, select the minimum-weight edge incident to the vertex. • Replace by a single vertex each connected component defined by the selected edges. • Delete all resulting isolated vertices, loops, and all but the lowest-weight edge among each set of multiple edges.

  13. Algorithm Step1 • Apply two successive Boruvka steps to the graph, thereby reducing the number of vertices by at least a factor of four.

  14. Algorithm Step2 • Choose a subgraph H by selecting each edge independently with probability ½. • Apply the algorithm recursively to H, producing a minimum spanning forest F of H. • Find all the F-heavy edges and delete them from the contracted graph.

  15. Algorithm Step3 • Apply the algorithm recursively to the remaining graph to compute a spanning forest F’. Return those edges contracted in Step1 together with the edges of F’.

  16. Original Problem G Boruvka × 2 G* F’ Sample with p=0.5 Left Sub-problem H G’ Right Sub-problem Return minimum forest F of H Delete F-heavy edges from G*

  17. Correctness • By the cut property, every edge contracted during Step1 is in the MSF. • By the cycle property, the edges deleted in Step2 do NOT belong to the MSF. • By the induction hypothesis, the MSF of the remaining graph is correctly determined in the recursive call of Step3.

  18. Candidate Edge of MST • The expected number of F-light edges in G is at most n/p (negative binomial) • For every sample graph H, the expected candidate edge for MST in G is at most n/p (F-light edge)

  19. Random-sampling • To help discard some edge that cannot be in the minimum spanning tree • Construct the sample graph H • Process the edges in increasing order • To process an edge e • 1. Test whether both endpoints of e in same component • 2. Include the edge in H with probability p • 3. If e is in H and is F-light, add e to the Forest F

  20. Random-sampling W(E,G)=14WF(E,G)=max{5,6,9,13}F-heavy W(E,F)=11WF(E,F)=max{5,6,9}F-heavy G H 4 4 14 14 10 10 5 5 7 C E C E 13 13 3 3 7 B 6 6 A G A G 11 11 B W(D,F)=9WF(D,F)=9F-light F F F D D 9 9 W(A,B)=7WF(A,B)=∞F-light

  21. Random-sampling • Increasing Order • If F-light • Throw • If • Select • Else • Throw • Don’t select G 5 C E 4 14 6 10 A G 11 13 3 F D 9 F 7 B Random select edges to H Find F of H G 5 C E 4 14 6 10 A G 11 13 3 F D 9 7 B

  22. Observation • No F-heavy edge can be in the minimum spanning forest of G (cycle property) • F-light edge can be the candidate edge for the minimum spanning tree of G • The forest F produced is the forest that would be produced by Kruskal and inlcude all possible MSF of G

  23. Observation • The size of F is at most n-1 • The expected number of F-light edges in G is at most n/p (negative binomial) Mean k = Expected n =

  24. Outline • Introduction • Basic Property & Definition • Algorithm • Analysis

  25. Analysis of the Algorithm • The worst case. • The expectations running time. • The probability of the expectations running time.

  26. Running time Analysis • Total running time= running time in each steps. • Step(1): 2 steps Boruvka’s algorithm • Step(2):Dixon-Rauch-Tarjan verification algorithm. • All takes linear time to the number of edges. • Estimate the total number of edges.

  27. Observe the recursion tree • G=(V,E) |V| = n, |E|=m . • m≧n/2 since there is no isolate vertices. • Each problem generates at most 2 subproblems. • At depth d, there is at most 2d nodes. • Each node in depth d has at most n/4d vertices. • The depth d is at most log4n. • There are at most vertices in all subproblems

  28. The worst case • Theorem 4.1 The worst-case running time of the minimum-spanning-forest algorithm is O(min{n2,m log n}), the same as the bound for Boruvka’s algorithm. • Proof: There is two different estimate ways. • A subproblem at depth contains at most (n/4d)2/2 edges. • Total edges in all subproblems is:

  29. The worst case 2. Consider a subprolbem G=(V,E) after step(1), we have a G’=(V’ ,E’),|E’|≦|E| - |V|/2, |V’| ≦|V|/4 Edges in left-child = |H| Edges in right-child ≦ |E’| - |H| + |F| so edges in two subproblem is less then: (|H|) + (|E’| - |H| + |F|) =|E’| +|F|≦|E|-|V|/2 + |V|/4≦|E| The two sub problem at most contains |E| edges.

  30. The worst case m edges

  31. The worst case • The depth is at most log4n and each level has at most m edges, so there are at most (m log n) edges. • The worst-case running time of the minimum-spanning-forest algorithm is O(min{n2,m log n}).

  32. Analysis of the Algorithm • The worst case. • The expectations running time. • The probability of the expectations running time.

  33. Analysis – Average Case (1/8) • Theorem: the expected running time of the minimum spanning forest algorithm is O(m) • Calculating the expected total number of edges for all left path problems Original Problem Right Sub-problem Left Sub-problem Left Subsub-problem Right Subsub-problem

  34. Analysis – Average Case (2/8) • Calculating the expected total edge number for one left path started at one problem with m’ edges • Evaluating the total edge number for all right sub-problems # of edges = m’ Expected total edge number ≤2m’

  35. Analysis – Average Case (3/8) • Calculating the expected total edge number for one left path started at one problem with m’ edges 1. E[edge number of H] = 0.5 × edge number of G* Original Problem G 2. ∵ Boruvka × 2 ∴ edge number of G* ≤ edge number of G Boruvka × 2 G* Sample with p=0.5 E[edge number of H] ≤ 0.5 × edge number of G H G’ Right Sub-problem Left Sub-problem

  36. Analysis – Average Case (4/8) • Calculating the expected total edge number for one left path started at one problem with m’ edges Original Problem G Boruvka × 2 G* # of edges = m’ Sample with p=0.5 E[edge number of H] ≤ 0.5 × edge number of G H G’ # of edges ≤ 0.5 × m’ Right Sub-problem Left Sub-problem Expected total edge number ≤ = 2m’

  37. Analysis – Average Case (5/8) • Calculating the expected total edge number for one left pathL started at one problem with m’ edges • Expected total edge number on L≤2m’ K.O. • Evaluating the total edge number of all right sub-problems • E[total edges of all right sub-problem] ≤ n

  38. Analysis – Average Case (6/8) • Evaluating the total edge number for all right sub-problems • To prove :E[total edges of all right sub-problem] ≤ n 1. ∵ Boruvka × 2 ∴ vertex number of G* ≤ 0.25 × vertex number of G Original Problem G Boruvka × 2 2. Based on lemma 2.1: E[edge number of G’]≤2 × vertex number of G* G* Sample with p=0.5 E[edge number of G’]≤ 0.5×vertex number of G H G’ Right Sub-problem Left Sub-problem Return minimum forest F of H Delete F-heavy edges from G*

  39. Analysis – Average Case (7/8) • Evaluating the total edge number for all right sub-problems • To prove :E[total edges of all right sub-problem] ≤ n Original Problem = n G Boruvka × 2 G* # of vertices of original-problems=n Sample with p=0.5 # of vertices of sub-problems ≤ 2×n/4 # of edges of right sub-problems≤n/2 E[edge number of G’]≤ 0.5×vertex number of G H G’ # of vertices of sub-problems ≤ 4×n/42 # of edges of right sub-problems≤2×n/8 Right Sub-problem Left Sub-problem # of edges of right sub-problems≤4×n/(42×2) # of vertices of sub-problems ≤ 8×n/43 # of edges of right sub-problems≤8×n/(43×2) # of vertices of sub-problems ≤ 16×n/44

  40. Analysis – Average Case (8/8) • Calculating the expected total edge number for one left path started at one problem with m’ edges • Expected total edge number for one left path ≤2m’ • Evaluating the total edge number for all right sub-problems • E[total edges of all right sub-problem] ≤ n # of edges = m’ E[processed edges in the original problem and all sub-problems] =2×(m+n) Expected total edge number ≤2m’

  41. Analysis of the Algorithm • The worst case. • The expectations running time. • The probability of the expectations running time.

  42. The Probability of Linearity • Theorem 4.3 • The minimum spanning forest algorithm runs in Ο(m) time with probability 1 – exp(-Ω(m))

  43. The Probability of Linearity Chernoff Bound: Given xi as i.d.d. random variables and 0< i  n, and X is the sum of all xi, for t > 0, we have Thus, the probability that less than s successes (each with chance p) within k trials is

  44. The Probability of Linearity • Right Subproblems • At most the number of vertices in all right subproblems: n/2 ( proved by theorem 4.2 ) • n/2 is the upper bound on the total number of heads in nickel-flips

  45. Right Subproblems • The probability • It occurs fewer than n/2 heads in a sequence of 3m nickel-tosses • m + n ≦ 3m since n/2 ≦ m • The probability is exp (-Ω(m)) by a Chernoff bound

  46. The Probability of Linearity • Left Subproblem • Sequence: every sequence ends up with a tail, that is, HH…HHT • The number of occurrences of tails is at most the number of sequences • Assume that there are at most m’ edges in the root problem and in all right subproblems

  47. Left Subproblems The probability • It occurs m’ tails in a sequence of more than 3m’ coin-tosses • The probability is exp (-Ω(m)) by a Chernoff bound

  48. The Probability of Linearity • Combining Right & Left Subproblems • The total number of edges is Ο(m) with a high-probability bound 1 – exp(-Ω(m))

More Related