450 likes | 486 Views
Minimum Spanning Trees Featuring Disjoint Sets. HKOI Training 2006 Liu Chi Man (cx) 25 Mar 2006. Prerequisites. Asymptotic complexity Set theory Elementary graph theory Priority queues (or heaps). Graphs. A graph is a set of vertices and a set of edges G = (V, E)
E N D
Minimum Spanning TreesFeaturing Disjoint Sets HKOI Training 2006 Liu Chi Man (cx) 25 Mar 2006
Prerequisites • Asymptotic complexity • Set theory • Elementary graph theory • Priority queues (or heaps)
Graphs • A graph is a set of vertices and a set of edges • G = (V, E) • Number of vertices = |V| • Number of edges = |E| • We assume simple graph, so |E| = O(|V|2)
Roadmap • What is a tree? • Disjoint sets • Minimum spanning trees • Various tree topics
Trees in graph theory • In graph theory, a tree is an acyclic, connected graph • Acyclic means “without cycles”
Properties of trees • |E| = |V| - 1 • |E| = (|V|) • Between any pair of vertices, there is a unique path • Adding an edge between a pair of non-adjacent vertices creates exactly one cycle • Removing an edge from the tree breaks the tree into two smaller trees
Definition? • The following four conditions are equivalent: • G is connected and acyclic • G is connected and |E| = |V| - 1 • G is acyclic and |E| = |V| - 1 • Between any pair of vertices in G, there exists a unique path • G is a tree if at least one of the above conditions is satisfied
Other properties of trees • Bipartite • Planar • A tree with at least two vertices has at least two leaves (vertices of degree 1)
Roadmap • What is a tree? • Disjoint sets • Minimum spanning trees • Various tree topics
The Union-Find problem • N balls initially, each ball in its own bag • Label the balls 1, 2, 3, ..., N • Two kinds of operations: • Pick two bags, put all balls in these bags into a new bag (Union) • Given a ball, find the bag containing it (Find)
The Union-Find problem • An example with 4 balls • Initial: {1}, {2}, {3}, {4} • Union {1}, {3} {1, 3}, {2}, {4} • Find 3. Answer: {1, 3} • Union {4}, {1,3} {1, 3, 4}, {2} • Find 2. Answer: {2} • Find 1. Answer {1, 3, 4}
Disjoint sets • Disjoint-set data structures can be used to solve the union-find problem • Each bag has its own representative ball • {1, 3, 4} is represented by ball 3 (for example) • {2} is represented by ball 2
Implementation 1: Naive arrays • Bag[x] := representative of the bag containing x • <O(N), O(1)> • Union takes O(N) and Find takes O(1) • Slight modifications give <O(U), O(1)> • U is the size of the union • Worst case: O(MN) for M operations
Implementation 1: Naive arrays • How to union Bag[x] and Bag[y]? • Z := Bag[x] For each ball v in Z do Bag[v] := Bag[y] • Can I update the balls in Bag[y] instead? • Rule: Update the balls in the smaller bag • O(MlgN) for M union operations
6 1 3 5 4 7 2 Implementation 2: Forest • A forest is a collection of trees • Each bag is represented by a rooted tree, with the root being the representative ball Example: Two bags --- {1, 3, 5} and {2, 4, 6, 7}.
Implementation 2: Forest • Find(x) • Traverse from x up to the root • Union(x, y) • Merge the two trees containing x and y
1 2 3 4 1 2 4 3 1 2 3 4 1 2 3 4 Implementation 2: Forest Initial: Union 1 3: Union 2 4: Find 4:
1 2 3 4 1 2 3 4 Implementation 2: Forest Union 1 4: Find 4:
Implementation 2: Forest • How to represent the trees? • Leftmost-Child-Right-Sibling (LCRS)? • Too complicated • Parent array • Parent[x] := parent of x • If x is a tree root, set Parent[x] := x
Implementation 2: Forest • The worst case is still O(MN ) for M operations • What is the worst case? • Improvements • Union-by-rank • Path compression
Union-by-rank • We should avoid tall trees • Root of the taller tree becomes the new root when union • So, keep track of tree heights (ranks) Bad Good
Path compression • See also the solution for Symbolic Links (HKOI2005 Senior Final) • Find(x): traverse from x up to root • Compress the x-to-root path at the same time
The root is 3 3 3 3 5 5 1 5 1 6 1 4 6 6 The root is 3 7 2 4 4 The root is 3 7 7 2 2 Path compression • Find(4)
U-by-rank + Path compression • We ignore the effect of path compression on tree heights to simplify U-by-rank • U-by-rank alone gives O(MlgN) • U-by-rank + path compression gives O(M(N)) • : inverse Ackermann function • (N) 5 for practically large N
Roadmap • What is a tree? • Disjoint sets • Minimum spanning trees • Various tree topics
Minimum spanning trees • Given a connected graph G = (V, E), a spanning tree of G is a graph T such that • T is a subgraph of G • T is a tree • T contains every vertex of G • A connected graph must have at least one spanning tree
Minimum spanning trees • Given a weighted connected graph G, a minimum spanning tree T* of G is a spanning tree of G with minimum total edge weight • Application: Minimizing the total length of wires needed to connect up a collection of computers
Minimum spanning trees • Two algorithms • Kruskal’s algorithm • Prim’s algorithm
Kruskal’s algorithm • Choose edges in ascending weight greedily, while preventing cycles
Kruskal’s algorithm • Algorithm • T is an empty set • Sort the edges in G by their weights • For (in ascending weight) each edge e do • If T {e} is acyclic then • Add e to T • Return T
Kruskal’s algorithm • How to detect a cycle? • Depth-first search (DFS) • O(V) per check • O(VE) overall • Disjoint set • Vertices are balls, connected components are bags
Kruskal’s algorithm • Algorithm (using disjoint-set) • T is an empty set • Create bags {1}, {2}, …, {V} • Sort the edges in G by their weights • For (in ascending weight) each edge e do • Suppose e connects vertices x and y • If Find(x) Find(y) then • Add e to T, then Union(Find(x), Find(y)) • Return T
Kruskal’s algorithm • The improved time complexity is O(ElgV) • The bottleneck is sorting
Prim’s algorithm • In Kruskal’s algorithm, the MST-in-progress scatters around • Prim’s algorithm grows the MST from a “seed” • Prim’s algorithm iteratively chooses the lightest grow-able edge • A grow-able edge connects a grown vertex and a non-grown vertex
Prim’s algorithm • Algorithm • Let seed be any vertex, and Grown := {seed} • Initially T is an empty set • Repeat |V|-1 times • Let e=(x,y) be the lightest grow-able edge • Add e to T • Add x and y to Grown • Return T
Prim’s algorithm • How to find the lightest grow-able edge? • Check all (grown, non-grown) vertex pairs • Too slow • Each non-grown vertex x keeps a value nearest[x], which is the weight of the lightest edge connecting x to some grown vertex • Nearest[x] = if no such edge
Prim’s algorithm • How to use nearest? • Grow the vertex (x) with the minimum nearest-value • Which edge? Keep track on it! • Since x has just been grown, we need to update the nearest-values of all non-grown vertices • Only need to consider edges incident to x
Prim’s algorithm • Try to program Prim’s algorithm • You may find that it’s very similar to Dijkstra’s algorithm for finding shortest paths! • Almost only a one-line difference
Prim’s algorithm • Per round... • Finding minimum nearest-value: O(V) • Updating nearest-values: O(V) (Overall O(E)) • Overall: O(V2+E) = O(V2) time • Using a binary heap, • O(lgV) per Finding minimum • O(lgV) per Updating • Overall: O(ElgV) time
MST Extensions • Second-best MST • We don’t want the best! • Online MST • See IOI2003 Path Maintenance • Minimum bottleneck spanning tree • The bottleneck of a spanning tree is the weight of its maximum weight edge • An algorithm that runs in O(V+E) exists
MST Extensions (NP-Hard) • Minimum Steiner Tree • No need to connect all vertices, but at least a given subset B V • Degree-bounded MST • Every vertex of the spanning tree must have degree not greater than a given value K • For a discussion of NP-hardness, please attend [Talk] Introduction to Complexity Theory on 3 June
Roadmap • What is a tree? • Disjoint sets • Minimum spanning trees • Various tree topics
Various tree topics (List) • Center, eccentricity, radius, diameter • Tree isomorphism • Canonical representation • Prüfer code • Lowest common ancestor (LCA) • Counting spanning trees
Supplementary readings • Advanced: • Disjoint set forest (Lecture slides) • Prim’s algorithm • Kruskal’s algorithm • Center and diameter • Post-advanced (so-called Beginners): • Lowest common ancestor • Maximum branching