610 likes | 621 Views
Explore the concepts of Minimum Spanning Trees (MSTs) and delve into the intricacies of Kruskal's Algorithm for finding the optimal tree. Learn about unique MSTs, algorithm correctness, implementation details, and the running time analysis.
E N D
CPSC 311Analysis of Algorithms More Graph Algorithms Prof. Jennifer Welch Fall 2009 CPSC 311, Fall 2009
Minimum Spanning Tree 16 5 4 11 12 7 3 14 9 6 2 8 10 15 17 13 18 find subset of edges that span all the nodes, create no cycle, and minimize sum of weights CPSC 311, Fall 2009
Facts About MSTs • There can be many spanning trees of a graph • In fact, there can be many minimum spanning trees of a graph • But if every edge has a unique weight, then there is a unique MST CPSC 311, Fall 2009
Uniqueness of MST • Suppose in contradiction there are 2 MSTs, M1 and M2. • Let e be edge with minimum weight that is one but not the other (say it is in M1). • If e is added to M2, a cycle is formed. • Let e' be an edge in the cycle that is not in M1 CPSC 311, Fall 2009
Uniqueness of MST e: in M1 but not M2 M2: e': in M2 but not M1; wt is less than wt of e Replacing e with e' creates a new MST M3 whose weight is less than that of M2 contradiction CPSC 311, Fall 2009
Generic MST Algorithm • input: weighted undirected graph G = (V,E,w) • T := empty set • while T is not yet a spanning tree of G • find an edge e in E s.t. T U {e} is a subgraph of some MST of G • add e to T • return T (as MST of G) CPSC 311, Fall 2009
Kruskal's MST algorithm 16 5 4 11 12 7 3 14 9 6 2 8 10 15 17 13 18 consider the edges in increasing order of weight, add in an edge iff it does not cause a cycle CPSC 311, Fall 2009
Kruskal's Algorithm as a Special Case of Generic Alg. • Consider edges in increasing order of weight • Add the next edge iff it doesn't cause a cycle • At any point, T is a forest (set of trees); eventually T is a single tree CPSC 311, Fall 2009
Kruskal's Algorithm • input: G = (V,E,w) // weighted graph • T := Ø // subset of E • sort E by increasing weights • for each (u,v) in E in sorted order do • if T U {u,v} has no cycle then T := T U {(u,v)} • return (V,T) CPSC 311, Fall 2009
Why is Kruskal's Greedy? • Algorithm manages a set of edges s.t. • these edges are a subset of some MST • At each iteration: • choose an edge so that the MST-subset property remains true • subproblem left is to do the same with the remaining edges • Always try to add cheapest available edge that will not violate the tree property • locally optimal choice CPSC 311, Fall 2009
Correctness of Kruskal's Alg. • Let e1, e2, …, en-1 be sequence of edges chosen • Clearly they form a spanning tree • Suppose it is not minimum weight • Let eibe the edge where the algorithm goes wrong • {e1,…,ei-1} is part of some MST M • but {e1,…,ei} is not part of any MST CPSC 311, Fall 2009
Correctness of Kruskal's Alg. M: ei, forms a cycle in M wt(e*) > wt(ei) replacing e* w/ eiforms a spanning tree with smaller weight than M, contradiction! e* : min wt. edge in cycle not in e1 to ei-1 white edges are part of MST M, which contains e1 to ei-1, but not ei CPSC 311, Fall 2009
Note on Correctness Proof • Argument on previous slide works for case when every edge has a unique weight • Algorithm also works when edge weights are not necessarily correct • Modify proof on previous slide: contradiction is reached to assumption that ei is not part of any MST CPSC 311, Fall 2009
Implementing Kruskal's Alg. • Sort edges by weight • efficient algorithms known • How to test quickly if adding in the next edge would cause a cycle? • use Disjoint Sets data structure… CPSC 311, Fall 2009
Disjoint Sets in Kruskal's Alg. • Use Disjoint Sets data structure to test whether adding an edge to a set of edges would cause a cycle • Each set represents nodes that are connected using edges chosen so far • Initially put each node in a set by itself using Make-Set • When testing edge (u,v), call Find-Set to check if u and v are in the same set • if so, then a cycle would result if (u,v) is chosen • When edge (u,v) is chosen, call Union(u,v) CPSC 311, Fall 2009
Kruskal's Algorithm #2 • input: G = (V,E,w) // weighted graph • T := Ø // subset of E • sort E by increasing weights • for each v in V do MakeSet(v) • for each (u,v) in E in sorted order do • a := FindSet(u) • b := FindSet(v) • if a ≠ b then • T := T U {(u,v)} • Union(a,b) • return (V,T) CPSC 311, Fall 2009
Running Time of Kruskal's MST Algorithm • Sorting the edges takes O(E log E) time • The rest of the time is proportional to the total time taken by all the calls to MakeSet, FindSet, and Union. • Number of calls to MakeSet is |V| • If we unravel the for loop, we can see: • total number of calls to FindSet is O(E) • total number of calls to Union is O(V) • So number of operations is O(V+E), with |V| MakeSet ops. • If the Disjoint Sets data structure is implemented using union by rank and path compression, the time for all the Disjoint Sets operations is O((V+E) log*V) • Total time is O(E log E), since graph is connected, so V ≤ E + 1, and this equals O(E log V), since E is O(V2) and log V2 = 2log V CPSC 311, Fall 2009
Another Greedy MST Alg. • Kruskal's algorithm maintains a forest that grows until it forms a spanning tree • Alternative idea is keep just one tree and grow it until it spans all the nodes • Prim's algorithm • At each iteration, choose the minimum weight outgoing edge to add • greedy! CPSC 311, Fall 2009
Idea of Prim's Algorithm • Instead of growing the MST as possibly multiple trees that eventually all merge, grow the MST from a single node, so that there is only one tree at any point. • Also a special case of the generic algorithm: at each step, add the minimum weight edge that goes out from the tree constructed so far. CPSC 311, Fall 2009
Prim's Algorithm • input: weighted undirected graph G = (V,e,w) • T := empty set • S := {any node in V} • while |T| < |V| - 1 do • let (u,v) be a min wt. outgoing edge (u in S, v not in S) • add (u,v) to T • add v to S • return (S,T) (as MST of G) CPSC 311, Fall 2009
Prim's Algorithm Example 8 7 b c d 4 9 2 11 a i 4 e 14 7 6 8 10 h g f 1 2 CPSC 311, Fall 2009
Correctness of Prim's Algorithm • Let Ti be the tree represented by (S,T) at the end of iteration i. • Show by induction on i that Ti is a subtree of some MST of G. • Basis: i = 0 (before first iteration). T0 contains just a single node, and thus is a subtree of every MST of G. CPSC 311, Fall 2009
u v Correctness of Prim's Algorithm • Induction: Assume Ti is a subtree of some MST M. We must show Ti+1 is a subtree of some MST. • Let (u,v) be the edge added in iteration i+1. Case 1: (u,v) is in M. Then Ti+1 is also a subtree of M. Ti Ti+1 CPSC 311, Fall 2009
Correctness of Prim's Algorithm Case 2: (u,v) is not in M. • There is a path P in M from u to v, since M spans G. • Let (x,y) be the first edge in P with one endpoint in Ti and the other not in Ti. y P x Ti u v CPSC 311, Fall 2009
Correctness of Prim's Algorithm • Let M' = M - {(x,y)} U {(u,v)} • M' is also a spanning tree of G. • w(M') = w(M) - w(x,y) + w(u,v) ≤ w(M) since (u,v) is min wt outgoing edge • So M' is also an MST and Ti+1 is subtree M' y x Ti u v Ti+1 CPSC 311, Fall 2009
Implementing Prim's Algorithm • How do we find minimum weight outgoing edge? • First cut: scan all adjacency lists at each iteration. • Results in O(VE) time. • Try to do better. CPSC 311, Fall 2009
Implementing Prim's Algorithm • Idea: have each node not yet in the tree keep track of its best (cheapest) edge to the tree constructed so far. • To find min wt. outgoing edge, find minimum among these values • use a priority queue to store the best edge info (insert and extract-min operations) CPSC 311, Fall 2009
u v w x Implementing Prim's Algorithm • When a node v is added to T, some other nodes might have their best edges affected, but only neighbors of v • add decrease-key operation to the priority queue v's best edge to Ti check if this edge is cheaper for w Ti Ti+1 x's best edge to Ti w's best edge to Ti CPSC 311, Fall 2009
Details on Prim's Algorithm Associate with each node v two fields: • best-wt[v] : if v is not yet in the tree, then it holds the min. wt. of all edges from v to a node in the tree. Initially infinity. • best-node[v] : if v is not yet in the tree, then it holds the name of the node u in the tree s.t. w(v,u) is v's best-wt. Initially nil. CPSC 311, Fall 2009
Details on Prim's Algorithm • input: G = (V,E,w) // initialization • initialize priority queue Q to contain all nodes, using best-wt values as keys • let v0 be any node in V • decrease-key(Q,v0,0) // last line means change best-wt[v0] to 0 and adjust Q accordingly CPSC 311, Fall 2009
Details on Prim's Algorithm • while Q is not empty do • u := extract-min(Q) // node w/ smallest best-wt • if u is not v0 then add (u,best-node[u]) to T • for each neighbor v of u do • if v is in Q and w(u,v) < best-wt[v] then • best-node[v] := u • decrease-key(Q,v,w(u,v)) • return (V,T) // as MST of G CPSC 311, Fall 2009
Running Time of Prim's Algorithm Depends on priority queue implementation. Let • Tins be time for insert • Tdec be time for decrease-key • Tex be time for extract-min Then we have • |V| inserts and one decrease-key in the initialization: O(VTins+Tdec) • |V| iterations of while • one extract-min per iteration: O(VTex) total CPSC 311, Fall 2009
Running Time of Prim's Algorithm • Each iteration of while includes a for loop. • Number of iterations of for loop varies, depending on how many neighbors the current node has • Total number of iterations of for loop is O(E). • Each iteration of for loop: • one decrease key, so O(ETdec) total CPSC 311, Fall 2009
Running Time of Prim's Algorithm • O(V(Tins + Tex) + ETdec) • If priority queue is implemented with a binary heap, then • Tins = Tex = Tdec = O(log V) • total time is O(E log V) • (Think about how to implement decrease-key in O(log V) time.) CPSC 311, Fall 2009
Shortest Paths in a Graph • We already saw the Floyd-Warshall algorithm to compute all pairs of shortest paths • Let's review two important single-source shortest path algorithms: • Dijkstra's algorithm • Bellman-Ford algorithm CPSC 311, Fall 2009
s t Single Source Shortest Path Problem • Given: directed or undirected graph G = (V,E,w) and source node s in V • Find: For each t in V, a path in G from s to t with minimum weight • Warning! Negative weights are a problem: 4 5 CPSC 311, Fall 2009
Shortest Path Tree • Result of a SSSP algorithm can be viewed as a tree rooted at the source • Why not use breadth-first search? • Works fine if all weights are the same: • weight of each path is (a multiple of) the number of edges in the path • Doesn't work when weights are different CPSC 311, Fall 2009
Dijkstra's SSSP Algorithm • Assumes all edge weights are nonnegative • Similar to Prim's MST algorithm • Start with source node s and iteratively construct a tree rooted at s • Each node keeps track of tree node that provides cheapest path from s (not just cheapest path from any tree node) • At each iteration, include the node whose cheapest path from s is the overall cheapest CPSC 311, Fall 2009
4 5 1 s 6 Prim's MST Prim's vs. Dijkstra's 4 5 1 s 6 Dijkstra's SSSP CPSC 311, Fall 2009
Implementing Dijkstra's Alg. • How can each node u keep track of its best path from s? • Keep an estimate, d[u], of shortest path distance from s to u • Use d as a key in a priority queue • When u is added to the tree, check each of u's neighbors v to see if u provides v with a cheaper path from s: • compare d[v] to d[u] + w(u,v) CPSC 311, Fall 2009
Dijkstra's Algorithm • input: G = (V,E,w) and source node s // initialization • d[s] := 0 • d[v] := infinity for all other nodes v • initialize priority queue Q to contain all nodes using d values as keys CPSC 311, Fall 2009
Dijkstra's Algorithm • while Q is not empty do • u := extract-min(Q) • for each neighbor v of u do • if d[u] + w(u,v) < d[v] then • d[v] := d[u] + w(u,v) • decrease-key(Q,v,d[v]) • parent(v) := u CPSC 311, Fall 2009
Dijkstra's Algorithm Example iteration 2 a b 8 12 4 c 10 9 6 3 2 d e 4 source is node a CPSC 311, Fall 2009
Correctness of Dijkstra's Alg. • Let Ti be the tree constructed after i-th iteration of while loop: • nodes not in Q • edges indicated by parent variables • Show by induction on i that the path in Ti from s to u is a shortest path and has distance d[u], for all u in Ti. • Basis: i = 1. s is the only node in T1 and d[s] = 0. CPSC 311, Fall 2009
Ti Ti-1 s x u Correctness of Dijkstra's Alg. • Induction: Assume Ti-1 is a correct shortest path tree and show for Ti. • Let u be the node added in iteration i. • Let x = parent(u). Need to show path in Ti from s to u is a shortest path, and has distance d[u] CPSC 311, Fall 2009
P', another path from s to u a s x u b Correctness of Dijkstra's Alg P, path in Ti from s to u Ti-1 Ti (a,b) is first edge in P' that leaves Ti-1 (i.e., a is in Ti-1 but b is not) CPSC 311, Fall 2009
Correctness of Dijkstra's Alg Let P1 be part of P' before (a,b). Let P2 be part of P' after (a,b). w(P') = w(P1) + w(a,b) + w(P2) ≥ w(P1) + w(a,b) (nonneg wts) ≥ (wt of path in Ti-1 from s to a) + w(a,b) (inductive hypothesis) ≥ (wt of path in Ti-1 from s to x) + w(x,u) (alg chose u in iteration i and d-values are accurate, by inductive hyp.) = w(P). So P is a shortest path, and d[u] is accurate after iteration i. Ti-1 P Ti s u x a b P' CPSC 311, Fall 2009
Running Time of Dijstra's Alg. • initialization: insert each node once • O(V Tins) • O(V) iterations of while loop • one extract-min per iteration => O(V Tex) • for loop inside while loop has variable number of iterations… • For loop has O(E) iterations total • one decrease-key per iteration => O(E Tdec) • Total is O(V (Tins + Tex) + E Tdec) CPSC 311, Fall 2009
Using Fancier Heap Implementations • O(V(Tins + Tex) + E Tdec) • If priority queue is implemented with a binary heap, then • Tins = Tex = Tdec = O(log V) • total time is O(E log V) • There are fancier implementations of the priority queue, such as Fibonacci heap: • Tins = O(1), Tex = O(log V), Tdec = O(1) (amortized) • total time is O(V log V + E) CPSC 311, Fall 2009
Using Simpler Heap Implementations • O(V(Tins + Tex) + E Tdec) • If graph is dense, so that |E| = (V2), then it doesn't help to make Tins and Tex to be at most O(V). • Instead, focus on making Tdec be small, say constant. • Implement priority queue with an unsorted array: • Tins = O(1), Tex = O(V), Tdec = O(1) • total is O(V2) CPSC 311, Fall 2009