CPSC 311Analysis of Algorithms More Graph Algorithms Prof. Jennifer Welch Fall 2009
Minimum Spanning Tree 16 5 4 11 12 7 3 14 9 6 2 8 10 15 17 13 18 find subset of edges that span all the nodes, create no cycle, and minimize sum of weights CPSC 311, Fall 2009
Facts About MSTs • There can be many spanning trees of a graph • In fact, there can be many minimum spanning trees of a graph • But if every edge has a unique weight, then there is a unique MST CPSC 311, Fall 2009
Uniqueness of MST • Suppose in contradiction there are 2 MSTs, M1 and M2. • Let e be edge with minimum weight that is one but not the other (say it is in M1). • If e is added to M2, a cycle is formed. • Let e' be an edge in the cycle that is not in M1 CPSC 311, Fall 2009
Uniqueness of MST e: in M1 but not M2 M2: e': in M2 but not M1; wt is less than wt of e Replacing e with e' creates a new MST M3 whose weight is less than that of M2 contradiction CPSC 311, Fall 2009
Generic MST Algorithm • input: weighted undirected graph G = (V,E,w) • T := empty set • while T is not yet a spanning tree of G • find an edge e in E s.t. T U {e} is a subgraph of some MST of G • add e to T • return T (as MST of G) CPSC 311, Fall 2009
Kruskal's MST algorithm 16 5 4 11 12 7 3 14 9 6 2 8 10 15 17 13 18 consider the edges in increasing order of weight, add in an edge iff it does not cause a cycle CPSC 311, Fall 2009
Kruskal's Algorithm as a Special Case of Generic Alg. • Consider edges in increasing order of weight • Add the next edge iff it doesn't cause a cycle • At any point, T is a forest (set of trees); eventually T is a single tree CPSC 311, Fall 2009
Kruskal's Algorithm • input: G = (V,E,w) // weighted graph • T := Ø // subset of E • sort E by increasing weights • for each (u,v) in E in sorted order do • if T U {u,v} has no cycle then T := T U {(u,v)} • return (V,T) CPSC 311, Fall 2009
Why is Kruskal's Greedy? • Algorithm manages a set of edges s.t. • these edges are a subset of some MST • At each iteration: • choose an edge so that the MST-subset property remains true • subproblem left is to do the same with the remaining edges • Always try to add cheapest available edge that will not violate the tree property • locally optimal choice CPSC 311, Fall 2009
Correctness of Kruskal's Alg. • Let e1, e2, …, en-1 be sequence of edges chosen • Clearly they form a spanning tree • Suppose it is not minimum weight • Let eibe the edge where the algorithm goes wrong • {e1,…,ei-1} is part of some MST M • but {e1,…,ei} is not part of any MST CPSC 311, Fall 2009
Correctness of Kruskal's Alg. M: ei, forms a cycle in M wt(e*) > wt(ei) replacing e* w/ eiforms a spanning tree with smaller weight than M, contradiction! e* : min wt. edge in cycle not in e1 to ei-1 white edges are part of MST M, which contains e1 to ei-1, but not ei CPSC 311, Fall 2009
Note on Correctness Proof • Argument on previous slide works for case when every edge has a unique weight • Algorithm also works when edge weights are not necessarily correct • Modify proof on previous slide: contradiction is reached to assumption that ei is not part of any MST CPSC 311, Fall 2009
Implementing Kruskal's Alg. • Sort edges by weight • efficient algorithms known • How to test quickly if adding in the next edge would cause a cycle? • use Disjoint Sets data structure… CPSC 311, Fall 2009
Disjoint Sets in Kruskal's Alg. • Use Disjoint Sets data structure to test whether adding an edge to a set of edges would cause a cycle • Each set represents nodes that are connected using edges chosen so far • Initially put each node in a set by itself using Make-Set • When testing edge (u,v), call Find-Set to check if u and v are in the same set • if so, then a cycle would result if (u,v) is chosen • When edge (u,v) is chosen, call Union(u,v) CPSC 311, Fall 2009
Kruskal's Algorithm #2 • input: G = (V,E,w) // weighted graph • T := Ø // subset of E • sort E by increasing weights • for each v in V do MakeSet(v) • for each (u,v) in E in sorted order do • a := FindSet(u) • b := FindSet(v) • if a ≠ b then • T := T U {(u,v)} • Union(a,b) • return (V,T) CPSC 311, Fall 2009
Running Time of Kruskal's MST Algorithm • Sorting the edges takes O(E log E) time • The rest of the time is proportional to the total time taken by all the calls to MakeSet, FindSet, and Union. • Number of calls to MakeSet is |V| • If we unravel the for loop, we can see: • total number of calls to FindSet is O(E) • total number of calls to Union is O(V) • So number of operations is O(V+E), with |V| MakeSet ops. • If the Disjoint Sets data structure is implemented using union by rank and path compression, the time for all the Disjoint Sets operations is O((V+E) log*V) • Total time is O(E log E), since graph is connected, so V ≤ E + 1, and this equals O(E log V), since E is O(V2) and log V2 = 2log V CPSC 311, Fall 2009
Another Greedy MST Alg. • Kruskal's algorithm maintains a forest that grows until it forms a spanning tree • Alternative idea is keep just one tree and grow it until it spans all the nodes • Prim's algorithm • At each iteration, choose the minimum weight outgoing edge to add • greedy! CPSC 311, Fall 2009
Idea of Prim's Algorithm • Instead of growing the MST as possibly multiple trees that eventually all merge, grow the MST from a single node, so that there is only one tree at any point. • Also a special case of the generic algorithm: at each step, add the minimum weight edge that goes out from the tree constructed so far. CPSC 311, Fall 2009
Prim's Algorithm • input: weighted undirected graph G = (V,e,w) • T := empty set • S := {any node in V} • while |T| < |V| - 1 do • let (u,v) be a min wt. outgoing edge (u in S, v not in S) • add (u,v) to T • add v to S • return (S,T) (as MST of G) CPSC 311, Fall 2009
Prim's Algorithm Example 8 7 b c d 4 9 2 11 a i 4 e 14 7 6 8 10 h g f 1 2 CPSC 311, Fall 2009
Correctness of Prim's Algorithm • Let Ti be the tree represented by (S,T) at the end of iteration i. • Show by induction on i that Ti is a subtree of some MST of G. • Basis: i = 0 (before first iteration). T0 contains just a single node, and thus is a subtree of every MST of G. CPSC 311, Fall 2009
u v Correctness of Prim's Algorithm • Induction: Assume Ti is a subtree of some MST M. We must show Ti+1 is a subtree of some MST. • Let (u,v) be the edge added in iteration i+1. Case 1: (u,v) is in M. Then Ti+1 is also a subtree of M. Ti Ti+1 CPSC 311, Fall 2009
Correctness of Prim's Algorithm Case 2: (u,v) is not in M. • There is a path P in M from u to v, since M spans G. • Let (x,y) be the first edge in P with one endpoint in Ti and the other not in Ti. y P x Ti u v CPSC 311, Fall 2009
Correctness of Prim's Algorithm • Let M' = M - {(x,y)} U {(u,v)} • M' is also a spanning tree of G. • w(M') = w(M) - w(x,y) + w(u,v) ≤ w(M) since (u,v) is min wt outgoing edge • So M' is also an MST and Ti+1 is subtree M' y x Ti u v Ti+1 CPSC 311, Fall 2009
Implementing Prim's Algorithm • How do we find minimum weight outgoing edge? • First cut: scan all adjacency lists at each iteration. • Results in O(VE) time. • Try to do better. CPSC 311, Fall 2009
Implementing Prim's Algorithm • Idea: have each node not yet in the tree keep track of its best (cheapest) edge to the tree constructed so far. • To find min wt. outgoing edge, find minimum among these values • use a priority queue to store the best edge info (insert and extract-min operations) CPSC 311, Fall 2009
u v w x Implementing Prim's Algorithm • When a node v is added to T, some other nodes might have their best edges affected, but only neighbors of v • add decrease-key operation to the priority queue v's best edge to Ti check if this edge is cheaper for w Ti Ti+1 x's best edge to Ti w's best edge to Ti CPSC 311, Fall 2009
Details on Prim's Algorithm Associate with each node v two fields: • best-wt[v] : if v is not yet in the tree, then it holds the min. wt. of all edges from v to a node in the tree. Initially infinity. • best-node[v] : if v is not yet in the tree, then it holds the name of the node u in the tree s.t. w(v,u) is v's best-wt. Initially nil. CPSC 311, Fall 2009
Details on Prim's Algorithm • input: G = (V,E,w) // initialization • initialize priority queue Q to contain all nodes, using best-wt values as keys • let v0 be any node in V • decrease-key(Q,v0,0) // last line means change best-wt[v0] to 0 and adjust Q accordingly CPSC 311, Fall 2009
Details on Prim's Algorithm • while Q is not empty do • u := extract-min(Q) // node w/ smallest best-wt • if u is not v0 then add (u,best-node[u]) to T • for each neighbor v of u do • if v is in Q and w(u,v) < best-wt[v] then • best-node[v] := u • decrease-key(Q,v,w(u,v)) • return (V,T) // as MST of G CPSC 311, Fall 2009
Running Time of Prim's Algorithm Depends on priority queue implementation. Let • Tins be time for insert • Tdec be time for decrease-key • Tex be time for extract-min Then we have • |V| inserts and one decrease-key in the initialization: O(VTins+Tdec) • |V| iterations of while • one extract-min per iteration: O(VTex) total CPSC 311, Fall 2009
Running Time of Prim's Algorithm • Each iteration of while includes a for loop. • Number of iterations of for loop varies, depending on how many neighbors the current node has • Total number of iterations of for loop is O(E). • Each iteration of for loop: • one decrease key, so O(ETdec) total CPSC 311, Fall 2009
Running Time of Prim's Algorithm • O(V(Tins + Tex) + ETdec) • If priority queue is implemented with a binary heap, then • Tins = Tex = Tdec = O(log V) • total time is O(E log V) • (Think about how to implement decrease-key in O(log V) time.) CPSC 311, Fall 2009
Shortest Paths in a Graph • We already saw the Floyd-Warshall algorithm to compute all pairs of shortest paths • Let's review two important single-source shortest path algorithms: • Dijkstra's algorithm • Bellman-Ford algorithm CPSC 311, Fall 2009
s t Single Source Shortest Path Problem • Given: directed or undirected graph G = (V,E,w) and source node s in V • Find: For each t in V, a path in G from s to t with minimum weight • Warning! Negative weights are a problem: 4 5 CPSC 311, Fall 2009
Shortest Path Tree • Result of a SSSP algorithm can be viewed as a tree rooted at the source • Why not use breadth-first search? • Works fine if all weights are the same: • weight of each path is (a multiple of) the number of edges in the path • Doesn't work when weights are different CPSC 311, Fall 2009
Dijkstra's SSSP Algorithm • Assumes all edge weights are nonnegative • Similar to Prim's MST algorithm • Start with source node s and iteratively construct a tree rooted at s • Each node keeps track of tree node that provides cheapest path from s (not just cheapest path from any tree node) • At each iteration, include the node whose cheapest path from s is the overall cheapest CPSC 311, Fall 2009
4 5 1 s 6 Prim's MST Prim's vs. Dijkstra's 4 5 1 s 6 Dijkstra's SSSP CPSC 311, Fall 2009
Implementing Dijkstra's Alg. • How can each node u keep track of its best path from s? • Keep an estimate, d[u], of shortest path distance from s to u • Use d as a key in a priority queue • When u is added to the tree, check each of u's neighbors v to see if u provides v with a cheaper path from s: • compare d[v] to d[u] + w(u,v) CPSC 311, Fall 2009
Dijkstra's Algorithm • input: G = (V,E,w) and source node s // initialization • d[s] := 0 • d[v] := infinity for all other nodes v • initialize priority queue Q to contain all nodes using d values as keys CPSC 311, Fall 2009
Dijkstra's Algorithm • while Q is not empty do • u := extract-min(Q) • for each neighbor v of u do • if d[u] + w(u,v) < d[v] then • d[v] := d[u] + w(u,v) • decrease-key(Q,v,d[v]) • parent(v) := u CPSC 311, Fall 2009
Dijkstra's Algorithm Example iteration 2 a b 8 12 4 c 10 9 6 3 2 d e 4 source is node a CPSC 311, Fall 2009
Correctness of Dijkstra's Alg. • Let Ti be the tree constructed after i-th iteration of while loop: • nodes not in Q • edges indicated by parent variables • Show by induction on i that the path in Ti from s to u is a shortest path and has distance d[u], for all u in Ti. • Basis: i = 1. s is the only node in T1 and d[s] = 0. CPSC 311, Fall 2009
Ti Ti-1 s x u Correctness of Dijkstra's Alg. • Induction: Assume Ti-1 is a correct shortest path tree and show for Ti. • Let u be the node added in iteration i. • Let x = parent(u). Need to show path in Ti from s to u is a shortest path, and has distance d[u] CPSC 311, Fall 2009
P', another path from s to u a s x u b Correctness of Dijkstra's Alg P, path in Ti from s to u Ti-1 Ti (a,b) is first edge in P' that leaves Ti-1 (i.e., a is in Ti-1 but b is not) CPSC 311, Fall 2009
Correctness of Dijkstra's Alg Let P1 be part of P' before (a,b). Let P2 be part of P' after (a,b). w(P') = w(P1) + w(a,b) + w(P2) ≥ w(P1) + w(a,b) (nonneg wts) ≥ (wt of path in Ti-1 from s to a) + w(a,b) (inductive hypothesis) ≥ (wt of path in Ti-1 from s to x) + w(x,u) (alg chose u in iteration i and d-values are accurate, by inductive hyp.) = w(P). So P is a shortest path, and d[u] is accurate after iteration i. Ti-1 P Ti s u x a b P' CPSC 311, Fall 2009
Running Time of Dijstra's Alg. • initialization: insert each node once • O(V Tins) • O(V) iterations of while loop • one extract-min per iteration => O(V Tex) • for loop inside while loop has variable number of iterations… • For loop has O(E) iterations total • one decrease-key per iteration => O(E Tdec) • Total is O(V (Tins + Tex) + E Tdec) CPSC 311, Fall 2009
Using Fancier Heap Implementations • O(V(Tins + Tex) + E Tdec) • If priority queue is implemented with a binary heap, then • Tins = Tex = Tdec = O(log V) • total time is O(E log V) • There are fancier implementations of the priority queue, such as Fibonacci heap: • Tins = O(1), Tex = O(log V), Tdec = O(1) (amortized) • total time is O(V log V + E) CPSC 311, Fall 2009
Using Simpler Heap Implementations • O(V(Tins + Tex) + E Tdec) • If graph is dense, so that |E| = (V2), then it doesn't help to make Tins and Tex to be at most O(V). • Instead, focus on making Tdec be small, say constant. • Implement priority queue with an unsorted array: • Tins = O(1), Tex = O(V), Tdec = O(1) • total is O(V2) CPSC 311, Fall 2009