Algorithm Design and Analysis (ADA)

Algorithm Design and Analysis (ADA) 242-535, Semester 1 2013-2014 • Objective • look at two algorithms for finding mimimum spanning trees (MSTs) over graphs • Prim's algorithm, Kruskal's algorithm 10. Minimum Spanning Trees (MSTs)

Overview • Minimum Spanning Tree • Prim's Algorithm • Kruskal's Algorithm • Difference between Prim and Kruskal

1. Minimum Spanning Tree • A minimum spanning tree T is a subgraph of a weighted graph G which contains all the verticies of G and whose edges have the minimum summed weight. • Example weighted graph G: A 4 B 5 3 2 C D 1 6 3 6 E 2 F

A minimum spanning tree (weight = 12): • A non-minimum spanning tree (weight = 20): A 4 B 5 3 2 C D 1 6 3 6 E 2 F A 4 B 5 3 2 C D 1 6 3 6 E 2 F

Typical MST Applications

MST Optimality • MSTs have the optimal substructure property: an optimal tree is composed of optimal subtrees. • This can be seen by considering how to break a MST into parts: • Let T be an MST of G with an edge (u,v) • Removing (u,v) splits T into two trees T1 and T2 • Claim: T1 is an MST of G1 = (V1,E1), and T2 is an MST of G2 = (V2,E2) ( • V1 and V2 do not share any vertices • Proof: w(T) = w(u,v) + w(T1) + w(T2) • There can’t be a better tree than T1 or T2, or T would be suboptimal

Optimality and Coding • Both dynamic programming (DP) and greedy algorithms can be used on problems that exhibit optimality. • What's the difference between the two approaches? • DP: use when the problem is optimal and there are repeating sub-problems • Greedy: use when the problem is optimal, and each sub-problem can be solved without combining/examining smaller sub-sub-problems • this is called the greedy-choice property

DP solves sub-problems bottom-up since the current sub-problem may depend on sub-sub-problems. • Greedy algorithmsusually execute top-down, since a sub-problem can be solved without using sub-sub-problem solutions. • In a greedy algorithm, we make whatever choice seems best at the moment and then solve any other sub-problems arising after the choice is made.

DP, Greed and MST • Since a MST is an optimal data structure, it is possible to use dynamic programming (DP) techniques • e.g. memoization, bottom-up execution • In fact, both the MST algorithms in this part are greedy • at each iteration the choice of how to grow the MST only depends on the current state of the MST, not earlier sub-states or simpler versions

2. Prim's Algorithm • Prim's algorithm finds a minimum spanning tree T by iteratively adding edges to T. • At each iteration, a minimum-weight edge is added that does not create a cycle in the current T. • The new edge must be connected to a vertex which is already in T. • The algorithm can stop after |V|-1 edges have been added to T.

Simple Pseudocode • Tree prim(Graph G, int numVerts){ Tree T = anyVert(G); for i = 1 to numVerts-1 { Edge e = select an edge of minimum weight that is connected to avertex already in T and does not form a cycle in T; T = T + e ; } return T}

Informal Algorithm A • For the graph G. • 1) Add any vertex to T • e.g A, T = {A} • 2) Examine all the edges leaving {A} and add the vertex with the smallest weight. • edge weight(A,B) 4(A,C) 2(A,E) 3 • add edge (A,C), T becomes {A,C} 4 B 5 3 2 C D 1 6 3 6 E 2 F continued

3) Examine all the edges leaving {A,C} and add the vertex with the smallest weight. • edge weight edge weight(A,B) 4 (C,D) 1(A,E) 3 (C,E) 6(C,F) 3 • add edge (C,D), T becomes {A,C,D} continued

4) Examine all the edges leaving {A,C,D} and add the vertex with the smallest weight. • edge weight edge weight(A,B) 4 (D,B) 5(A,E) 3 (C,E) 6(C,F) 3 (D,F) 6 • add edge (A,E) or (C,F), it does not matter • add edge (A,E), T becomes {A,C,D,E} continued

5) Examine all the edges leaving {A,C,D,E} and add the vertex with the smallest weight. • edge weight edge weight(A,B) 4 (D,B) 5(C,F) 3 (D,F) 6(E,F) 2 • add edge (E,F), T becomes {A,C,D,E,F} continued

6) Examine all the edges leaving {A,C,D,E,F} and add the vertex with the smallest weight. • edge weight edge weight(A,B) 4 (D,B) 5 • add edge (A,B), T becomes {A,B,C,D,E,F} • All the verticies of G are now in T, so we stop. continued

Resulting minimum spanning tree (weight = 12): A 4 B 5 3 2 C D 1 6 3 6 E 2 F

Prim's Algorithm Graphically • Build up the MST (black tree) by repeatedly adding the minimum crossing edge (thick red line) to it • a crossing edge connects a node in the MST to the rest of the graph. • Ignore edges that form cycles (grey lines).

Example 2 • The following weighted graph: 29 32 36 19 28 17 34 52 35 37 16 26 40 58 38 93

Building the MST 1 2 3

7 8 Finished

Prim's in More Detail boolean marked[]; // for storing the vertices in the MST Queue<Edge> mst; // for storing the edges in the MST MinPriQueue<Edge> pq; // for storing the crossing (and ineligible) edges

void prims(Graph graph, int start) { visit(graph, start); while (!pq.isEmpty()) { Edge e = pq.remove(); // get lowest weight edge int v = e.either(); // (v,w) are the ends of the edge e int w = e.other(v); if (!marked[v] || !marked[w]) { mst.add(e); // add edge to MST queue if (!marked[v]) // visit vertex v or w visit(graph, v); if (!marked[w]) visit(graph, w); } } } // end of prims()

void visit(Graph graph, int v) /* Mark v and add to priority queue all the edges from v to unmarked vertices */ { marked[v] = true; for (Edge e : graph.adj(v)) if (!marked[e.other(v)]) pq.add(e); }

Running Time • The running time depends on the implementation of the minimum priority queue, which uses a minimum heap. • An add() takes time O(log n), and remove is O(1) • At most E edges are added to the priority queue and at most E are removed. • The algorithm has a worst case running time of O(E log E).

3. Kruskal's Algorithm • Krukal's algorithm finds a minimumspanning tree T by iteratively adding edges to T. • At each iteration, a minimum-weight edge is added that does not create a cycle in the current T. • The new edge can come from anywhere in G. • The algorithm can stop after |V|-1 edges have been added to T.

Simple Pseudocode • Tree kruskal(Graph G, int numVerts){ Tree T = allVerts(G); for i = 1 to numVerts-1 {Edge e = select anedge of minimum weight from G which does not form acycle in T; T = T + e ; } return T}

Example 1 iter edge 1 (c,d) 2 (k,l) 3 (b,f) 4 (c,g) 5 (a,b) 6 (f, j) 7 (b,c) 8 (j,k) 9 (g,h) 10 (i, j) 11 (a,e) • Graph G: b c d a 2 3 1 2 3 1 3 g f h e 4 3 3 4 4 2 3 i l j 3 3 k 1 continued

Minimum spanning tree (weight = 24): b c d a 2 3 1 2 3 1 3 g f h e 4 3 3 4 4 2 3 i l j 3 3 k 1

Example 2 edges sorted by weight black edges are used in MST next MST edge

Kruskal's in More Detail Queue<Edge> mst; // for storing the edges in the MST MinPriQueue<Edge> pq; // for storing the all the edges from G UnionFind uf; /* used to maintain disjoint sets of vertices: one for vertices in the MST, and other sets for nodes outside */

void kruskal(Graph graph) { pq.addAll(graph.edges()); // add all edges to pri queue uf.makeSets(graph.vertices()); // create disjoint sets of V's while ((!pq.isEmpty()) && (mst.size() < graph.vertices().size()-1)) { Edge e = pq.remove(); // get lowest weight edge int v = e.either(); // (v,w) are the ends of the edge e int w = e.other(v); if (!uf.isConnected(v, w)) { // are v and w not in same set? // only one of v or w must be in the MST set uf.union(v, w); // combine v and w's sets mst.add(e); // add edge to mst } } } // end of kruskal()

Running Time • As with Prim's algorithm, the running time depends on the implementation of the minimum priority queue, which uses a minimum heap. • The algorithm also uses a Union-find data structure • E find() and isConnected() calls, which are both O(log n) • The algorithm has a worst case running time of O(E log E) – same as Prim's • in practice, Prim's algorithm is usually faster than Kruskal's

Union-find • A disjoint-set data structure keeps track of a set of elements split into disjoint (non-overlapping) subsets. • Union-find consists of two main operations: • find(): report which subset a particular element is in • union(): join two subsets into a single subset • others: makeSets(), isConnected(), etc.

4. Difference between Prim and Kruskal • Prim's algorithm chooses an edge that must be connected to a vertex in the minimum spanning tree T. • Kruskal's algorithm chooses an edge from G that may or may not be connected to a vertex in T.

Algorithm Design and Analysis (ADA)