280 likes | 424 Views
Chapter 9: Graph Algorithms. Graph ADT. Topological Sort. Shortest Path. Maximum Flow. Minimum Spanning Tree. Depth-First Search. Undecidability. NP-Completeness. CS 340. Page 143. 4. 3. 6. 6. 5. 2. 2. 3. 3. 4. 5. The Graph Abstract Data Type.
E N D
Chapter 9: Graph Algorithms • Graph ADT • Topological Sort • Shortest Path • Maximum Flow • Minimum Spanning Tree • Depth-First Search • Undecidability • NP-Completeness CS 340 Page 143
4 3 6 6 5 2 2 3 3 4 5 The Graph Abstract Data Type A graph G = (V, E) consists of a set of vertices, V, and a set of edges, E, each of which is a pair of vertices. If the edges are ordered pairs of vertices, then the graph is directed. Directed, unweighted, cyclic, strongly connected, loopless graph (length of longest simple cycle: 7) Undirected, unweighted, unconnected, loopless graph (length of longest simple path: 2) Directed, unweighted, acyclic, weakly connected, loopless graph (length of longest simple path: 6) Undirected, unweighted, connected graph, with loops (length of longest simple path: 4) Directed, weighted, cyclic, weakly connected, loopless graph (weight of longest simple cycle: 27) CS 340 Page 144
C C C 4 3 B B B A A A 6 6 5 E E E D D D 2 2 3 F F F 3 H H H 4 5 G G G Graph Representation Option 1: Adjacency Matrix The Problem: Most graphs are sparse (i.e., most vertex pairs are not edges), so the memory requirement is excessive: (V2). CS 340 Page 145
C C C 4 3 B B B A A A 6 6 5 E E E D D D 2 2 3 F F F 3 H H H 4 5 G G G Graph Representation Option 2: Adjacency Lists A A B G A B D A B 3 D 6 E 6 B A C D B D B C 4 C B C B C E F B 5 E 2 D B D G D G 3 E E G E C E H 3 F G F A F G G 4 G A E F H G G H A 2 H G H E H G 5 CS 340 Page 146
Topological Sort A topological sort of an acyclic directed graph orders the vertices so that if there is a path from vertex u to vertex v, then vertex v appears after vertex u in the ordering. MATH 120 One topological sort of the course prerequisite graph at left: CS 111, MATH 120, CS 140, MATH 125, CS 150, MATH 224, CS 234, CS 240, ECE 282, CS 312, CS 325, MATH 150, MATH 152, STAT 380, CS 321, MATH 250, MATH 321, CS 314, MATH 423, CS 340, CS 425, ECE 381, CS 434, ECE 482, CS 330, CS 382, CS 423, CS 438, CS 454, CS 447, CS 499, CS 482, CS 456, ECE 483 MATH 125 CS 140 MATH 150 MATH 224 ECE 282 CS 111 CS 150 ECE 381 MATH 152 MATH 423 ECE 482 MATH 250 CS 234 CS 240 STAT 380 ECE 483 MATH 321 CS 312 CS 321 CS 325 CS 482 CS 340 CS 434 CS 382 CS 314 CS 330 CS 425 CS 447 CS 423 CS 499 CS 438 CS 456 CS 454 CS 340 Page 147
Indegree of A: 0 B C Indegree of B: 1 0 0 0 Indegree of C: 3 3 2 2 1 0 0 Indegree of D: 1 1 1 0 0 A E F D Indegree of E: 1 0 Indegree of F: 4 4 3 2 2 1 1 1 Indegree of G: 1 0 0 0 0 0 Indegree of H: 3 3 2 1 1 1 0 0 G H I Indegree of I: 0 0 0 Output: A Topological Sort Algorithm • Place all indegree-zero vertices in a list. • While the list is non-empty: • Output an element v in the indegree-zero list. • For each vertex w with (v,w) in the edge set E: • Decrement the indegree of w by one. • Place w in the indegree-zero list if its new indegree is 0. 0 E I B D G C H F CS 340 Page 148
Shortest Paths The shortest path between two locations is dependent on the function that is used to measure a path’s “cost”. Does the path “cost” less if the amount of boring scenery is minimized? Does the path “cost” less if the total distance traveled is minimized? Does the path “cost” less if the amount of traffic encountered is minimized? In general, then, how do you find the shortest path from a specified vertex to every other vertex in a directed graph? CS 340 Page 149
(1,A) (1,A) (2,B) B B B B B C C C C C (1,A) (1,A) (2,E) (0,-) (0,-) (0,-) A A A A A E E E E E F F F F F D D D D D G G G G G H H H H H I I I I I (1,A) (1,A) (2,E) (1,A) (2,B) (1,A) (2,B) (1,A) (2,E) (1,A) (2,E) (0,-) (3,F) (0,-) (3,F) (1,A) (1,A) (4,D) (2,E) (2,E) Shortest Path Algorithm: Unweighted Graphs Use a breadth-first search, i.e., starting at the specified vertex, mark each adjacent vertex with its distance and its predecessor, until all vertices are marked. Algorithm’s time complexity: O(E+V). CS 340 Page 150
Shortest Path Algorithm: Weighted Graphs With No Negative Weights (7,A) (27,E) (7,A) (,-) (,-) (,-) 19 19 19 19 19 19 B B B B B B C C C C C C (,-) 7 7 7 7 7 7 (6,A) (19,E) (,-) 21 21 21 21 21 21 5 5 5 5 5 5 (6,A) (,-) (,-) (,-) (,-) (0,-) (0,-) (0,-) 6 6 6 6 6 6 13 13 13 13 13 13 2 2 2 2 2 2 A A A A A A E E E E E E F F F F F F D D D D D D 8 8 8 8 8 8 16 16 16 16 16 16 2 2 2 2 2 2 4 4 4 4 4 4 3 3 3 3 3 3 (8,A) (,-) G G G G G G H H H H H H I I I I I I (8,A) (,-) (,-) (,-) 8 8 8 8 8 8 3 3 3 3 3 3 (22,E) (,-) (,-) (7,A) (26,B) (7,A) (7,A) (26,B) (26,B) (,-) (6,A) (18,H) (,-) (,-) (6,A) (19,E) (6,A) (19,E) (0,-) (0,-) (0,-) (8,A) (,-) (8,A) (,-) (8,A) (,-) (16,G) (16,G) (22,E) Use Dijkstra’s Algorithm, i.e., starting at the specified vertex, finalize the unfinalized vertex whose current cost is minimal, and update each vertex adjacent to the finalized vertex with its (possibly revised) cost and predecessor, until all vertices are finalized. Algorithm’s time complexity: O(E+V2) = O(V2). CS 340 Page 151
(7,A) (7,A) (7,A) (25,D) (25,D) (26,B) (20,F) (20,F) (20,F) (6,A) (18,H) (6,A) (18,H) (6,A) (18,H) (0,-) (0,-) (0,-) (8,A) (23,D) (8,A) (23,D) (8,A) (,-) 19 19 19 (16,G) (16,G) (16,G) B B B B C C C C 7 7 7 21 21 21 5 5 5 6 6 6 13 13 13 2 2 2 A A A A E E E E F F F F D D D D (7,A) (25,D) 19 8 8 8 16 16 16 2 2 2 4 4 4 3 3 3 7 G G G G H H H H I I I I 21 5 (20,F) (6,A) (18,H) 8 8 8 3 3 3 (0,-) 6 13 2 8 16 2 4 3 (8,A) (23,D) 8 3 (16,G) Note that this algorithm would not work for graphs with negative weights, since a vertex cannot be finalized when there might be some negative weight in the graph which would reduce a particular path’s cost. CS 340 Page 152
Shortest Path Algorithm: Weighted Graphs With Negative Weights (,-) (13,A) (13,A) (,-) (,-) (,-) 15 15 15 B C B C B C 13 13 13 -9 -9 -9 7 (,-) 7 (,-) 7 (,-) (,-) (,-) (25,A) (,-) (25,A) (23,E) (0,-) 25 -2 6 (0,-) 25 -2 6 (0,-) 25 -2 6 A E F D A E F D A E F D 26 11 -4 8 26 11 -4 8 26 11 -4 8 -9 -9 -9 G H I G H I G H I (,-) (,-) (26,A) (,-) (26,A) (,-) 6 -3 6 -3 6 -3 (,-) (,-) (,-) (13,A) (13,A) (13,A) (28,B) (28,B) (28,B) 15 15 15 B C B C B C 13 13 13 -9 -9 -9 7 (,-) 7 (,-) 7 (25,A) (20,B) (25,A) (20,B) (25,A) (20,B) (26,F) (0,-) 25 -2 6 (0,-) 25 -2 6 (0,-) 25 -2 6 A E F D A E F D A E F D 26 11 -4 8 26 11 -4 8 26 11 -4 8 -9 -9 -9 G H I G H I G H I (26,A) (,-) (26,A) (,-) (26,A) (,-) 6 -3 6 -3 6 -3 (,-) (32,G) (31,F) Use a variation of Dijkstra’s Algorithm without using the concept of vertex finalization, i.e., starting at the specified vertex, update each vertex adjacent to the current vertex with its (possibly revised) cost and predecessor, placing each revised vertex in a queue. Continue until the queue is empty. Algorithm’s time complexity: O(EV). CS 340 Page 153
(13,A) (13,A) (13,A) (28,B) (28,B) (17,D) 15 15 15 B C B C B C 13 13 13 -9 -9 -9 7 7 7 (25,A) (20,B) (26,F) (22,H) (20,B) (26,F) (22,H) (20,B) (26,F) (0,-) 25 -2 6 (0,-) 25 -2 6 (0,-) 25 -2 6 A E F D A E F D A E F D 26 11 -4 8 26 11 -4 8 26 11 -4 8 -9 -9 -9 G H I G H I G H I (26,A) (,-) (26,A) (,-) (26,A) (34,D) 6 -3 6 -3 6 -3 (31,F) (31,F) (31,F) (13,A) (13,A) (13,A) (17,D) (17,D) (17,D) 15 15 15 B C B C B C 13 13 13 -9 -9 -9 7 7 7 (22,H) (20,B) (26,F) (22,H) (20,B) (26,F) (22,H) (20,B) (26,F) (0,-) 25 -2 6 (0,-) 25 -2 6 (0,-) 25 -2 6 A E F D A E F D A E F D 26 11 -4 8 26 11 -4 8 26 11 -4 8 -9 -9 -9 G H I G H I G H I (26,A) (34,D) (26,A) (34,D) (26,A) (34,D) 6 -3 6 -3 6 -3 (31,F) (31,F) (31,F) Note that this algorithm would not work for graphs with negative-cost cycles. For example, if edge IH in the above example had cost -5 instead of cost -3, then the algorithm would loop indefinitely. CS 340 Page 154
19 B C 35 21 35 21 27 17 23 A E F D 16 18 12 14 12 30 G H I 8 15 Maximum Flow Assume that the directed graph G = (V, E) has edge capacities assigned to each edge, and that two vertices s and t have been designated the source and sink nodes, respectively. We wish to maximize the “flow” from s to t by determining how much of each edge’s capacity can be used so that at each vertex, the total incoming flow equals the total outgoing flow. This problem relates to such practical applications as Internet routing and automobile traffic control. In the graph at left, for instance, the total of the incoming capacities for node C is 40 while the total of the outgoing capacities for node C is 35. Obviously, the maximum flow for this graph will have to “waste” some of the incoming capacity at node C. Conversely, node B’s outgoing capacity exceeds its incoming capacity by 5, so some of its outgoing capacity will have to be wasted. CS 340 Page 155
19 0 19 B C B C B C 35 0 35 21 35 0 0 21 35 21 0 21 27 17 23 0 0 0 27 17 23 A E F D A E F D A E F D 16 0 16 18 12 14 0 0 0 18 12 14 12 0 12 30 0 30 G H I G H I G H I 8 15 0 0 8 15 19 0 19 B C B C B C 21 35 21 14 21 21 35 21 0 0 35 21 0 21 27 17 23 0 0 21 27 17 2 A E F D A E F D A E F D 16 0 16 21 18 12 14 0 0 0 18 12 14 12 0 12 30 0 30 G H I G H I G H I 8 15 0 0 8 15 Maximum Flow Algorithm To find a maximum flow, keep track of a flow graph and a residual graph, which keep track of which paths have been added to the flow. Keep choosing paths which yield maximal increases to the flow; add these paths to the flow graph, subtract them from the residual graph, and add their reverse to the residual graph. Original Graph Flow Graph Residual Graph CS 340 Page 156
Original Graph Flow Graph Residual Graph 19 0 19 B C B C B C 21 17 35 21 14 21 21 35 21 17 0 18 17 21 17 4 27 17 23 17 17 21 10 0 A E F D A E F D 2 A E F D 16 0 17 16 21 17 12 12 18 14 0 0 0 18 12 14 0 12 30 0 30 G H I G H I G H I 8 15 0 0 8 15 14 19 14 B C B C B C 5 35 31 35 35 0 21 21 35 21 31 0 4 17 21 17 4 27 17 23 17 17 21 10 0 A E F D A E F D 2 A E F D 16 0 17 16 21 17 12 12 18 14 0 0 0 18 12 14 0 12 30 0 30 G H I G H I G H I 8 15 0 0 8 15 14 19 14 B C B C B C 5 35 31 35 35 0 21 21 35 21 31 0 4 17 21 17 4 27 17 23 17 17 21 10 0 A E F D A E F D 2 A E F D 16 12 17 4 21 17 12 12 12 18 14 12 0 0 6 12 14 12 0 12 30 12 18 12 12 G H I G H I G H I 8 15 0 12 8 3 12 CS 340 Page 157
Original Graph Flow Graph Residual Graph 14 19 14 B C B C B C 5 35 31 35 35 0 21 21 35 21 31 0 4 17 21 17 4 27 17 23 25 17 21 2 0 2 A E F D A E F D A E F D 16 12 25 4 21 17 20 18 12 14 12 8 6 6 4 6 12 4 8 4 30 20 8 10 12 12 8 G H I G H I G H I 8 15 8 12 0 3 12 8 14 19 14 B C B C B C 5 35 31 35 35 0 21 21 35 21 31 0 4 17 21 17 4 27 17 23 25 17 21 2 0 2 A E F D A E F D A E F D 16 16 25 0 21 17 24 18 12 14 16 12 10 2 0 2 12 8 4 8 30 24 12 6 16 8 12 G H I G H I G H I 8 15 8 12 0 3 12 8 Thus, the maximum flow for the graph is 76. CS 340 Page 158
B C B C D E H 3 D E H A A F G F G 6 B B B B B C C C C C 5 5 4 4 4 3 3 3 3 3 D D D D D E E E E E H H H H H A A A A A 3 3 3 3 3 4 4 4 4 F F F F F G G G G G Minimum Spanning Tree: Kruskal’s Algorithm When a large set of vertices must be interconnected (i.e., “spanning”) and there are several possible means of doing so, redundancy can be eliminated (i.e., “tree”) and costs can be reduced (i.e., “minimum”) by employing a minimum spanning tree. Kruskal’s Algorithm accomplishes this by just selecting minimum-cost edges as long as they don’t form cycles. 6 6 B B C C 5 5 7 4 4 8 8 3 3 D D E E 7 7 H H A A 3 3 6 9 4 4 F F G G 7 MINIMUM SPANNING TREE ORIGINAL GRAPH CS 340 Page 159
B C B C D E H 3 D E H A A F G F G 6 B C B C B C 5 5 3 D E H 3 D E H A A 3 D E H A 4 4 4 F G F G F G 6 6 6 B C B C B C 5 4 5 4 5 4 3 D E H 3 D E H 3 D E 7 H A A A 3 3 4 4 4 F G F G F G MINIMUM SPANNING TREE Minimum Spanning Tree: Prim’s Algorithm An alternative to Kruskal’s Algorithm is Prim’s Algorithm, a variation of Dijkstra’sAlgorithm that starts with a minimum-cost edge, and builds the tree by adding minimum-cost edges as long as they don’t create cycles. Like Kruskal’s Algorithm, Prim’s Algorithm is O(ElogV) 6 B C 5 7 4 8 3 D 8 E 7 H A 6 3 9 4 F G 7 ORIGINAL GRAPH CS 340 Page 160
A B A B C A B C C D E D E E F G H F G H G Original Graph Depth-First Search (Solid lines are part of depth-first spanning tree; dashed lines are visits to previously marked vertices) D H F Depth-First Spanning Tree Depth-First Search A convenient means to traverse a graph is to use a depth-first search, which recursively visits and marks the vertices until all of them have been traversed. Such a traversal provides a means by which several significant features of a graph can be determined. CS 340 Page 161
Depth-First Search Application: Articulation Points B E H 2 6 8 2/1 6/6 8/6 A D G J 1 3 7 10 1/1 3/1 7/6 10/7 C F I 4 5 9 4/1 5/5 9/7 Depth-First Search (with nodes numbered as they’re visited) Original Graph Nodes also marked with lowest-numbered vertex reachable via zero or more tree edges, followed by at most one back edge A vertex v in a graph G is called an articulation point if its removal from G would cause the graph to be disconnected. Such vertices would be considered “critical” in applications like networks, where articulation points are the only means of communication between different portions of the network. A depth-first search can be used to find all of a graph’s articulation points. The only articulation points are the root (if it has more than one child) and any other node v with a child whose “low” number is at least as large as v’s “visit” number. (In this example: nodes B, C, E, and G.) CS 340 Page 162
Depth-First Search Application: Euler Circuits B C D B C D B B C C D D A G F E A G F E A A G G F F E E H I J H I J H H I I J J Original Graph After Removing First DFS Cycle: BCDFB After Removing Second DFS Cycle: ABGHA After Removing Third DFS Cycle: FHIF B C D A G F E H I J After Removing Fourth DFS Cycle: FEJF An Euler circuit of a graph G is a cycle that visits every edge exactly once. Such a cycle could be useful in applications like networks, where it could provide an efficient means of testing whether each network link is up. A depth-first search can be used to find an Euler circuit of a graph, if one exists. (Note: An Euler circuit exists only if the graph is connected and every vertex has even degree.) Splicing the first two cycles yields cycle A(BCDFB)GHA. Splicing this cycle with the third cycle yields ABCD(FHIF)BGHA. Splicing this cycle with the fourth cycle yields ABCD(FEJF)HIFBGHA Note that this traversal takes O(E+V) time. CS 340 Page 163
Depth-First Search Application: Strong Components A A A A B B B B C C C C D D D D 6 5 1 10 6 5 1 10 E E E E F F F F G G G G 4 2 8 4 2 8 H H H H I I I I J J J J K K K K 11 3 7 9 11 3 7 9 Third leg of depth-first search Number the vertices according to a depth-first postorder traversal, and reverse the edges. Depth-first search of the revised graph, always starting at the vertex with the highest number. First leg of depth-first search Second leg of depth-first search Original Directed Graph A subgraph of a directed graph is a strong component of the graph if there is a path in the subgraph from every vertex in the subgraph to every other vertex in the subgraph. In network applications, such a subgraph could be used to ensure intercommunication. A depth-first search can be used to find the strong components of a graph. The trees in the final depth-first spanning forest form the set of strong components. In this example, the strong components are: { C }, { B, F, I, E }, { A }, { D, G, K, J }, and { H }. CS 340 Page 164
Program P Program H YES (if P halts on I) Input I NO (if P loops forever on I) Program X YES Program H loop forever Program P NO halt Undecidable Problems Some problems are difficult to solve on a computer, but some are impossible! Example: The Halting Problem We’d like to build a program H that will test any other program P and any input I and answer “yes” if P will terminate normally on input I and “no” if P will get stuck in an infinite loop on input I. If this program H exists, then let’s make it the subroutine for program X, which takes any program P, and runs it through subroutine H as both the program and the input. If the result from the subroutine is “yes”, the program X enters an infinite loop; otherwise it halts. Note that if we ran program X with itself as its input, then it would halt only if it loops forever, and it would loop forever only if it halts!?!? This contradiction means that our assumption that we could build program H was false, i.e., that the halting problem is undecidable. CS 340 Page 165
P and NP Problems A problem is said to be a P problem if it can be solved with a deterministic, polynomial-time algorithm. (Deterministic algorithms have each step clearly specified.) A problem is said to be an NP problem if it can be solved with a nondeterministic, polynomial-time algorithm. In essence, at a critical point in the algorithm, a decision must be made, and it is assumed that a magical “choice” function always chooses correctly. Example: Satisfiability Given a set of n boolean variables b1, b2, …, bn and a boolean function f(b1, b2, …, bn). Problem: Are there values that can be assigned to the variables so that the function would evaluate to TRUE? For example, is the function: (b1 OR b2) AND (b3 OR NOT b4) ever TRUE? To try every combination takes exponential time, but a nondeterministic solution is polynomial-time: for (i=1; i<=n; i++) b[i] = choice(TRUE,FALSE); if (f(b[1],b[2],…,b[n])==TRUE) cout << “SATISFIABLE”; else cout << “UNSATISFIABLE”; So Satisfiability is an NP problem. CS 340 Page 166
The Knapsack Problem Given a set of n valuable jewels J1, J2, …, Jn with respective weights w1, w2, …, wn, and respective prices p1, p2, …, pn, as well as a knapsack capable of supporting a total weight of M. Problem: Is there a way to pack at least T dollars worth of jewels, without exceeding the weight capacity of the knapsack? (It’s not as easy as it sounds; three lightweight $1000 jewels might be preferable to one heavy $2500 jewel, for instance.) A nondeterministic polynomial-time solution: totalWorth = 0; totalWeight = 0; for (i=1; i<=n; i++) { b[i] = choice(TRUE,FALSE); if (b[i]==TRUE) { totalWorth+= p[i]; totalWeight+= w[i]; } } if ((totalWorth >= T) && (totalWeight <= M)) cout << “YAHOO! I’M RICH!”; else cout << “@#$&%!”; CS 340 Page 167
NP-Complete Problems Note that all P problems are automatically NP problems, but it’s unknown if the reverse is true. The hardest NP problems are the NP-complete problems. An NP-complete problem is one to which every NP problem can be “polynomially reduced”. In other words, an instance of any NP problem can be transformed in polynomial time into an instance of the NP-complete problem in such a way that a solution to the first instance provides a solution to the second instance, and vice versa. For example, consider the following two problems: The Hamiltonian Circuit Problem: Given an undirected graph, does it contain a cycle that passes through every vertex in the graph exactly once? The Spanning Tree Problem: Given an undirected graph and a positive integer k, does the graph have a spanning tree with exactly k leaf nodes? Assume that the Hamiltonian Circuit Problem is known to be NP-complete. To prove that the Spanning Tree Problem is NP-complete, we can just polynomially reduce the Hamiltonian Circuit Problem to it. CS 340 Page 168
Let G = (V, E) be an instance of the Hamiltonian Circuit Problem. Let x be some arbitrary vertex in V. Define Ax = {y in V (x,y) is in E}. Define an instance G = (V, E) of the Spanning Tree Problem as follows: Let V = V {u, v, w}, where u, v, and w are new vertices that are not in V. Let E = E {(u, x)} {(y, v y is inAx} {(v, w)}. Finally, let k = 2. If G has a Hamiltonian circuit, then G has a spanning tree with exactly k leaf nodes. Proof:G’s Hamiltonian circuit can be written as (x, y1, z1, z2, …, zm, y2, x), where V = {x, y1, y2, z1, z2, …, zm} and Ax contains y1 and y2. Thus, (u, x, y1, z1, z2, …, zm, y2, v, w) forms a Hamiltonian path (i.e., a spanning tree with exactly two leaf nodes) of G. CS 340 Page 169
Conversely, if G has a spanning tree with exactly two leaf nodes, then G has a Hamiltonian circuit. Proof: The two-leaf spanning tree of G must have u and w as its leaf nodes (since they’re the only vertices of degree one in G ). Notice that the only vertex adjacent to w in G is v and the only vertex adjacent to u in G is x. Also, the only vertices adjacent to v in G are those in Ax {w}, and the only vertices adjacent to x in G are those in Ax {u}. So the Hamiltonian path in G must take the form (u, x, y1, z1, z2, …, zm, y2, v, w) where V = {x, y1, y2, z1, z2, …, zm} and Ax contains y1 and y2. Thus, since y2 is adjacent to x, (x, y1, z1, z2, …, zm, y2, x) must be a Hamiltonian circuit in G. CS 340 Page 170