1.23k likes | 1.26k Views
This article covers the concepts of graphs and graph traversals, including depth-first search and breadth-first search. It discusses different graph representations and algorithms like Dijkstra's algorithm and topological sort.
E N D
CSE 326: Data StructuresPart 8Graphs Henry Kautz Autumn Quarter 2002
Outline • Graphs (TO DO: READ WEISS CH 9) • Graph Data Structures • Graph Properties • Topological Sort • Graph Traversals • Depth First Search • Breadth First Search • Iterative Deepening Depth First • Shortest Path Problem • Dijkstra’s Algorithm
Han Luke Leia Graph ADT Graphs are a formalism for representing relationships between objects • a graph G is represented as G = (V, E) • Vis a set of vertices • E is a set of edges • operations include: • iterating over vertices • iterating over edges • iterating over vertices adjacent to a specific vertex • asking whether an edge exists connected two vertices V = {Han, Leia, Luke} E = {(Luke, Leia), (Han, Leia), (Leia, Han)}
Han Luke Leia Graph Representation 1: Adjacency Matrix A |V| x |V| array in which an element (u, v) is true if and only if there is an edge from uto v Han Luke Leia Han Luke Runtime: iterate over vertices iterate ever edges iterate edges adj. to vertex edge exists? Leia Space requirements:
Han Luke Leia Graph Representation 2: Adjacency List A |V|-ary list (array) in which each entry stores a list (linked list) of all adjacent vertices Han Luke Runtime: iterate over vertices iterate ever edges iterate edges adj. to vertex edge exists? Leia space requirements:
Directed vs. Undirected Graphs • In directed graphs, edges have a specific direction: • In undirected graphs, they don’t (edges are two-way): • Vertices u and v are adjacent if (u, v) E Han Luke Leia Han Luke Leia
Graph Density A sparse graph has O(|V|) edges A dense graph has (|V|2) edges Anything in between is either sparsish or densy depending on the context.
Weighted Graphs Each edge has an associated weight or cost. 20 Clinton Mukilteo 30 Kingston Edmonds 35 Bainbridge Seattle 60 Bremerton There may be more information in the graph as well.
Paths and Cycles A path is a list of vertices {v1, v2, …, vn} such that (vi, vi+1) E for all 0 i < n. A cycle is a path that begins and ends at the same node. Chicago Seattle Salt Lake City San Francisco Dallas p = {Seattle, Salt Lake City, Chicago, Dallas, San Francisco, Seattle}
Path Length and Cost Path length: the number of edges in the path Path cost: the sum of the costs of each edge Chicago 3.5 Seattle 2 2 Salt Lake City 2 2.5 2.5 2.5 3 San Francisco Dallas length(p) = 5 cost(p) = 11.5
Connectivity Undirected graphs are connected if there is a path between any two vertices Directed graphs are strongly connected if there is a path from any one vertex to any other Directed graphs are weakly connected if there is a path between any two vertices, ignoring direction A complete graph has an edge between every pair of vertices
Trees as Graphs • Every tree is a graph with some restrictions: • the tree is directed • there are no cycles (directed or undirected) • there is a directed path from the root to every node A B C D E F G H BAD! I J
Directed Acyclic Graphs (DAGs) DAGs are directed graphs with no cycles. main() mult() if program call graph is a DAG, then all procedure calls can be in-lined add() read() access() Trees DAGs Graphs
Application of DAGs: Representing Partial Orders reserve flight check in airport call taxi pack bags take flight locate gate taxi to airport
Topological Sort Given a graph, G = (V, E), output all the vertices in V such that no vertex is output before any other vertex with an edge to it. reserve flight check in airport call taxi take flight taxi to airport locate gate pack bags
Topo-Sort Take One Label each vertex’s in-degree (# of inbound edges) While there are vertices remaining Pick a vertex with in-degree of zero and output it Reduce the in-degree of all vertices adjacent to it Remove it from the list of vertices runtime:
Topo-Sort Take Two Label each vertex’s in-degree Initialize a queue (or stack)to contain all in-degree zero vertices While there are vertices remaining in the queue Remove a vertex v with in-degree of zero and output it Reduce the in-degree of all vertices adjacent to v Put any of these with new in-degree zero on the queue runtime:
Recall: Tree Traversals a e b c d h i j f g k l a b f g k c d h i l j e
Depth-First Search • Pre/Post/In – order traversals are examples of depth-first search • Nodes are visited deeply on the left-most branches before any nodes are visited on the right-most branches • Visiting the right branches deeply before the left would still be depth-first! Crucial idea is “go deep first!” • Difference in pre/post/in-order is how some computation (e.g. printing) is done at current node relative to the recursive calls • In DFS the nodes “being worked on” are kept on a stack
Iterative Version DFSPre-order Traversal Push root on a Stack Repeat until Stack is empty: Pop a node Process it Push it’s children on the Stack
a e b c d h i j f g k l Level-Order Tree Traversal • Consider task of traversing tree level by level fromtop to bottom (alphabetic order) • Is this also DFS?
Breadth-First Search • No! Level-order traversal is an example of Breadth-First Search • BFS characteristics • Nodes being worked on maintained in a FIFO Queue, not a stack • Iterative style procedures often easier to design than recursive procedures Put root in a Queue Repeat until Queue is empty: Dequeue a node Process it Add it’s children to queue
a e b c d h i j f g k l QUEUE a b c d e c d e f g d e f g e f g h i j f g h i j g h i j h i j k i j k j k l k l l
Graph Traversals • Depth first search and breadth first search also work for arbitrary (directed or undirected) graphs • Must mark visited vertices so you do not go into an infinite loop! • Either can be used to determine connectivity: • Is there a path between two given vertices? • Is the graph (weakly) connected? • Important difference: Breadth-first search always finds a shortest path from the start vertex to any other (for unweighted graphs) • Depth first search may not!
Is BFS the Hands Down Winner? • Depth-first search • Simple to implement (implicit or explict stack) • Does not always find shortest paths • Must be careful to “mark” visited vertices, or you could go into an infinite loop if there is a cycle • Breadth-first search • Simple to implement (queue) • Always finds shortest paths • Marking visited nodes can improve efficiency, but even without doing so search is guaranteed to terminate
Space Requirements Consider space required by the stack or queue… • Suppose • G is known to be at distance d from S • Each vertex n has k out-edges • There are no (undirected or directed) cycles • BFS queue will grow to size kd • Will simultaneously contain all nodes that are at distance d (once last vertex at distance d-1 is expanded) • For k=10, d=15, size is 1,000,000,000,000,000
DFS Space Requirements • Consider DFS, where we limit the depth of the search to d • Force a backtrack at d+1 • When visiting a node n at depth d, stack will contain • (at most) k-1 siblings of n • parent of n • siblings of parent of n • grandparent of n • siblings of grandparent of n … • DFS queue grows at most to size dk • For k=10, d=15, size is 150 • Compare with BFS 1,000,000,000,000,000
Conclusion • For very large graphs – DFS is hugely more memory efficient, if we know the distance to the goal vertex! • But suppose we don’t know d. What is the (obvious) strategy?
Iterative Deepening DFS IterativeDeepeningDFS(vertex s, g){ for (i=1;true;i++) if DFS(i, s, g) return; } // Also need to keep track of path found bool DFS(int limit, vertex s, g){ if (s==g) return true; if (limit-- <= 0) return false; for (n in children(s)) if (DFS(limit, n, g)) return true; return false; }
Analysis of Iterative Deepening • Even without “marking” nodes as visited, iterative-deepening DFS never goes into an infinite loop • For very large graphs, memory cost of keeping track of visited vertices may make marking prohibitive • Work performed with limit < actual distance to G is wasted – but the wasted work is usually small compared to amount of work done during the last iteration
Asymptotic Analysis • There are “pathological” graphs for which iterative deepening is bad: n=d G S
A Better Case Suppose each vertex n has k out-edges, no cycles • Bounded DFS to level i reaches ki vertices • Iterative Deepening DFS(d) = ignore low order terms!
(More) Conclusions • To find a shortest path between two nodes in a unweighted graph, use either BFS or Iterated DFS • If the graph is large, Iterated DFS typically uses much less memory • Later we’ll learn about heuristic search algorithms, which use additional knowledge about the problem domain to reduce the number of vertices visited
Single Source, Shortest Path for Weighted Graphs Given a graph G = (V, E) with edge costs c(e), and a vertex s V, find the shortest (lowest cost) path from s to every vertex in V • Graph may be directed or undirected • Graph may or may not contain cycles • Weights may be all positive or not • What is the problem if graph contains cycles whose total cost is negative?
The Trouble with Negative Weighted Cycles 2 A B 10 -5 1 E 2 C D
Edsger Wybe Dijkstra (1930-2002) • Invented concepts of structured programming, synchronization, weakest precondition, and "semaphores" for controlling computer processes. The Oxford English Dictionary cites his use of the words "vector" and "stack" in a computing context. • Believed programming should be taught without computers • 1972 Turing Award • “In their capacity as a tool, computers will be but a ripple on the surface of our culture. In their capacity as intellectual challenge, they are without precedent in the cultural history of mankind.”
Dijkstra’s Algorithm for Single Source Shortest Path • Classic algorithm for solving shortest path in weighted graphs (with onlypositive edge weights) • Similar to breadth-first search, but uses a priority queue instead of a FIFO queue: • Always select (expand) the vertex that has a lowest-cost path to the start vertex • a kind of “greedy” algorithm • Correctly handles the case where the lowest-cost (shortest) path to a vertex is not the one with fewest edges
Pseudocode for Dijkstra Initialize the cost of each vertex to cost[s] = 0; heap.insert(s); While (! heap.empty()) n = heap.deleteMin() For (each vertex a which is adjacent to n along edge e) if (cost[n] + edge_cost[e] < cost[a]) then cost [a] = cost[n] + edge_cost[e] previous_on_path_to[a] = n; if (a is in the heap) then heap.decreaseKey(a) else heap.insert(a)
Important Features • Once a vertex is removed from the head, the cost of the shortest path to that node is known • While a vertex is still in the heap, another shorter path to it might still be found • The shortest path itself from s to any node a can be found by following the pointers stored in previous_on_path_to[a]
2 2 3 B A F H 1 1 2 1 4 10 9 4 G C 8 2 D 1 E 7 Dijkstra’s Algorithm in Action
Data Structures for Dijkstra’s Algorithm |V| times: Select the unknown node with the lowest cost findMin/deleteMin O(log |V|) |E| times: a’s cost = min(a’s old cost, …) decreaseKey O(log |V|) runtime: O(|E| log |V|)
CSE 326: Data StructuresLecture 8.BHeuristic Graph Search Henry Kautz Winter Quarter 2002
Homework Hint - Problem 4 You can turn in a final version of your answer to problem 4 without penalty on Wednesday.
Outline • Best First Search • A* Search • Example: Plan Synthesis • This material is NOT in Weiss, but is important for both the programming project and the final exam!
Huge Graphs • Consider some really huge graphs… • All cities and towns in the World Atlas • All stars in the Galaxy • All ways 10 blocks can be stacked Huh???