330 likes | 448 Views
CS305/503, Spring 2009 Graphs. Michael Barnathan. Here’s what we’ll be learning:. Data Structures: Graphs. Theory: Graph nomenclature (there is a lot of it). Depth-first search. Breadth-first search. Best-first search. Review: Trees.
E N D
CS305/503, Spring 2009Graphs Michael Barnathan
Here’s what we’ll be learning: • Data Structures: • Graphs. • Theory: • Graph nomenclature (there is a lot of it). • Depth-first search. • Breadth-first search. • Best-first search.
Review: Trees • A tree is a data structure in which every node points to a set of “children”. • A binary tree is a special case in which a node may contain up to 2 children. • Each node has exactly one parent, except the root, which has no parent. • There is thus only one unique path to every node. • This is nice; it simplifies many of the algorithms. • You very seldom need to backtrack.
Unique Paths This is not a tree: This is a tree: 1 1 2 5 2 5 3 4 3 4 4 has two parents and there are two ways to access it.
There goes another assumption! • What if we get rid of the assumption that each node has one parent and one path? • We’re not assuming much anymore… now we’re just looking at connected nodes. 1 3 2 5 4 Weird.
Graphs • This data structure is called a graph. • It is the most general data structure. • Trees are special cases of graphs. • Linked lists are special cases of graphs. • Formally, a graph is simply a set of nodes V connected by a set of lines E: G = <V,E>. • The nodes are called vertices. • The lines connecting them are edges. • The number of edges adjacent to a vertex is called the degree of that vertex.
Example Vertices G = 1 3 2 5 Edges 4 V = 1 2 3 4 5 E =
Why are they useful? • Networks: • Computer networks (routers!) • Social networks. • Spread of disease. • Roads, paths, travel: Bob Mallory You Alice Trudy Larchwood 71 Woodland Jonathon Palmer
Undirected Graphs Larchwood 71 • These are all two-way streets. Traffic can flow both ways. We can turn from 71 onto Larchwood, or Larchwood onto 71. • The graph is therefore called undirected. The edges can be traversed in either direction. Woodland Jonathon Palmer
Directed Graphs Larchwood 71 • What if Larchwood were one way only? • You could not turn onto 71 from Larchwood, but could turn onto Larchwood from 71. • This is represented by adding arrows to edges to signify that the edge only flows one way. Edges cannot be traversed against the direction of the arrow. • These are called directed edges and a graph containing at least one of them is called a directed graph or digraph. Woodland Jonathon Palmer
Cycles • It is possible for a graph to loop back on itself, directly or indirectly. • The loop is called a cycle or closed walk. • The number of vertices in the loop is called the length of the cycle. • A graph with cycles is known as a cyclic graph, while one that contains none is called acyclic. 1 1 2 3 Length 1 Length 3
Trees • Since you don’t have a pointer back to the parent, trees are directed acyclic graphs. 1 2 5 3 4
Connected Components • It is possible for some vertices to be isolated from others within the same graph: • Each group is called a connected component. Formally, two vertices are in the same connected component if one may be reached from the other. A connected graph has only one connected component. • A strongly connected component is a group in which every vertex in the group can be reached from every other vertex in the group. • Question: are the connected components of the graph shown above strongly connected? Why or why not? 1 4 2 3 5 This is one graph.
Path Length • A traversal starting at one vertex and ending at another is called a path. • The number of edges traversed to get from the start to the end vertex is the path length. • The minimal path length between two vertices is the length of the shortest path that connects them.
Path Length Example • What is the shortest path from 71 to Palmer? 1 Larchwood 71 2 2 2 Woodland Jonathon 3 3 Palmer
The Problem With Path Length • Of course, not all roads are created equal. • Which is closer, Colorado or West Virginia? Path Length = 27. Path Length = 30. Colorado, here we come!
Weighted Path Length • In order to represent things like distance (I-95 != Route 36) or “cost” of walking down a certain path, we can assign weights to edges. • Instead of counting each edge as “1”, we count it by its weight: 0.4 Larchwood 71 0.4 0.2 0.3 Woodland Jonathon 0.2 0.2 Palmer Shortest path length: 0.4 + 0.3 = 0.7 mi
Weighted Path Length • Path lengths can also be negative in some cases (maybe a certain road bypasses traffic and saves you driving time?) • Finding the shortest path length is obviously an important problem. • If you’re UPS, you want your truck drivers to deliver packages on time in as short a distance as possible (to conserve fuel). • If you are routing a packet, you want to select the fastest route that can get it to its destination. • Intuitively, how would you find the shortest weighted path length between two vertices? • We’ll give some formal strategies for this next time.
Traversing a Graph. • Very often, we will want to scan the vertices of a graph (for example, to find the path length). • There are three common ways of traversing a graph: • Depth-first. • Breadth-first. • Best-first. • There are also popular variations on best-first search, such as A* search, which are used frequently in AI. • A “root” (vertex to start at) must be selected in order to give the traversal a place to begin.
Depth-First Search • DFS is equivalent to preorder traversal of a tree. Because graphs may be cyclic, it requires keeping track of which vertices were visited. • The idea: when encountering an unvisited vertex, traverse down it immediately. • Only once that traversal finishes do you traverse down the remaining edges of the current vertex. • This is usually done recursively.
DFS Example Start 4 1 3 2 5 When we traverse 3, 3 becomes the new current vertex. We then traverse its edges (to 4) before returning and finishing up with 2’s other vertex (5).
DFS Algorithm void dfs(Vertex v) { if (v == null) return; visit(v); //We can do anything with v here. v.visited = true; for (Edge e : v.edges()) if (!e.getOtherVertex(v).visited()) dfs(e.getOtherVertex(v)); }
Breadth First Search • Where depth-first search scanned down the entire path before checking additional edges, breadth-first search does the opposite. • Idea: scan each adjacent edge before traversing into any of them. • Whereas DFS used a stack to traverse (you did realize it was using the system stack to keep track of the history, right?), BFS uses a queue. • Also, while DFS is recursive, BFS is iterative.
BFS Example Start 5 1 3 2 4 All of 2’s adjacent vertices (3 and 4) are labeled before we traverse into 3 and check its adjacent vertices (5).
BFS Algorithm void bfs(Vertex v) { if (v == null) return; Queue<Vertex> vqueue = new Queue<Vertex>(); vqueue.add(v); //Start with the start vertex. v.visited = true; while (!vqueue.empty()) { v = vqueue.pop(); //Dequeue the next element and store it in v. visit(v); //We can do anything with v here. for (Edge e : v.edges()) if (!e.getOtherVertex(v).visited()) { vqueue.add(e.getOtherVertex(v)); e.getOtherVertex(v).visited = true; } } }
Best First Search • Best-first search uses a user-chosen heuristic function which ranks nodes based on how “promising” they are in achieving a goal. • The heuristic function may be based on the value or position of the vertex or weight of the edges. • For example, in a game of checkers, a move that results in jumping an opponent’s piece may be ranked highly by the heuristic function, since it makes progress towards attaining a goal (winning the game). • Best-first search always chooses the “best” next move at each step. • What do we call those sorts of algorithms again? • Whereas a stack is used in depth-first search and a queue is used in breadth-first search, a priority queue can be used in best-first search. • The priority would be how “good” a vertex is ranked. • Other than that change, the algorithm is the same as breadth-first search.
A Classical Problem • This is called the “7 Bridges of Konigsberg”. You may have seen it on IQ tests. • Euler first solved it in 1736. We’ll walk through his solution. • The problem: find a route that allows you to cross each of the 7 bridges exactly once, or demonstrate that none exists.
Euler’s Solution • The configuration of the city is irrelevant; only how one can move from one part of it to another is important. • So we can represent the problem as a graph. B B AB BD AD A D A D CD C AC C
As “States” • Think of each vertex as a “state”. • One bridge is required to enter that state. • One bridge is required to leave that state. • Clearly, there would be no solution (except to swim!) if the graph were not connected. • Because the graph is connected, a solution will involve both entering and leaving every state at least once. • With potentially two exceptions: the state you start in (you don’t need to enter it) and the one you end in (you don’t need to leave it). • This means that either every vertex or every vertex but the starting and ending vertices must have an even degree for this to work! • There are four vertices in this graph: A, B, C, and D. • We can start at any one of them and finish at any one of them. • So if we can find any two vertices with an even degree in this graph, we can cross each bridge exactly once. Otherwise, we can’t.
Degrees of Each Vertex • Recall: The degree of a vertex is the number of edges adjacent to that vertex. • What is the degree of each of the four vertices? B A D C
Eulerian Graphs • A has degree 5, all others have degree 3. • In order to cross each bridge exactly once, either all vertices in the graph or all but two vertices in the graph must have even degrees. • A path that crosses each edge in a graph exactly once is called a “Eulerian path”. • Graphs that satisfy the above condition (i.e. they have a Eulerian path) are called Eulerian graphs. • Every vertex in this graph has an odd degree. • So this graph is not Eulerian. • Therefore, it is impossible to walk each bridge only once.
A Bridge Too Far • We discussed some basic graph theory today. • Next time, we’ll cover algorithms for finding the shortest path between two vertices and an alternate representation of a graph. • The lesson: • Particularly in mathematics, it is possible to simplify a problem by removing irrelevant information. The clutter may make them seem more difficult than they appear.
Assignment 4 • This assignment will have you writing a heap from scratch. • You may not use the Java Set or Map classes for this assignment. • The assignment handout is located on the course website. • The deadline is next Tuesday, April 14.