360 likes | 374 Views
Learn about graphs, paths, connectivity, and graph representations in computational science and artificial intelligence. Dive into adjacency sets, degrees, graph insertion, complexity, search algorithms, and more.
Intro to Computation & AI Dr. Jill Fain Lehman School of Computer Science Lecture 4: November 13, 1997
v4 v2 e5 e2 v5 e1 e8 e3 v1 e6 e7 v6 e4 v3 Graph Basics • In general a graph consists of a set of nodes/vertices V, and set of edges E • Note: a tree is a special type of graph. • Either v V or e E may be a complex structure with additional information associated with it.
pgh nyc 5 la 1.5 1 1.5 2 bos 2.5 5 3.5 sf no Examples noun ‘s noun noun noun noun article adj verb verb
Graph Formalism • G = (V, E) where G is a graph, V a set of vertices and E a set of edges, such that e E iff e = (v1, v2), v1, v2 V. • If G is undirected, then e = (v1, v2) implies e = (v2, v1), i.e. vertices are unordered. • If G is directed (digraph) then (v1, v2) are ordered. v1 is the origin, v2 is the terminus or destination. v1 v2 v1 v2
B A C D Paths, Adjacency, Cycles • Two vertices vi and vj are adjacent if there exists an edge e E such that e = (vi, vj). • A path p is a sequence of vertices of V of the form p = v1 v2 ... vn (n >= 2) in which each vertex vi is adjacent to vi+1 (for 1<= i <= n-1). • A cycle is a path p = v1 v2 ... vn such that v1 = vn
Connectivity • If x e V and y e V, x = y, then x and y are connected if there exists a path p = v1…vnsuch that x = v1 and y = vn. • For G undirected, a subset S of V is a connected component if for any two distinct vertices, x e S, y e S, x is connected to y. • For G directed, a subset S of V is strongly connected if for each pair of distinct vertices (vi,vj) e S, vi is connected to vj and vj is connected to vi. S is weakly connected if either vi is connected to vj or vj is connected to vi.
Connectivity Examples Strongly connected Weakly connected
Adjacency Sets and Degrees • Let an adjacency set Vx = {y | (x, y) e E}. Then G = (V, A) where A = {Vx | x e V}. • For G undirected, the degree of a vertex x is the number of edges e in which x is one of the endpoints of e. d=4 d=3 undirected graph with 2 components d=1 d=4 d=0 d=2
Degrees for Directed Graphs • If x is a vertex in a digraph G, we can define two sets Pred(x) and Succ(x), the predecessors and successors of x respectively. • Pred(x) = {y | y e V and (y, x) e E}; the size of Pred(x) is called the in-degree of x. • Succ(x) = {y | y e V and (x,y) e E}; the size of Succ(x) is called the out-degree of x. in=0; out=2 in=2, out=0 in=1, out=1
Graph Representations: The Adjacency Matrix • Given G=(V,E), V=v1…vn. Let T[i,j] be a table with n rows and n columns such that row i corresponds to vi and column j to vj, (1 <= i,j <= n). Then T[i,j] = 1 iff there exists e e E such that e = (vi,vj) and T[i,j] = 0 iff there exists no e e E such that e = (vi,vj) . 1 2 3 4 1 1 2 3 4 0 1 0 0 0 0 1 1 1 0 0 1 1 0 0 0 2 Adjacency matrix for G G 3 4
1: 2: 3: 4: 3 1 2 4 4 1 4 4 2 Graph Representations: Edge lists 1 G 2 3 4 G Vector of linked adjacency lists for G G 1 List of linked adjacency lists for G (basic graphnode:= name, nextv, edgelist) 2 3 3 1 4 1
Graph Insertion Given G: a list of graphnodes v: a graphnode edge: a pair of graphnodes, x and y And assume listinsert inserts only if not there. InsertEdge(edge, graph) listinsert(edge.x, graph) listinsert(edge.y, graph) listinsert(edge.y.name, edge.x.edgelist) return graph Complexity???
Complexity of Simple Graph Insertion InsertEdge(edge, graph) listinsert(edge.x, graph) O(|V|) listinsert(edge.y, graph) O(|V|) listinsert(edge.y.name, edge.x.edgelist) O(|V|) return graph Complexity: O(V) on each call How many calls? At most V2 So, O(V3) to build a graph
Example Main() { G := null For e in ((ny pgh)(ny bos)(bos pgh)) do G := InsertEdge(e, G)} G ny pgh bos pgh bos pgh
Graph Search • Basic idea: to search a graph G, we want to visit all G’s vertices in a systematic order (we’ll use the adjacency list). • Will need to designate some v e V as the start vertex. • Will need to mark each vertex we’ve visited as seen in order to detect cycles; so we add the field visited (boolean) to the basic graphnode definition.
Recursive DFS ExhaustiveDFS(v) { v.visited := true for w in v.edgelist do if w.visited = false then ExhaustiveDFS(w)} main() { ExhaustiveDFS(v0)} What if G has multiple components, or G has one component but is weakly connected?
Example B EDFS(A) A.visited := true for unvisited w in (B C D) do EDFS(B) D A C
Example B EDFS(A) A.visited := true for unvisited w in (B C D) do D A C EDFS(B) B.visited := true for unvisited w in (A C D) do EDFS(C)
Example B EDFS(A) A.visited := true for unvisited w in (B C D) do D A C EDFS(B) B.visited := true for unvisited w in (A C D) do EDFS(C) C.visited := true for unvisited w in (A B D) do EDFS(D)
Example B EDFS(A) A.visited := true for unvisited w in (B C D) do A D C EDFS(B) B.visited := true for unvisited w in (A C D) do EDFS(C) C.visited := true for unvisited w in (A B D) do EDFS(D) D.visited := true Nounvisited w in (B C A) so function returns
Example B EDFS(A) A.visited := true for unvisited w in (B C D) do A D C EDFS(B) B.visited := true for unvisited w in (A C D) do EDFS(C) C.visited := true No unvisited w so return
Example EDFS(A) A.visited := true for unvisited w in (B C D) do B A D EDFS(B) B.visited := true No unvisited w in (D) so return C
Example B EDFS(A) A.visited := true No unvisited w in (C D) so return A D C How would you change EDFS to visit nodes “breadth first”? ExhaustiveDFS(v) { v.visited := true for w in v.edgelist do if w.visited = false then ExhaustiveDFS(w)}
Shortest Path • For many problems the best representation is a directed graph with weighted edges representing, e.g., distance, time, cost. • Dijkstra’s shortest path algorithm finds the lowest cost path in O(n2). 2 3 Simulate by hand Write pseudocode Go to TA hours 1 6 24 Read assignment Write/debug Java 1 Go to TA hours 16
PERT/CPM • Project Evaluation and Review (PERT) charts use a graph to encode : • tasks as vertices • dependencies among paths as edges • duration of task as weight on edge • A critical path on a PERT chart is a path from a start vertex to an end vertex such that if the completion time of any task along p slips by DT then the project also slips by DT. • PERT/CPM uses a DAG and topological ordering.
The Travelling Salesman Problem (TSP) • Given G, a directed graph with weighted edges, where vertices represent cities, and weights on edges connecting cities give the distance/cost of traveling between those cities. • Problem: Find the minimum cost cycle that visits all the cities in the graph exactly once before returning to the starting point. • The number of possible paths is exponential; can we do better than exhaustively trying all paths?
The Class P • P is the class of all problems that can be solved in polynomial time on a deterministic computer. • Polynomial means O(nk) for some integer k given a problem of size n. • A deterministic computer makes exactly one choice at any choice point. • All single processor machines and machines with fixed parallelism are deterministic.
The Class NP • NP is the class of all problems that can be solved in polynomial time on a nondeterministic computer. • A nondeterministic computer always makes the correct choice at a choice point (one choice but never backs up). • Alternatively: a nondeterministic computer makes k copies of itself to run in parallel at a k-wise choice point, for all values of k. • Alternatively: a nondeterministic computer can explore a tree of depth d in O(d) time.
Instant Ph.D. Just answer the question: Does P = NP? (Nobody knows)
NP-Completeness • An NP-complete problem is one that can be solved in O(nk) on a nondeterministic machine, and for which it can be shown that every problem in NP can be reduced to the NP-complete problem using a polynomial time transformation. • Such proofs rely on the definition of Turing Machine. • Concept of NP-completeness is important because: • Showing a polynomial deterministic solution for any NP-complete problem means P = NP. • Proving something is NP-complete (or NP-hard) means you’re not likely to find a polynomial algorithm.
Proving a Problem is in NP • Another way to show your problem is NP-complete is to show that a known NP-complete problem can be reduced to it in polynomial time. • E.g. The Hamiltonian Circuit problem is known to be NP-complete (find a cycle in a directed graph of n vertices that travels through each vertex exactly once and returns to the start). • Let’s prove (very informally) that TSP is NP-complete.
Step 1: Show TSP is in NP • Show TSP in NP by giving nondeterministic solution: • Nondeterministically guess all possible subsets of |V| vertices and choose the one with minimum cost.
Step 2: Reduce Hamiltonian Circuit to TSP • Given a graph G = (V, E) we turn it into GTSP by adding a weight of 1 to each edge. • Run our nondeterministic TSP algorithm seeking a path of cost |V|. • GTSP has a solution iff G has a Hamiltonian Circuit. 1 a a b b 1 1 1 1 1 d c c d
Step 3: Proof by Contradiction • Now assume TSP is not NP-complete. • Then we can solve any instance of HC in polynomial time (just convert to TSP, run and read off answer). So HC is in P. • But we know that HC is NP-complete (contradiction). • Thus our assumption must be wrong and TSP is NP-complete.
Who Cares? • Just because you can only think of an exponential solution to a problem doesn’t mean that there isn’t a polynomial time solution (remember the mutilated checkerboard?). • If a problem is in P it is also in NP by definition (similarly, if it’s O(n2) it’s also O(n3), etc.) • Reduction of a known NP-complete problem guarantees that there is no polytime solution unless P = NP.
What do you do with an NP-complete problem? • Don’t bother looking for a polynomial time solution; go directly to heuristic search….