530 likes | 540 Views
This outline covers various topics related to graphs, including their representation, search algorithms, and properties such as cycles and connectivity. It also explores different ways to represent graphs, including adjacency matrices and adjacency lists.
E N D
Data Structures and Algorithms Graphs I: Representation and Search Gal A. Kaminka Computer Science Department
Outline • Reminder: Graphs • Directed and undirected • Matrix representation of graphs • Directed and undirected • Sparse matrices and sparse graphs • Adjacency list representation
Graphs • Tuple <V,E> • V is set of vertices • E is a binary relation on V • Each edge is a tuple < v1,v2 >, where v1,v2 in V • |E| =< |V|2
Directed and Undirected Graphs • Directed graph: • < v1, v2 > in E is ordered, i.e., a relation (v1,v2) • Undirected graph: • < v1, v2 > in E is un-ordered, i.e., a set { v1, v2 } • Degree of a node X: • Out-degree: number of edges < X, v2 > • In-degree: number of edges < v1, X > • Degree: In-degree + Out-degree • In undirected graph: number of edges { X, v2 }
Paths • Path from vertex v0 to vertex vk: • A sequence of vertices < v0, v1, …. vk > • For all 0 =< i < k, edge < vi, vi+1 > exists. • Path is of length k • Two vertices x, y are adjacent if < x, y > is an edge • Path is simple if vertices in sequence are distinct. • Cycle: if v0 = vk • < v, v > is cycle of length 1
Connected graphs • Undirected Connected graph: • For any vertices x, y there exists a path xy (= yx) • Directed connected graph: • If underlying undirected graph is connected • Strongly connected directed graph: • If for any two vertices x, y there exist path xy and path yx • Clique: a strongly connected component • |V|-1 =< |E| =< |V|2
Cycles and trees • Graph with no cycles: acyclic • Directed Acyclic Graph: DAG • Undirected forest: • Acyclic undirected graph • Tree: undirected acyclic connected graph • one connected component
Representing graphs • Adjacency matrix: • When graph is dense • |E| close to |V|2 • Adjacency lists: • When graph is sparse • |E| << |V|2
Adjacency Matrix • Matrix of size |V| x |V| • Each row (column) j correspond to a distinct vertex j • “1” in cell < i, j > if there is exists an edge <i,j> • Otherwise, “0” • In an undirected graph, “1” in <i,j> => “1” in <j,i> • “1” in <j,j> means there’s a self-loop in vertex j
Examples 1 1 2 1 2 3 1 0 0 1 2 0 1 0 3 1 1 0 3 2 3 4 1 2 3 4 1 0 1 1 0 2 1 0 0 0 3 1 0 0 0 4 0 0 0 0
Adjacency matrix features • Storage complexity: O(|V|2) • But can use bit-vector representation • Undirected graph: symmetric along main diagonal • AT transpose of A • Undirected: A=AT • In-degree of X: Sum along column X O(|V|) • Out-degree of X: Sum along row X O(|V|) • Very simple, good for small graphs • Edge existence query: O(1)
But, …. • Many graphs in practical problems are sparse • Not many edges --- not all pairs x,y have edge xy • Matrix representation demands too much memory • We want to reduce memory footprint • Use sparse matrix techniques
Adjacency List • An array Adj[ ] of size |V| • Each cell holds a list for associated vertex • Adj[u] is list of all vertices adjacent to u • List does not have to be sorted Undirected graphs: • Each edge is represented twice
Examples 1 1 2 1 3 2 2 3 1 2 3 2 3 4 1 2 3 2 1 3 1 4
Adjacency list features • Storage Complexity: • O(|V| + |E|) • In undirected graph: O(|V|+2*|E|) = O(|V|+|E|) • Edge query check: • O(|V|) in worst case • Degree of node X: • Out degree: Length of Adj[X] O(|V|) calculation • In degree: Check all Adj[] lists O(|V|+|E|) • Can be done in O(1) with some auxiliary information!
Graph Traversals (Search) • We have covered some of these with binary trees • Breadth-first search (BFS) • Depth-first search (DFS) • A traversal (search): • An algorithm for systematically exploring a graph • Visiting (all) vertices • Until finding a goal vertex or until no more vertices Only for connected graphs
Breadth-first search • One of the simplest algorithms • Also one of the most important • It forms the basis for MANY graph algorithms
BFS: Level-by-level traversal • Given a starting vertex s • Visit all vertices at increasing distance from s • Visit all vertices at distance k from s • Then visit all vertices at distance k+1 from s • Then ….
5 2 1 3 8 6 10 7 9 BFS in a binary tree (reminder) BFS: visit all siblings before their descendents 5 2 8 1 3 6 10 7 9
BFS(tree t) • q new queue • enqueue(q, t) • while (not empty(q)) • curr dequeue(q) • visit curr // e.g., print curr.datum • enqueue(q, curr.left) • enqueue(q, curr.right) This version for binary trees only!
BFS for general graphs • This version assumes vertices have two children • left, right • This is trivial to fix • But still no good for general graphs • It does not handle cycles Example.
A B G C E D F Queue: A Start with A. Put in the queue (marked red)
A B G C E D F Queue: A B E B and E are next
A B G C E D F Queue: A B E C G D F When we go to B, we put G and C in the queue When we go to E, we put D and F in the queue
A B G C E D F Queue: A B E C G D F When we go to B, we put G and C in the queue When we go to E, we put D and F in the queue
A B G C E D F Queue: A B EC G D F F Suppose we now want to expand C. We put F in the queue again!
Generalizing BFS • Cycles: • We need to save auxiliary information • Each node needs to be marked • Visited: No need to be put on queue • Not visited: Put on queue when found What about assuming only two children vertices? • Need to put all adjacent vertices in queue
BFS(graph g, vertex s) • unmark all vertices in G • q new queue • mark s • enqueue(q, s) • while (not empty(q)) • curr dequeue(q) • visit curr // e.g., print its data • for each edge <curr, V> • if V is unmarked • mark V • enqueue(q, V)
The general BFS algorithm • Each vertex can be in one of three states: • Unmarked and not on queue • Marked and on queue • Marked and off queue • The algorithm moves vertices between these states
Handling vertices • Unmarked and not on queue: • Not reached yet • Marked and on queue: • Known, but adjacent vertices not visited yet (possibly) • Marked and off queue: • Known, all adjacent vertices on queue or done with
A B G C E D F Queue: A Start with A. Mark it.
A B G C E D F Queue: A B E Expand A’s adjacent vertices. Mark them and put them in queue.
A B G C E D F Queue: AB E C G Now take B off queue, and queue its neighbors.
A B G C E D F Queue: ABE C G D F Do same with E.
A B G C E D F Queue: ABEC G D F Visit C. Its neighbor F is already marked, so not queued.
A B G C E D F Queue: ABECG D F Visit G.
A B G C E D F Queue: ABECGD F Visit D. F, E marked so not queued.
A B G C E D F Queue: ABECGDF Visit F. E, D, C marked, so not queued again.
A B G C E D F Queue: ABECGDF Done. We have explored the graph in order: A B E C G D F.
Interesting features of BFS • Complexity: O(|V| + |E|) • All vertices put on queue exactly once • For each vertex on queue, we expand its edges • In other words, we traverse all edges once • BFS finds shortest path from s to each vertex • Shortest in terms of number of edges • Why does this work?
Depth-first search • Again, a simple and powerful algorithm • Given a starting vertex s • Pick an adjacent vertex, visit it. • Then visit one of its adjacent vertices • ….. • Until impossible, then backtrack, visit another
DFS(graph g, vertex s)Assume all vertices initially unmarked • mark s • visit s // e.g., print its data • for each edge <s, V> • if V is not marked • DFS(G, V)
A B G C E D F Current vertex: A Start with A. Mark it.
A B G C E D F Current: B Expand A’s adjacent vertices. Pick one (B). Mark it and re-visit.
A B G C E D F Current: C Now expand B, and visit its neighbor, C.
A B G C E D F Current: F Visit F. Pick one of its neighbors, E.
A B G C E D F Current: E E’s adjacent vertices are A, D and F. A and F are marked, so pick D.
A B G C E D F Current: D Visit D. No new vertices available. Backtrack to E. Backtrack to F. Backtrack to C. Backtrack to B