Turgay Korkmaz Office: NPB 3.330 Phone: (210) 458-7346 Fax: (210) 458-4437

CS 2123 Data Structures Ch 16 – Graphs (Networks) Graph traversals: DFS - BFS Turgay Korkmaz Office: NPB 3.330 Phone: (210) 458-7346 Fax: (210) 458-4437 e-mail: korkmaz@cs.utsa.edu web: www.cs.utsa.edu/~korkmaz Thanks to Eric S. Roberts, the author of our textbook, for providing some slides/figures/programs. I also used some materials from other textbooks (specifically, The Algorithm Design Manual by Steven Skiena and his slides at http://www.cs.sunysb.edu/skiena. So I also thank to Steven Skiena for presentations on the Internet).

Disclaimer • You are highly recommended to read the chapter 16 from our textbook, which provides a general interface-based design for graph abstraction using several other data structures (e.g., list, set) covered in previous chapters. • Instead of this general abstract design, we will directly implement concrete graph structures and operations. So even though the key concepts are the same, the implementation details will be different than the textbook.

Objectives • To appreciate the conceptual structure of a graph and its applications • To learn basic graph theory terminology/notation • To learn underlying representations for graphs (adjacency matrix, adjacency list) • To understand and be able to apply basic graph algorithms • Depth-first search (DFS), • Breath-first search(BFS), • Dijkstra’s shortest path (DSP), • To provide knowledge and skills to students so that they can comfortably use graph structures and algorithms (or develop new ones) in their research or workplace.

2 4 1 5 6 3 6 3 5 3 1 6 1 2 Graph Traversal for(i=1; i <= N; i++) { curr=edges[i]; while(curr) { … curr = curr->next; } } • So far we processed the nodes/vertices in the order of node IDs. • But, many graph/network algorithms require us to process the • nodes/vertices in an order that takes the connections into account. • Such algorithms start at some node/vertex and advance from node to another by moving along the links/arcs/edges, performing some operation on each node. • The precise nature of the operation depends on the algorithm, but the process of performing that operation is called visiting/discovering the node. • The process of visiting each node in a graph by moving along its links is called traversing the graph. The goal of a traversal is to visit/discover every node once—and only once—in an order determined by connections. (similar to tree traversal) • But, Graphs often have many different paths that lead to the same node. So we need additional bookkeeping to keep track of which nodes have already been visited. • How can we keep track of the nodes that have already been visited?

2 4 1 5 6 3 6 3 5 3 1 6 1 2 typedef struct edgenode { int y; int w; // weight; struct edgenode *next; } edgenodeT; typedef struct { edgenodeT *edges[MAXV+1]; int degree[MAXV+1]; int visited[MAXV+1]; int nvertices; int nedges; int directed; } graphT; Graph Traversal:keep track of visited nodes edges degree visited Like trees, graphs have several traversal algorithms. We will study two of them: - Depth-first search (DFS) and - Breadth-first search (BFS)

DFS: Depth-first search

DFS: Depth-first search similar to preorder walk on trees void DisplayTree(nodeT *t) //PreOrderWalk { if (t != NULL) { printf(“%c “, t->key); DisplayTree(t->left); DisplayTree(t->right); } } Depth-first search The depth-first strategy for traversing a graph is similar to the preorder traversal of trees and has the same recursive structure. The only additional complication is that graphs—unlike trees—can contain cycles. So, we need to check/make sure that nodes are not visited many times during the traversal. Otherwise, the recursive process can go on forever as the algorithm proceeds. D G B E C A t F

2 4 1 5 6 3 6 3 5 3 1 6 1 2 DFSDepth-first search edges degree visited

2 4 1 5 6 3 6 3 5 3 1 6 1 2 DFSDepth-first search void DFS_print(graph *g, int v) { edgenodeT *pe; if (g==NULL) return; g->visited[v] = 1; printf(“%d is visited\n”, v); pe = g->edges[v]; while (pe ! = NULL) { if (g->visited[pe->y] == 0) { DFS_print(g, pe->y); } pe = pe->next; } } edges degree visited 1 1 1 1 1 1

2 4 1 5 6 3 6 3 5 3 1 6 1 2 Exercise: DFS void DFS_print(graph *g, int v) { edgenodeT *pe; if (g==NULL) return; g->visited[v] = TRUE; printf(“%d is visited\n”, v); pe = g->edges[v]; while (pe ! = NULL) { if (g->visited[pe->y] == FALSE) { DFS_print(g, pe->y); } pe = pe->next; } } 10 7 3 6 11 3 9 1 1 8 12 Time Complexity ???:

2 4 1 5 6 3 6 3 5 3 1 6 1 2 Computational Complexity of DFS void DFS_print(graph *g, int v) { edgenodeT *pe; if (g==NULL) return; g->visited[v] = TRUE; printf(“%d is visited\n”, v); pe = g->edges[v]; while (pe ! = NULL) { if (g->visited[pe->y] == FALSE) { DFS_print(g, pe->y); } pe = pe->next; } } 10 7 3 6 11 3 9 1 1 8 12 Time Complexity: O(m + n)

2 4 1 5 6 3 6 3 5 3 1 6 1 2 typedef struct edgenode { int y; int w; // weight; struct edgenode *next; } edgenodeT; typedef struct { edgenodeT *edges[MAXV+1]; int degree[MAXV+1]; int visited[MAXV+1]; int parent[MAXV+1]; int nvertices; int nedges; int directed; } graphT; Keeping track of parents void DFS_print(graph *g, int v) { edgenodeT *pe; if (g==NULL) return; g->visited[v] = TRUE; printf(“%d is visited\n”, v); pe = g->edges[v]; while (pe ! = NULL) { if (g->visited[pe->y] == FALSE) { g->parent[pe->y] = v; // !!!!! DFS_print(g, pe->y); } pe = pe->next; } } p:1 p:5 p:-1 p:2 p:4 p:3 p:5 10 7 3 6 p:3 11 3 9 1 p:10 p:8 1 8 12 p:10 p:7

2 4 1 5 6 3 6 3 5 3 1 6 1 2 Recursion and Path Finding p:1 We can reconstruct a path by following the chain of ancestors from an endnode to the root node. !!! We have to work backward!!! Find the path from start=1 (root) to end node 6? p:5 p:-1 p:2 p:4 p:3 find_path(int start, int end, int parents[]) { if ((start == end) || (end == -1)) printf(”%d”, start); else { find_path(start, parents[end], parents); printf(” %d”, end); } } 4 1 6 p:4 Why we cannot find the path starting from the root towards the end node? If we move printf before recursive call, what will be printed? How about an iterative solution?

2 4 1 5 6 3 6 3 5 3 1 6 1 2 typedef struct edgenode { int y; int w; // weight; struct edgenode *next; } edgenodeT; typedef struct { edgenodeT *edges[MAXV+1]; int degree[MAXV+1]; int visited[MAXV+1]; int parent[MAXV+1]; int nvertices; int nedges; int directed; } graphT; Exercise: print out TREE edges • print_tree_edges(graphT *g); • 1, 2 • 2, 3 • 3, 5 • 3, 10 • 4, 6 • 5, 4 • 5, 7 • 7, 8 • 8, 9 • 10, 11 • 10, 12 p:1 p:5 p:-1 p:2 p:4 p:3 p:5 10 7 3 6 p:3 11 3 9 1 p:10 p:8 1 8 12 p:10 p:7

Edge classification A depth-first search of a graph organizes the edges of the graph in a precise way. During the execution of DFS, we assign a direction to each edge, from the vertex which discover it: intedge_classification(graph *g, int x, int y) { if (parent[y] == x) return(TREE); if (g->visited[y] && !processed[y]) return(BACK); if (processed[y] && (entry_time[y]>entry_time[x])) return(FORWARD); if (processed[y] && (entry_time[y]<entry_time[x])) return(CROSS); printf(”Warning: unclassified edge (%d,%d)”,x,y); } In an undirected graph, we have only TREE and BACK edges. Why? In a directed graph we may have all four cases…

2 4 1 5 6 3 6 3 5 3 1 6 1 2 typedef struct edgenode { int y; int w; // weight; struct edgenode *next; } edgenodeT; typedef struct { edgenodeT *edges[MAXV+1]; int degree[MAXV+1]; int visited[MAXV+1]; int parent[MAXV+1]; int nvertices; int nedges; int directed; } graphT; Exercise: print out TREE edges print_tree_edges(graphT *g){ p:1 p:5 p:-1 p:2 p:4 p:3 p:5 10 7 3 6 p:3 11 3 9 1 p:10 p:8 1 8 12 p:10 p:7

2 4 1 6 3 6 3 5 1 6 1 typedef struct edgenode { int y; int w; // weight; struct edgenode *next; } edgenodeT; typedef struct { edgenodeT *edges[MAXV+1]; int degree[MAXV+1]; int visited[MAXV+1]; int parent[MAXV+1]; int nvertices; int nedges; int directed; } graphT; Exercise Can you use DFS to find connected components? Implement the following functions: int isConnected(g); int NumOfConnComp(g); int NumOfNodesInLargestComp(g);

2 4 1 6 3 6 3 5 1 6 1 typedef struct edgenode { int y; int w; // weight; struct edgenode *next; } edgenodeT; typedef struct { edgenodeT *edges[MAXV+1]; int degree[MAXV+1]; int visited[MAXV+1]; int parent[MAXV+1]; int nvertices; int nedges; int directed; } graphT; Exercise int isConnected(graphT *g){ // set all the values in visited to 0 DFS(g); // check if all the values in visited are 1

OPT A generalDFS imp. dfs(graph *g, int v) { edgenodeT *pe; if (g==NULL || finished) return; g->visited[v] = TRUE; g->time = g->time + 1; g->entry_time[v]=g->time; process_vertex_early(v); pe = g->edges[v]; while (pe ! = NULL) f if (g->visited[pe->y] == FALSE) { g->parent[pe->y] = v; // !!!!! process_edge(v, pe->y); dfs(g, pe->y); } else if ((!g->processed[pe->y]) || (g->directed)) process_edge(v, pe->y); if (g->finished) return; pe = pe->next; } process_vertex_late(v); g->time = g->time + 1; exit_time[v] = g->time; g->processed[v] = TRUE; } /* new fields variables/labels in graphT */ int entry_time[MAXV+1]; int exit_time[MAXV+1]; int parent[MAXV+1]={-1}; int processed[MAXV+1]={0}; int finished=0; int time=0; process_vertex_early(int v) { … } process_edge(int v, int y) { … } process_vertex_late(int v) { … }

BFS: Breadth-first search

BFS: Breadth-first searchsimilar to levelorder walk on trees void LevelOrderWalk(nodeT *t) { if (t != NULL) { …… } } Breadth-first search The breadth-first strategy for traversing a graph is similar to the level-order traversal of trees and has the same iterative structure using queue data structure. The only additional complication is that graphs—unlike trees—can contain cycles. So, we need to check/make sure that nodes are not visited many times during the traversal. Otherwise, the iterative process can go on forever as the algorithm proceeds. D G B E C A t F

2 4 1 5 6 3 6 3 5 3 1 6 1 2 BFSBreadth-first search edges degree visited

queue.h interface from Ch 10 • #ifndef _queue_h • #define _queue_h • //#include "genlib.h“ • #define New(ptr_t) ((ptr_t) malloc(sizeof *((ptr_t) NULL))) • typedefintbool; • typedef void *queueElementT; /* “void *” can be replaced by other types: int in this case */ • typedefstructqueueCDT *queueADT; • queueADTNewQueue(void); • void FreeQueue(queueADT queue); • void Enqueue(queueADT queue, queueElementT element); • queueElementTDequeue(queueADT queue); • boolQueueIsEmpty(queueADT queue); • boolQueueIsFull(queueADT queue); • intQueueLength(queueADT queue); • queueElementTGetQueueElement(queueADT queue, int index); • #endif int

2 4 1 5 6 3 6 3 5 3 1 6 1 2 BFSBreadth-first search void BFS_print(graphT *g, int start) { queueADT q; edgenodeT *pe; int v; if (g==NULL) return; q = NewQueue(); Enqueue(q, start); g->visited[start] = TRUE; printf(“%d is visited\n”, start); while (!QueueIsEmpty(q)) { v = Dequeue(q); pe = g->edges[v]; while (pe ! = NULL) { if (!(g->visited[pe->y] )) { Enqueue(q, pe->y); g->visited[pe->y] = TRUE; printf(“%d is visited\n”, pe->y); } pe = pe->next; } } } q: 1 2 3 4 5 6 edges degree visited 1 1 1 1 1 1

2 4 1 5 6 3 6 3 5 3 1 6 1 2 typedef struct edgenode { int y; int w; // weight; struct edgenode *next; } edgenodeT; typedef struct { edgenodeT *edges[MAXV+1]; int degree[MAXV+1]; int visited[MAXV+1]; int parent[MAXV+1]; int nvertices; int nedges; int directed; } graphT; Exercise: BFS void BFS_print(graphT *g, int start) { queueADT q; edgenodeT *pe; int v; if (g==NULL) return; q = NewQueue(); Enqueue(q, start); g->visited[start] = TRUE; printf(“%d is visited\n”, start); while (!QueueIsEmpty(q)) { v = Dequeue(q); pe = g->edges[v]; while (pe ! = NULL) { if (!(g->visited[pe->y] )) { Enqueue(q, pe->y); g->visited[pe->y] = TRUE; printf(“%d is visited\n”, pe->y); } pe = pe->next; } } } 10 7 3 6 11 3 9 1 1 8 12 Time Complexity: O(m + n)

2 4 1 5 6 3 6 3 5 3 1 6 1 2 typedef struct edgenode { int y; int w; // weight; struct edgenode *next; } edgenodeT; typedef struct { edgenodeT *edges[MAXV+1]; int degree[MAXV+1]; int visited[MAXV+1]; int parent[MAXV+1]; int nvertices; int nedges; int directed; } graphT; Keeping track of parents void BFS_print(graphT *g, int start) { queueADT q; edgenodeT *pe; int v; if (g==NULL) return; q = NewQueue(); Enqueue(q, start); g->visited[start] = TRUE; printf(“%d is visited\n”, start); while (!QueueIsEmpty(q)) { v = Dequeue(q); pe = g->edges[v]; while (pe ! = NULL) { if (!(g->visited[pe->y] )) { Enqueue(q, pe->y); g->visited[pe->y] = TRUE; g->parents[pe->y] = v; // !!!!! printf(“%d is visited\n”, pe->y); } pe = pe->next; } } } p:1 p:2 p:-1 p:1 p:4 p:3 p:5 10 7 3 6 p:3 11 3 9 1 p:10 p:7 1 8 12 p:10 p:7 Time Complexity: O(m + n)

2 4 1 5 6 3 6 3 5 3 1 6 1 2 Shortest (Min hop) Paths In BFS vertices are visited in order of increasing distance (in terms of hop count) from the root, so this tree has a very important property. The unique tree path from the root to any node x  V uses the smallest number of edges (or equivalently, intermediate nodes/hops) possible on any root-to-x path in the graph. So we can find paths with minimum number of nodes/hopes using BFS. How can we print the path from root to any given destination node x? How about finding paths with minimum distance in terms of edge weight?

2 4 1 5 6 3 6 3 5 3 1 6 1 2 Recursion and Path Finding As before we can reconstruct a path by following the chain of ancestors from the end node to the root. Find the path from start=1 (root) to node 6? 4 1 6 p:4 p:1 find_path(int start, int end, int parents[]) { if ((start == end) || (end == -1)) printf(”%d”, start); else { find_path(start, parents[end], parents); printf(” %d”, end); } } p:2 p:-1 p:4 p:1 p:3

2 4 1 5 6 3 6 3 5 3 1 6 1 2 typedef struct edgenode { int y; int w; // weight; struct edgenode *next; } edgenodeT; typedef struct { edgenodeT *edges[MAXV+1]; int degree[MAXV+1]; int visited[MAXV+1]; int parent[MAXV+1]; int nvertices; int nedges; int directed; } graphT; Print out TREE edges • print_tree_edges(graphT *g); • 1, 2 • 1, 3 • 2, 4 • 3, 5 • 3, 10 • 4, 6 • 5, 7 • 7, 8 • 7, 9 • 10, 11 • 10, 12 p:1 p:2 p:-1 p:1 p:4 p:3 p:5 10 7 3 6 p:3 11 3 9 1 p:10 p:7 1 8 12 p:10 p:7

2 4 1 6 3 6 3 5 1 6 1 Exercise Can you use BFS to find connected components? Will this be better than using DFS? iterative vs. recursive Implement the following functions: int isConnected(g); int NumOfConnComp(g);

2 4 1 5 6 3 6 3 5 3 1 6 1 2 Exercise: BFSAdjacency Matrix void BFS_print(graphT *g, int start) { queueADT q; edgenodeT *pe; int v; if (g==NULL) return; q = NewQueue(); Enqueue(q, start); g->visited[start] = TRUE; printf(“%d is visited\n”, start); while (!QueueIsEmpty(q)) { v = Dequeue(q); pe = g->edges[v]; while (pe ! = NULL) { if (!(g->visited[pe->y] )) { Enqueue(q, pe->y); g->visited[pe->y] = TRUE; printf(“%d is visited\n”, pe->y); } pe = pe->next; } } } typedef struct { int valid; int w;} linkT; linkT M[MAXV+1] [MAXV+1]; visited 1 1 1 MODIFY THIS 1 1 1 Time Complexity: O(m + n) vs. O(n2)

2 4 1 5 6 3 6 3 5 3 1 6 1 2 OPT Exercise: reverse BFS void BFS_print(graphT *g, int start) { queueADT q; edgenodeT *pe; int v; if (g==NULL) return; q = NewQueue(); Enqueue(q, start); g->visited[start] = TRUE; printf(“%d is visited\n”, start); while (!QueueIsEmpty(q)) { v = Dequeue(q); pe = g->edges[v]; while (pe ! = NULL) { if (!(g->visited[pe->y] )) { Enqueue(q, pe->y); g->visited[pe->y] = TRUE; printf(“%d is visited\n”, pe->y); } pe = pe->next; } } } CAN YOU USE OR MODIFY THIS? If the graph is undirected, yes! If not, no! Why? If the graph is directed, what do you need to do? Will the adjacency matrix be better? Adjacency list representation needs to be Backward star instead of Forward star Given a Forward star representation, Create Backward star representation?

bfs(graphT *g, int start) { queueADT q; int v; edgenodeT *pe; q = NewQueue(); Enqueue(q, start); g->visited[start] = TRUE; while (!QueueIsEmpty(q) ) { v = Dequeue(q); process_vertex_early(v); g->processed[v] = TRUE; pe = g->edges[v]; while (pe ! = NULL) f if ((processed[pe->y] == FALSE) || g->directed) process_edge(v,pe->y); if (!(g->visited[pe->y])) { Enqueue(q, pe->y); g->visited[pe->y] = TRUE; g->parent[pe->y] = v; // !!!!!! } pe = pe->next; } process_vertex_late(v); } FreeQueue(q); } A generalBFS imp. OPT /* new fields variables/labels in graphT */ int entry_time[MAXV+1]; int exit_time[MAXV+1]; int parent[MAXV+1]={-1}; int processed[MAXV+1]={0}; int finished=0; int time=0; process_vertex_early(int v) { … } process_edge(int v, int y) { … } process_vertex_late(int v) { … }

Turgay Korkmaz Office: NPB 3.330 Phone: (210) 458-7346 Fax: (210) 458-4437