430 likes | 441 Views
Data Structures & Algorithms Digraphs and DAGs. Richard Newman based on book by R. Sedgewick and slides by S. Sahni. Digraphs. Edges are directed Number of possible undirected graphs is huge 2 V(V+1)/2 Number of possible directed graphs is … huger(?) 2 V^2. Digraphs. Graph enumeration.
E N D
Data Structures & Algorithms Digraphs and DAGs Richard Newman based on book by R. Sedgewick and slides by S. Sahni
Digraphs • Edges are directed • Number of possible undirected graphs is huge • 2V(V+1)/2 • Number of possible directed graphs is … huger(?) • 2V^2
Digraphs • Graph enumeration
Digraphs • Defn. 19.1: A digraph is a set of nodes V and a set of distinct directed edges E, each from one node to another node in V. (self-loop allowed) • Defn. 19.2: A directed path in a digraph is a list of nodes for which there is an edge from each node to its successor. A node t is reachable from node s iff there is a d.p. from s to t.
Digraphs • Defn. 19.3: A directed acyclic graph (DAG) is a digraph with no directed cycles (tours). A node with only out-edges is a source; a node with only in-edges is a sink. • Defn. 19.4: A digraph is strongly connectediff every node is reachable from every node.
DAGs • DAGs can be used to model many real-life problems • Scheduling • Precedence • Pre-requisite structures • Causality • Etc.
Digraphs • Prop. 19.1: A digraph that is not strongly connected comprises a set of strongly connected components, which are maximal strongly connected subgraphs, and a set of directed edges that go from one component to another.
DAGs • Connected components • 0 • 0-7-4-5-3 • 2 • 2-6 • 1 • 1 2 2 0 0 1 1 7 7 5 5 6 6 4 4 3 3
Digraphs • Prop. 19.2: Given a digraph D, define another digraph K(D) with one node corresponding to each strongly connected component of D, and an edge from u to v iff there is one or more edge from the component corresponding to u to the component corresponding to v. K(D) is a DAG called the kernel DAG of D.
DAGs • DAG Components • 0: 0-7-4-5-3 • 1: 2-6 • 2: 1 • Kernel DAG • Component 0 • Component 1 • Component 2 2 2 0 0 1 1 7 7 5 5 6 6 4 4 3 3 1 0 2
Digraphs • In undirected graph, we just say two nodes are connected if there is a path between them • In a digraph, node t is reachable from node s if there is a directed path from s to t. • In a digraph, s and t are strongly connected if they are mutually reachable.
Digraphs • Classify edges in DFS • Tree – recursive calls • Back – to ancestor (including parent!) • Down – to visited descendent • Cross – neither ancestor nor descendent (cousins)
DFS in Digraphs Stack: 0 572 576 57(2) 541 54 55(6 cross) 53(0 back) 5(4 back) (5 down) 0 down 2 7 2 2 0 0 6 4 1 back 1 1 7 7 5 5 cross 5 6 4 4 3 3 6 3 0 1 2 3 4 5 6 7 pre 0 4 1 7 5 6 2 3 post 7 2 1 3 5 4 0 6
DFS Algorithms • Cycle Detection • If we find a back edge, it represents a cycle – including link to parent! • Cross edges don’t make cycles! • Reachability • Start from one, DFS until find other (or complete DFS) • Weak connectivity • If DFS finds all the nodes, then yes!
DFS Algorithms • Convert digraph to DAG • Remove back edges! • Use to generate large DAGs from large digraphs • Note that DFS in a digraph only gives reachability from the start node, not from all nodes
Transitive Closure • Defn. 19.5: The transitive closure of a digraph D is a digraph T with the same vertices but with an edge from s to t in T iff t is reachable from s in D. 2 0 2 2 0 0 1 7 5 1 1 7 7 5 5 6 4 3 6 4 4 3 3 6
Transitive Closure • Can also view (and compute) transitive closure by Boolean matrix multiplication • Use logical AND as x • Use logical OR as + • Ai represents (any) path of length i A 0 1 2 3 4 5 6 7 0 0 0 1 0 0 1 0 1 1 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 1 0 3 0 0 0 0 1 0 0 0 4 0 0 0 0 0 1 1 0 5 1 0 0 1 0 0 0 0 6 0 0 1 0 0 0 0 0 7 0 1 0 0 1 0 0 0 A2 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 1 0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0
Transitive Closure • Can also view (and compute) transitive closure by Boolean matrix multiplication • Use logical AND as x • Use logical OR as + • Ai represents (any) path of length i A<3 0 1 2 3 4 5 6 7 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 2 0 0 1 0 0 0 1 0 3 0 0 0 0 1 1 1 0 4 1 0 1 1 0 1 1 0 5 1 0 1 1 1 1 0 1 6 0 0 1 0 0 0 1 0 7 0 1 0 0 1 1 1 0 A3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 0 0 1 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 1 1 1 1 0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 1 0
Transitive Closure • Can also view (and compute) transitive closure by Boolean matrix multiplication • Use logical AND as x • Use logical OR as + • Ai represents (any) path of length i A<4 0 1 2 3 4 5 6 7 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 2 0 0 1 0 0 0 1 0 3 1 0 1 1 1 1 1 0 4 1 0 1 1 1 1 1 1 5 1 1 1 1 1 1 1 1 6 0 0 1 0 0 0 1 0 7 1 1 1 1 1 1 1 0 A4 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 0 0 1 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1
Transitive Closure • Can also view (and compute) transitive closure by Boolean matrix multiplication • Use logical AND as x • Use logical OR as + • Ai represents (any) path of length i A<5 0 1 2 3 4 5 6 7 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 2 0 0 1 0 0 0 1 0 3 1 0 1 1 1 1 1 1 4 1 1 1 1 1 1 1 1 5 1 1 1 1 1 1 1 1 6 0 0 1 0 0 0 1 0 7 1 1 1 1 1 1 1 1 A5 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 0 0 1 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 0 0 0 0 1 1 1 1 1 1 1 0
Transitive Closure • Can also view (and compute) transitive closure by Boolean matrix multiplication • Use logical AND as x • Use logical OR as + • Ai represents (any) path of length i A<6 0 1 2 3 4 5 6 7 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 2 0 0 1 0 0 0 1 0 3 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 1 1 5 1 1 1 1 1 1 1 1 6 0 0 1 0 0 0 1 0 7 1 1 1 1 1 1 1 1 0 • Keep on multiplying and adding until… … reach fixed point Matrix does not change
Transitive Closure • Prop. 19.5: We can compute the transitive closure of a digraph by adding self-loops, then computing AV, taking time V4. • Self-loops allow path to be of any length up to exponent • Must reach fixed point by V – why? • Efficient approach: • A, A2, A4, A8, … successive squaring • Takes lg V matrix multiplies, each V3 • Total time is V3 lg V
Transitive Closure • Even faster way! • Warshall’s algorithm: for (i = 0; i < V; ++i) for (s = 0; s < V; ++s) for (t = 0; t < V; ++t) if (A[s][i] && A[i][t]) A[s][t] = 1;
Transitive Closure Correctness by induction on i: Base: After first iteration, s-t or s-0-t After second iteration, s-t, s-0-t, s-1-t, s-0-1-t, s-1-0-t. IH: After ith iteration – all paths w/o inner nodes > i Inductive step: path from s to t w/o i+1 (already there) or path via i+1 (tested by if statement) for (every intermediate node i) for (every source s) for (every destination t) if (s reaches i & i reaches t) s reaches t;
Transitive Closure • Prop. 19.7: Warshall’s algorithm computes the transitive closure of a digraph in time V3. • Obvious from structure of Warshall’s algorithm – three nested loops of V each: for (i = 0; i < V; ++i) for (s = 0; s < V; ++s) for (t = 0; t < V; ++t) if (A[s][i] && A[i][t]) A[s][t] = 1;
Transitive Closure • Prop. 19.8: We can support constant-time reachability testing for a digraph with V nodes using space O(V2) and preprocessing time O(V3). • Can improve Warshall’s algorithm: for (i = 0; i < V; ++i) for (s = 0; s < V; ++s) for (t = 0; t < V; ++t) if (A[s][i] && A[i][t]) A[s][t] = 1;
Transitive Closure • We can improve Warshall’s algorithm by moving the test of A[s][i] out of the inner loop, avoiding innermost loop when s cannot reach i. for (i = 0; i < V; ++i) for (s = 0; s < V; ++s) if (A[s][i]) for (t = 0; t < V; ++t) if (A[i][t]) A[s][t] = 1;
Shortest Path • We can modify Warshall’s algorithm to compute shortest path, if A[][] contains the length of the minimum path from s to t (initialized with 1 for an edge and sentinel value V for no edge). for (i = 0; i < V; ++i) for (s = 0; s < V; ++s) for (t = 0; t < V; ++t) if (A[s][i] + A[i][t] < A[s][t]) A[s][t] = A[s][i] + A[i][t];
Reduction • Prop. 19.9: We can use any transitive-closure algorithm to compute the product of two Boolean matrices with at most a constant factor difference in running time. • Prf: Construct a 3v x 3v matrix using A, B, and VxV identity matrix I. TC is square. 2 I A 0 I A AB 0 I B = 0 I B 0 0 I 0 0 I
Reduction • What this means is that if we can perform transitive closure faster, then we can compute Boolean matrix products faster. • Likewise, a faster Boolean matrix multiply algorithm will speed up our TC algorithm. • Note that we can compute TC faster for sparse graphs – time O(V(E+V))
Topological Sort • Relabel: Given a DAG, relabel its nodes such that every directed edge points from a lower-numbered node to a higher-numbered node. • Rearrange: Given a DAG, rearrange its nodes on a horizontal line such that all the directed edges point from left to right. • Key = turn partial order into total order that is consistent with the partial order
Topological Sort Relabel: 2 0 2 7 2 0 0 0 1 7 5 1 2 1 7 1 7 5 5 5 6 4 3 6 6 6 4 4 3 3 Rearrange: 3 0 7 1 4 5 6 2
Reverse Topological Sort • Relabel: Given a DAG, relabel its nodes such that every directed edge points from a higher-numbered node to a lower-numbered node. • Rearrange: Given a DAG, rearrange its nodes on a horizontal line such that all the directed edges point from right to left. • Just reverse the regular topological sort
Topological Sort • Prop. 19.11: Postorder numbering in DFS yields a reverse topological sort for any DAG. • It is easy to turn reverse topological sort into a regular topological sort
Topological Sort • Prop. 19.12: Every DAG has at least one source and at least one sink. • Turn this into a topological sort algorithm: • Make indegree[V] vector initialized to 0 • Scan through DAG (visiting every edge) and increment indegree[i] each time an edge to node i is found. • Scan through indegree[] and enqueue all nodes with indegree zero (the sources).
Topological Sort • Prop. 19.12: Every DAG has at least one source and at least one sink. • Set currentID = 0. • While the queue is non-empty, • remove a node x and label it currentID • currentID++ • Decrement indegree[j] for all edges from node x to node j • If indegree[j] == 0, enqueue node j
Topological Sort 2 0 2 7 2 0 0 0 1 7 5 1 3 1 7 2 7 5 5 5 6 4 3 6 6 6 4 4 1 3 indegree: 0 0 1 0 1 2 0 0 1 2 1 2 0 3 1 0 0 1 Source Queue: 0 3 7 1 4 5 6 2
Topological Sort Application • Finding longest path from each node, longest path in DAG • “Critical path” for scheduling • What is the most urgent thing to do? • Reverse topological sort DAG • In RTS order, for each node v, longest path from v, LP[v] = 1 + max{LP[x] | (v,x) in E} • Guaranteed that LP[x] is known by time it is needed. Example of… Dynamic Programming!
Topological Sort 2 7 2 0 0 0 1 3 1 7 2 7 5 5 5 6 6 6 4 4 1 3 Longest path: 5 4 4 1 3 1 2 1
Transitive Closure Redux Reachability from each node in DAG • Reverse topological sort DAG • Row vector Reach[v] – initially self and successors • In RTS order, for each node v • Reach[v] = OR {Reach[x] | (v,x) in E} • Guaranteed that Reach[x] is known by time it is needed. Another example of… Dynamic Programming!
Transitive Closure Redux • Matrix approach takes time O(VE) • Direct recursive DFS approach: • No back edges (no cycles) • Tree edges – recursive call • Cross edges – do OR, no call • Down edges – skip, no call • Prop. 19.13: Using DFS and DP, can compute TC of a DAG in time O(V2+VX) where X is the number of cross edges
Strongly Connected Components • Kosaraju’s method • Do DFS on reversed digraph • Compute post-order numbering • Do DFS of original graph using reverse of postorder computed on reverse graph • Time and space are both linear in V+E!
Recap • Digraphs • Strong connectivity • Connected components • Reachability • Digraph kernel • Transitive closure • Shortest paths (special case) • Reduction (from Boolean matrix multiply) • Topological Sort