Data Structures & Algorithms Digraphs and DAGs

Data Structures & Algorithms Digraphs and DAGs Richard Newman based on book by R. Sedgewick and slides by S. Sahni

Digraphs • Edges are directed • Number of possible undirected graphs is huge • 2V(V+1)/2 • Number of possible directed graphs is … huger(?) • 2V^2

Digraphs • Graph enumeration

Digraphs • Defn. 19.1: A digraph is a set of nodes V and a set of distinct directed edges E, each from one node to another node in V. (self-loop allowed) • Defn. 19.2: A directed path in a digraph is a list of nodes for which there is an edge from each node to its successor. A node t is reachable from node s iff there is a d.p. from s to t.

Digraphs • Defn. 19.3: A directed acyclic graph (DAG) is a digraph with no directed cycles (tours). A node with only out-edges is a source; a node with only in-edges is a sink. • Defn. 19.4: A digraph is strongly connectediff every node is reachable from every node.

DAGs • DAGs can be used to model many real-life problems • Scheduling • Precedence • Pre-requisite structures • Causality • Etc.

Digraphs • Prop. 19.1: A digraph that is not strongly connected comprises a set of strongly connected components, which are maximal strongly connected subgraphs, and a set of directed edges that go from one component to another.

DAGs • Connected components • 0 • 0-7-4-5-3 • 2 • 2-6 • 1 • 1 2 2 0 0 1 1 7 7 5 5 6 6 4 4 3 3

Digraphs • Prop. 19.2: Given a digraph D, define another digraph K(D) with one node corresponding to each strongly connected component of D, and an edge from u to v iff there is one or more edge from the component corresponding to u to the component corresponding to v. K(D) is a DAG called the kernel DAG of D.

DAGs • DAG Components • 0: 0-7-4-5-3 • 1: 2-6 • 2: 1 • Kernel DAG • Component 0 • Component 1 • Component 2 2 2 0 0 1 1 7 7 5 5 6 6 4 4 3 3 1 0 2

Digraphs • In undirected graph, we just say two nodes are connected if there is a path between them • In a digraph, node t is reachable from node s if there is a directed path from s to t. • In a digraph, s and t are strongly connected if they are mutually reachable.

Digraphs • Classify edges in DFS • Tree – recursive calls • Back – to ancestor (including parent!) • Down – to visited descendent • Cross – neither ancestor nor descendent (cousins)

DFS in Digraphs Stack: 0 572 576 57(2) 541 54 55(6 cross) 53(0 back) 5(4 back) (5 down) 0 down 2 7 2 2 0 0 6 4 1 back 1 1 7 7 5 5 cross 5 6 4 4 3 3 6 3 0 1 2 3 4 5 6 7 pre 0 4 1 7 5 6 2 3 post 7 2 1 3 5 4 0 6

DFS Algorithms • Cycle Detection • If we find a back edge, it represents a cycle – including link to parent! • Cross edges don’t make cycles! • Reachability • Start from one, DFS until find other (or complete DFS) • Weak connectivity • If DFS finds all the nodes, then yes!

DFS Algorithms • Convert digraph to DAG • Remove back edges! • Use to generate large DAGs from large digraphs • Note that DFS in a digraph only gives reachability from the start node, not from all nodes

Transitive Closure • Defn. 19.5: The transitive closure of a digraph D is a digraph T with the same vertices but with an edge from s to t in T iff t is reachable from s in D. 2 0 2 2 0 0 1 7 5 1 1 7 7 5 5 6 4 3 6 4 4 3 3 6

Transitive Closure • Can also view (and compute) transitive closure by Boolean matrix multiplication • Use logical AND as x • Use logical OR as + • Ai represents (any) path of length i A 0 1 2 3 4 5 6 7 0 0 0 1 0 0 1 0 1 1 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 1 0 3 0 0 0 0 1 0 0 0 4 0 0 0 0 0 1 1 0 5 1 0 0 1 0 0 0 0 6 0 0 1 0 0 0 0 0 7 0 1 0 0 1 0 0 0 A2 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 1 0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0

Transitive Closure • Can also view (and compute) transitive closure by Boolean matrix multiplication • Use logical AND as x • Use logical OR as + • Ai represents (any) path of length i A<3 0 1 2 3 4 5 6 7 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 2 0 0 1 0 0 0 1 0 3 0 0 0 0 1 1 1 0 4 1 0 1 1 0 1 1 0 5 1 0 1 1 1 1 0 1 6 0 0 1 0 0 0 1 0 7 0 1 0 0 1 1 1 0 A3 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 0 0 1 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 1 1 1 1 0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 1 0

Transitive Closure • Can also view (and compute) transitive closure by Boolean matrix multiplication • Use logical AND as x • Use logical OR as + • Ai represents (any) path of length i A<6 0 1 2 3 4 5 6 7 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 2 0 0 1 0 0 0 1 0 3 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 1 1 5 1 1 1 1 1 1 1 1 6 0 0 1 0 0 0 1 0 7 1 1 1 1 1 1 1 1 0 • Keep on multiplying and adding until… … reach fixed point Matrix does not change

Transitive Closure • Prop. 19.5: We can compute the transitive closure of a digraph by adding self-loops, then computing AV, taking time V4. • Self-loops allow path to be of any length up to exponent • Must reach fixed point by V – why? • Efficient approach: • A, A2, A4, A8, … successive squaring • Takes lg V matrix multiplies, each V3 • Total time is V3 lg V

Transitive Closure • Even faster way! • Warshall’s algorithm: for (i = 0; i < V; ++i) for (s = 0; s < V; ++s) for (t = 0; t < V; ++t) if (A[s][i] && A[i][t]) A[s][t] = 1;

Transitive Closure Correctness by induction on i: Base: After first iteration, s-t or s-0-t After second iteration, s-t, s-0-t, s-1-t, s-0-1-t, s-1-0-t. IH: After ith iteration – all paths w/o inner nodes > i Inductive step: path from s to t w/o i+1 (already there) or path via i+1 (tested by if statement) for (every intermediate node i) for (every source s) for (every destination t) if (s reaches i & i reaches t) s reaches t;

Transitive Closure • Prop. 19.7: Warshall’s algorithm computes the transitive closure of a digraph in time V3. • Obvious from structure of Warshall’s algorithm – three nested loops of V each: for (i = 0; i < V; ++i) for (s = 0; s < V; ++s) for (t = 0; t < V; ++t) if (A[s][i] && A[i][t]) A[s][t] = 1;

Transitive Closure • Prop. 19.8: We can support constant-time reachability testing for a digraph with V nodes using space O(V2) and preprocessing time O(V3). • Can improve Warshall’s algorithm: for (i = 0; i < V; ++i) for (s = 0; s < V; ++s) for (t = 0; t < V; ++t) if (A[s][i] && A[i][t]) A[s][t] = 1;

Transitive Closure • We can improve Warshall’s algorithm by moving the test of A[s][i] out of the inner loop, avoiding innermost loop when s cannot reach i. for (i = 0; i < V; ++i) for (s = 0; s < V; ++s) if (A[s][i]) for (t = 0; t < V; ++t) if (A[i][t]) A[s][t] = 1;

Shortest Path • We can modify Warshall’s algorithm to compute shortest path, if A[][] contains the length of the minimum path from s to t (initialized with 1 for an edge and sentinel value V for no edge). for (i = 0; i < V; ++i) for (s = 0; s < V; ++s) for (t = 0; t < V; ++t) if (A[s][i] + A[i][t] < A[s][t]) A[s][t] = A[s][i] + A[i][t];

Reduction • Prop. 19.9: We can use any transitive-closure algorithm to compute the product of two Boolean matrices with at most a constant factor difference in running time. • Prf: Construct a 3v x 3v matrix using A, B, and VxV identity matrix I. TC is square. 2 I A 0 I A AB 0 I B = 0 I B 0 0 I 0 0 I

Reduction • What this means is that if we can perform transitive closure faster, then we can compute Boolean matrix products faster. • Likewise, a faster Boolean matrix multiply algorithm will speed up our TC algorithm. • Note that we can compute TC faster for sparse graphs – time O(V(E+V))

Recap • Digraphs • Strong connectivity • Connected components • Reachability • Digraph kernel • Transitive closure • Shortest paths (special case) • Reduction (from Boolean matrix multiply)

Data Structures & Algorithms Digraphs and DAGs