200 likes | 423 Views
A dynamic algorithm for topologically sorting directed acyclic graphs. David J. Pearce and Paul H.J. Kelly Imperial College, London, UK d.pearce@doc.ic.ac.uk www.doc.ic.ac.uk/~djp1/. Introduction. Topologically sorting a directed acyclic graph G=(V,E). U. W. U. Y. Z. W. V. T. X. S.
E N D
A dynamic algorithm for topologically sorting directed acyclic graphs David J. Pearce and Paul H.J. Kelly Imperial College, London, UK d.pearce@doc.ic.ac.uk www.doc.ic.ac.uk/~djp1/
Introduction • Topologically sorting a directed acyclic graph G=(V,E) U W U Y Z W V T X S Z Y X T V • Sort nodes so X before Y, if XYE • Well-known algorithms taking (v + e) time • E.g. using depth-first search S
Introduction • Topologically sorting a directed acyclic graph G=(V,E) U W U Y Z W V T X S Z Y X T V • Sort nodes so X before Y, if XYE • Well-known algorithms taking (v + e) time • E.g. using depth-first search S
Problem Definition • How to update topological sort after edge insertion? U W U Y Z W V T X S Z Y X T V • Invalidating or non-invalidating? • Adding YV does not invalidate sort, but XY does • How to deal with invalidating edge insertions? • Re-sorting entire graph takes (v+e) time again S
Problem Definition • How to update topological sort after edge insertion? U W U Y Z W V T X S Z Y X T V • Invalidating or non-invalidating? • Adding YV does not invalidate sort, but XY does • How to deal with invalidating edge insertions? • Re-sorting entire graph takes (v+e) time again S
Problem Definition • How to update topological sort after edge insertion? U W U Y Z W V T X S Z Y X T V • Invalidating or non-invalidating? • Adding YV does not invalidate sort, but XY does • How to deal with invalidating edge insertions? • Re-sorting entire graph takes (v+e) time again S
Performing less work • How to avoid re-sorting entire graph after edge insertion? U W U Y Z W V T X S Z Y X Affected region T V • Affected region is all nodes between Y and X • We denote this set as ARXY • Only ARXY needs reordering to obtain valid sort • Can eliminate many nodes from consideration • Proof by Marchetti-Spaccamela et al. [MNR96] S
Performing less work (continued) • How to re-sort affected region? U W U Y Z W V T X S Before Z Y X T V U Z W V T X Y S After S • Could just move Y to right of X! • But, V now incorrectly prioritised with respect to Y • Problem, cannot move Y past nodes it reaches in affected region
Algorithm MNR • Algorithm MNR due to Marchetti-Spaccamela et al. [MNR96] U W U Y Z W V T X S Before Z Y X T V U Z W T X Y V S After • Depth-First Search from Y identifies reachable set • Only visits those in affected region • Shift other nodes to the left • Every node in affected region moved – so at least O(ARXY) time S
Algorithm PK • Algorithm PK - main contribution of this work U W U Y Z W V T X S Before Z Y X T V U W Z X Y T V S After S • Key insight: can avoid resorting entire affected region • {Z,T} remain untouched, saving work compared with MNR • Only set xy ARXY needs re-sorting (i.e. nodes in grey)
Algorithm PK – what is XY? • Define XY = RF RB U W U Y Z W V T X S Z Y X W X Y V T V RB RF S • Observation: nodes in RB must come before those in RF • RB = all nodes reaching X (including X) • RF = all nodes reachable from Y (including Y)
Algorithm PK – How does it work? • Begins with two Depth-First Searches to identify RF and RB U W U Y Z W V T X S Before Z Y X Forward DFS T V Backward DFS S • Use forward and backward Depth-First Search • Forward search to determine RF (same as MNR) • Backward search to determine RB
Algorithm PK – how does it work? • Next, re-sort members of XY W X Y V U W RB RF Z Y X T V U ? Z ? ? T ? S S • Place RB and RF into slots previously held by RB RF • RB goes into leftmost slots, RF into rightmost slots
Algorithm PK – Complexity ? • PK needs at most ~O(XY + E(XY )) per edge insertion • In contrast, MNR needs O(ARXY + E(XY)) time • Thus, PK should win when XYmuch smaller than ARXY • Algorithm AHRSZ due to Alpern et al. [AHRSZ90] • Worst-case time complexity ~O(Kmin + E(Kmin )) • Where parameter Kmin XY is the minimal cover • So, tighter bound than algorithm PK – but is it practical? • Employs Dietz and Sleator ordered list structure [DS87] • Permits new priorities values to be created in O(1) time • But, this complex to implement and has relatively high overheads Definition: E(K) = { XYE | XK YK }
Experimental Study – Part I • Experiment • Measure cost per insertion over 5000 insertions into random DAG • Invalidating and non-invalidating included to reflect actual performance
Experimental Study – Part II • Experiment • Measure cost per insertion over 5000 insertions into random DAG • Invalidating and non-invalidating included to reflect actual performance
Conclusion • Algorithm PK • Theoretical complexity marginally inferior to AHRSZ algorithm • But, simplicity of PK yields more efficient algorithm in practice • Algorithm MNR • Worst theoretical complexity overall • However, outperforms the others on dense graphs • Other Work … • Extended algorithms to incremental strongly connected components problem – see [PKH03a,Pea04] • Developed batch variant of MNR – see [Pea04] • Obtains O(b + v + e) bound on time to insert batch of b edges • In contrast, MNR/PK/AHRSZ need O(b(v+e)) worst-case time
References • [MNR96] – A. Marchetti-Spaccamela, U. Nanni and H. Rohnert, “Maintaining a topological order under edge insertions”. Information Processing Letters, 1996. • [AHRSZ90] - B. Alpern, R. Hoover, B.K. Rosen, P.F. Sweeny and F.K. Zadec, “Incremental evaluation of computational circuits”, In Proc. ACM Symposium on Discrete Algorithms, 1990. • [PKH03a] – D. J. Pearce, P. H.J. Kelly and Chris Hankin, “Online Cycle Detection and Difference Propagation for Pointer Analysis”, In Proc. IEEE Workshop on Source Code Analysis and Manipulation, 2003. • [Pea04] – D. J. Pearce, “Some directed graph algorithms and their application to pointer analysis”, PhD Thesis, Imperial College. www.doc.ic.ac.uk/~djp1, d.pearce@doc.ic.ac.uk
Q) Is XY minimal ? • The answer is no. For example: U W X Y U W X Y W U X Y • Key point: U not repositioned • But, under PK it would be since U XY (because it reaches X) • Scope for an even better algorithm ?