1 / 59

Worklist Algorithm for Flow Functions in Data Analysis

This algorithm initializes data, applies flow functions to nodes, and updates their values iteratively for a complete analysis. It ensures proper termination and structure within the domain.

krozier
Download Presentation

Worklist Algorithm for Flow Functions in Data Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Worklist algorithm • Initialize all di to the empty set • Store all nodes onto a worklist • while worklist is not empty: • remove node n from worklist • apply flow function for node n • update the appropriate di, and add nodes whose inputs have changed back onto worklist

  2. Worklist algorithm let m: map from edge to computed value at edge let worklist: work list of nodes for each edge e in CFG do m(e) := ; for each node n do worklist.add(n) while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0 .. info_out.length-1 do if (m(n.outgoing_edges[i])  info_out[i]) m(n.outgoing_edges[i]) := info_out[i]; worklist.add(n.outgoing_edges[i].dst);

  3. Termination • Why is termination important? • Can we stop the algorithm in the middle and just say we’re done... • No: we need to run it to completion, otherwise the results are not safe...

  4. Termination • Assuming we’re doing reaching defs, let’s try to guarantee that the worklist loop terminates, regardless of what the flow function F does while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0 .. info_out.length-1 do if (m(n.outgoing_edges[i])  info_out[i]) m(n.outgoing_edges[i]) := info_out[i]; worklist.add(n.outgoing_edges[i].dst);

  5. Termination • Assuming we’re doing reaching defs, let’s try to guarantee that the worklist loop terminates, regardless of what the flow function F does while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0 .. info_out.length-1 do let new_info := m(n.outgoing_edges[i]) [ info_out[i]; if (m(n.outgoing_edges[i])  new_info]) m(n.outgoing_edges[i]) := new_info; worklist.add(n.outgoing_edges[i].dst);

  6. Structure of the domain • We’re using the structure of the domain outside of the flow functions • In general, it’s useful to have a framework that formalizes this structure • We will use lattices

  7. Background material

  8. Relations • A relation over a set S is a set R µ S £ S • We write a R b for (a,b) 2 R • A relation R is: • reflexive iff 8 a 2 S . a R a • transitive iff 8 a 2 S, b 2 S, c 2 S . a R b Æ b R c ) a R c • symmetric iff 8 a, b 2 S . a R b ) b R a • anti-symmetric iff 8 a, b, 2 S . a R b ):(b R a)

  9. Relations • A relation over a set S is a set R µ S £ S • We write a R b for (a,b) 2 R • A relation R is: • reflexive iff 8 a 2 S . a R a • transitive iff 8 a 2 S, b 2 S, c 2 S . a R b Æ b R c ) a R c • symmetric iff 8 a, b 2 S . a R b ) b R a • anti-symmetric iff 8 a, b, 2 S . a R b ):(b R a) 8 a, b, 2 S . a R b Æ b R a ) a = b

  10. Partial orders • An equivalence class is a relation that is: • A partial order is a relation that is:

  11. Partial orders • An equivalence class is a relation that is: • reflexive, transitive, symmetric • A partial order is a relation that is: • reflexive, transitive, anti-symmetric • A partially ordered set (a poset) is a pair (S,·) of a set S and a partial order · over the set • Examples of posets: (2S, µ), (Z, ·), (Z, divides)

  12. Lub and glb • Given a poset (S, ·), and two elements a 2 S and b 2 S, then the: • least upper bound (lub) is an element c such thata · c, b · c, and 8 d 2 S . (a · d Æ b · d) ) c · d • greatest lower bound (glb) is an element c such thatc · a, c · b, and 8 d 2 S . (d · a Æ d · b) ) d · c

  13. Lub and glb • Given a poset (S, ·), and two elements a 2 S and b 2 S, then the: • least upper bound (lub) is an element c such thata · c, b · c, and 8 d 2 S . (a · d Æ b · d) ) c · d • greatest lower bound (glb) is an element c such thatc · a, c · b, and 8 d 2 S . (d · a Æ d · b) ) d · c • lub and glb don’t always exists:

  14. Lub and glb • Given a poset (S, ·), and two elements a 2 S and b 2 S, then the: • least upper bound (lub) is an element c such thata · c, b · c, and 8 d 2 S . (a · d Æ b · d) ) c · d • greatest lower bound (glb) is an element c such thatc · a, c · b, and 8 d 2 S . (d · a Æ d · b) ) d · c • lub and glb don’t always exists:

  15. Lattices • A lattice is a tuple (S, v, ?, >, t, u) such that: • (S, v) is a poset • 8 a 2 S . ?v a • 8 a 2 S . a v> • Every two elements from S have a lub and a glb • t is the least upper bound operator, called a join • u is the greatest lower bound operator, called a meet

  16. Examples of lattices • Powerset lattice

  17. Examples of lattices • Powerset lattice

  18. Examples of lattices • Booleans expressions

  19. Examples of lattices • Booleans expressions

  20. Examples of lattices • Booleans expressions

  21. Examples of lattices • Booleans expressions

  22. End of background material

  23. Back to our example let m: map from edge to computed value at edge let worklist: work list of nodes for each edge e in CFG do m(e) := ; for each node n do worklist.add(n) while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0 .. info_out.length do let new_info := m(n.outgoing_edges[i]) [ info_out[i]; if (m(n.outgoing_edges[i])  new_info]) m(n.outgoing_edges[i]) := new_info; worklist.add(n.outgoing_edges[i].dst);

  24. Back to our example • We formalize our domain with a powerset lattice • What should be top and what should be bottom?

  25. Back to our example • We formalize our domain with a powerset lattice • What should be top and what should be bottom? • Does it matter? • It matters because, as we’ve seen, there is a notion of approximation, and this notion shows up in the lattice

  26. Direction of lattice • Unfortunately: • dataflow analysis community has picked one direction • abstract interpretation community has picked the other • We will work with the abstract interpretation direction • Bottom is the most precise (optimistic) answer, Top the most imprecise (conservative)

  27. Direction of lattice • Always safe to go up in the lattice • Can always set the result to > • Hard to go down in the lattice • So ... Bottom will be the empty set in reaching defs

  28. Worklist algorithm using lattices let m: map from edge to computed value at edge let worklist: work list of nodes for each edge e in CFG do m(e) := ? for each node n do worklist.add(n) while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0 .. info_out.length do let new_info := m(n.outgoing_edges[i]) t info_out[i]; if (m(n.outgoing_edges[i])  new_info]) m(n.outgoing_edges[i]) := new_info; worklist.add(n.outgoing_edges[i].dst);

  29. Termination of this algorithm? • For reaching definitions, it terminates... • Why? • lattice is finite • Can we loosen this requirement? • Yes, we only require the lattice to have a finite height • Height of a lattice: length of the longest ascending or descending chain • Height of lattice (2S, µ) =

  30. Termination of this algorithm? • For reaching definitions, it terminates... • Why? • lattice is finite • Can we loosen this requirement? • Yes, we only require the lattice to have a finite height • Height of a lattice: length of the longest ascending or descending chain • Height of lattice (2S, µ) = | S |

  31. Termination • Still, it’s annoying to have to perform a join in the worklist algorithm • It would be nice to get rid of it, if there is a property of the flow functions that would allow us to do so while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0 .. info_out.length do let new_info := m(n.outgoing_edges[i]) t info_out[i]; if (m(n.outgoing_edges[i])  new_info]) m(n.outgoing_edges[i]) := new_info; worklist.add(n.outgoing_edges[i].dst);

  32. Even more formal • To reason more formally about termination and precision, we re-express our worklist algorithm mathematically • We will use fixed points to formalize our algorithm

  33. Fixed points • Recall, we are computing m, a map from edges to dataflow information • Define a global flow function F as follows: F takes a map m as a parameter and returns a new map m’, in which individual local flow functions have been applied

  34. Fixed points • We want to find a fixed point of F, that is to say a map m such that m = F(m) • Approach to doing this? • Define ?, which is ? lifted to be a map: ? =  e. ? • Compute F(?), then F(F(?)), then F(F(F(?))), ... until the result doesn’t change anymore

  35. Fixed points • Formally: • We would like the sequence Fi(?) for i = 0, 1, 2 ... to be increasing, so we can get rid of the outer join • Require that F be monotonic: • 8 a, b . a v b ) F(a) v F(b)

  36. Fixed points

  37. Fixed points

  38. Back to termination • So if F is monotonic, we have what we want: finite height ) termination, without the outer join • Also, if the local flow functions are monotonic, then global flow function F is monotonic

  39. Another benefit of monotonicity • Suppose Marsians came to earth, and miraculously give you a fixed point of F, call it fp. • Then:

  40. Another benefit of monotonicity • Suppose Marsians came to earth, and miraculously give you a fixed point of F, call it fp. • Then:

  41. Another benefit of monotonicity • We are computing the least fixed point...

  42. Recap • Let’s do a recap of what we’ve seen so far • Started with worklist algorithm for reaching definitions

  43. Worklist algorithm for reaching defns let m: map from edge to computed value at edge let worklist: work list of nodes for each edge e in CFG do m(e) := ; for each node n do worklist.add(n) while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0 .. info_out.length do let new_info := m(n.outgoing_edges[i]) [ info_out[i]; if (m(n.outgoing_edges[i])  new_info]) m(n.outgoing_edges[i]) := new_info; worklist.add(n.outgoing_edges[i].dst);

  44. Generalized algorithm using lattices let m: map from edge to computed value at edge let worklist: work list of nodes for each edge e in CFG do m(e) := ? for each node n do worklist.add(n) while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0 .. info_out.length do let new_info := m(n.outgoing_edges[i]) t info_out[i]; if (m(n.outgoing_edges[i])  new_info]) m(n.outgoing_edges[i]) := new_info; worklist.add(n.outgoing_edges[i].dst);

  45. Next step: removed outer join • Wanted to remove the outer join, while still providing termination guarantee • To do this, we re-expressed our algorithm more formally • We first defined a “global” flow function F, and then expressed our algorithm as a fixed point computation

  46. Guarantees • If F is monotonic, don’t need outer join • If F is monotonic and height of lattice is finite: iterative algorithm terminates • If F is monotonic, the fixed point we find is the least fixed point. • Any questions so far?

  47. What about if we start at top? • What if we start with >: F(>), F(F(>)), F(F(F(>)))

  48. What about if we start at top? • What if we start with >: F(>), F(F(>)), F(F(F(>))) • We get the greatest fixed point • Why do we prefer the least fixed point? • More precise

  49. Graphically y 10 10 x

  50. Graphically y 10 10 x

More Related