430 likes | 446 Views
Learn about relations, types of relations, anti-symmetry, equivalence classes, posets, bounds, lattices, and analysis on sets in Advanced Compilers course. Explore how to build program analyzers using sets, posets, and lattices.
E N D
CSE 231 : Advanced Compilers Building Program Analyzers
Foundations : Relations Relation R over set S is just a set of pairs from S: R ⊆S xS a R b means (a, b)∈R Example: < = { (a, b) | b – a is positive } a < b means (a, b)∈<
Foundations : Types of Relations • reflexive • ∀a ∈S . a R a • transitive • ∀a, b, c ∈S. a R b /\b R c a R c • symmetric • ∀a, b ∈S . a R b b R a • anti-symmetric • ∀a, b ∈S . a R b /\ b R a a = b
Foundations : Anti-Symmetry • anti-symmetric • ∀a, b ∈S . a R b /\ b R a a = b • Anti-symmetry is slightly weird. • Essentially: • “If you start at X and only follow R to new elements, you will never get back to X.”
Foundations : Types of Relations equivalence reflexive, transitive, symmetric defines equivalence classes partitions domain partial order reflexive, transitive, anti-symmetric defines a partial order on domain
Foundations : Posets partially ordered set (poset) defined by set S and partial order ·
Foundations : Posets partially ordered set (poset) defined by set S and partial order · Example: (2{x, y, z},⊆)
Foundations : Posets partially ordered set (poset) defined by set S and partial order · Examples: (2S, ⊆) (Z, <) (Z, divides)
Foundations : Least Upper Bounds Assume poset (S, ⊑) least upper bound (lub) of a and b is c where a ⊑c b ⊑c ∀d . (a ⊑d /\b ⊑d) c ⊑d Upper Least
Foundations : Greatest Lower Bounds Assume poset (S, ⊑) greatest lower bound (glb) of a and b is c where c ⊑a c ⊑b ∀d . (d ⊑a /\d ⊑b) d ⊑c Lower Greatest
Foundations : Bounds Assume poset (S, ⊑) Essentially: lub(a, b) = smallest thing bigger than a and b glb(a, b) = biggest thing smaller than a and b Question: Do lub and glb always exist?
Foundations : Bounds Assume poset (S, ⊑) Essentially: lub(a, b) = smallest thing bigger than a and b glb(a, b) = biggest thing smaller than a and b Question: Do lub and glb always exist? Answer: No!
Foundations : Bounds Question: Do lub and glb always exist? Answer: No! glb( , ) = ???
Foundations : Lattices A lattice is (S, ⊑, ⊥, ⊤, ⊔, ⊓) where: (S, ⊑) is a poset ⊥ is the smallest thing in S ⊤ is the biggest thing in S lub(a, b) and glb(a, b) always exist a⊔ b = lub(a, b) a⊓ b = glb(a, b)
Foundations : Lattices (Formally) A lattice is (S, ⊑, ⊥, ⊤, ⊔, ⊓) where: (S, ⊑) is a poset ∀ a ∈ S . ⊥ ⊑ a ∀ a ∈ S . a ⊑ ⊤ ∀ a, b ∈ S . ∃ c . c = lub(a, b) /\ a⊔ b = c ∀ a, b ∈ S . ∃ c . c = glb(a, b) /\ a⊓ b = c
Foundations :Fancy Lattice Names ⊥ is “botom” ⊤ is “top” ⊔ is “join” ⊓ is “meet”
Examples of lattices • Powerset lattice
Examples of lattices • Powerset lattice
Examples of lattices • Booleans expressions
Examples of lattices • Booleans expressions
Examples of lattices • Booleans expressions
Examples of lattices • Booleans expressions
Analysis on Sets let m: map from edge to computed value at edge let worklist: work list of nodes for each edge e in CFG do m(e) := ∅ for each node n do worklist.add(n) while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0 .. info_out.length do let new_info := m(n.outgoing_edges[i]) ∪ info_out[i]; if (m(n.outgoing_edges[i]) new_info]) m(n.outgoing_edges[i]) := new_info; worklist.add(n.outgoing_edges[i].dst);
Port Analysis To Run On Lattices Formalize domain with a powerset lattice. What should top and bottom be?
Port Analysis To Run On Lattices Formalize domain with a powerset lattice. What should top and bottom be? Does it matter? Should we even care?
Port Analysis To Run On Lattices Formalize domain with a powerset lattice. What should top and bottom be? Does it matter? Should we even care? Yes. A notion of approximation shows up in the lattice.
Lattice Direction Unfortunate name clashes: dataflow analysis picked one direction abstract interpretation picked the other We work in the abstract interpretation direction: ⊥(bottom) is most precise (optimistic) ⊤(top) is most imprecise (conservative)
Lattice Direction Always safe to go up in the lattice. can always set the result to ⊤ (top) Hard to go down in the lattice. So: ⊥(bottom) will be empty set in reaching defns
Building an Analysis: worklist + lattice let m: map from edge to computed value at edge let worklist: work list of nodes for each edge e in CFG do m(e) := ⊥ for each node n do worklist.add(n) while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0 .. info_out.length do let new_info := m(n.outgoing_edges[i]) ⊔ info_out[i]; if (m(n.outgoing_edges[i]) new_info]) m(n.outgoing_edges[i]) := new_info; worklist.add(n.outgoing_edges[i].dst);
Termination? For reaching definitions, it terminates. Why?
Termination? For reaching definitions, it terminates. Why? Because lattice is finite. Can we loosen this requirement? Yes, only require the lattice to have a finite height. Height of a lattice: length of longest ascending or descending chain
Termination? Height of lattice (2S, ⊆) = ???
Termination? Height of lattice (2S, ⊆) = | S |
Termination. But can we do better? Still annoying to perform join in the worklist algo: It would be nice to get rid of it. Is there a property of the flow functions that can help? while (worklist.empty.not) do let n := worklist.remove_any; let info_in := m(n.incoming_edges); let info_out := F(n, info_in); for i := 0 .. info_out.length do let new_info := m(n.outgoing_edges[i]) ⊔ info_out[i]; if (m(n.outgoing_edges[i]) new_info]) m(n.outgoing_edges[i]) := new_info; worklist.add(n.outgoing_edges[i].dst);
Crank Up the Formality… To reason even more precisely about termination, we port our worklist algorithm to math. Fixed points underlie our new algorithm.
Fixpoints What is a fixpoint?
Fixpoints What is a fixpoint? An input to a function that equals its output. So a fixpoint for f is any input x such that: f(x) = x Easy!
Fixpoints Goal: compute map m from CFG edges to dataflow information Strategy: define a global flow function F as follows: F takes a map m as a parameter and returns a new map m’, in which individual local flow functions have been applied
Fixpoints • Recall, we are computing m, a map from edges to dataflow information • Define a global flow function F as follows: F takes a map m as a parameter and returns a new map m’, in which individual local flow functions have been applied
Fixpoints • We want to find a fixed point of F, that is, a map m such that m = F(m) • Approach to doing this? • Define ⊥, which is ⊥ lifted to be a map: ⊥ = e. ⊥ • Compute F(⊥), then F(F(⊥)), then F(F(F(⊥))), ... until the result doesn’t change