230 likes | 388 Views
Pointer and Shape Analysis Seminar http://www.cs.tau.ac.il/~msagiv/courses/shape.html. Mooly Sagiv Schriber 317 msagiv@post Office Hours Thursday 15-16. General Information. Prerequisites Compilers | Program Analysis Select 3 topics by Sunday Participate in 9 seminar talks
E N D
Pointer and Shape Analysis Seminarhttp://www.cs.tau.ac.il/~msagiv/courses/shape.html Mooly Sagiv Schriber 317 msagiv@post Office Hours Thursday 15-16
General Information • Prerequisites • Compilers | Program Analysis • Select 3 topics by Sunday • Participate in 9 seminar talks • Present a paper
Outline • Schedule • Point-to analysis
Points-To Analysis • Determine if a variable points to a variable at some (all) execution paths [1] p = &a; [2] q = &b; [3] if (getc()) [4] q = &c [5] q b p a q c
Iterative Program Analysis • Start by optimistically assuming that nothing is wrong • No points-to set • At every iteration apply the abstract meaning of programming language statements and add more points-to pairs • Stop when no changes occur
Iterative Points-to Analysis t= &a ta y= &b ta, yb z= &c ta, yb, z c *p= t ta, yb, z c ta, yb, z c p= &y p= &z ta, yb, z c, py ta, yb, z c, pz ta, yb, z c, py, pz
Iterative Points-to Analysis t= &a ta y= &b ta, yb z= &c ta, yb, z c, py, pz *p= t ta, yb, z c, py, pz, ya, za ta, yb, z c, py, pz, ya, za p= &y p= &z ta, yb, z c, py ta, yb, z c, pz ta, yb, z c, py, pz
Iterative Points-to Analysis t= &a ta y= &b ta, yb z= &c ta, yb, z c, py, pz *p= t ta, yb, z c, py, pz, ya, za ta, yb, z c, py, pz, ya, za p= &y p= &z ta, yb, z c, py, ya, za ta, yb, z c, pz, ya, za ta, yb, z c, py, pz
Iterative Points-to Analysis t= &a ta y= &b ta, yb z= &c ta, yb, z c, py, pz, ya, za *p= t ta, yb, z c, py, pz, ya, za ta, yb, z c, py, pz, ya, za p= &y p= &z ta, yb, z c, py, ya, za ta, yb, z c, pz, ya, za ta, yb, z c, py, pz, ya, za
A Simple Programming Language • Arbitrary (uninterpreted) control flow statement • Atomic statements • x = y • x = &y • x = *y • *x = y
Abstract Semantics • For every atomic statement S S #: P(Var* Var*) P(Var* Var*) x := &y # (pt) = pt – {(x, *)} {(x, y)} x := y #(pt) = pt – {(x, *)} {(x, z)| (y, z) pt} x := *y # (pt) = pt – {(x, *)} {(x, z)| (y, w), (w, z) pt} *x := y #(pt) = pt {(w, t)| (x, w), (y, t) pt}
t= &a 1 y= &b 2 z= &c 3 *p= t 4 5 6 p= &y p= &z 7
Supporting Memory Allocation • Uniform treatment of the memory allocated at an allocation statement • For every atomic statement S • S #: P(Var* Var*) P(Var* Var*) • x := &y # (pt) = pt – {(x, *)} {(x, y)} • x := y # (pt) = pt – {(x, *)} {(x, z)| (y, z) pt} • x := *y # (pt) = pt – {(x, *)} {(x, z)| (y, w), (w, z) pt} • *x := y #(pt) = pt {(w, t)| (x, w), (y, t) pt} • l: x := malloc() #(pt) = pt – {(x, *)} {(x, l)}
Summary Flow-Sensitive Solution • Limited destructive updates • Can be improved with must information • O(N * Var2) space
Context-Sensitivity • How to handle procedures • Separate points-to sets for every call • A uniform set for all calls
Context Sensitivity Example x = &t1; a = &t2; foo(x, a); z = &t3; b = &t4; foo(z, b); void foo(source, target) { *source = target; }
Flow-Insensitive Analysis • Ignore control flow statements • Arbitrary statement order • Only accumulate Points-to • Usually represented as a directed graph • O(n2) space
Flow Insensitive Solution t= &a y= &b z= &c *p= t p= &y p= &z
Set Constraints • A set of rules of the form: • lhs rhs • t rhs’ lhs rhs (conditional constraint) • lhs, rhs, rhs’ are variables over sets of terms • t is a term • The least solution can be found iteratively • start with empty sets • add terms when needed • Cubic graph based solution
p t y z a b c t := &a; {a} pt[t] y := &b; {b} pt[y] z := &c; {c} pt[z] if (nondet()) p:= &y;{y} pt[p] else p:= &z; {z} pt[p] *p := t; a pt[p] pt[t] pt[a] b pt[p] pt[t] pt[b] c pt[p] pt[t] pt[c] y pt[p] pt[t] pt[y] z pt[p] pt[t] pt[z] t pt[p] pt[t] pt[t] p pt[p] pt[t] pt[p]
Unification Based Solution Steengard 1996 • Treat assignments as equalities • Employ union-find algorithm • Almost linear time complexity
Conclusions • Points-to analysis is a simple pointer analysis problem • Effective solutions (8MLoc) • But rather imprecise • Set constraints are useful beyond pointer analysis • Class level analysis