350 likes | 438 Views
Mike Barnett – Microsoft Research, USA Manuel Fähndrich – Microsoft Research, USA Diego Garbervetsky – DC. FCEyN . UBA, Argentina Francesco Logozzo – Microsoft Research, USA - IWACO’07 -. Annotations for (more) Precise Points-to Analysis. Original Motivation. Objective
E N D
Mike Barnett – Microsoft Research, USA Manuel Fähndrich – Microsoft Research, USA Diego Garbervetsky – DC. FCEyN. UBA, Argentina Francesco Logozzo – Microsoft Research, USA - IWACO’07 - Annotations for (more) Precise Points-to Analysis
Original Motivation • Objective • Have a points-to and effect analysis to reason (among other things) about (weak) purity in .Net programs • Approach • Try to use Salcianu-Rinard points-to analysis • Problems • Their heap model neither support struct types nor parameter passing by reference • Relies in having a complete call graph for the app • Very conservative in case of non-analyzable methods
Motivating Example List<int> Copy(IEnumerable<int> src) { List<int> l = new List(); foreach (int x in src) l.Add(x); return l; } Is Copy (weakly) pure? • It is difficult to predict the runtime type of src and iter • We cannot predict the effect of methods applied to iter on src … List<int> Copy(IEnumerable<int> src) { List<int> l = new List<int>(); IEnumerator<int>iter = src.GetEnumerator(); while (iter.MoveNext()){ int x = iter.get_Current(); l.Add(x); } return l; } • May GetEnumerator modify src? • May MoveNext or get_Current indirectly modify src?
Our work • Interprocedural Points-to and read/ write Effects Analysis • Based on Salcianu’sPoinst-to and Purity Analysis • Support s some .NET features • Managed pointers, struct types • Extended support s for non analyzable calls • A small annotation Language • Represents points-to and effects information • Leverages on some Spec# annotations (ownership) • Implementation in Spec# compiler • Used to infer/verify method purity • Reentrancy analysis • Checking specifications admissibility in the Boogie Methodology
Salcianu’s analysis reminder Main abstraction: Poinst-to graphs PTG=<I ,O, L, E> Models the part of the heap accessed by the method void m1(A p1, A p2) { p2.g = p1.f; } • Inside Nodes (objects allocated by m) • Load Nodes (placeholder for unknown objects read from outside the scope of m) • Parameter Nodes (represent the object/s passed as parameter) • Inside egdes: References created by m • Outside egdes: References read from outside the scope of m • W = set of write effects (n, field). W: {P2.f1}
Salcianu’s analysis reminder void m0(A a2) { a1 = new A(); b = new B(); m1(a1,a2); } void m1(A p1, A p2) { p2.g = p1.f; } What happends is m1 is not analyzable? W: {a2.f1} • μ :: Node P(Node) relates every node n in the callee’s summary to a set of existing or fresh nodes in the caller (nodes(Pm ) nodes(Pcallee)) • Fixpoint calculation • Match argument with parameters • Match reads from callee with writes of caller (outside egdes disambiguation)
First extension • A new set of nodes: address nodes • A new level of indirection • Every variable or field is represented by its address • For objects and primitive values an outgoing edge (labeled *) meaning “the contents of” • For struct types the outgoing edges are fields. void m1(A p1, A p2) { p2.g = p1.f; }
Second extension • Annotations to improve the precision of the analysis of non-analyzable calls • Approach • Add a few annotations to non-analyzable methods (pt&e info) • Compute PTG for every annotated method • We need to enrich the PTG model • Treat every method as analyzable • When method code is available check against annotations
Why annotations? • Useful for interfaces and abstract methods • There is no source code • Virtual calls • Documentation • Impose restrictions over implementing classes • Useful for native code and 3rd party libraries • Source not available • Useful for local reasoning in program analysis • Used as summaries for method calls • Assumed valid for inter-procedural analysis • The callee will be eventually checked
Iterator sample revisited 1: int result = 0; 2: it= p1.GetEnumerator(); 3: while(it.MoveNext()) 4: result+=it.Current(); 5: return result; [Pure] [Fresh] [Escapes(true)] [GlobalAccess(false)] IEnumerableGetEnumerator(); boolMoveNext();
Iterator sample revisited 1: int result = 0; 2: it= p1.GetEnumerator(); 3: while(it.MoveNext()) 4: result+=it.Current(); 5: return result; [Pure] [GlobalAccess(false)] [Escapes(true)] [Fresh] IEnumerableGetEnumerator(); boolMoveNext(); • Node = represents the set of objects reachable from the object/s represented by that node
Iterator sample revisited 1: int result = 0; 2: it= p1.GetEnumerator(); 3: while(it.MoveNext()) 4: result+=it.Current(); 5: return result; [Pure] [GlobalAccess(false)] [Escapes(true)] [Fresh] IEnumerableGetEnumerator(); boolMoveNext();
Iterator sample revisited 1: int result = 0; 2: it= p1.GetEnumerator(); 3: while(it.MoveNext()) 4: result+=it.Current(); 5: return result; [Pure] [GlobalAccess(false)] [Escapes(true)] [Fresh] IEnumerableGetEnumerator(); [GlobalAccess(false)] boolMoveNext();
Iterator sample revisited 1: int result = 0; 2: it= p1.GetEnumerator(); 3: while(it.MoveNext()) 4: result+=it.Current(); 5: return result; [Pure] [GlobalAccess(false)] [Escapes(true)] [Fresh] IEnumerableGetEnumerator(); [GlobalAccess(false)] boolMoveNext();
GetEnumerator revisited 1: int result = 0; 2: it= p1.GetEnumerator(); 3: while(it.MoveNext()) 4: result+=it.Current(); 5: return result; [Pure] [GlobalAccess(false)] [Escapes(true)][Captures(false)] IEnumerableGetEnumerator(); [GlobalAccess(false)] boolMoveNext(); $ Fields: readonly reference using not owned fields
Iterator sample revisited 1: int result = 0; 2: it= p1.GetEnumerator(); 3: while(it.MoveNext()) 4: result+=it.Current(); 5: return result; [Pure] [GlobalAccess(false)] [Escapes(true)][Captures(false)] IEnumerableGetEnumerator(); [WriteConfined] boolMoveNext(); $ Fields: readonly reference using not owned fields C: Means all nodes reachable in its ownership domain
The Annotation Language • Fresh: for out parameters and ret value • It is a newly created object • Write: for parameters • The method may write objects reachable from the parameter • WriteConfined: for parameters • The method may write objects reachable from the parameter but only within its ownership domain • Escape(bool): for parameters • The method may create some link between objects reachable from the parameter and other objects reachable from the return value or another parameter. • Capture(bool): for parameters • The escaping parameter will be owned by some callee argument? • GlobalRead/GlobalWrite(bool): for methods • Does the method read or write a global? • WriteConfined: for methods • The method mutates only the objects owned by its parameters • Pure: for methods • The method can not mutate the pre-estate unless allowed (out parameters)
Conclusions • Extend an existing poinst-to analysis • To support .NET memory model • Improve precision for non analyzable calls • Using omega nodes and an small annotation language • We find the annotation useful as documentation for the methods. • Leverages on some of existing Spec# annotations • Pure, Ownerships • Initial experiment are encouraging • Interesting improvements in precision (purity, aliasing)
Future Work • Integration with Spec# • Generate and use modifies and read clauses • Improve precision: • Recompute $,? fields when more info becomes available • Use type information as annotations to reduce potential aliasing • Use omega nodes to “abstract” poinst-to graph (scalability)
Additional Slides • About annotations • Omega Nodes • Motivation, Definition, integration with annotations • Experiments • PT Analysis step by step
About annotations void m(Ap1, A p2){ A a; A b = m2(p1, p2); b.v = 20; c = new A(); m3(c); } • [Pure] annotation is not always enough information • They still can impact the caller • A pure method can: • Return a non fresh value ( a global variable, a parameter) • Make a parameter or a global reachable from outside • On the other hand… • A caller can be pure even if the callee is not. [Pure] A m2(A p1, A p2) { return p2; } A m3(A p2) { p2.v = 0; }
PTG for non-analyzable calls • Even using annotations we don’t have total control of method behavior • If we allow writing a parameter: how deep is accessing its content? • Salcianu’s analysis generates the interprocedural mapping by matching operations on callee side with operation on caller side • One to one traverse on both graphs
Some problems • Suppose we want to model that an unknown callee might potentially write any object reachable from a parameter • p1.f1 = 0; • p1.f1.f2 = 0; • p1.f1.f2….fn = 0; • Attempt 1: Mark the effect directly over all nodes reachable from the caller • Problem: When binding caller with callee, we may not have enough context… • M1: m2(a1); • The effect of the callee must persist the caller context • Attempt 2: Add to the callee PTG nodes/egdes corresponding with each potential access • Problem: They may be infinite
Omega nodes • Omega nodes: • Represent every object reachable from that node • ? fields: Represent any field • When computing μ (binding time) • If an omega node appears in μ(n) , then add all nodes reachable from n to μ(n) • If some of them is a Load Node convert it into an omega node
Annotations and Omega Nodes • Generates a conservative PTG for non analyzable calls using omega nodes for the parameters and “?” edges between them • Clean the PTG using information provided by annotations • Remove ? Egdes (or replace by $ edges) • Add omega confine nodes • Add inside nodes (fresh returns)
First extension - Language • We define a subset of IL like language including managed pointer support • Necessary for parameter passing by reference and for dealing with struct types a = &b d = a.f1
First extension • A new set of nodes: address nodes • A new level of indirection • Every variable or field is represented by its address • For objects and primitive values an outgoing edge (labeled *) meaning “the contents of” • For struct types the outgoing edges are fields. void m1(A p1, A p2) { p2.f1 = p1.f2; }
Dealing with struct types • Struct type has value semantics • Treated as Values • Impact in assignments • Parameter passing
P&PEA intraproc void m2(A a) { a = this; D d = new D(); a.f = d; }
P&PEA intraproc void m2(A a) { a = this; D d = new D(); a.f = d; }
P&PEA intraproc void m2(A a) { a = this; D d = new D(); a.f = d; }
P&PEA intraproc void m2(A a) { a = this; D d = new D(); a.f = d; } Write Effects: [PLN.this].f • this.f • a.f
P&PEA intraproc Summary for m2 void m2(A a) { a = this; D d = new D(); a.f = d; } Write Effects: [PLN.this].f • this.f