1 / 41

Pointer Analysis

Pointer Analysis. G. Ramalingam Microsoft Research, India. A Constant Propagation Example. x = 3; y = 4; z = x + 5;. x is always 3 here can replace x by 3 and replace x+5 by 8 and so on. A Constant Propagation Example With Pointers. x = 3; *p = 4; z = x + 5;.

lars-sharp
Download Presentation

Pointer Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pointer Analysis G. Ramalingam Microsoft Research, India

  2. A Constant Propagation Example x = 3; y = 4; z = x + 5; • x is always 3 here • can replace x by 3 • and replace x+5 by 8 • and so on

  3. A Constant Propagation ExampleWith Pointers x = 3; *p = 4; z = x + 5; • Is x always 3 here?

  4. A Constant Propagation ExampleWith Pointers p = &y; x = 3; *p = 4; z = x + 5; if (?) p = &x; else p = &y; x = 3; *p = 4; z = x + 5; p = &x; x = 3; *p = 4; z = x + 5; pointers affect most program analyses x is always 3 x is always 4 x may be 3 or 4 (i.e., x is unknown in our lattice)

  5. A Constant Propagation ExampleWith Pointers p = &y; x = 3; *p = 4; z = x + 5; if (?) p = &x; else p = &y; x = 3; *p = 4; z = x + 5; p = &x; x = 3; *p = 4; z = x + 5; p always points-to y p always points-to x p may point-to x or y

  6. Points-to Analysis • Determine the set of targets a pointer variable could point-to (at different points in the program) • “p points-to x” == “*p has l-value x” • targets could be variables or locations in the heap (dynamic memory allocation) • p = &x; • p = new Foo(); or p = malloc (…); • must-point-to vs. may-point-to

  7. A Constant Propagation ExampleWith Pointers Can *p denote the same location as *q? *q = 3; *p = 4; z = *q + 5; what values can this take?

  8. More Terminology • *p and *q are said to be aliases (in a given concrete state) if they represent the same location • Alias analysis • determine if a given pair of references could be aliases at a given program point • *pmay-alias *q • *pmust-alias*q

  9. Pointer Analysis • Points-To Analysis • may-point-to • must-point-to • Alias Analysis • may-alias • must-alias

  10. Points-To Analysis:A Simple Example p = &x; q = &y; if (?) { q = p; } x = &a; y = &b; z = *q;

  11. Points-To Analysis:A Simple Example p = &x; q = &y; if (?) { q = p; } x = &a; y = &b; z = *q;

  12. Points-To Analysis x = &a; y = &b; if (?) { p = &x; } else { p = &y; } *x = &c; *p = &c; x: a x: {a,c} x: a y: {b,c} y: b y: b p: {x,y} p: {x,y} p: {x,y} a: c a: c a: null How should we handle this statement? Weak update Weak update Strong update

  13. Questions • When is it correct to use a strong update? A weak update? • Is this points-to analysis precise? • What does it mean to say • p must-point-to x at pgm point u • p may-point-to x at pgm point u • p must-not-point-to x at u • p may-not-point-to x at u

  14. Points-To Analysis, Formally • We must formally define what we want to compute before we can answer many such questions

  15. Static Program Analysis • A static program analysis computes approximateinformation about the runtime behavior of a given program • The set of valid programs is defined by the programming language syntax • The runtime behavior of a given program is defined by the programming language semantics • The analysis problem defines whatinformation is desired • The analysis algorithm determines what approximation to make

  16. Programming Language: Syntax • A program consists of • a set of variables Var • a directed graph (V,E,entry) with a distinguished entry vertex, with every edge labelled by a primitive statement • A primitive statement is of the form • x = null • x = y • x = *y • x = &y; • *x = y • skip (where x and y are variables in Var) • Omitted (for now) • Dynamic memory allocation • Pointer arithmetic • Structures and fields • Procedures

  17. Example Program 1 Vars = {x,y,p,a,b,c} x = &a; y = &b; if (?) { p = &x; } else { p = &y; } *x = &c; *p = &c; x = &a 2 y = &b 3 p = &x p = &y 4 5 skip skip 6 *x = &c 7 *p = &c 8

  18. Programming Language:Operational Semantics • Operational semantics == an interpreter (defined mathematically) • State • Data-State ::= Var -> (Var U {null}) • PC ::= V (the vertex set of the CFG) • Program-State ::= PC x Data-State • Initial-state: • (entry, \x. null)

  19. Example States Vars = {x,y,p,a,b,c} Initial data-state 1 x: N, y:N, p:N, a:N, b:N, c:N x = &a 2 Initial program-state y = &b x: N, y:N, p:N, a:N, b:N, c:N <1, > 3 p = &x p = &y 4 5 skip skip 6 *x = &c 7 *p = &c 8

  20. Programming Language:Operational Semantics • Meaning of primitive statements • CS[stmt] : Data-State -> Data-State • CS[ x = y ] s = s[x s(y)] • CS[ x = *y ] s = s[x s(s(y))] • CS[ *x = y ] s = s[s(x) s(y)] • CS[ x = null ] s = s[x null] • CS[ x = &y ] s = s[x y] must say what happens if null is dereferenced

  21. Programming Language:Operational Semantics • Meaning of program • a transition relation  on program-states • Program-State X Program-State • state1state2means that the execution of some edge in the program can transform state1 into state2 • Defining  • (u,s) (v,s’) iff the program contains a control-flow edge u->vlabelled with a statement stmt such that M[stmt]s = s’

  22. Programming Language:Operational Semantics • A sequence of states s1s2 … snis said to be an execution (of the program) iff • s1is the Initial-State • sisi+1for 1 <= I < n • A state s is said to be a reachable stateiff there exists some execution s1s2 … snis such that sn= s. • Define RS(u) = { s | (u,s) is reachable }

  23. Programming Language:Operational Semantics • A sequence of states s1s2 … snis said to be an execution (of the program) iff • s1 is the Initial-State • si| si+1 for 1 <= I < n • A state s is said to be a reachable state iff there exists some execution s1s2 … snis such that sn= s. • Define RS(u) = { s | (u,s) is reachable } All of this formalism for this one definition

  24. Ideal Points-To Analysis:Formal Definition • Let u denote a vertex in the CFG • Define IdealMustPT (u) to be { (p,x) | forall s in RS(u). s(p) == x } • Define IdealMayPT (u) to be { (p,x) | exists s in RS(u). s(p) == x }

  25. May-Point-To Analysis:Formal Requirement Specification May Point-To Analysis Compute R: V -> 2Vars’ such that R(u) IdealMayPT(u) (where Var’ = Var U {null}) For every vertex u in the CFG, compute a set R(u) such that R(u) { (p,x) | $sÎRS(u). s(p) == x }

  26. May-Point-To Analysis:Formal Requirement Specification • An algorithm is said to be correct if the solution R it computes satisfies "uÎV. R(u) IdealMayPT(u) • An algorithm is said to be precise if the solution R it computes satisfies "uÎV. R(u)=IdealMayPT(u) • An algorithm that computes a solution R1 is said to be more precise than one that computes a solution R2 if "uÎV. R1(u) ÍR2(u) Compute R: V -> 2Vars’ such that R(u) IdealMayPT(u)

  27. Back To OurMay-Point-To Algorithm p = &x; q = &y; if (?) { q = p; } x = &a; y = &b; z = *q;

  28. (May-Point-To Analysis)Algorithm A • Is this algorithm correct? • Is this algorithm precise? • Let’s first completely and formally define the algorithm.

  29. Algorithm A: A Formal DefinitionThe “Data Flow Analysis” Recipe • Define semi-lattice of abstract-values • AbsDataState ::= Var -> 2Var’ • f1 f2 = \x. (f1 (x) È f2 (x)) • bottom = \x.{} • Define initial abstract-value • InitialAbsState = \x. {null} • Define transformers for primitive statements • AS[stmt] : AbsDataState -> AbsDataState

  30. Algorithm A: A Formal DefinitionThe “Data Flow Analysis” Recipe • Let st(v,u) denote stmt on edge v->u • Compute the least-fixed-point of the following “dataflow equations” • x(entry) = InitialAbsState • x(u) = v->u AS(st(v,u)) x(v) x(v) x(w) v w st(v,u) st(w,u) u x(u)

  31. Algorithm A:The Transformers • Abstract transformers for primitive statements • AS[stmt] : AbsDataState -> AbsDataState • AS[ x = y ] s = s[x s(y)] • AS[ x = null ] s = s[x {null}] • AS[ x = &y ] s = s[x {y}] • AS[ x = *y ] s = s[x s*(s(y))] where s*({v1,…,vn}) = s(v1) È … È s(vn) • AS[ *x = y ] s = ???

  32. Correctness & Precision • We have a complete & formal definition of the problem. • We have a complete & formal definition of a proposed solution. • How do we reason about the correctness & precision of the proposed solution?

  33. Enter: The French Recipe(Abstract Interpretation) • Concrete Domain • Concrete states: C • Semantics: For every statement st, • CS[st] : C -> C • AbstractDomain • Asemi-lattice(A, b) • TransferFunctions • For every statement st, • AS[st] : A -> A • Concrete (Collecting) Domain • Asemi-lattice(2C, b) • TransferFunctions • For every statement st, • CS*[st] : 2C -> 2C a g 2Data-State 2Var x Var’

  34. Points-To Analysis(Abstract Interpretation) a(Y) = { (p,x) | exists s in Y. s(p) == x } MayPT(u) a Í a RS(u) IdealMayPT(u) 2Data-State 2Var x Var’ IdealMayPT (u) = a ( RS(u) )

  35. Approximating Transformers:Correctness Criterion c is said to be correctly approximated by a iff a(c) Ía c1 correctly approximated by a1 f c2 f# correctly approximated by a2 C A

  36. Approximating Transformers:Correctness Criterion concretization g c1 a1 f abstraction a c2 f# a2 requirement: f#(a1) ≥a (f( g(a1)) C A

  37. Concrete Transformers • CS[stmt] : Data-State -> Data-State • CS[ x = y ] s = s[x s(y)] • CS[ x = *y ] s = s[x s(s(y))] • CS[ *x = y ] s = s[s(x) s(y)] • CS[ x = null ] s = s[x null] • CS*[stmt] : 2Data-State -> 2Data-State • CS*[st] X = { CS[st]s | s Î X }

  38. Abstract Transformers • AS[stmt] : AbsDataState -> AbsDataState • AS[ x = y ] s = s[x s(y)] • AS[ x = null ] s = s[x {null}] • AS[ x = *y ] s = s[x s*(s(y))] where s*({v1,…,vn}) = s(v1) È … È s(vn) • AS[ *x = y ] s = ???

  39. Algorithm A: TranformersWeak/Strong Update x: &y y: &x z: &a g x: {&y} y: {&x,&z} z: {&a} x: &y y: &z z: &a f f# *y = &b; *y = &b; x: &b y: &x z: &a x: {&y,&b} y: {&x,&z} z: {&a,&b} x: &y y: &z z: &b a

  40. Algorithm A: TranformersWeak/Strong Update x: &y y: &x z: &a g x: {&y} y: {&x,&z} z: {&a} x: &y y: &z z: &a f f# *x = &b; *x = &b; x: &y y: &b z: &a x: {&y} y: {&b} z: {&a} x: &y y: &b z: &a a

  41. Algorithm A: TransformersWeak/Strong Update • Transformer for “*p = q”

More Related