500 likes | 681 Views
3-Valued Logic Analyzer (TVP) Part II. Tal Lev-Ami and Mooly Sagiv. Outline. The Shape Analysis Problem Solving Shape Analysis with TVLA Structural Operational Semantics Predicate logic Embedding (Imprecise) Abstract Interpretation Instrumentation Predicates Focus Coerce Bibliography.
E N D
3-Valued Logic Analyzer(TVP)Part II Tal Lev-Ami and Mooly Sagiv
Outline • The Shape Analysis Problem • Solving Shape Analysis with TVLA • Structural Operational Semantics • Predicate logic • Embedding • (Imprecise) Abstract Interpretation • Instrumentation Predicates • Focus • Coerce • Bibliography
Shape Analysis • Determine the possible shapes of a dynamically allocated data structure at given program point • Relevant questions: • Does a variable point to an acyclic list? • Does a variable point to a doubly-linked list? • Does a variable point p to an allocated element every time p is dereferenced? • Can a procedure create a memory-leak
NULL dereference Dereference of NULL pointers typedef struct element { int value; struct element *next; } Elements bool search(int value, Elements *c) {Elements *elem;for ( elem = c; c != NULL;elem = elem->next;) if (elem->val == value) return TRUE; return FALSE
Memory leakage Elements* reverse(Elements *c){ Elements *h,*g; h = NULL; while (c!= NULL) { g = c->next; h = c; c->next = h; c = g; } return h; leakage of address pointed-by h
The SWhile Programming Language Abstract Syntax sel:= car | cdr a := x | x.sel | null | n | a1 opa a2 b := true | false | not b | b1 opb b2 | a1 opr a2 S := [x := a]l | [x.sel := a]l | [x := malloc()]l | [skip] l | S1 ; S2| if [b]lthen S1else S2 | while [b]l do S
NULL dereference Dereference of NULL pointers [elem := c;]1 [found := false;]2 while ([c != null]3 && [!found]4) ( if ([elem->car= value]5) then [found := true]6 else [elem = elem->cdr]7 )
Structural Operational Semanticsfor languages with dynamically allocated objects • The program state consists of: • current allocated objects • a mapping from variables into atoms, objects, and null • a car mapping from objects into atoms, objects, and null • a cdr mapping from objects into atoms, objects, and null • … • malloc() allocates more objects • assignments update the state
Structural Operational Semantics • The program state S=(O, env, car, cdr): • current allocated objects O • atoms (integers, Booleans) A • env: Var* A O {null} • car: A A O {null} • cdr: A A O {null} • The meaning of expressions Aa: SA O {null} • Aat(s) = at • Ax((O, env, car, cdr)) = env(x) • Ax.car((O, env, car, cdr)) = car(env(x)) • Ax.cdr((O, env, car, cdr)) = cdr(env(x))
Structural Semantics for SWhileaxioms [assvsos] <x := a, s=(O, e, car, cdr)> (O, e[x Aas], car, cdr) [asscarsos] <x.car := a, (O, e, car, cdr)> (O, e, car[e(x) Aas], cdr) [asscdrsos] <x.cdr := a, (O, e, car, cdr)> (O, e, car, cdr[e(x) Aas]) [assmsos] <x := malloc(), (O, e, car, cdr)> (O {n}, e[x n], car, cdr) where nO [skipsos] <skip, s> s
[ifttsos] <if b then S1 else S2, s> <S1, s> [ifffsos] <if b then S1 else S2, s> <S2, s> if Bbs=tt if Bbs=ff [comp1sos] <S1 , s> <S’1, s’> <S1; S2, s> < S’1; S2, s’> [comp2sos] <S1 , s> s’ <S1; S2, s> < S2, s’> Structural Semantics for SWhilerules
Summary • The SOS is natural • Can handle: • errors, e.g., null dereferences • free • garbage collection • But does not lead to an analysis • The set of potential objects is unbound • Solution: Three-Valued Kleene Predicate Logic
Predicate Logic • Vocabulary • A finite set of predicate symbols Peach with a fixed arity • A finite set of function symbols • Logical Structures S provide meaning for predicates • A set of individuals (nodes) U • PS: US {0, 1} • First-Order Formulas over express logical structure properties
Using Predicate Logic to describe states in SOS • U=O • For a Boolean variable x define a nullary predicate (proposition) b[x] • b[x] = 1 when env(x)=1 • For a pointer variable x define a unary predicate • p[x](u)=1 when env(x)=u and u is an object • Two binary predicates: • s[car](u1, u2) = 1 when car(u1)=u2 and u2 is object • s[cdr](u1, u2) = 1 when cdr(u1)=u2 and u2 is object
Running Example [elem := c;]1 [found := false;]2 while ([c != null]3 && [!found]4) ( if ([elem->car= value]5) then [found := true]6 else [elem = elem->cdr]7 )
%s Pvar {elem, c} %s Bvar {found} %s Sel {car, cdr} #include "pred.tvp" %% #include "cond.tvp" #include "stat.tvp" %% /* [elem := c;]1 */ l_1 Copy_Var(elem, c) l_2 /* [found := false;]2 */ l_2 Set_False(found) l_3 /* while ([c != null]3 && [!found]4) ( */ l_3 Is_Not_Null_Var (c) l_4 l_3 Is_Null_Var (c) l_end l_4 Is_False(found) l_5 l_4 Is_True(found) l_end /* if ([elem->car= value]5) */ l_5 Uninterpreted_Cond() l_6 l_5 Uninterpreted_Cond() l_7 /* then [found := true]6 */ l_6 Set_True(found) l_3 /* else [elem = elem->cdr]7 */ l_7 Get_Sel(cdr, elem, elem) l_3 /* ) */%% l_1, l_end
pred.tvp foreach (z in Bvar) { %p b[z]() } foreach (z in Pvar) { %p p[z](v) unique box } foreach (sel in Sel) { %p s[sel](v1, v2) function }
Actions • Use first order formulae over to express the SOS • Every action can have: • title %t • focus formula %f • precondition formula %p • error messages %message • new formula %new • predicate-update formulas {} • retain formula
cond.tvp (part 1) %action Uninterpreted_Cond() { %t "uninterpreted-Condition" } %action Is_True(x1) { %t x1 %p b[x1]() { b[x1]() = 1 } } %action Is_False(x1) { %t "!" + x1 %p !b[x1]() { b[x1]() = 0 } }
cond.tvp (part 2) %action Is_Not_Null_Var(x1) { %t x1 + " != null" %p E(v) p[x1](v) } %action Is_Null_Var(x1) { %t x1 + " = null" %p !(E(v) p[x1](v)) }
stat.tvp (part 1) %action Skip() { %t "Skip" } %action Set_True(x1) { %t x1 + " := true" { b[x1]() = 1 } } %action Set_False(x1) { %t x1 + " := false" { b[x1]() = 0 } }
stat.tvp (part 2) %action Copy_Var(x1, x2) { %t x1 + " := " + x2 { p[x1](v) = p[x2](v) } }
stat.tvp (part 3) %action Get_Sel(sel, x1, x2) { %t x1 + " := " + x2 + “.” + sel %message (!E(v) p[x2](v)) -> "an illegal dereference to" + sel + " component of " + x2 { p[x1](v) = E(v_1) p[x2](v_1) & s[sel](v_1, v) } }
stat.tvp (part 4) %action Set_Sel_Null(x1, sel) { %t x1 + "." + sel + " := null" %message (!E(v) p[x1](v)) -> "an illegal dereference to" + sel + " component of " + x1 { s[sel](v_1, v_2) = s[sel](v_1, v_2) & !p[x1](v_1) } }
stat.tvp (part 5) %action Set_Sel(x1, sel, x2) { %t x1 + “.” + sel + " := " + x2 %message (E(v, v1) p[x1](v) & s[sel](v, v1)) -> "Internal Error! assume that " + x1 + "." + sel + ==NULL" %message (!E(v) p[x1](v)) -> "an illegal dereference to" + sel + " component of " + x1 { s[sel](v_1, v_2) = s[sel](v_1, v_2) | p[x1](v_1) & p[x2](v_2) } }
stat.tvp (part 6) %action Malloc(x1) { %t x1 + " := malloc()" %new { p[x1](v) = isNew(v) } }
information order 01=1/2 Logical order 3-Valued Kleene Logic • A logic with 3-values • 0 -false • 1 - true • 1/2 - don’t know • Operators are conservatively interpreted • 1/2 means either true or false 1/2 0 1
3-Valued Predicate Logic • Vocabulary • A finite set of predicate symbols P • A special unary predicate sm • sm(u)=0 when u represents a unique concrete node • sm(u)=1/2 when u may represent more than one concrete node • 3-valued Logical Structures Sprovide meaning for predicates • A (bounded) set of individuals (nodes) U • PS: US {0, 1/2, 1} • First-Order Formulas over express logical structure properties • Interpret as maximum on logical order
The Blur Operation • Abstract an arbitrary structure into a structure of bounded size • Select a set of unary predicates as abstraction-predicates • Map all the nodes with the same value of abstraction predicates into a single summary node • Join the values of other predicates
The Embedding Theorem • If a big structure B can be embedded in a structure S via a surjective (onto) function f such that all predicate values are preserved, i.e.,pB(u1, .., uk) pS (f(u1), ..., f(uk)) • Then, every formula is preserved is preserved • =1 in S =1 in B • =0 in S =0 in B • =1/2 in S don’t know
Naive Program Analysis via 3-valued predicate logic • Chaotic iterations • Start with the initial 3-valued structure • Execute every action in three phases: • check if precondition is satisfied • execute update formulas • execute blur • Command line tvla prgm prgm -action pub
prgm.tvs %n = {u, u0} %p = { sm = {u:1/2} s[cdr] = {u->u:1/2, u0->u:1/2} p[c] = {u0} }
More Precise Shape Analysis • Distinguish between cyclic and acyclic lists • Use Focus to guarantee that important formulas do not evaluate to 1/2 • Use Coerce to maintain global invariants • It all works • Singly linked lists (reverse, insert, delete, del_all) • Sortedness (bubble-sort, insetion-sort, reverse) • Doubly linked lists (insert, delete • Mobile code (router) • Java multithreading (interference, concurrent-queue)
The Instrumentation Principle • Increase precision by storing the truth-value of some designated formulae • Introduce predicate-update formulae to update the extra predicates
x 31 71 91 is = 0 is = 0 is = 0 is = 0 is = 0 is = 0 Example: Heap Sharing is[cdr](v) = v1,v2: cdr(v1,v) cdr(v2,v) v1 v2 x x u u u1 u1
x 31 71 91 is = 0 is = 0 is = 0 is = 0 Example: Heap Sharing is[cdr](v) = v1,v2: cdr(v1,v) cdr(v2,v) v1 v2 is = 1 x x u u u1 u1 is = 0 is = 1 is = 0
pred.tvp foreach (z in Bvar) { %p b[z]() } foreach (z in Pvar) { %p p[z](v) unique box } foreach (sel in Sel) { %p s[sel](v1, v2) function } foreach (sel in Sel) { %i is[sel](v) = E(v1, v2) sel(v_1) & sel(v2, v) & v_1 != v_2 }
stat.tvp (part 4) %action Set_Sel_Null(x1, sel) { %t x1 + "." + sel + " := null" %message (!E(v) p[x1](v)) -> "an illegal dereference to" + sel + " component of " + x1 { s[sel](v_1, v_2) = s[sel](v_1, v_2) & !p[x1](v_1) is[sel](v) = is(v) & (!(E(v_1) x1(v_1) & sel(v_1, v)) | E(v_1, v_2) v_1 != v_2 & (sel(v_1, v) & !x1(v_1)) & (sel(v_2, v) & !x1(v_2))) } }
stat.tvp (part 5) %action Set_Sel(x1, sel, x2) { %t x1 + “.” + sel + " := " + x2 %message (E(v, v1) p[x1](v) & s[sel](v, v1)) -> "Internal Error! assume that " + x1 + "." + sel + ==NULL" %message (!E(v) p[x1](v)) -> "an illegal dereference to" + sel + " component of " + x1 { s[sel](v_1, v_2) = s[sel](v_1, v_2) | p[x1](v_1) & p[x2](v_2) is[sel](v) = is[sel](v) | E(v_1) x2(v) & sel(v_1, v) } }
Additional Instrumentation Predicates • reachable-from-variable-x(v)v1:x(v1) cdr*(v1,v) • cyclic-along-dimension-d(v) cdr+(v, v) • ordered elementinOrder(v) v1:cdr(v, v_1)v->d <= v_1->d • doubly linked lists
The Focusing Principle • To increase precision • “Bring the predicate-update formula into focus” (Force 1/2 to 0 or 1) • Then apply the predicate-update formulas
x x x y x y u1 (1) Focus on v1: x(v1) cdr(v1,v) u u1 u1 u u y y u1 u.1 u.0
x x y x x x y x u u1 u u1 y u1 u.1 u.0 y (2) Evaluate Predicate-Update Formulae x(v) = v1: x(v1) cdr(v1,v) u u1 u1 u y u1 u.1 u.0
The Coercion Principle • Increase precision by exploiting some structural properties possessed by all stores (Global invariants) • Structural properties captured by constraints • Apply a constraint solver
x x x x x x u u u1 u1 u u1 u u1 y y u1 u.1 u.0 u1 u.1 u.0 y y (3) Apply Constraint Solver
Conclusion • TVLA allows construction of non trivial analyses • But it is no panacea • Expressing operational semantics using logical formulas is not always easy • Need instrumentation to be reasonably precise (sometimes help efficiency as well) • Open problems: • A debugger for TVLA • Frontends • Algorithmic problems: • Space optimizations
Bibliography • Chapter 2.6 • http://www.cs.uni-sb.de/~wilhelm/foiles/(Invited talk CC’2000) • http://www.cs.wisc.edu/~reps/#shape_analysisParametric Shape Analysis based on 3-valued logics (the general theory) • http://www.math.tau.ac.il/~tla/The system and its applications