600 likes | 626 Views
Learn about the TVLA approach for abstract interpreters in software verification. This system integrates verification with heap analysis, using logical structures to express concrete semantics and ensure program correctness. Explore how this method detects memory leaks, dangling pointers, and eliminates false alarms.
E N D
TVLA: A System for Generating Abstract Interpreters T. Lev-Ami, R. Manevich, M. Sagiv http://www.cs.tau.ac.il/~tvla A. Loginov, G. Ramalingam, E. Yahav T. Reps & R. Wilhelm
Object Oriented Programs • Extensive use of the heap • Dynamic allocation • Recursive data structures • Destructive reference updates • x.f := y
Verification of OO Programs • Verify properties about the heap • No runtime errors • Null dereferences • Freed memory • Dangling pointers • No memory leaks • Partial correctness
A Common Approach to Pointer Analysis in Software Verification Pointer Analysis Verification Analysis Program • Break verification problem into two phases • Preprocessing phase – separate points-to analysis • typically no distinction between objects allocated at same site • Verification phase – verification using points-to results • May lose precision • May lose ability to perform strong updates • May produce false alarms Property
Loss of Precision in Two-Phase Approach Verify f is not read after it is closed f = new InputStream(); f.read(); f.close(); Straightforward ...
Loss of Precision in Two-Phase Approach while (?) { f = new InputStream(); f.read(); f.close(); } f1 f closed=1/2 False alarm! “read may be erroneous”
The TVLA Approach • Express concrete semantics using first order logic • Abstract Interpretation using 3-valued Kleene’s logic • verification integrated with heap analysis • heap analysis (merging criterion) can adapt to verification Frontend AbstractInterpretation First-OrderTransition Sys Program Property
Concrete States Abstract States f f f closed f closed f closed f closed closed closed closed closed f … The TVLA Approach: An Example while (?) { f = new InputStream(); f.read(); f.close(); }
Outline • Expressing concrete semantics using logic (TVLA input) • Canonincal abstraction • Abstract interpretation using Canonincal abstraction (TVLA engine) • Applications & ongoing Work
90 40 79 50 30 x 79 50 40 0 90 4 3 5 2 1 p Traditional Heap Interpretation • States = Two level stores • env: Var Values • fields: Loc Values • Values=Loc Atoms • Example • env = [x 30, p 79] • next = [30 40, 40 50, 50 79, 79 90] • val = [30 1, 40 2, 50 3, 79 4, 90 5]
Predicate Logic • Vocabulary • A finite set of predicate symbols Peach with a fixed arity • Logical Structures S provide meaning for predicates • A set of individuals (nodes) U • pS: (US)k {0, 1} • FOTC over TC, express logical structure properties
n n n n u1 u2 u3 u4 u5 x p Representing Stores as Logical Structures • Locations Individuals • Program variables Unary predicates • Fields Binary predicates • Example • U = {u1, u2, u3, u4, u5} • x = {u1}, p = {u3} • next = {<u1, u2>, <u2, u3>, <u3, u4>, <u4, u5>}
Invariants • No memory leaksv: active(v) {x PVar} v1: x(v1) next*(v1, v) • Acyclic list(x)v1,v2: x(v1) next*(v1, v2) next+(v2, v1) • Reverse (x)v1,v2, v3: x(v1) next*(v1, v2) next(v2, v3) next’(v3, v2)
1/2 Information order Kleene 3-Valued Logic • 1: True • 0: False • 1/2: Unknown • A join semi-lattice: 0 1 = 1/2 Logical order
3-Valued Logical Structures • A set of individuals (nodes) U • Predicate meaning • pS: (US)k {0, 1, 1/2}
Canonical Abstraction • Partition the individuals into equivalence classes based on the values of their unary predicates • Every individual is mapped into its equivalence class • Collapse predicates via • pS(u’1, ..., u’k) = {pB(u1, ..., uk) | f(u1)=u’1, ..., f(u’k)=u’k) } • At most 2A abstract individuals
n n n n x x t t n n Canonical Abstraction x = NULL; while (…) do { t = malloc(); t next=x; x = t } u2 u2 u1 u1 u3 u3 u2,3 u1 x t
n n u2,3 u1 x t n Canonical Abstraction x = NULL; while (…) do { t = malloc(); t next=x; x = t } n n u2 u1 u3 x t
Canonical Abstraction and Equality • Summary nodes may represent more than one element • (In)equality need not be preserved under abstraction • Explicitly record equality • Summary nodes are nodes with eq(u, u)=1/2
eq Canonical Abstraction and Equality eq eq eq x = NULL; while (…) do { t = malloc(); t next=x; x = t } eq n n u2 u1 u3 eq x t eq eq eq eq n u2,3 u2,3 u1 x t n
n u2,3 u1 x t n Canonical Abstraction x = NULL; while (…) do { t = malloc(); t next=x; x = t } n n u2 u1 u3 x t
Heap Sharing predicate v:is(v) v1,v2: next(v1,v) next(v2,v) eq(v1, v2) is(v)=0 is(v)=0 is(v)=0 … u2 u1 un x n n n t n u2..n u1 x t n is(v)=0 is(v)=0
n n n u2 u3..n u1 x t n is(v)=0 is(v)=1 is(v)=0 Heap Sharing predicate v:is(v) v1,v2: next(v1,v) next(v2,v) eq(v1, v2) is(v)=0 is(v)=1 is(v)=0 … u2 u1 un x n n n t n
Concrete Representation Abstract Representation Concretization Abstract Interpretation Collecting Interpretation st# stc Concrete Representation Abstract Representation Abstraction Best Conservative Interpretation (CC79)
x y y . . . x Evaluate update formulas x y x y Canonincal x x y y . . . “Focus”- Based Transformer (x = x n) x y inverse embedding Canonincal abstraction
x y Focus(x n) x “Partial ” y Evaluate update Formulas (Kleene) x x Canonincal y y x x y y “Focus”-Based Transformer (x = x n) x y
x = NULL n n t t t empty x x n x F T n n t t t t =malloc(..); t n x x x n n n n tnext=x; t t t t n x x x n n n x = t t t t t x n x x n x return x Example: Abstract Interpretation
Cleanness Analysis of Linked Lists (Nurit Dor, SAS 2000) • Static analysis of C programs manipulating linked lists • Defects checked • References to freed memory • Null dereferences • Memory leaks • Existing algorithms are inadequate • Miss errors • Report too many false alarms
Dereference of NULL pointers bool search(int value, Elements *c) {Elements *elem;for (elem = c; c != NULL; elem = elem->next;) if (elem->val == value) return TRUE; return FALSE typedef struct element { int value; struct element *next; } Elements
Dereference of NULL pointers bool search(int value, Elements *c) {Elements *elem;for (elem = c; c != NULL; elem = elem->next;) if (elem->val == value) return TRUE; return FALSE typedef struct element { int value; struct element *next; } Elements potential null dereference
Dereference of NULL pointers bool search(int value, Elements *c) {Elements *elem;for (elem = c; elem != NULL; elem = elem->next;) if (elem->val == value) return TRUE; return FALSE typedef struct element { int value; struct element *next; } Elements potential null dereference
Memory leakage typedef struct element { int value; struct element *next; } Elements Elements* reverse(Elements *c){ Elements *h,*g;h = NULL;while (c!= NULL) { g = c->next; h = c; c->next = h; c = g; }return h;
Memory leakage typedef struct element { int value; struct element *next; } Elements Elements* reverse(Elements *c){ Elements *h,*g;h = NULL;while (c!= NULL) { g = c->next;h = c; c->next = h; c = g; }return h; leakage of address pointed-by h
Memory leakage typedef struct element { int value; struct element *next; } Elements Elements* reverse(Elements *c){ Elements *h,*g;h = NULL;while (c!= NULL) { g = c->next; h = c; c->next = h; c = g; }return h; ✔No memory leaks
Proving Correctness of Sorting Implementations (Lev-Ami, Reps, S, Wilhelm ISSTA 2000) • Partial correctness • The elements are sorted • The list is a permutation of the original list • Termination • At every loop iterations the set of elements reachable from the head is decreased
Example: InsertSort List InsertSort(List x) { List r, pr, rn, l, pl; r = x; pr = NULL; while (r != NULL) { l = x; rn = r n; pl = NULL; while (l != r) { if (l data > r data) { pr n = rn; r n = l; if (pl == NULL) x = r; else pl n = r; r = pr; break; } pl = l; l = l n; } pr = r; r = rn; } return x; } typedef struct list_cell { int data; struct list_cell *n; } *List; • v:inOrder[dle, n](v) v1: next(v, v1) dle(v,v1) assert sorted[x]; assert permutation[x,x’]
Example: InsertSort List InsertSort(List x) { if (x == NULL) return NULL pr = x; r = x->n; while (r != NULL) { pl = x; rn = r->n; l = x->n; while (l != r) { pr->n = rn ; r->n = l; pl->n = r; r = pr; break; } pl = l; l = l->n; } pr = r; r = rn; } typedef struct list_cell { int data; struct list_cell *n; } *List; Run Demo 14
Example: Mark and Sweep void Mark(Node root) { if (root != NULL) { pending = pending = pending {root} marked = while (pending ) { x = SelectAndRemove(pending) marked = marked {x} t = x left if (t NULL) if (t marked) pending = pending {t} t = x right if (t NULL) if (t marked) pending = pending {t} } } assert(marked == Reachset(root)) } void Sweep() { unexplored = Universe collected = while (unexplored ) { x = SelectAndRemove(unexplored) if (x marked) collected = collected {x} } assert(collected == Universe – Reachset(root) ) }
Challenges: Heap & Concurrency[Yahav POPL’01] • Concurrency with the heap is evil… • Java threads are just heap allocated objects • Data and control are strongly related • Thread-scheduling info may require understanding of heap structure (e.g., scheduling queue) • Heap analysis requires information about thread scheduling Thread t1 = new Thread(); Thread t2 = new Thread(); … t = t1; … t.start();
Configurations – Example held_by at[l_1] at[l_C] rval[myLock] rval[myLock] blocked at[l_1] at[l_0] at[l_0] rval[myLock] l_0: while (true) { l_1: synchronized(myLock) { l_C: // critical section l_2: } l_3: }
ConcreteConfiguration held_by at[l_1] at[l_C] rval[myLock] blocked rval[myLock] at[l_1] at[l_0] at[l_0] rval[myLock]
AbstractConfiguration held_by blocked at[l_1] at[l_C] rval[myLock] rval[myLock] at[l_0]
Compile-Time GC for Java(Ran Shaham, SAS’03) • The compiler can issue free when objects are no longer needed • Analysis of Java/JavaCard programs • Generic Java Bytecode Frontend • Precise analysis but expensive (cost of procedures)
More Scalable Solution[Gilad Arnold] • Combine backward and forward analysis • Forward analysis determine the shape • Backward analysis locates heap liveness • Use
Verifying Safety Propertiesusing Separationand Heterogeneous AbstractionsPLDI’04 E. Yahav School of Computer Science Tel-Aviv University G. Ramalingam IBM T.J. Watson Research Center
Quick Overview:Fine Grained Heap Abstraction Heap H Precise ... ... but often too expensive