1 / 60

TVLA: A System for Generating Abstract Interpreters

Learn about the TVLA approach for abstract interpreters in software verification. This system integrates verification with heap analysis, using logical structures to express concrete semantics and ensure program correctness. Explore how this method detects memory leaks, dangling pointers, and eliminates false alarms.

paigedawson
Download Presentation

TVLA: A System for Generating Abstract Interpreters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TVLA: A System for Generating Abstract Interpreters T. Lev-Ami, R. Manevich, M. Sagiv http://www.cs.tau.ac.il/~tvla A. Loginov, G. Ramalingam, E. Yahav T. Reps & R. Wilhelm

  2. Object Oriented Programs • Extensive use of the heap • Dynamic allocation • Recursive data structures • Destructive reference updates • x.f := y

  3. Verification of OO Programs • Verify properties about the heap • No runtime errors • Null dereferences • Freed memory • Dangling pointers • No memory leaks • Partial correctness

  4. A Common Approach to Pointer Analysis in Software Verification Pointer Analysis Verification Analysis Program • Break verification problem into two phases • Preprocessing phase – separate points-to analysis • typically no distinction between objects allocated at same site • Verification phase – verification using points-to results • May lose precision • May lose ability to perform strong updates • May produce false alarms Property

  5. Loss of Precision in Two-Phase Approach Verify f is not read after it is closed f = new InputStream(); f.read(); f.close(); Straightforward ...

  6. Loss of Precision in Two-Phase Approach while (?) { f = new InputStream(); f.read(); f.close(); } f1 f closed=1/2 False alarm! “read may be erroneous”

  7. The TVLA Approach • Express concrete semantics using first order logic • Abstract Interpretation using 3-valued Kleene’s logic • verification integrated with heap analysis • heap analysis (merging criterion) can adapt to verification Frontend AbstractInterpretation First-OrderTransition Sys Program Property

  8. Concrete States Abstract States f f f closed f closed f closed f closed closed closed closed closed f … The TVLA Approach: An Example while (?) { f = new InputStream(); f.read(); f.close(); }

  9. Outline • Expressing concrete semantics using logic (TVLA input) • Canonincal abstraction • Abstract interpretation using Canonincal abstraction (TVLA engine) • Applications & ongoing Work

  10. 90 40 79 50 30 x 79 50 40 0 90 4 3 5 2 1 p Traditional Heap Interpretation • States = Two level stores • env: Var Values • fields: Loc Values • Values=Loc Atoms • Example • env = [x  30, p  79] • next = [30 40, 40  50, 50 79, 79  90] • val = [30 1, 40  2, 50 3, 79  4, 90 5]

  11. Predicate Logic • Vocabulary • A finite set of predicate symbols Peach with a fixed arity • Logical Structures S provide meaning for predicates • A set of individuals (nodes) U • pS: (US)k {0, 1} • FOTC over TC, express logical structure properties

  12. n n n n u1 u2 u3 u4 u5 x p Representing Stores as Logical Structures • Locations  Individuals • Program variables  Unary predicates • Fields  Binary predicates • Example • U = {u1, u2, u3, u4, u5} • x = {u1}, p = {u3} • next = {<u1, u2>, <u2, u3>, <u3, u4>, <u4, u5>}

  13. Concrete Interpretation Rules (Lists)

  14. Invariants • No memory leaksv: active(v)  {x PVar} v1: x(v1)  next*(v1, v) • Acyclic list(x)v1,v2: x(v1)  next*(v1, v2)  next+(v2, v1) • Reverse (x)v1,v2, v3: x(v1)  next*(v1, v2)  next(v2, v3)  next’(v3, v2)

  15. 1/2 Information order   Kleene 3-Valued Logic • 1: True • 0: False • 1/2: Unknown • A join semi-lattice: 0  1 = 1/2 Logical order

  16. Boolean Connectives [Kleene]

  17. 3-Valued Logical Structures • A set of individuals (nodes) U • Predicate meaning • pS: (US)k {0, 1, 1/2}

  18. Canonical Abstraction • Partition the individuals into equivalence classes based on the values of their unary predicates • Every individual is mapped into its equivalence class • Collapse predicates via  • pS(u’1, ..., u’k) =  {pB(u1, ..., uk) | f(u1)=u’1, ..., f(u’k)=u’k) } • At most 2A abstract individuals

  19. n n n n x x t t n n Canonical Abstraction x = NULL; while (…) do { t = malloc(); t next=x; x = t } u2 u2 u1 u1 u3 u3 u2,3 u1 x t

  20. n  n u2,3 u1 x  t n Canonical Abstraction x = NULL; while (…) do { t = malloc(); t next=x; x = t } n n u2 u1 u3 x t

  21. Canonical Abstraction and Equality • Summary nodes may represent more than one element • (In)equality need not be preserved under abstraction • Explicitly record equality • Summary nodes are nodes with eq(u, u)=1/2

  22. eq  Canonical Abstraction and Equality eq eq eq x = NULL; while (…) do { t = malloc(); t next=x; x = t } eq n n u2 u1 u3 eq x t eq eq eq eq n u2,3 u2,3 u1 x t n

  23. n u2,3 u1 x t n Canonical Abstraction x = NULL; while (…) do { t = malloc(); t next=x; x = t } n n u2 u1 u3 x t

  24. Heap Sharing predicate v:is(v)  v1,v2: next(v1,v)  next(v2,v)  eq(v1, v2) is(v)=0 is(v)=0 is(v)=0 … u2 u1 un x n n n t n u2..n u1 x t n is(v)=0 is(v)=0

  25. n n n u2 u3..n u1 x t n is(v)=0 is(v)=1 is(v)=0 Heap Sharing predicate v:is(v)  v1,v2: next(v1,v)  next(v2,v)  eq(v1, v2) is(v)=0 is(v)=1 is(v)=0 … u2 u1 un x n n n t n

  26. Concrete Representation Abstract Representation Concretization Abstract Interpretation Collecting Interpretation st# stc Concrete Representation Abstract Representation Abstraction Best Conservative Interpretation (CC79)

  27. x y  y . . . x Evaluate update formulas x y x y Canonincal x x y y . . . “Focus”- Based Transformer (x = x  n) x y inverse embedding Canonincal abstraction

  28. x y Focus(x  n) x “Partial ” y Evaluate update Formulas (Kleene) x x Canonincal y y x x y y “Focus”-Based Transformer (x = x  n) x y

  29. x = NULL n n t t t empty x x n x F T n n t t t t =malloc(..); t n x x x n n n n tnext=x; t t t t n x x x n n n x = t t t t t x n x x n x return x Example: Abstract Interpretation

  30. Summary

  31. Cleanness Analysis of Linked Lists (Nurit Dor, SAS 2000) • Static analysis of C programs manipulating linked lists • Defects checked • References to freed memory • Null dereferences • Memory leaks • Existing algorithms are inadequate • Miss errors • Report too many false alarms

  32. Dereference of NULL pointers bool search(int value, Elements *c) {Elements *elem;for (elem = c; c != NULL; elem = elem->next;) if (elem->val == value) return TRUE; return FALSE typedef struct element { int value; struct element *next; } Elements

  33. Dereference of NULL pointers bool search(int value, Elements *c) {Elements *elem;for (elem = c; c != NULL; elem = elem->next;) if (elem->val == value) return TRUE; return FALSE typedef struct element { int value; struct element *next; } Elements potential null dereference

  34. Dereference of NULL pointers bool search(int value, Elements *c) {Elements *elem;for (elem = c; elem != NULL; elem = elem->next;) if (elem->val == value) return TRUE; return FALSE typedef struct element { int value; struct element *next; } Elements potential null dereference

  35. Memory leakage typedef struct element { int value; struct element *next; } Elements Elements* reverse(Elements *c){ Elements *h,*g;h = NULL;while (c!= NULL) { g = c->next; h = c; c->next = h; c = g; }return h;

  36. Memory leakage typedef struct element { int value; struct element *next; } Elements Elements* reverse(Elements *c){ Elements *h,*g;h = NULL;while (c!= NULL) { g = c->next;h = c; c->next = h; c = g; }return h; leakage of address pointed-by h

  37. Memory leakage typedef struct element { int value; struct element *next; } Elements Elements* reverse(Elements *c){ Elements *h,*g;h = NULL;while (c!= NULL) { g = c->next; h = c; c->next = h; c = g; }return h; ✔No memory leaks

  38. Proving Correctness of Sorting Implementations (Lev-Ami, Reps, S, Wilhelm ISSTA 2000) • Partial correctness • The elements are sorted • The list is a permutation of the original list • Termination • At every loop iterations the set of elements reachable from the head is decreased

  39. Example: InsertSort List InsertSort(List x) { List r, pr, rn, l, pl; r = x; pr = NULL; while (r != NULL) { l = x; rn = r  n; pl = NULL; while (l != r) { if (l  data > r  data) { pr  n = rn; r  n = l; if (pl == NULL) x = r; else pl  n = r; r = pr; break; } pl = l; l = l  n; } pr = r; r = rn; } return x; } typedef struct list_cell { int data; struct list_cell *n; } *List; • v:inOrder[dle, n](v)  v1: next(v, v1)  dle(v,v1) assert sorted[x]; assert permutation[x,x’]

  40. Example: InsertSort List InsertSort(List x) { if (x == NULL) return NULL pr = x; r = x->n; while (r != NULL) { pl = x; rn = r->n; l = x->n; while (l != r) { pr->n = rn ; r->n = l; pl->n = r; r = pr; break; } pl = l; l = l->n; } pr = r; r = rn; } typedef struct list_cell { int data; struct list_cell *n; } *List; Run Demo 14

  41. Example: Mark and Sweep void Mark(Node root) { if (root != NULL) { pending =  pending = pending  {root} marked =  while (pending ) { x = SelectAndRemove(pending) marked = marked  {x} t = x  left if (t  NULL) if (t  marked) pending = pending  {t} t = x  right if (t  NULL) if (t  marked) pending = pending  {t} } } assert(marked == Reachset(root)) } void Sweep() { unexplored = Universe collected =  while (unexplored ) { x = SelectAndRemove(unexplored) if (x  marked) collected = collected  {x} } assert(collected == Universe – Reachset(root) ) }

  42. Challenges: Heap & Concurrency[Yahav POPL’01] • Concurrency with the heap is evil… • Java threads are just heap allocated objects • Data and control are strongly related • Thread-scheduling info may require understanding of heap structure (e.g., scheduling queue) • Heap analysis requires information about thread scheduling Thread t1 = new Thread(); Thread t2 = new Thread(); … t = t1; … t.start();

  43. Configurations – Example held_by at[l_1] at[l_C] rval[myLock] rval[myLock] blocked at[l_1] at[l_0] at[l_0] rval[myLock] l_0: while (true) { l_1: synchronized(myLock) { l_C: // critical section l_2: } l_3: }

  44. ConcreteConfiguration held_by at[l_1] at[l_C] rval[myLock] blocked rval[myLock] at[l_1] at[l_0] at[l_0] rval[myLock]

  45. AbstractConfiguration held_by blocked at[l_1] at[l_C] rval[myLock] rval[myLock] at[l_0]

  46. ExamplesVerified

  47. Compile-Time GC for Java(Ran Shaham, SAS’03) • The compiler can issue free when objects are no longer needed • Analysis of Java/JavaCard programs • Generic Java Bytecode Frontend • Precise analysis but expensive (cost of procedures)

  48. More Scalable Solution[Gilad Arnold] • Combine backward and forward analysis • Forward analysis determine the shape • Backward analysis locates heap liveness • Use 

  49. Verifying Safety Propertiesusing Separationand Heterogeneous AbstractionsPLDI’04 E. Yahav School of Computer Science Tel-Aviv University G. Ramalingam IBM T.J. Watson Research Center

  50. Quick Overview:Fine Grained Heap Abstraction Heap H Precise ... ... but often too expensive

More Related