1 / 34

Putting Static Analysis to Work for Verification A Case Study

This case study explores the use of static analysis for program verification, specifically focusing on the challenges of handling pointers and dynamically allocated objects. The study presents a prototype implementation in TVLA (Three-Valued Logic Analyzer) and discusses the benefits and limitations of this approach.

gillies
Download Presentation

Putting Static Analysis to Work for Verification A Case Study

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Putting Static Analysis to Work for Verification A Case Study Tal Lev-Ami Thomas Reps Mooly Sagiv Reinhard Wilhelm

  2. Program Verification • Mathematically prove that the program is “partially” correct on all inputs • Example: Hoare style verification x  n {x := x + 1} x  n + 1

  3. Why Use Program Verification? • Debugging programs is hard • Testing can only show the presence of errors - not their absence • Can provide counter examples • ...

  4. Obstacles to Program Verification • Hard to specify software • Does not “scale” • Limited program size • Programmer needs to provide loop invariants • Pointers and dynamically allocated objects are not handled

  5. Our Goals • Handle pointers and dynamically allocated objects (unbounded memory and/or multi-threading) • No loop invariants • Input: pre {Procedure} post • Output: • A safe approximation to the strongest postcondition p • Issue a warning if p  post • Conservative: • Never misses an error • May yield false warnings

  6. L insert_sort(L x) { L r, pr, rn, l, pl; r = x; pr = NULL; while (r != NULL) { l = x; rn = r ->n; pl = NULL; while (l != r) { if (l->data > r->data) { pr->n = rn; r->n = l; if (pl == NULL) x = r; else pl->n = r; r = pr; break; } pl = l; l = l->n; } pr = r; r = rn; } return x; } list(x) typedef struct node { int data; struct node *n; *L; olist(x)

  7. int main() { L x, y, z, w; L create(), insert_sort(L); L merge(L,L), reverse(L); x = create(); x = insert_sort(x); y = create(); y = insert_sort(y); z = merge(x,y); w = reverse(z); } list(x) olist(x) olist(y) list(y) olist(z) rolist(w)

  8. Formulae over program variables express pre- and post-conditions The assignment rule is used to generate the strongest postcondition for non-destructive updates Programmer provides loop invariants Finite set of descriptors express pre- and post-conditions Predicate-update formulae specify safe set of descriptors (abstract semantics) Iteratively explore all the descriptors at every program point (abstract interpretation) The ADT designer can provide domain specific information via instrumentation Our Approach Conventional Verification

  9. Outline of the Rest of this Talk • Concentrate on sorting • DescriptorsCompact representation of stores • State-space exploration via abstract interpretation • Prototype implementation in TVLAThree-Valued Logic Analyzer • Conclusions

  10. 19 7 96 null p[x](v) predicates n(v1, v2) dle(v1, v2) dle n n dle p[x]=1 p[x]=0 p[x]=0 dle dle dle dle Logical representation of stores x data n data n data n

  11. ½ Information order   Three-Valued Logic • 1 - True • 0 - False • ½ = {1, 0} Unknown • A join semi-lattice 0  1 = ½

  12. 19 7 96 null n n dle • n • n p[x]=0 p[x]=1 dle dle dle Blurred Representation of Stores x data n data n data n dle p[x]=1 p[x]=0 p[x]=0 dle dle dle dle

  13. Parametric Abstraction (Blur) • Merge all the nodes with the same unary “abstraction” predicate values into a single summary node • Join predicate values • Convert a structure of arbitrary size into a 3-valued structure of bounded size

  14. Instrumentation • Explicitly maintains information about distinctions among cells • Leads to less blurring when used as abstraction predicates • Unary predicates defined via a first order formula+transitive closure • Example “local order” • inOrder[n](v) = v1: n(v, v1) dle(v, v1) • inROrder[n](v) = v1: n(v, v1) dle(v1, v)

  15. dle n n • n • n p[x]=0 inOrder[n]=1 p[x]=1 inOrder[n]=1 dle dle dle Blurred Representation of Stores inOrder[n](v1) p[x](v) n(v1, v2) dle(v1, v2) dle p[x]=1 inOrder[n]=1 p[x]=0 inOrder[n]=1 p[x]=0 inOrder[n]=1 dle dle dle dle

  16. vs. • n • n p[x]=0 inOrder[n]=1 p[x]=1 inOrder[n]=1 dle dle dle Arbitrary Lists • n p[x]=1 inOrder[n]=½ p[x]=0 inOrder[n]=½ • n dle dle dle

  17. Abstract Interpretation • Iteratively compute a set of structures at every program location • Conservatively interpret statements (conditions) on blurred structures • Must terminate since the number of blurred structures is finite for a given program • Fully automatic • Guaranteed to be sound • But may be overly conservative

  18. Abstract Interpretation of Insertion Sort • n • n p[x]=1 inOrder[n]=½ p[x]=0 inOrder[n]=½ dle dle dle • n • n p[x]=0 inOrder[n]=1 p[x]=1 inOrder[n]=1 dle dle dle

  19. The Key Problem • How to interpret statements (conditions) on blurred structures? • Difficult to provide a conservative (and reasonably precise) interpretation • It is difficult to show that specific abstractions are conservative (Sagiv, Reps, Wilhelm, TOPLAS 98) • Long and intimidating proofs • Or no proofs (and bugs)

  20. The 3 Valued-Logic Approach • Automatically derives a conservative interpretation of statements and conditions from: • structural operational semantics • written using logical formulae • global properties • abstraction predicates • An experimental system (TVLA) • Correct by construction

  21. n n p[x]=1 p[y]=0 inOrder[n]=½ n p[x]=0 p[y]=1 inOrder[n]=½ p[x]=0 p[y]=0 inOrder[n]=½ dle dle dle dle dle dle • n n p[x]=1 p[y]=0 inOrder[n]=1 p[x]=0 p[y]=1 inOrder[n]=½ n p[x]=0 p[y]=0 inOrder[n]=½ dle dle dle dle x->d <= y->d v1, v2 :p[x] (v1 ) p[y]( v2) dle (v1 ,v2 ) true

  22. From Local Outlook to Global Outlook • (Safety) Every time control reaches a given point: • there are no garbage memory cells • the list is acyclic • each cell is locally ordered • (History) The list is a permutation of the original list

  23. Bugs Found • Pointer manipulations • null dereferences • memory leaks • Forget to sort the first element • Swap equal elements in bubble sort(non-termination)

  24. L insert_sort_b2(L x) { L r, pr, rn, l, pl; if (x == NULL) return NULL; pr = x; r = x->n; while (r != NULL) { pl = x; rn = r->n; l = x->n; while (l != r) { if (l->d > r->d) { pr->n = rn; r->n = l ; pl->n = r; r = pr; break } pl = l; l = l->n; } pr = r; r = rn; } return x; } • n • n p[x]=1 inOrder[n]=½ p[x]=0 inOrder[n]=1 dle dle dle

  25. Running Times

  26. Properties Not Proved • (Liveness) Termination • Stability

  27. Related Work • Temporal-logic model checking • Manually extracts finite-state machine • Does not handle dynamically allocated data • But proves stronger properties, e.g., liveness • Bourdoncle 93 • Handles integer arithmetic • Cannot handle pointers

  28. Further Work • Recursive programs (Quicksort) • Experiment with other ADTs (AVL trees) • Automatically derive predicate-update formulae for instrumentation predicates • Scaling to larger programs • User annotations • Class-level analysis • Modular analysis • Space optimizations • Smart front-end that precomputes “cheap” information

  29. Conclusions • It is possible to automatically verify non-trivial properties of complex C programs that manipulate dynamically allocated memory w/o providing loop invariant • The implementation is automatically generated from TVLA • But scaling is an issue

  30. Other Applications of TVLA • Verifying “cleanness” properties of C programs (Dor, Rodeh, Sagiv 2000) • null derefernces • memory leaks • Verifying safety properties of Mobile Ambients (Nielson, Nielson, Sagiv 2000) • Verifying safety programs of multithreaded Java programs (Yahav 2000) • Deadlocks • Nested monitors • Read/Write interference

  31. Boolean Connectives [Kleene]

  32. The Operational Semantics of x = t->n x’(v1, v2) = v1 : t (v1) n (v1, v2)

  33. The Operational Semantics of x->n = NULL n’(v1, v2) =n (v1, v2) x (v1) inOrder’[dle, n](v) =inOrder [dle, n](v)x (v) inROrder’[dle, n](v) =inROrder [dle, n](v)x (v)

  34. The Operational Semantics of x->n = t n’(v1, v2) =n (v1, v2) (x (v1) t(v2)) inOrder’[dle, n](v) = (x(v)? v1 : t(v1)  dle(v, v1): InOrder[dle, n](v) ) inROrder’[dle, n](v) = (x(v)? v1 : t(v1)  dle(v1, v): inROrder[dle, n](v) )

More Related