Putting Static Analysis to Work for Verification A Case Study

Putting Static Analysis to Work for Verification A Case Study Tal Lev-Ami Thomas Reps Mooly Sagiv Reinhard Wilhelm

Program Verification • Mathematically prove that the program is “partially” correct on all inputs • Example: Hoare style verification x  n {x := x + 1} x  n + 1

Why Use Program Verification? • Debugging programs is hard • Testing can only show the presence of errors - not their absence • Can provide counter examples • ...

Obstacles to Program Verification • Hard to specify software • Does not “scale” • Limited program size • Programmer needs to provide loop invariants • Pointers and dynamically allocated objects are not handled

Our Goals • Handle pointers and dynamically allocated objects (unbounded memory and/or multi-threading) • No loop invariants • Input: pre {Procedure} post • Output: • A safe approximation to the strongest postcondition p • Issue a warning if p  post • Conservative: • Never misses an error • May yield false warnings

L insert_sort(L x) { L r, pr, rn, l, pl; r = x; pr = NULL; while (r != NULL) { l = x; rn = r ->n; pl = NULL; while (l != r) { if (l->data > r->data) { pr->n = rn; r->n = l; if (pl == NULL) x = r; else pl->n = r; r = pr; break; } pl = l; l = l->n; } pr = r; r = rn; } return x; } list(x) typedef struct node { int data; struct node *n; *L; olist(x)

int main() { L x, y, z, w; L create(), insert_sort(L); L merge(L,L), reverse(L); x = create(); x = insert_sort(x); y = create(); y = insert_sort(y); z = merge(x,y); w = reverse(z); } list(x) olist(x) olist(y) list(y) olist(z) rolist(w)

Formulae over program variables express pre- and post-conditions The assignment rule is used to generate the strongest postcondition for non-destructive updates Programmer provides loop invariants Finite set of descriptors express pre- and post-conditions Predicate-update formulae specify safe set of descriptors (abstract semantics) Iteratively explore all the descriptors at every program point (abstract interpretation) The ADT designer can provide domain specific information via instrumentation Our Approach Conventional Verification

Outline of the Rest of this Talk • Concentrate on sorting • DescriptorsCompact representation of stores • State-space exploration via abstract interpretation • Prototype implementation in TVLAThree-Valued Logic Analyzer • Conclusions

19 7 96 null p[x](v) predicates n(v1, v2) dle(v1, v2) dle n n dle p[x]=1 p[x]=0 p[x]=0 dle dle dle dle Logical representation of stores x data n data n data n

½ Information order   Three-Valued Logic • 1 - True • 0 - False • ½ = {1, 0} Unknown • A join semi-lattice 0  1 = ½

19 7 96 null n n dle • n • n p[x]=0 p[x]=1 dle dle dle Blurred Representation of Stores x data n data n data n dle p[x]=1 p[x]=0 p[x]=0 dle dle dle dle

Parametric Abstraction (Blur) • Merge all the nodes with the same unary “abstraction” predicate values into a single summary node • Join predicate values • Convert a structure of arbitrary size into a 3-valued structure of bounded size

Instrumentation • Explicitly maintains information about distinctions among cells • Leads to less blurring when used as abstraction predicates • Unary predicates defined via a first order formula+transitive closure • Example “local order” • inOrder[n](v) = v1: n(v, v1) dle(v, v1) • inROrder[n](v) = v1: n(v, v1) dle(v1, v)

dle n n • n • n p[x]=0 inOrder[n]=1 p[x]=1 inOrder[n]=1 dle dle dle Blurred Representation of Stores inOrder[n](v1) p[x](v) n(v1, v2) dle(v1, v2) dle p[x]=1 inOrder[n]=1 p[x]=0 inOrder[n]=1 p[x]=0 inOrder[n]=1 dle dle dle dle

vs. • n • n p[x]=0 inOrder[n]=1 p[x]=1 inOrder[n]=1 dle dle dle Arbitrary Lists • n p[x]=1 inOrder[n]=½ p[x]=0 inOrder[n]=½ • n dle dle dle

Abstract Interpretation • Iteratively compute a set of structures at every program location • Conservatively interpret statements (conditions) on blurred structures • Must terminate since the number of blurred structures is finite for a given program • Fully automatic • Guaranteed to be sound • But may be overly conservative

Abstract Interpretation of Insertion Sort • n • n p[x]=1 inOrder[n]=½ p[x]=0 inOrder[n]=½ dle dle dle • n • n p[x]=0 inOrder[n]=1 p[x]=1 inOrder[n]=1 dle dle dle

The Key Problem • How to interpret statements (conditions) on blurred structures? • Difficult to provide a conservative (and reasonably precise) interpretation • It is difficult to show that specific abstractions are conservative (Sagiv, Reps, Wilhelm, TOPLAS 98) • Long and intimidating proofs • Or no proofs (and bugs)

The 3 Valued-Logic Approach • Automatically derives a conservative interpretation of statements and conditions from: • structural operational semantics • written using logical formulae • global properties • abstraction predicates • An experimental system (TVLA) • Correct by construction

n n p[x]=1 p[y]=0 inOrder[n]=½ n p[x]=0 p[y]=1 inOrder[n]=½ p[x]=0 p[y]=0 inOrder[n]=½ dle dle dle dle dle dle • n n p[x]=1 p[y]=0 inOrder[n]=1 p[x]=0 p[y]=1 inOrder[n]=½ n p[x]=0 p[y]=0 inOrder[n]=½ dle dle dle dle x->d <= y->d v1, v2 :p[x] (v1 ) p[y]( v2) dle (v1 ,v2 ) true

From Local Outlook to Global Outlook • (Safety) Every time control reaches a given point: • there are no garbage memory cells • the list is acyclic • each cell is locally ordered • (History) The list is a permutation of the original list

Bugs Found • Pointer manipulations • null dereferences • memory leaks • Forget to sort the first element • Swap equal elements in bubble sort(non-termination)

L insert_sort_b2(L x) { L r, pr, rn, l, pl; if (x == NULL) return NULL; pr = x; r = x->n; while (r != NULL) { pl = x; rn = r->n; l = x->n; while (l != r) { if (l->d > r->d) { pr->n = rn; r->n = l ; pl->n = r; r = pr; break } pl = l; l = l->n; } pr = r; r = rn; } return x; } • n • n p[x]=1 inOrder[n]=½ p[x]=0 inOrder[n]=1 dle dle dle

Running Times

Properties Not Proved • (Liveness) Termination • Stability

Related Work • Temporal-logic model checking • Manually extracts finite-state machine • Does not handle dynamically allocated data • But proves stronger properties, e.g., liveness • Bourdoncle 93 • Handles integer arithmetic • Cannot handle pointers

Further Work • Recursive programs (Quicksort) • Experiment with other ADTs (AVL trees) • Automatically derive predicate-update formulae for instrumentation predicates • Scaling to larger programs • User annotations • Class-level analysis • Modular analysis • Space optimizations • Smart front-end that precomputes “cheap” information

Conclusions • It is possible to automatically verify non-trivial properties of complex C programs that manipulate dynamically allocated memory w/o providing loop invariant • The implementation is automatically generated from TVLA • But scaling is an issue

Other Applications of TVLA • Verifying “cleanness” properties of C programs (Dor, Rodeh, Sagiv 2000) • null derefernces • memory leaks • Verifying safety properties of Mobile Ambients (Nielson, Nielson, Sagiv 2000) • Verifying safety programs of multithreaded Java programs (Yahav 2000) • Deadlocks • Nested monitors • Read/Write interference

Boolean Connectives [Kleene]

The Operational Semantics of x = t->n x’(v1, v2) = v1 : t (v1) n (v1, v2)

The Operational Semantics of x->n = NULL n’(v1, v2) =n (v1, v2) x (v1) inOrder’[dle, n](v) =inOrder [dle, n](v)x (v) inROrder’[dle, n](v) =inROrder [dle, n](v)x (v)

The Operational Semantics of x->n = t n’(v1, v2) =n (v1, v2) (x (v1) t(v2)) inOrder’[dle, n](v) = (x(v)? v1 : t(v1)  dle(v, v1): InOrder[dle, n](v) ) inROrder’[dle, n](v) = (x(v)? v1 : t(v1)  dle(v1, v): inROrder[dle, n](v) )

Putting Static Analysis to Work for Verification A Case Study

Putting Static Analysis to Work for Verification A Case Study

Presentation Transcript

Putting strengths to work

Putting Pointer Analysis to Work

Static Analysis And Verification Of Drivers

Putting Research to Work

Static Analysis : Virtual Work Equation

putting LCT to work

Putting People to Work

Putting Laziness to Work

Static Analysis and Verification

Static Program Analysis for Verification - an Introduction -

STATIC PPP WITH GPS+GLONASS. A CASE STUDY

Putting XML to Work

Static Analysis of HDL Descriptions: Extracting Models for Verification

Putting Education to Work

Putting tablets to work

CASE STUDY ANALYSIS

Putting NHDPlus to Work

Putting Biotech to Work:

Spatial Analysis – A Case Study

Static Analysis for Security A Case Study in the Automation of Code Auditing

A Case Study Analysis

Putting Statistics to Work