Assertion Checking Unified

Assertion Checking Unified Sumit Gulwani Microsoft Research, Redmond Ashish Tiwari SRI

Example • Green assertion requires modeling £ as uninterpreted. • Red assertion requires modeling £ as commutative, and reasoning about disequality guard fw. • Blue assertion requires reasoning about equality guard f=w. * u := 0; v := 0; f := w; a := 1; b := 1; z := f+f; Assert(a=b) Assert(z=2w) fw False True a := a£c; b := b£c; u := u+(a£c); v := v+(a£c)+(c£a); f := f-1; z := z-2; Assert(v = 2u)

Abstract Program Model / Problem Statement Linear Arithmetic e = y | c | e1§ e2 | c e Uninterpreted Functions e = y | F(e1,e2) Combination e = y | c | e1§ e2 | c e | F(e1,e2) * y := ? y := e Assert(e1=e2) False Assume(e1e2) True Non-det Conditional Non-det Assignment Assignment Disequality Guard

Summary of Results

Outline • Unification type of theory • Assertion checking algorithm (unitary/finitary theories) • coNP-hardness (bitary theories)

Unification Terminology • A substitution is a (acyclic) mapping of some variables to expressions. • A substitution 1 is more general than2 if there exists  such that 1 = (2). • A substitution  is a unifier for an equality e1=e2 if e1[(y)/y] = e2[(y)/y]. Example Consider the equality F(u) + F(v) = F(a) + F(b). {uÃa, vÃb} is a unifier for it and so is {uÃ1, aÃ1, vÃb}. The former unifier is more general than the latter.

i=1 Let Unif(e1=e2) = ÇÆ y = i(y) y k Unification Terminology Continued … • A set of unifiers {1,…,k} for e1=e2 is complete if for all unifiers  of e1=e2, 9 i s.t. i is more general than . Example Consider the equality F(u) + F(v) = F(a) + F(b). {{uÃa, vÃb}, {uÃb, vÃa}} is a complete set of unifiers for it. Hence, Unif(F(u)+F(v)=F(a)+F(b)) = (u=a Æ v=b) Ç (u=b Æ v=a).

Unification Type of Theories • Unitary: All equalities e1=e2 have a complete set of unifiers that is singleton. • Finitary: All equalities e1=e2 have a complete set of unifiers whose cardinality is finite. • Bitary: There exists an equality e1=e2 whose complete set of unifiers has 2 unifiers of the form y Ã z1 and y Ã z2

Examples of Bitary Theories Bitary: There exists an equality e1=e2 whose complete set of unifiers has 2 unifiers of the form y Ã z1 and y Ã z2 • Commutative Functions F(F(y,y),F(z1,z2)) = F(F(y,z1),F(y,z2)) • Combination of Linear Arithmetic + Uninterpreted Functions F(F(y)+F(y)) + F(F(z1)+F(z2)) = F(F(y)+F(z1)) + F(y)+F(z2))

Summary of Results

Connection between Assertion Checking & Unification An assertion e1 = e2 holds at a program point  iff the assertion Unif(e1=e2) holds at . Example To prove, F(u)+F(v) = F(a)+F(b), we need to prove that (u=a Æ v=b) Ç (u=b Æ v=a) is true.

Assertion Checking Algorithm • Backward analysis strengthened with Unification • Perform weakest precondition computation. • At each step replace the formula  by Unif(), which is a stronger and simpler formula. • Termination (reach fixpoint across loops)? • Yes, because of unifier computations. • PTIME for unitary theories (no disequality guards). Bounded for finitary theories.

Advantage of Backward Analysis with Unification • Forward Analysis: needs to maintain an infinite number of facts: Fi(u) + Fi(v) = Fi(a) + Fi(b) at the first join point. • Backward Analysis: does not terminate: Fi(u) + Fi(v) = Fi(a) + Fi(b) • Backward Analysis with Unification: Terminates in 2 steps: [(u=a Æ v=b) Ç (u=b Æ v=a)] * u := b; v := a; u := a; v := b; u := F(u); v := F(v); a := F(a); b := F(b); * Assert(u+v=a+b)

Handling equality guards Equality Guards Disequality Guards Ç[x/y] Ç[y/x] Ç x=y Assume (x = y) Assume (x  y)   • Standard weakest precondition will lead to disequalities in formulas. • Instead we can use heuristics as above. We perform standard weakest precondition computation

Reducing Unsatisfiability to Assertion Checking : boolean 3-SAT instance with m clauses IsUnsatisfiable() { for j=1 to m cj := F; for i=1 to k do if (*) 8 j s.t. var i occurs positively in clause j, cj := T; else 8 j s.t. var i occurs negatively in clause j, cj := T; Assert (c1=F Ç c2=F … Ç cm=F); }

Encoding disjunction • The check c1=F Ç c2=F can be encoded by some appropriate assertion e1=e2 in a bitary theory. • The above trick can be recursively applied to construct an assertion that encodes c1=F Ç c2=F Ç … Ç cm=F

Conclusion • Complexity of assertion checking depends on the unification type of the theory of program expressions: Unitary (PTIME), Bitary (coNP-hard), Finitary (Decidable) • The assertion checking algorithm is based on backward analysis strengthened with unification. • For some infinite-height abstract domains, a (goal-driven) backward analysis is more efficient than forward analysis. • Use of unification is yet another non-trivial use of theorem proving in program analysis.

Proof of Termination • At each program point, the proof obligation has the form: i=1 ÇÆ y = i(y) y k • In each successive loop iteration, above formula becomes stronger. We prove this cannot happen indefinitely: • Assign the following measure to the above formula { # of conjuncts representing unifier i | i=1 to k } • Show this measure decreases in some well-founded ordering.

Discussion The assertion checking algorithm is based on backward analysis strengthened with unification. • Backward Analysis vs. Forward Analysis • Use of Theorem Proving in Program Analysis • Combining (Forward) Abstract Interpreters [PLDI 06]: using an extension of Nelson-Oppen combination method. • This paper: Backward Analysis using Unification

Assertion Checking Unified