ECI 2007: Specification and Verification of Object-Oriented Programs

ECI 2007: Specification and Verification of Object-Oriented Programs Lecture 6

 Formula := A |  |    A  Atom := b | t = 0 | t < 0 | t  0 t  Term := c | x | t + t | t – t | ct | Select(m,t) m  MemTerm := f | Update(m,t,t) f  Field b  SymBoolConst x  SymIntConst c  {…,-1,0,1,…}

Memory axiom for all objects o and o’, and memories m:  o = o’  Select(Update(m,o,v),o’) = v o  o’  Select(Update(m,o,v),o’) = Select(m,o’)

{ b.f = 5 } a.f = 5 { a.f + b.f = 10 } iff Select(f,b) = 5  Select(Update(f,a,5),a) + Select(Update(f,a,5),b)  10 is unsatisfiable theory of arithmetic: 5, 10, + theory of arrays: Select, Update, f Constraints that arise in program verification are mixed!

x = w, y = w z = z’ Theories communicating via equality and variables Select(f,b) = 5  Select(Update(f,a,5),a) + Select(Update(f,a,5),b)  10 Introduce: variable w to represent Select(f,b) variable x to represent Select(Update(f,a,w),a) variable y to represent Select(Updatef,a,w),b) variables z and z’ to eliminate the arithmetic disequality Theory of arithmetic Theory of arrays w = 5 x + y = z z’ = 10 w = Select(f,b) x = Select(Update(f,a,w),a) y = Select(Update(f,a,w),b) z  z’

Theory of arrays •  Formula := A |    A  Atom := t = t | t  t t  Term := c | Select(m,t) m  MemTerm := f | Update(m,t,t) c  SymConst for all objects o and o’, and memories m:  o = o’  Select(Update(m,o,v),o’) = v o  o’  Select(Update(m,o,v),o’) = Select(m,o’)

Theory of Equality with Uninterpreted Functions •  Formula := A |    A  Atom := t = t | t  t t  Term := c | f(t,…,t) c  SymConst f  Function for all constants a and b and functions f: - a = a - a = b  b = a - a = b  b = c  a = c - a = b  f(a) = f(b)

f(f(f(f(f(a))))) = a f(f(f(a))) = a f(a,b) = a f(f(a,b),b) = b f(a,b) = b f(f(a)) = a a = b f(a) = a f(f(f(f(a)))) = a

f(f(f(f(f(a))))) = a f(f(f(a))) = a f f f f f f a b f a f(a,b) = a f(f(a,b),b) = b

f Congruence closure algorithm f f f f f a b f e-graph a Use union-find algorithm to maintain equivalence classes on terms.

Decision procedure for EUF 1. Construct initial e-graph for all terms appearing in equalities and disequalities. 2. Apply congruence closure ignoring disequalities. 3. If there is a disequality t1 t2 and an equivalence class containing both t1 and t2, return unsatisfiable. 4. Otherwise, return satisfiable.

Soundness Theorem: If the algorithm returns unsatisfiable, the constraints are unsatisfiable. Lemma: At every step of the congruence closure algorithm, each equality in the e-graph is implied by the original set of equalities. Proof: By induction on the number of steps.

Completeness Theorem: If the algorithm returns satisfiable, there is a model satisfying the constraints.

Model • A (finite or infinite) universe U • An interpretation I • maps each constant symbol u to an • element I(u)  U • maps each function symbol f to a • function I(f)  (UU)

Completeness Theorem: If the algorithm returns satisfiable, there is a model satisfying the constraints. How do we construct the model?

f f(a,b) = a f(f(a,b),b) = b f a b For any term t in the e-graph, let EC(t) be the equivalence class containing t. U = set of equivalence classes + new element  I(c) = EC(c) I(f)() = EC(f(u)), if u. f(u) is a term in the e-graph I(f)() = , otherwise

Convexity A conjunction of facts is convex if whenever it entails a disjunction of equalities, it also entails at least one equality by itself. If C  a1 = b1  …  an = bn Then there is i  [1,n] such that C  ai = bi A theory is convex if ever conjunction of facts in the theory is convex.

EUF is convex Suppose C  u1 = t1  u2 = t2 Then C  u1  t1  u2  t2 is unsatisfiable The congruence closure algorithm demonstrates that there is some i such that even C  ui  ti is unsatisfiable

Uninterpreted theory Function symbols: f1, f2, … (each with an arity  {0,1,…}) Relation symbols: R1, R2, … (each with an arity  {0,1,…}) Special relation: equality (arity 2) Variables: x1, x2, … Boolean facts: x1 = x2,x1 x2,R(x1, x2), R(x1, x2), x. R(x,y) A conjunction of facts is consistent iff there is a model (U,I) that satisfies each fact in the conjunction. e.g., EUF, arrays, lists

Interpreted theory Function symbols: f1, f2, … (each with an arity  {0,1,…}) Relation symbols: R1, R2, … (each with an arity  {0,1,…}) Special relation: equality (arity 2) Variables: x1, x2, … Boolean facts: x1 = x2,x1 x2,R(x1, x2), R(x1, x2), x. R(x,y) Fixed model (U,I) providing an interpretation for the function and relation symbols. A conjunction of facts is consistent iff I can be extended to the free variables of the conjunction so that each fact in the conjunction is satisfied. e.g., arithmetic over rationals, arithmetic over integers

Communicating theories • Suppose the only shared symbols between two theories T1 and T2 are equality and variables • C1 is conjunction of facts in theory T1 • C2 is conjunction of facts in theory T2 • Suppose C1 is consistent by itself and C2 is consistent by itself • Is C1  C2 consistent?

x = y C2 C1 x  y y + z  x z  0 g1 = g2 – g3 f(g1)  f(z) g2 = f(x) g3 = f(y) g2 = g3 f(f(x) – f(y))  f(z)  x  y  y + z  x  z  0 g1 = z C1 is consistent C2 is consistent But C1  C2 is not consistent!

For any conjunction C1 of facts in the theory of rationals and any conjunction C2 of facts in the theory of EUF, it suffices to communicate equalities over shared variables. What if C1 is a conjunction of facts in the theory of arithmetic over integers?

C2 C1 1  x x  2 a = 1 b = 2 f(x)  f(a) f(x)  f(b) C1  x = a  x = b  f(x) = f(a)  f(x) = f(b) = C2 The equality sharing procedure does not work because the theory of integers is non-convex (although the theory of rationals is convex)! Fix: Communicate disjunctions of equalities!

1  x x  2 a = 1 b = 2 f(x)  f(a) f(x)  f(b)  x = a  x = b

1  x x  2 a = 1 b = 2 x = a f(x)  f(a) f(x)  f(b) x = a 4, 2, x = b Unsatisfiable

1  x x  2 a = 1 b = 2 x = b f(x)  f(a) f(x)  f(b) x = b Unsatisfiable

Another Example

1  x x  2 a = 1 b = 2 f(x) = a f(a) = b f(b) = b  x = a  x = b

a = b a = b 1  x x  2 a = 1 b = 2 x = a f(x) = a f(a) = b f(b) = b x = a 4, 3, x = b Unsatisfiable

a = b a = b 1  x x  2 a = 1 b = 2 x = b f(x) = a f(a) = b f(b) = b x = b Unsatisfiable

The procedure returns satisfiable only when • C1 is consistent • C2 is consistent • C1 is convex • C2 is convex • C1 entails (x = y) iff C2 entails (x = y) Theorem: If the procedure returns satisfiable, then there is a model of C1  C2. • Technical side conditions: • Every consistent formula in T1 has a countably • infinite model • (2) Every consistent formula in T2 has a countably • infinite model

Proof Partition variables into equivalence classes Q1, …, Qn such that for all i  [1,n], if x,y  Qi then C1 entails x = y. Lemma: For all i  [1,n], if x,y  Qi then C2 entails x = y. For each i  [1,n], pick representative wi Qi. Lemma: C1 1  i < j  n(wi  wj) is consistent. Lemma: C2 1  i < j  n(wi  wj) is consistent.

Proof continued D1 = C1 1  i < j  n(wi  wj) D2 = C2 1  i < j  n(wi  wj) D1 has a countably infinite model (U1, I1) D2 has a countably infinite model (U2, I2) Pick an isomorphism K from U1 to U2 that is consistent with variable assignments, i.e., for all x, K(I1(x)) = I2(x). The interpretations of function and relation symbols can be mapped easily using K.

ECI 2007: Specification and Verification of Object-Oriented Programs