410 likes | 497 Views
Decision-Procedure Based Theorem Provers Tactic-Based Theorem Proving Inferring Loop Invariants. CS 294-8 Lecture 12. Review. Source language. VCGen. FOL. Theories. ?. Goal-directed proving. Sat. proc 1. Sat. proc n. …. Combining Satisfiability Procedures.
E N D
Decision-Procedure Based Theorem ProversTactic-Based Theorem ProvingInferring Loop Invariants CS 294-8 Lecture 12 Prof. Necula CS 294-8 Lecture 12
Review Source language VCGen FOL Theories ? Goal-directed proving Sat. proc 1 Sat. proc n … Prof. Necula CS 294-8 Lecture 12
Combining Satisfiability Procedures • Consider a set of literals F • Containing symbols from two theories T1 and T2 • We split F into two sets of literals • F1 containing only literals in theory T1 • F2 containing only literals in theory T2 • We name all subexpressions: p1(f2(E)) is split into f2(E) = n Æ p1(n) • We have: unsat (F1Æ F2) iff unsat(F) • unsat(F1) Ç unsat(F2) ) unsat(F) • But the converse is not true • So we cannot compute unsat(F) with a trivial combination of the sat. procedures for T1 and T2 Prof. Necula CS 294-8 Lecture 12
f(f(x) - f(y)) f(z) x y y + z x 0 z y x x = y f(x) = f(y) 0 = z f(x) - f(y) = z false f(f(x) - f(y)) = f(z) Combining Satisfiability Procedures. Example • Consider equality and arithmetic Prof. Necula CS 294-8 Lecture 12
Combining Satisfiability Procedures • Combining satisfiability procedures is non trivial • And that is to be expected: • Equality was solved by Ackerman in 1924, arithmetic by Fourier even before, but E + A only in 1979 ! • Yet in any single verification problem we will have literals from several theories: • Equality, arithmetic, lists, … • When and how can we combine separate satisfiability procedures? Prof. Necula CS 294-8 Lecture 12
Nelson-Oppen Method (1) • Represent all conjuncts in the same DAG f(f(x) - f(y)) f(z) Æ y ¸ x Æ x ¸ y + z Æ z ¸ 0 f - ¸ ¸ f f f ¸ + y x z 0 Prof. Necula CS 294-8 Lecture 12
Nelson-Oppen Method (2) • Run each sat. procedure • Require it to report all contradictions (as usual) • Also require it to report all equalities between nodes f - ¸ ¸ f f f ¸ + y x z 0 Prof. Necula CS 294-8 Lecture 12
Nelson-Oppen Method (3) • Broadcast all discovered equalities and re-run sat. procedures • Until no more equalities are discovered or a contradiction arises f x Contradiction - ¸ ¸ f f f ¸ + y x z 0 Prof. Necula CS 294-8 Lecture 12
What Theories Can be Combined? • Only theories without common interpreted symbols • But Ok if one theory takes the symbol uninterpreted • Only certain theories can be combined • Consider (Z, +, ·) and Equality • Consider: 1 · x · 2 Æ a = 1 Æ b = 2 Æ f(x) ¹ f(a) Æ f(x) ¹ f(b) • No equalities and no contradictions are discovered • Yet, unsatisfiable • A theory is non-convex when a set of literals entails a disjunction of equalities without entailing any single equality Prof. Necula CS 294-8 Lecture 12
Handling Non-Convex Theories • Many theories are non-convex • Consider the theory of memory and pointers • It is not-convex: true ) A = A’ Ç sel(upd(M, A, V), A’) = sel(M, A’) (neither of the disjuncts is entailed individually) • For such theories it can be the case that • No contradiction is discovered • No single equality is discovered • But a disjunction of equalities is discovered • We need to propagate disjunction of equalities … Prof. Necula CS 294-8 Lecture 12
Propagating Disjunction of Equalities • To propagate disjunctions we perform a case split: • If a disjunction E1Ç … Ç En is discovered: Save the current state of the prover for i = 1 to n { broadcast Ei if no contradiction arises then return “satisfiable” restore the saved prover state } return “unsatisfiable” Prof. Necula CS 294-8 Lecture 12
Handling Non-Convex Theories • Case splitting is expensive • Must backtrack (performance --) • Must implement all satisfiability procedures in incremental fashion (simplicity --) • In some cases the splitting can be prohibitive: • Take pointers for example. upd(upd(…(upd(m, i1, x), …, in-1, x), in, x) = upd(…(upd(m, j1, x), …, jn-1, x) Æ sel(m, i1) ¹ x Æ … Æ sel(m, in) ¹ x entails Çj ¹ k ij¹ ik (a conjunction of length n entails n2 disjuncts) Prof. Necula CS 294-8 Lecture 12
Forward vs. Backward Theorem Proving Prof. Necula CS 294-8 Lecture 12
Forward vs. Backward Theorem Proving • The state of a prover can be expressed as: H1Æ … Æ Hn)? G • Given the hypotheses Hi try to derive goal G • A forward theorem prover derives new hypotheses, in hope of deriving G • If H1Æ … Æ Hn) H then move to state H1Æ … Æ HnÆ H )? G • Success state: H1Æ … Æ G Æ … Hn)? G • A forward theorem prover uses heuristics to reach G • Or it can exhaustively derive everything that is derivable ! Prof. Necula CS 294-8 Lecture 12
Forward Theorem Proving • Nelson-Oppen is a forward theorem prover: • The state is L1Æ … Æ LnÆ: L )? false • If L1Æ … Æ LnÆ: L ) E (an equality) then • New state is L1Æ … Æ LnÆ: L Æ E )? false (add the equality) • Success state is L1Æ … ÆLÆ … Æ: LÆ … Ln)? false • Nelson-Oppen provers exhaustively produce all derivable facts hoping to encounter the goal • Case splitting can be explained this way too: • If L1Æ … Æ LnÆ: L ) E Ç E’ (a disjunction of equalities) then • Two new states are produced (both must lead to success) L1Æ … Æ LnÆ: L Æ E )? false L1Æ … Æ LnÆ: L Æ E’ )? false Prof. Necula CS 294-8 Lecture 12
Backward Theorem Proving • A backward theorem prover derives new subgoals from the goal • The current state is H1Æ … Æ Hn )? G • If H1Æ … Æ HnÆ G1Æ … Æ Gn) G (Gi are subgoals) • Produce “n“ new states (all must lead to success): H1Æ … Æ Hn)? Gi • Similar to case splitting in Nelson-Oppen: • Consider a non-convex theory: H1Æ … Æ Hn) E Ç E’ is same as H1Æ … Æ HnÆ: E Æ: E’ ) false (thus we have reduced the goal “false” to subgoals: E Æ: E’ ) Prof. Necula CS 294-8 Lecture 12
Programming Theorem Provers • Backward theorem provers most often use heuristics • If it useful to be able to program the heuristics • Such programs are called tactics and tactic-based provers have this capability • E.g. the Edinburgh LCF was a tactic based prover whose programming language was called the Meta-Language (ML) • A tactic examines the state and either: • Announces that it is not applicable in the current state, or • Modifies the proving state Prof. Necula CS 294-8 Lecture 12
Programming Theorem Provers. Tactics. • State = Formula list £ Formula • A set of hypotheses and a goal • Tactic = State ! (State !a) ! (unit !a) !a • Continuation passing style • Given a state will invoke either the success continuation with a modified state or the failure continuation • Example: a congruence-closure based tactic cc (h, g) c f = let e1, …, en new equalities in the congruence closure of h c (h [ {e1, …, en}, g) • A forward chaining tactic Prof. Necula CS 294-8 Lecture 12
Programming Theorem Provers. Tactics. • Consider an axiom: 8x. a(x) ) b(x) • Like the clause b(x) :- a(x) in Prolog • This could be turned into a tactic clause (h, g) c f = if unif(g, b) = f then s (h, f(a)) else f () • A backward chaining tactic Prof. Necula CS 294-8 Lecture 12
Programming Theorem Provers. Tacticals. • Tactics can be composed using tacticals Examples: • THEN : tactic ! tactic ! tactic THEN t1 t2 = ls.lc.lf. let newc s’ = t2 s’ c f in t1 s newc f • REPEAT : tactic ! tactic REPEAT t = THEN t (REPEAT t) • ORELSE : tactic ! tactic ! tactic ORELSE t1 t2 = ls.lc.lf. let newf x = t2 s c f in t1 s c newf Prof. Necula CS 294-8 Lecture 12
Programming Theorem Provers. Tacticals • Prolog is just one possible tactic: • Given tactics for each clause: c1, …, cn • Prolog : tactic Prolog = REPEAT (c1 ORLESE c2 ORELSE … ORELSE cn) • Nelson-Oppen can also be programmed this way • The result is not as efficient as a special-purpose implementation • This is a very powerful mechanism for semi-automatic theorem proving • Used in: Isabelle, HOL, and many others Prof. Necula CS 294-8 Lecture 12
Techniques for Inferring Loop Invariants Prof. Necula CS 294-8 Lecture 12
Inferring Loop Invariants • Traditional program verification has several elements: • Function specifications and loop invariants • Verification condition generation • Theorem proving • Requiring specifications from the programmer is often acceptable • Requiring loop invariants is not acceptable • Same for specifications of local functions Prof. Necula CS 294-8 Lecture 12
Inferring Loop Invariants • A set of cutpoints is a set of program points : • There is at least one cutpoint on each circular path in CFG • There is a cutpoint at the start of the program • There is a cutpoint before the return • Consider that our function uses n variables x: • We associate with each cutpoint an assertion Ik(x) • If a is a path from cutpoint k to j then : • Ra(x) : Zn!Zn expresses the effect of path a on the values of x at j as a function of those at k • Pa(x) : Zn!B is a path predicate that is true exactly of those values of x at k that will enable the path a Prof. Necula CS 294-8 Lecture 12
0 1 2 Cutpoints. Example. • p01 = true r01 = { A Ã A, K Ã 0, L Ã len(A), S Ã 0, mÃm} • p11 = K + 1 < L r11 = { A Ã A, K Ã K + 1, L Ã L, S Ã S + sel(m, A + K), mÃm} • p12 = K + 1 ¸ L r12 = r11 • Easily obtained through sym. eval. L = len(A) K = 0 S = 0 S = S + A[K] K ++ K < L return S Prof. Necula CS 294-8 Lecture 12
Equational Definition of Invariants • A set of assertions is a set of invariants if: • The assertion for the start cutpoint is the precondition • The assertion for the end cutpoint is the postcondition • For each path from i to j we have 8x. Ii(x) Æ Pij(x) ) Ij(Rij(x)) • Now we have to solve a system of constraints with the unknowns I1, …, In-1 • I0 and In are known • We will consider the simpler case of a single loop • Otherwise we might want to try solving the inner/last loop first Prof. Necula CS 294-8 Lecture 12
Invariants. Example. 0 • I0) I1(r0(x)) • The invariant I1 is established initially • I1Æ K+1 < L ) I1(r1(x)) • The invariant I1 is preserved in the loop • I1Æ K+1 ¸ L ) I2(r1(x)) • The invariant I1 is strong enough (i.e. useful)h L = len(A) K = 0 S = 0 ro 1 S = S + A[K] K ++ K < L r1 2 return S Prof. Necula CS 294-8 Lecture 12
The Lattice of Invariants true • A few predicates satisfy condition 2 • Are invariant • Form a lattice • Weak predicates satisfy the condition 1 • Are satisfied initially • Strong predicates satisfy condition 3 • Are useful false Prof. Necula CS 294-8 Lecture 12
Finding The Invariant • Which of the potential invariants should we try to find ? • We prefer to work backwards • Essentially proving only what is needed to satisfy In • Forward is also possible but sometimes wasteful since we have to prove everything that holds at any point Prof. Necula CS 294-8 Lecture 12
Finding the Invariant true • Thus we do not know the “precondition” of the loop • The weakest invariant that is strong enough has most chances of holding initially • This is the one that we’ll try to find • And then check that it is weak enough false Prof. Necula CS 294-8 Lecture 12
Induction Iteration Method true • Equation 3 gives a predicate weaker than any invariant I1Æ K+1 ¸ L ) I2(r1(x)) I1) ( K+1 ¸ L ) I2(r1(x))) W0 = K+1 ¸ L ) I2(r1(x)) • Equation 2 suggest an iterative computation of the invariant I1 I1) (K+1 < L ) I1(r1(x))) false Prof. Necula CS 294-8 Lecture 12
Induction Iteration Method • Define a family of predicates W0 = K+1 ¸ L ) I2(r1(x)) Wj = W0Æ K+1 < L ) Wj-1(r1(x)) • Properties of Wj • Wj) Wj-1) … ) W0 (they form a strengthening chain) • I1) Wj (they are weaker than any invariant) • If Wj-1) Wj then • Wj is an invariant (satisfies both equations 2 and 3): Wj) K+1 < L ) Wj(r1(x)) • Wj is the weakest invariant (recall domain theory, predicates form a domain, and we use the fixed point theorem to obtain least solutions to recursive equations) Prof. Necula CS 294-8 Lecture 12
Induction Iteration Method W = K+1 ¸ L ) I2(r1(x)) // This is W0 W’ = true while (not (W’ ) W)) { W’ = W W = (K+1 ¸ L ) I2(r1(x))) Æ (K + 1 < L ) W’(r1(x))) } • The only hard part is to check whether W’ ) W • We use a theorem prover for this purpose Prof. Necula CS 294-8 Lecture 12
Induction Iteration. true • The sequence of Wj approaches the weakest invariant from above • The predicate Wj can quickly become very large • Checking W’ ) W becomes harder and harder • This is not guaranteed to terminate W0 W1 W2 W3 false Prof. Necula CS 294-8 Lecture 12
Induction Iteration. Example. 0 • Consider that the strength condition is: I1) K ¸ 0 Æ K < len(A) (array bounds) • We compute the W’s: W0 = K ¸ 0 Æ K < len(A) W1 = W0Æ K+1<L ) K+1¸ 0 Æ K+1<len(A) W2 = W0Æ K+1 < L ) (K + 1 ¸ 0 Æ K + 1 < len(A) Æ K+2< L ) K+2 ¸ 0 Æ K+2 < len(A))) … L = len(A) K = 0 S = 0 ro 1 S = S + A[K] K ++ K < L r1 2 return S Prof. Necula CS 294-8 Lecture 12
Induction Iteration. Strengthening. • We can try to strengthen the inductive invariant • Instead of: Wj = W0Æ K+1 < L ) Wj-1(r1(x)) we compute: Wj = strengthen (W0Æ K+1 < L ) Wj-1(r1(x))) where strengthen(P) ) P • We still have Wj) Wj-1 andwe stop when Wj-1) Wj • The result is still an invariant that satisfies 2 and 3 Prof. Necula CS 294-8 Lecture 12
Strengthening Heuristics • One goal of strengthening is simplification: • Drop disjuncts: P1Ç P2! P1 • Drop implications: P1) P2! P2 • A good idea is to try to eliminate variables changed in the loop body: • If Wj does not depend on variables changed by r1 (e.g. K, S) • Wj+1 = W0Æ K+1 < L ) Wj(r1(x)) = W0Æ K+1 < L ) Wj • Now Wj) Wj+1 and we are done ! Prof. Necula CS 294-8 Lecture 12
Induction Iteration. Strengthening true • We are still in the “strong-enough” area • We are making bigger steps • And we might overshoot then weakest invariant • We might also fail to find any invariant • But we do so quickly W0 W1 W2 W’1 W3 W’2 false Prof. Necula CS 294-8 Lecture 12
One Strengthening Heuristic for Integers • Rewrite Wj in conjunctive normal form W1 = K ¸ 0 Æ K < len(A) Æ (K + 1 < L ) K + 1 ¸ 0 Æ K + 1 < len(A)) = K ¸ 0 Æ K < len(A) Æ (K + 1 ¸ L Ç K + 1 < len(A)) • Take each disjunction containing arithmetic literals • Negate it and obtain a conjunction of arithmetic literals K+1 < L Æ K+1 ¸ len(A) • Weaken the result by eliminating a variable (preferably a loop-modified variable) E.g., add the two literals: L > len(A) • Negate the result and get another disjunction: L · len(A) W1 = K ¸ 0 Æ K < len(A) Æ L · len(A) (check that W1) W2) Prof. Necula CS 294-8 Lecture 12
Induction Iteration • We showed a way to compute invariants algoritmically • Similar to fixed point computation in domains • Similar to abstract interpretation on the lattice of predicates • Then we discussed heuristics that improve the termination properties • Similar to widening in abstract interpretation Prof. Necula CS 294-8 Lecture 12
Theorem Proving. Conclusions. • Theorem proving strengths • Very expressive • Theorem proving weaknesses • Too ambitious • A great toolbox for software analysis • Symbolic evaluation • Decision procedures • Related to program analysis • Abstract interpretation on the lattice of predicates Prof. Necula CS 294-8 Lecture 12