410 likes | 491 Views
Decision-Procedure Based Theorem Provers Tactic-Based Theorem Proving Inferring Loop Invariants. CS 294-8 Lecture 12. Review. Source language. VCGen. FOL. Theories. ?. Goal-directed proving. Sat. proc 1. Sat. proc n. …. Combining Satisfiability Procedures.
E N D
Decision-Procedure Based Theorem ProversTactic-Based Theorem ProvingInferring Loop Invariants CS 294-8 Lecture 12 Prof. Necula CS 294-8 Lecture 12
Review Source language VCGen FOL Theories ? Goal-directed proving Sat. proc 1 Sat. proc n … Prof. Necula CS 294-8 Lecture 12
Combining Satisfiability Procedures • Consider a set of literals F • Containing symbols from two theories T1 and T2 • We split F into two sets of literals • F1 containing only literals in theory T1 • F2 containing only literals in theory T2 • We name all subexpressions: p1(f2(E)) is split into f2(E) = n p1(n) • We have: unsat (F1 F2) iff unsat(F) • unsat(F1) unsat(F2) unsat(F) • But the converse is not true • So we cannot compute unsat(F) with a trivial combination of the sat. procedures for T1 and T2 Prof. Necula CS 294-8 Lecture 12
f(f(x) - f(y)) f(z) x y y + z x 0 z y x x = y f(x) = f(y) 0 = z f(x) - f(y) = z false f(f(x) - f(y)) = f(z) Combining Satisfiability Procedures. Example • Consider equality and arithmetic Prof. Necula CS 294-8 Lecture 12
Combining Satisfiability Procedures • Combining satisfiability procedures is non trivial • And that is to be expected: • Equality was solved by Ackerman in 1924, arithmetic by Fourier even before, but E + A only in 1979 ! • Yet in any single verification problem we will have literals from several theories: • Equality, arithmetic, lists, … • When and how can we combine separate satisfiability procedures? Prof. Necula CS 294-8 Lecture 12
Nelson-Oppen Method (1) • Represent all conjuncts in the same DAG f(f(x) - f(y)) f(z) y x x y + z z 0 f - f f f + y x z 0 Prof. Necula CS 294-8 Lecture 12
Nelson-Oppen Method (2) • Run each sat. procedure • Require it to report all contradictions (as usual) • Also require it to report all equalities between nodes f - f f f + y x z 0 Prof. Necula CS 294-8 Lecture 12
Nelson-Oppen Method (3) • Broadcast all discovered equalities and re-run sat. procedures • Until no more equalities are discovered or a contradiction arises f x Contradiction - f f f + y x z 0 Prof. Necula CS 294-8 Lecture 12
What Theories Can be Combined? • Only theories without common interpreted symbols • But Ok if one theory takes the symbol uninterpreted • Only certain theories can be combined • Consider (Z, +, ·) and Equality • Consider: 1 x 2 a = 1 b = 2 f(x) ¹ f(a) f(x) ¹ f(b) • No equalities and no contradictions are discovered • Yet, unsatisfiable • A theory is non-convex when a set of literals entails a disjunction of equalities without entailing any single equality Prof. Necula CS 294-8 Lecture 12
Handling Non-Convex Theories • Many theories are non-convex • Consider the theory of memory and pointers • It is not-convex: true A = A’ sel(upd(M, A, V), A’) = sel(M, A’) (neither of the disjuncts is entailed individually) • For such theories it can be the case that • No contradiction is discovered • No single equality is discovered • But a disjunction of equalities is discovered • We need to propagate disjunction of equalities … Prof. Necula CS 294-8 Lecture 12
Propagating Disjunction of Equalities • To propagate disjunctions we perform a case split: • If a disjunction E1 … En is discovered: Save the current state of the prover for i = 1 to n { broadcast Ei if no contradiction arises then return “satisfiable” restore the saved prover state } return “unsatisfiable” Prof. Necula CS 294-8 Lecture 12
Handling Non-Convex Theories • Case splitting is expensive • Must backtrack (performance --) • Must implement all satisfiability procedures in incremental fashion (simplicity --) • In some cases the splitting can be prohibitive: • Take pointers for example. upd(upd(…(upd(m, i1, x), …, in-1, x), in, x) = upd(…(upd(m, j1, x), …, jn-1, x) sel(m, i1) ¹ x … sel(m, in) ¹ x entails j ¹ k ij¹ ik (a conjunction of length n entails n2 disjuncts) Prof. Necula CS 294-8 Lecture 12
Forward vs. Backward Theorem Proving Prof. Necula CS 294-8 Lecture 12
Forward vs. Backward Theorem Proving • The state of a prover can be expressed as: H1 … Hn? G • Given the hypotheses Hi try to derive goal G • A forward theorem prover derives new hypotheses, in hope of deriving G • If H1 … Hn H then move to state H1 … Hn H ? G • Success state: H1 … G … Hn? G • A forward theorem prover uses heuristics to reach G • Or it can exhaustively derive everything that is derivable ! Prof. Necula CS 294-8 Lecture 12
Forward Theorem Proving • Nelson-Oppen is a forward theorem prover: • The state is L1 … Ln L ? false • If L1 … Ln L E (an equality) then • New state is L1 … Ln L E ? false (add the equality) • Success state is L1 … L … L … Ln? false • Nelson-Oppen provers exhaustively produce all derivable facts hoping to encounter the goal • Case splitting can be explained this way too: • If L1 … Ln L E E’ (a disjunction of equalities) then • Two new states are produced (both must lead to success) L1 … Ln L E ? false L1 … Ln L E’ ? false Prof. Necula CS 294-8 Lecture 12
Backward Theorem Proving • A backward theorem prover derives new subgoals from the goal • The current state is H1 … Hn ? G • If H1 … Hn G1 … Gn G (Gi are subgoals) • Produce “n“ new states (all must lead to success): H1 … Hn? Gi • Similar to case splitting in Nelson-Oppen: • Consider a non-convex theory: H1 … Hn E E’ is same as H1 … Hn E E’ false (thus we have reduced the goal “false” to subgoals E E’ ) Prof. Necula CS 294-8 Lecture 12
Programming Theorem Provers • Backward theorem provers most often use heuristics • If it useful to be able to program the heuristics • Such programs are called tactics and tactic-based provers have this capability • E.g. the Edinburgh LCF was a tactic based prover whose programming language was called the Meta-Language (ML) • A tactic examines the state and either: • Announces that it is not applicable in the current state, or • Modifies the proving state Prof. Necula CS 294-8 Lecture 12
Programming Theorem Provers. Tactics. • State = Formula list Formula • A set of hypotheses and a goal • Tactic = State (State a) (unit a) a • Continuation passing style • Given a state will invoke either the success continuation with a modified state or the failure continuation • Example: a congruence-closure based tactic cc (h, g) c f = let e1, …, en new equalities in the congruence closure of h c (h {e1, …, en}, g) • A forward chaining tactic Prof. Necula CS 294-8 Lecture 12
Programming Theorem Provers. Tactics. • Consider an axiom: x. a(x) b(x) • Like the clause b(x) :- a(x) in Prolog • This could be turned into a tactic clause (h, g) c f = if unif(g, b) = f then s (h, f(a)) else f () • A backward chaining tactic Prof. Necula CS 294-8 Lecture 12
Programming Theorem Provers. Tacticals. • Tactics can be composed using tacticals Examples: • THEN : tactic tactic tactic THEN t1 t2 = ls.lc.lf. let newc s’ = t2 s’ c f in t1 s newc f • REPEAT : tactic tactic REPEAT t = THEN t (REPEAT t) • ORELSE : tactic tactic tactic ORELSE t1 t2 = ls.lc.lf. let newf x = t2 s c f in t1 s c newf Prof. Necula CS 294-8 Lecture 12
Programming Theorem Provers. Tacticals • Prolog is just one possible tactic: • Given tactics for each clause: c1, …, cn • Prolog : tactic Prolog = REPEAT (c1 ORLESE c2 ORELSE … ORELSE cn) • Nelson-Oppen can also be programmed this way • The result is not as efficient as a special-purpose implementation • This is a very powerful mechanism for semi-automatic theorem proving • Used in: Isabelle, HOL, and many others Prof. Necula CS 294-8 Lecture 12
Techniques for Inferring Loop Invariants Prof. Necula CS 294-8 Lecture 12
Inferring Loop Invariants • Traditional program verification has several elements: • Function specifications and loop invariants • Verification condition generation • Theorem proving • Requiring specifications from the programmer is often acceptable • Requiring loop invariants is not acceptable • Same for specifications of local functions Prof. Necula CS 294-8 Lecture 12
Inferring Loop Invariants • A set of cutpoints is a set of program points : • There is at least one cutpoint on each circular path in CFG • There is a cutpoint at the start of the program • There is a cutpoint before the return • Consider that our function uses n variables x: • We associate with each cutpoint an assertion Ik(x) • If a is a path from cutpoint k to j then : • Ra(x) : ZnZn expresses the effect of path a on the values of x at j as a function of those at k • Pa(x) : ZnB is a path predicate that is true exactly of those values of x at k that will enable the path a Prof. Necula CS 294-8 Lecture 12
0 1 2 Cutpoints. Example. • p01 = true r01 = { A A, K 0, L len(A), S 0, mm} • p11 = K + 1 < L r11 = { A A, K K + 1, L L, S S + sel(m, A + K), mm} • p12 = K + 1 L r12 = r11 • Easily obtained through sym. eval. L = len(A) K = 0 S = 0 S = S + A[K] K ++ K < L return S Prof. Necula CS 294-8 Lecture 12
Equational Definition of Invariants • A set of assertions is a set of invariants if: • The assertion for the start cutpoint is the precondition • The assertion for the end cutpoint is the postcondition • For each path from i to j we have x. Ii(x) Pij(x) Ij(Rij(x)) • Now we have to solve a system of constraints with the unknowns I1, …, In-1 • I0 and In are known • We will consider the simpler case of a single loop • Otherwise we might want to try solving the inner/last loop first Prof. Necula CS 294-8 Lecture 12
Invariants. Example. 0 • I0 I1(r0(x)) • The invariant I1 is established initially • I1 K+1 < L I1(r1(x)) • The invariant I1 is preserved in the loop • I1 K+1 L I2(r1(x)) • The invariant I1 is strong enough (i.e. useful)h L = len(A) K = 0 S = 0 ro 1 S = S + A[K] K ++ K < L r1 2 return S Prof. Necula CS 294-8 Lecture 12
The Lattice of Invariants true • A few predicates satisfy condition 2 • Are invariant • Form a lattice • Weak predicates satisfy the condition 1 • Are satisfied initially • Strong predicates satisfy condition 3 • Are useful false Prof. Necula CS 294-8 Lecture 12
Finding The Invariant • Which of the potential invariants should we try to find ? • We prefer to work backwards • Essentially proving only what is needed to satisfy In • Forward is also possible but sometimes wasteful since we have to prove everything that holds at any point Prof. Necula CS 294-8 Lecture 12
Finding the Invariant true • Thus we do not know the “precondition” of the loop • The weakest invariant that is strong enough has most chances of holding initially • This is the one that we’ll try to find • And then check that it is weak enough false Prof. Necula CS 294-8 Lecture 12
Induction Iteration Method true • Equation 3 gives a predicate weaker than any invariant I1 K+1 L I2(r1(x)) I1 ( K+1 L I2(r1(x))) W0 = K+1 L I2(r1(x)) • Equation 2 suggest an iterative computation of the invariant I1 I1 (K+1 < L I1(r1(x))) false Prof. Necula CS 294-8 Lecture 12
Induction Iteration Method • Define a family of predicates W0 = K+1 L I2(r1(x)) Wj = W0 K+1 < L Wj-1(r1(x)) • Properties of Wj • Wj Wj-1 … W0 (they form a strengthening chain) • I1 Wj (they are weaker than any invariant) • If Wj-1 Wj then • Wj is an invariant (satisfies both equations 2 and 3): Wj K+1 < L Wj(r1(x)) • Wj is the weakest invariant (recall domain theory, predicates form a domain, and we use the fixed point theorem to obtain least solutions to recursive equations) Prof. Necula CS 294-8 Lecture 12
Induction Iteration Method W = K+1 L I2(r1(x)) // This is W0 W’ = true while (not (W’ W)) { W’ = W W = (K+1 L I2(r1(x))) (K + 1 < L W’(r1(x))) } • The only hard part is to check whether W’ W • We use a theorem prover for this purpose Prof. Necula CS 294-8 Lecture 12
Induction Iteration. true • The sequence of Wj approaches the weakest invariant from above • The predicate Wj can quickly become very large • Checking W’ W becomes harder and harder • This is not guaranteed to terminate W0 W1 W2 W3 false Prof. Necula CS 294-8 Lecture 12
Induction Iteration. Example. 0 • Consider that the strength condition is: I1 K 0 K < len(A) (array bounds) • We compute the W’s: W0 = K 0 K < len(A) W1 = W0 K+1<L K+1 0 K+1<len(A) W2 = W0 K+1 < L (K + 1 0 K + 1 < len(A) K+2< L K+2 0 K+2 < len(A))) … L = len(A) K = 0 S = 0 ro 1 S = S + A[K] K ++ K < L r1 2 return S Prof. Necula CS 294-8 Lecture 12
Induction Iteration. Strengthening. • We can try to strengthen the inductive invariant • Instead of: Wj = W0 K+1 < L Wj-1(r1(x)) we compute: Wj = strengthen (W0 K+1 < L Wj-1(r1(x))) where strengthen(P) P • We still have Wj Wj-1 andwe stop when Wj-1 Wj • The result is still an invariant that satisfies 2 and 3 Prof. Necula CS 294-8 Lecture 12
Strengthening Heuristics • One goal of strengthening is simplification: • Drop disjuncts: P1 P2 P1 • Drop implications: P1 P2 P2 • A good idea is to try to eliminate variables changed in the loop body: • If Wj does not depend on variables changed by r1 (e.g. K, S) • Wj+1 = W0 K+1 < L Wj(r1(x)) = W0 K+1 < L Wj • Now Wj Wj+1 and we are done ! Prof. Necula CS 294-8 Lecture 12
Induction Iteration. Strengthening true • We are still in the “strong-enough” area • We are making bigger steps • And we might overshoot then weakest invariant • We might also fail to find any invariant • But we do so quickly W0 W1 W2 W’1 W3 W’2 false Prof. Necula CS 294-8 Lecture 12
One Strengthening Heuristic for Integers • Rewrite Wj in conjunctive normal form W1 = K 0 K < len(A) (K + 1 < L K + 1 0 K + 1 < len(A)) = K 0 K < len(A) (K + 1 L K + 1 < len(A)) • Take each disjunction containing arithmetic literals • Negate it and obtain a conjunction of arithmetic literals K+1 < L K+1 len(A) • Weaken the result by eliminating a variable (preferably a loop-modified variable) E.g., add the two literals: L > len(A) • Negate the result and get another disjunction: L len(A) W1 = K 0 K < len(A) L len(A) (check that W1 W2) Prof. Necula CS 294-8 Lecture 12
Induction Iteration • We showed a way to compute invariants algoritmically • Similar to fixed point computation in domains • Similar to abstract interpretation on the lattice of predicates • Then we discussed heuristics that improve the termination properties • Similar to widening in abstract interpretation Prof. Necula CS 294-8 Lecture 12
Theorem Proving. Conclusions. • Theorem proving strengths • Very expressive • Theorem proving weaknesses • Too ambitious • A great toolbox for software analysis • Symbolic evaluation • Decision procedures • Related to program analysis • Abstract interpretation on the lattice of predicates Prof. Necula CS 294-8 Lecture 12