1 / 38

Symbolic Tools for Program Proving

Symbolic Tools for Program Proving. Ken McMillan Microsoft Research. Presented at MEMOCODE 2012 in Arlington, VA. TexPoint fonts used in EMF: A A A A A. Program proving as SMT. In program proving, we decorate a program with auxiliary assertions, such as Loop invariants

druce
Download Presentation

Symbolic Tools for Program Proving

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Symbolic Tools for Program Proving Ken McMillan Microsoft Research Presented at MEMOCODE 2012 in Arlington, VA. TexPoint fonts used in EMF: AAAAA

  2. Program proving as SMT • In program proving, we decorate a program with auxiliary assertions, such as • Loop invariants • Procedure summaries • Environment conditions • Analysis of the program yields logical verification conditions (VC's) that we can discharge with a theorem prover. • Leaving the auxiliary assertions as unknown symbolic constants, the problem of program proof becomes an SMT problem: • Find values of the unknown relations that make the VC's valid. We will examine the consequences of viewing the program analysis as a satisfiability problem.

  3. Example • The verification conditions are: var:= 0; whileinvariantdo := ; done assert ; symbolic assertion invariant holds on entry loop preserves invariant assertion holds on loop exit • To prove the program, we must solve for • Duality: a proof corresponds to a model of the VC's.

  4. Is it really this simple? • We need symbolic solutions, since is an infinite set. • Traditional SMT solvers can't solve for because of the quantifiers in the VC's. • In general, SMT solvers don't produce models for quantified formulas. • We need specialized tools to solve these programs • We will consider three approaches • Predicate abstraction • Property-driven reachability • Lazy abstraction with interpolants

  5. Advantages of this view • Separates concerns • Interpreting program semantics • Proof search • Simplifies tools • No need to work with compiler intermediate representations • Re-uses tools • Verification condition generators • Logical solvers We need to know whether purely symbolic tools can perform as well as point tools that handle particular programming languages, or program representations.

  6. Constrained Horn Clause • Many program proof systems produce VC's in the form of constrained Horn clauses: constraint in theory example: • We will call the degree of the clause • If , we say the clause is non-linear • We will next consider some examples of systems that produce VC's in this form, linear and non-linear...

  7. Procedural programs • Consider a simple procedure (without parameters): procedure requires; ensures; begin := ; end precondition postcondition VC's are constrained Horn clauses: • Procedural abstraction, Boogie style: assert; call ; assert assert assert ; havoc x; assume assert

  8. Solving for procedure summaries assert; call ; assert assert assert ; havoc x; assume assert solve... This is an over-approximate procedure summary

  9. The non-linear case • Suppose we have a symbolic invariant before a procedure: assert; call ; assert The VC's have degree 2 • Nonlinear VC's make solving more involved • counterexamples are trees, not paths

  10. Modular concurrent proofs • Consider two parallel processes: process while * do done process while * do done • Suppose has locals and we have globals • We have to kinds of symbolic invariants: • local invariants and • environment assumptions and

  11. Modular VC's • The VC's for process are: initiation consecution non-interference environment abstraction • These VC's are: • in CHC form (the constraints are the transition relations) • non-linear (the non-interference rule)

  12. Generic logic solvers • Type inference for dependently typed functional languages • Example: Liquid types produced TCC's in CHC form • A generic logical solver for CHC's can potentially solve all these inference problems. • We will now consider some approaches to solving CHC's based on different proof search strategies.

  13. Predicate abstraction • Given a set of predicates , we synthesize the strongest satisfying interpretation of the unknown relations, as: • A Boolean combination over , or • A conjunction of literals over (Cartesian approximation) • Simple example: procedure P: ensures; begin if * then call P; end procedure Q: ensures begin call P; call P; end procedure main: begin y=x call Q; assert end

  14. PA example, cont. • Our VC's are: base case of P recursive case of P Q is P twice property to prove (query) We want to synthesize a solution for these VC's using these predicates: Strategy: start with false and use counterexamples to the VC's to weaken the relational interpretation.

  15. PA execution • All the VC's are now solved, so our property is proved. • This is the strongest solution expressible using our predicates. • But consider the query: relational interpretation failed VC • We can't repair this by weakening a relation. • If it's false, we need more predicates!

  16. Refinement using interpolants • A factis a CHC of the form . • A fact also expresses a set of ground facts: • For example stands for • A counterexample is a proof of a ground fact that contradicts a query. • We build a counterexample as a derivation tree.

  17. Derivation tree • Start with negation of query (we want to refute it) • Unify each P-fact with a P-rule • The derivation tree characterizes a set of ground derivations

  18. Solving the derivation tree • By solving the constraints in the derivation tree, we derive a ground fact that contradicts the query. ) not true! Satisfiable without query: Note the constraint tree is just a BMC formula. BMC = solving for a proof of a ground fact!

  19. Interpolating the derivation tree • If the constraints are UNSAT, we can compute an interpolant. ) predicates from interpolants • Interpolant formulas are: • bottom up refutation • only over head variables • upper bound on derivable facts

  20. Predicate abstraction as unwinding • We can think of predicate abstraction as unwinding • Each time inductiveness fails, we add an new instance of a clause

  21. Lazy predicate refinement • When query fails, build a derivation tree for the unwinding, and compute interpolants. predicates from interpolants: eager propagation unwinding solved! solution inductive!

  22. What have we done? • Given a purely logical account of predicate abstraction with interpolant-based predicate refinement: • Generate the VC's • Unwind the VC's • If a rule fails, create a new instance in the unwinding • Compute consequences expressible with predicates • If a query fails, build derivation tree • If derivation tree is satisfiable, this is a counterexample • Else compute interpolant, refine predicates. • This approach (more or less) implemented in QARMC

  23. Lazy abstraction with interpolants • In the IMPACT algorithm, we don't compute consequences eagerly using predicate abstraction. Instead, we simply decorate the unwinding with the interpolantsfrom the failed derivationof a counterexample. • We can generalize IMPACT from the linear case to the non-linear • In IMPACT, counterexample derivations are paths • in Duality, they are trees.

  24. Duality algorithm • We unwind the CHC's without any eager deduction • Each time inductiveness fails, we add an new instance of a clause

  25. Fixing the proof • When query fails, build a derivation tree for the unwinding, and compute interpolants,then update the solution with the interpolants. unwinding solved! solution inductive!

  26. Duality of proofs and models • We want to prove a program. • To do this, we try to solve the VC's for some unknown relations. • To do this, we try to solve for a refutation (proof VC's have no solution). • If we prove there is no such refutation, the proof (in the form of an interpolant) is a solution of the VC's. • The solution of the VC's is the proof of the program. Your Zen koan for the day:

  27. What have we done? • A fully symbolic interpolant-based approach to VC solving. • No eager predicate computation. • Method converges when subset of unwinding is inductive. • Re-use of existing facts to speed convergence and reduce BMC cost. • Method is a generalization of IMPACT to the non-linear case • BMC problems and counterexamples are trees, not paths. • Can do inter-procedural, thread modular analysis. • Using efficient interpolating SMT solver, we can handle large-grain VC's. • A single CHC can represent semantics of an entire procedure • SMT solver can efficiently search large space of execution paths • Tool can produce re-usable constructs such as procedure summaries.

  28. Property-driven reachability • We can generalize PDR to the non-linear case [BjornerHoder2012] • In PDR, when we fail to prove a conjecture locally, we form proof sub-goalsand propagate them downward. unwinding solved! solution inductive!

  29. What have we done? • Solved the VC's using purely local reasoning • Conjectures propagate downward, based on local counterexamples • Counterexamples and proofs propagate upward • Proofs of conjectures must be generalized (research problem) • Compute an interpolant by local proof steps. • Roughly this procedure is implemented in [BjornerHoder2012].

  30. The story so far • We've seen that program analysis can be viewed ass solving the VC's • Familiar algorithms can be transferred to this context: • (Lazy) Predicate abstraction • Lazy abstraction with interpolants • Property-driven reachability analysis • In the process we... • Generalize these algorithms to the nonlinear case, so they can compute procedure summaries, modular proofs, refinement types, etc... • Abstract away from program languages and representations. • Allows re-use of VC generation tools • We also have lots of flexibility in generating VC's • Different granularity -- blocks, loops, procedures, etc. • Different proof rules give different proof decompositions • By expressing the auxiliary relations in the right form, we can guide the proof

  31. Performance • The key remaining question is how much performance we may sacrifice to gain this flexibility. • Are there important optimizations we will miss? • In particular, what is lost if we don't explicitly mode control flow? • We'll look a two cases of comparison between generic logical tools and highly refined program-specific tools to try to answer this question.

  32. Verifying Boolean programs • We compare two tools for inter-procedural analysis of Boolean programs [Bjorner and Hoder, 2012]: • Bebop (a BDD-based tool used in SLAM) • CHC solver using PDR

  33. Full device driver verification • We compare Duality with Yogi, a software model checker extensively tuned for this application domain. • Benchmarks: randomly selected SDV examples • Procedure-level VC's generated by Boogie • Solved using duality algorithm with interpolating Z3.

  34. Adding localization reduction • Hypothesis: large overhead due to encoding of heap using many global maps (one per structure field). • Test: Localize using bounded model checking (a standard technique). This shows what potentially could be achieved by integrating localization incrementally, or perhaps different heap encoding

  35. Conclusion • We've seen that program verification can be viewed as solving the VC's to infer the necessary auxiliary constructs such as loop invariants, procedure summaries, non-interference conditions and so on. • Many existing verification techniques can be applied to this problem • Generalizing to the non-linear case • Allowing application to many proofs systems and languages • This allows a separation of concerns between programming language interpretation and verification algorithms • Re-use existing VC generators (Boogie, VCC, etc...) • Reduce barrier to entry in the field • Preliminary experiments indicate that the additional flexibility we gain in this may not come at a significant cost in performance.

  36. Re-using facts about sub-trees • If we prove a fact about one instance of P, it might be true of another • We simply pose this conjecture as a query and try to verify it... ? • is covered by not used in the inductive subset • This eager inference helps to bound the unwinding.

  37. Reusing facts about contexts • Suppose we have already proved a fact about predicate . • We can re-used this over-approximation in the derivation tree. • Facts useful in one context may useful in another. suppose we already have deduced: replace subtree by this over-approximation. If the over-approximation is UNSAT, we can compute interpolants, else, we can expand the sub-tree.

  38. Compositional verification • Re-use of previously derived facts is essential to avoid explosion of the derivation trees • Note that a derivation tree can be exponential in the unwinding size. • This is a form of compositional reasoning. • By conjecturing a value for one predicate, we divide the verification problem in half. • The conjecture may be too strong because its sub-tree is under-approximate • May be too weak because its context is under-approximate. • Interesting heuristic issues • In case the tree is SAT, which approximated leaf do we expand. • BMC tools such as Corral and FunFrog also face this issue.

More Related