500 likes | 640 Views
Towards a language design for modular software verification. Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal (ITU Copenhagen), Amal Ahmed (TTI-Chicago) Workshop on Effects and Type Theory Tallinn, December 13, 2007.
E N D
Towards a language design for modular software verification Aleks Nanevski Microsoft Research, Cambridge Joint with Greg Morrisett (Harvard), Lars Birkedal (ITU Copenhagen), Amal Ahmed (TTI-Chicago) Workshop on Effects and Type Theory Tallinn, December 13, 2007
How to design a programming language from scratch with verification in mind? • Simple types have been very successful in preventing a class of programming errors. • But many errors are outside of their reach. • index-out-of-bounds • division-by-zero • invariants on mutable state, or almost anything involving effects • Can a language enforce these deeper properties? • While supporting usual features from programming practice. • Be conservative over simply-typed languages.
Two foundational approaches to program specification and verification • Hoare Logic • starts with an existing language • usually imperative, untyped, first-order • recent extensions to simply-typed functional languages [Honda’05],[Krishnaswami’06],[Birkedal’05] • Dependent type theory • targetspure higher-order lambda calculus • types may capture deep semantic properties of data • integer is even, list has 5 elements, etc. • I want to argue that we essentially want a combination of both.
What limitations of simple types to address? • Simple types cannot specify effects. • These operations are naturally partial, but here they must be “completed”: • perform run-time check • possibly raise exception • Simple types do not capture this partiality.
How to specify effect behavior? • Type-and-effect systems: refine the type with the effect annotation.
Semantic disconnect in type-and-effect systems • Following term would be labeled as throwing DivByZero, in most type-and-effect systems. • Also,execution of div x n will repeat the check for n>0, even if it doesn’t need to. • Also, how to specify dynamically generated exns? • this immediately requires dependent types
How to reconnect type-and-effects with semantics? • Idea: draw effect annotations from logic. • y > 0 is a precondition that must be proved before running div x y. • we will also require postconditions, like in Hoare logic • and proofs • Important: Pre/post-conditions become embedded in types.
Why embed specifications into types? • Captures partiality • e.g., no need to define div x y in case y · 0. • hence, strictly more expressive than Hoare Logic • Enables trade-offs between proving and efficiency • I.e. we can immediately define: • Uniform abstraction over terms, types, specs. • essential for information hiding and scalability • essential for higher-order and local state
Which logic to use for specifications? • It should be able to support all kinds of programming features: • practical data structures (e.g., hash-tables). • higher-order functions, polymorphism. • pointers, aliasing, state ownership • recursion, callcc, IO, concurrency. • Thus, the logic better be very expressive. • Type theory (like Coq) seems perfect. • But need to reconcile it with effects.
Hoare Type Theory (HTT) • Introduce a type corresponding to specs in Hoare Logic (for partial correctness). • Hoare type stands for • stateful programs with • precondition P • postcondition Q • result type A • Simply-typed fragment (almost) core Haskell.
Hoare Type Theory (cont’d) • Fruitful combination of some fundamental PL ideas: • Dijkstra’s predicate transformer. • Curry-Howard isomorphism. • Monads (as in Haskell). • Separation Logic of Reynolds, O’Hearn, et al. • Provably compositional: • components can be specified and checked in isolation. • Prototype under construction as extension of Coq. • Execution by code extraction.
Type theories are unsound if effects are added naively • Propositions like (10 < 0) are types. • Effectful programs can often be given any type: • divergence via infinite recursion • exceptions • mutable state • IO • concurrency • An effectful program can prove that (10 < 0)! • Hence, the system is inconsistent The awkward squad from Haskell
A solution: Monads • Like in Haskell, distinguish purity with types • pure fragment – the underlying type theory • e : nat • e is an integer value • e : ST nat • e is delayed effectful computation. • when executed, it may change the state and diverge. • but since it is delayed, it is actually considered pure. • hence, can safely appear in types, predicates, proofs. • e : ST (10 < 0) • a computation which must diverge when executed.
Refine the monad with pre/post-condition to capture effectful behavior and partiality • Hoare type is a dependent (or indexed) monad. • Formation rule • ST{P}x:A{Q} : Type if • P : heap Prop • A : Type • x:A |- Q : heap heap Prop, where heap = loc option( a:Type. a), and loc = nat. • Note: postcondition is binary relation on heaps. Variant of VDM notation.
Example: specify function that increments location contents and returns old value • where is true if x points to v:A in h. • Note: before running inc x, must prove that x stores a nat. • because x may store a value of some other type. • because x may be a dangling pointer.
Implementation of inc in Haskell-style do-notation. • HTT implementation typechecks inc as follows: • Compute P,Q=weakest pre/strongest post for the do-block • Then emit obligation to prove the consequence:
Typing of primitive commands designed to compute weakest pre and strongest post • Memory read • (Strong) Memory update
Typing of primitive commands designed to compute weakest pre and strongest post • Memory allocation • Memory deallocation
Fixpoints are a little bit different… • Pre/posts must be given explicitly (for now) • Corresponds to giving loop invariants in Hoare Logic • But should be possible to write a rule that infers the strongest invariant! Future work.
Monadic primitives (unit) • Roughly, corresponds to Hoare Logic rule of variable assignment.
Monadic primitives (bind) • Rule of sequential composition (but higher-order) • Note: quantifications over pre/posts and heaps is essential for obtaining tightest specs.
Monadic primitives (Haskell-style do) • Rule of consequence • Interesting fact: “do” is not ordinary coercion • it is an introduction form for Hoare type • bind is corresponding elimination
Example: counter • Allocate a private location x • Export function that increments x • Executing fcounter; x0f; x1f; x2f will bind 0,1,2 to x0,x1,x2, respectively. • What is the spec for counter?
A specification with nested Hoare types • Problem: x is out of scope in return type.
Hide private state by existential abstraction • Introduce invariant into code to hide how count is kept. • Another problem: • fst(f) 0 h states (x0) h, but we lost connection with i • We will need Separation Logic to handle this.
Weakest pre and strongest post precisely capture the semantics of a program. • Problem: these may not be easy to read! • Remember the example 3-line program:
Here is the computed tightest spec for inc, in Coq syntax. inc : forall x : loc, ST (fun i : heap => (fun i0 : heap => exists v : nat, ptsto x v i0) i /\ (forall (x0 : nat) (m : heap), (fun (y : nat) (i0 m0 : heap) => m0 = i0 /\ ptsto x y i0) x0 i m -> (fun (xv : nat) (i0 : heap) => (fun i1 : heap => exists B : Type, exists w : B, ptsto x w i1) i0 /\ (forall (x1 : unit) (m0 : heap), (fun (_ : unit) (i1 m1 : heap) => m1 = update x (xv + 1) i1) x1 i0 m0 -> (fun (_ : unit) (_ : heap) => True) x1 m0)) x0 m)) (fun (y : nat) (i m : heap) => exists x0 : nat, exists h : heap, (fun (y0 : nat) (i0 m0 : heap) => m0 = i0 /\ ptsto x y0 i0) x0 i h /\ (fun (xv y0 : nat) (i0 m0 : heap) => exists x1 : unit, exists h0 : heap, (fun (_ : unit) (i1 m1 : heap) => m1 = update x (xv + 1) i1) x1 i0 h0 /\ (fun (_ : unit) (r : nat) (i1 f : heap) => r = xv /\ f = i1) x1 y0 h0 m0) x0 y h m)
Luckily, the spec has a lot of structure! • It literally represents the program as a predicate. • We apply the proving strategy from Hoare Logic: • symbolically evaluate the program, one step at a time. • at each step, discharge the verification condition that enables the next evaluation step. • With a twist: Evaluation/VC-generation can be implemented as a set of lemmas. • proving the lemmas verifies the VC-gen implementation.
Example lemma for symbolic evaluation (in Coq syntax) • If program starts with a read from location x: • first prove that x is initialized (ptsto x v i) • then proceed to prove the spec of the continuation. • Other lemmas similar (evals_bind_write, evals_bind_new…) • Applicable lemma can be determined by a tactic. Lemma evals_bind_read : forall (A B : Type) (x : loc) (v : A) (p2 : A -> heap -> Prop) (q2 : A -> B -> heap -> heap -> Prop) (i : heap) (q : B -> heap -> Prop), ptsto x v i -> (p2 v i /\ forall y m, q2 v y i m -> q y m) -> (bind_pre (read_pre A x) (read_post A x) p2 i /\ forall y m, (bind_post (read_pre A x) (read_post A x) p2 q2 y i m -> q y m.
Large footprints in Hoare Logic • Let inc: • Q: What is known after inc runs in a heap with locations x and y? • A: Only that xv+1, but all info about y is lost. • Spec should explicitly say that y is not changed. • possible to write in ST, but quite inconvenient
Small footprints and Separation Logic • Specs should only describe what the program changes [O’Hearn,Reynolds,Pym,…] • If e : STsep{P}x:A{Q}, then e can run in • any heap containing a subheap i such that P i • diverges, or returns subheap m such that Q i m • part of initial heap outside iis not accessible. • Easier to use than large footprints, but more difficult meta theory.
Separation logic adds two new things: • Separating conjunction (easily definable in HTT): (P * Q) holds of heap h iff P and Q hold of disjoint parts of h • Frame rule of inference: If then • Can we add Frame rule to HTT? How to prove that Frame is sound?
Employ a type-theoretic idea to expedite… • Impose that well-typed programs must satisfy Frame! • Define new monad STsep, over ST: • Then re-type the stateful commands, using rule of consequence.
Programs remain the same, but specs become much simpler • Example: allocation • empty subheap is consumed and replaced by rv • r must be fresh (as new can’t access existing state) • Example: deallocation • subheap x- is consumed and replaced by empty. • Analogy with linear logic.
STsep monad correctly handles private state • Now (fst f) 0 replaces empty from the precondition. • Meaning: initial heap is extended with x0
Meta-theoretic properties:soundness, compositionality, equations
Verification in HTT reduces to typechecking • Theorem: If e:ST{P}r:A{Q}, then E evaluates as expected. • Proved via Preservation and Progress lemmas. • but much more demanding! • Preservation: evaluation preserves types, normal forms, and postconditions. • e.g: if e:ST{T}r:int{r = 55} then e does produce 55. • Progress demands soundness of assertion logic • Requires a denotational model for HTT.
Type checking is syntax directed • Program properties independent of context. • No need for whole program reasoning. • Proofs by induction on program structure. • Program is a proof of its spec: • in the pure case, by Curry-Howard. • in the impure case, by weakest pre/strongest post. • Formal statements of compositionality • In the pure case, substitution principles. • In the impure case, Hoare’s rule of composition.
Denotational models • Denotation for e : ST{P}x:A{Q x} is a predicate transformer: • takes p:heapProp such that 8h. p h P h • returns q:AheapProp such that 8x h. q x h 9i. p i Æ Q x i h • is monotone • Model suffices for soundness, but too large • e.g., does not support storing monads into heaps • also, requires showing monotonicity before taking fix. • Better, realizability model [Petersen,Birkedal’08]. • But not implemented in Coq, and seems very hard to!
Summary • HTT reflects effect information into types via Hoare-style pre/post conditions. • Generalization of monadic type-and-effect systems, but effect annotations are logical predicates over heaps. • Types determine in which context a program may be used (in a context satisfying the precondition). • This is a uniquely type-theoretic property, generalizing ordinary Hoare Logics. • Combines usefully with higher-order features of a type theory like Coq, to represent modes of use of state, like: • freshnes, aliasing, ownership (via Separation Logic) • higher-order and shared local state (via existential abstraction).
Related work • Extended static checking: • ESC/Java, JML, Spec#, SPlint, Cyclone, Sage • Hoare-like annotations verified during typechecking. • Restrictive strategies for dealing with undecidability • Dependent types and effects • [Augustson’98],[Mandelbaum’03],[Zhu,Xi’05],[Shao’05], [Sheard’05],[Westbrook’06],[Taha’07],[Condit’07]. • Programs and specs cannot share pure code (phase separation) • Hoare Logics for higher-order functions: • [Schoeder’02],[Honda’05],[Krishnaswami’06],[Birkedal’04] • Simply-typed underlying languages (with effects) • Hoare triples do not integrate into types.
HTT in comparison to related work. Programming features Fully verified software Java,C#,Haskell,O’Caml Hoare specs (ESC,JML,Spec#,Cyclone) Light dependent types (Cayenne,DML, ATS,Omega) HTT Typed lambda calculus Dependent type theory (Coq,Epigram,NuPRL…) Spec expressiveness
Future work: gain more experience with implementation in Coq • A lot of scaffolding for verification is in place • symbolic evaluation lemmas • tactics for Separation Logic reasoning (were tricky to nail down at first; several wrong starts) • Getting ready to attack larger programs. • Probably start with libraries for imperative data structures. • Largest so far: Hash-table module, Stack module, Parsing combinators. • Experience encouraging: • proofs/code ratio quite large • but proofs were not difficult
Future work: other effects • First attempts at formulating Haskell-style monad for transactional concurrency. • Separate state into private and shared • Reasoning like O’Hearn’s concurrent separation logic • Hoare type is a 4-touple STM{I}{P}x:A{Q} • I – invariant of shared state • Other notions of concurrency? Auxiliary variables, history/prophecy variables? Predicate transformers for concurrency? • IO monad? • Specifications must be limited to statements that are invariant against outside changes to the world. • Continuation monad? (first attempts made)
Future work: better models and axiomatizations • Can we encode equality over effectful code as some reasonable judgment? • Without having to implement involved categorical models.
Hopefully in future not too far, far away… Programming features Fully verified software Java,C#,Haskell,O’Caml Hoare specs (ESC,JML,Spec#,Cyclone) Light dependent types (Cayenne,DML, ATS,Omega) HTT Typed lambda calculus Dependent type theory (Coq,Epigram,NuPRL…) Spec expressiveness