280 likes | 417 Views
Secure Compiler Seminar 11/7 Survey: Modular Development of Certified Program Verifiers with a Proof Assistant. Toshihiro YOSHINO (D1, Yonezawa Lab.) < tossy-2@yl.is.s.u-tokyo.ac.jp >. Today’s Paper.
E N D
Secure Compiler Seminar 11/7Survey: Modular Development ofCertified Program Verifierswith a Proof Assistant Toshihiro YOSHINO(D1, Yonezawa Lab.) <tossy-2@yl.is.s.u-tokyo.ac.jp>
Today’s Paper • A. Chlipala (UC Berkeley). Modular Development of Certified Program Verifiers with a Proof Assistant. ICFP ’06. • Implementation can be downloaded from web site below: ⇒ http://proofos.sourceforge.net/
Overview of the Paper • Case study to develop a certified program verifier with Coq • Verifies memory safety of x86 machine code • Its soundness is machine-checked • Modular development by reusable functors • Possible to create a new verifier based on another type system with low cost
Constructing Certified Verifiers • Design and implement with Coq • Use “extraction” feature of Coq to obtain a working verifier • A verifier can be formalized as: • load: program -> state loads a program • The type program represents binary file format • safe: state -> Prop is the safety property we wish to verify for programs • [[P]] is notation for poption P • option(O’Caml) or Maybe(Haskell) for domain Prop
Constructing Certified Verifiers • Abstraction refinement by multiple stages • Each stage (component) is a functor which transforms target states into source states • Later components reason at higher levels of abstraction • Use Coq’s module system to implement this modular design
Formalization of x86 Instruction Set • PCC-style formalization • Subset of x86 instruction set + ERROR instruction • mov, jcc, … • Safety ≡ ERROR is unreachable • In combination with assertion, many properties can be proven • Can be formalizedcoinductively • Cope with infinitederivation
Types and Extraction in Coq • Basically Coq manipulates on terms of dependently-typed lambda calculus • A proposition is represented as a type, its proof as a term of that type • Well known as Curry-Howard isomorphism • Proving step corresponds to type inference • Given a goal, refine it interactively into subgoals, and eliminate holes • Rules used for these steps are called tactics
Types and Extraction in Coq • Program extraction from Coq code • In short, extraction is to erase terms of sorts other than Set • Brief example: isEven Definition isEven : forall (n:nat), poption (even n). refine (fix isEven (n:nat) : poption (even n) := match n return … with | O => PSome _ _ | S (S n) => … | _ => PNone _ end); auto. Qed. Definition isEven : forall (n:nat), poption (even n). refine (fix isEven (n:nat) : poption (even n) := match n return … with | O => PSome _ _ | S (S n) => … | _ => PNone _ end); auto. Qed. let rec isEven (n:nat) = match n with | O -> true | S (S n) -> isEven n | _ -> false
poption: “option” for Domain “Prop” • Two constructors: PNone and PSome • PSome is given a proof of P • Literately, PSome means “P holds and I have a proof for that” and PNone “I am not sure” • Can be used as failure-monad • PNone >>= _ = PNonePSome p >>= f = f p • In extraction, PSome corresponds to true, and PNone to false
soption • soption extends poption with a parameter • Proposition about a term of domain T (of sort Set) • soption, too, can be used as failure monad • In the paper’s theoretical part, written as {{ x : T | P }}
Coq’s Module System • Used to build re-usable verification components • Frequent pattern: Module M86 <: MACHINE. Definition mstate := state. Definition minstr := instr. … End M86. Module Type MACHINE. Parameter mstate : Set. Parameter minitState : mstate -> Prop. … End MACHINE. Record state : Set := { stRegs32 : regs32; … }. Inductive instr : Set := Arith : … | … . Inductive exec : … := … .
Module ModelCheck • Provides fundamental methods of model checking • Methods to prove theorems about infinite state systems through exhaustive exploration • Refine the model in each of the following stages Abstract Concrete
Module ModelCheckIntroduced Elements • absState: a set of abstract states • An abstract state is managed with “hypotheses”, states that are known to be safe • Hypothesis is used, for example, to formalize return pointer from a function • describes correspondence between machine states and abstract states • Context(Γ) is deleted in extracting a verifier • init is a set (actually a list) of initial states • It must be a set because one real machine state may correspond to multiple abstract states • There must be some elements in init that has no hypothesis
Module ModelCheckIntroduced Elements • step describes execution step • Execute an instruction from the specified state • soption is used because the execution may get stuck • Progress and Preservation must hold Progress Preservation
Initial states Initial states Module ModelCheckThe Concept Illustrated MACHINE: Input to the module State space ofa real machine absState step
Module Reduction • Translates x86 machine language into simpler RISC-style instruction set (SAL) • x86 machine language is too complex and not suitable for verification purposes • One instruction may perform several basic operations • The same basic operations show up in the working of many instructions • Reduction module also provides model checking layer for SAL programs
Module ReductionSAL: Simplified Assembly Language • Named after the language used in Proof-Carrying Code[Necula 1997] • RISC-style instruction set • Arithmetics are extended to allow expressions with parentheses and infix operators • Additional temporary registers TMPi
Module FixedCode • Ensures that code region is not overwritten by the code itself • To simplify the verification framework • Definition is in the form of ModelCheck • Additional check is performed only on storing to the memory
Module TypeSystem • Support for a standard approach for type systems • A set of types is introduced and typing rules for values are described • Subtype relation is also introduced • The definition in the figure suffices because Coq takes care of that part • And each register isassociated with a type
Module TypeSystem • viewShift represents shift of types’ view • Occurs at places a program crosses an abstraction boundary • For example, in function calls when the stack frame changes • Introducing existential is also a kind of view shift
Module WeakUpdate • Introduces a type system of weak update • Each memory cell has a type associated and this type does not change during a run • A cell can be overwritten only with a value of its type • Dynamic memory management is out of the scope • In real setting, memory is frequently reclaimed and reused • Garbage collector or malloc/free
The Rest of Modules • Module StackTypes • Keeps track of types of stack slots • Module SimpleFlags • Keeps track of flag values • In x86 (too), no atomic instruction for conditional test and jump at one time • Crucial for assuring pointer is valid (not null) or checking array boundary
Case Study:A Verifier for Algebraic Datatypes • Implemented the library and a sample verifier with Coq • http://proofos.sourceforge.net/ • Approx. 20K(+α) LoC • Main implementation consists of only 600 LoC • 7,000 LoC for implementing library components • 10,000 for generic utility • 1,000 for bitvectors and fixed-precision arithmetics • 1,000 for a subset of x86 machine code • Auxiliary library from O’Caml implementation (not counted here) • x86 binary parsing, etc.
Related Work • Foundational PCC[Appel 2001] • Reduce TCB and also improve flexibility of PCC by constructing a system on some logical framework • However, efficiency is sacrificed by generality • Theoretical issues seem to have priority to pragmatics • Epigram[McBride, McKinna 2004], ATS[Chen, Xi 2005],RSP[Westbrook et al. 2005] and GADTs[Sheard 2004] • Incorporate dependent types into program languages • But the foundations of Coq’s implementation and metatheories are simpler than them
Summary (of the Paper) • Designed a structure for modular certified verifiers • Components are reusable functors • Pipeline-style design • Implemented library components with Coq • As a case study, memory safety verifier for x86 machine code is constructed
Relevance to My Research • I have been studying a framework to build verifiers for low-level languages • First formalize the common language ADL • Verification is done on the translated program (in ADL) • Trying to prove correctness of translation • Currently ongoing with Coq
Relevance to My Research • Both very similar approach • ADL and SAL are both designed in a minimalist criteria • Verification logic is built on top of the common language’s semantics • To achieve high portability and flexibility • From this viewpoint, my project is covered by his… (x_x) • Correctness of translation is also proven by Coq in proofos • Positively thinking, my direction was not so wrong
Relevance to My Research • Comparison of two projects…