410 likes | 549 Views
Scalable Contract Checking for Systems Software using SMT solvers. Shaz Qadeer RiSE , Microsoft Research Joint work with Jeremy Condit and Shuvendu Lahiri. http://research.microsoft.com/en-us/projects/havoc/. Context: Scalable module verification. Harness.
E N D
Scalable Contract Checking for Systems Software using SMT solvers Shaz Qadeer RiSE, Microsoft Research Joint work with Jeremy Condit and Shuvendu Lahiri http://research.microsoft.com/en-us/projects/havoc/
Context: Scalable module verification Harness Target: OS components (kernel, drivers, file-systems) • ~100KLOC of lines of codes with >1000 of procedures Module • A set of public/entry procedures • A set of private/internal procedures Specs • Interface specification • Specs for public methods • Specs for external modules • Property assertion • Initialize(..); • while(*) { • choice= nondet(); • If (choice == 1){ • [assume pre_1] • call Public_1(…); • } else if (choice == 2){ • [assume pre_2] • call Public_2(…); • } … • } • Cleanup(…);
Desirable goals • Find bugs • Violations of property assertions • Low false alarms • Use contracts • Modular checking for scalability • Readable contracts are formal documentation • Reduce testing cost by providing high assurance in the verifier • Formal documentation of assumptions • Simple meta-theory for proofs
Existing methods on these examples Harness • Initialize(..); • while(*) { • choice= nondet(); • If (choice == 1){ • [assume pre_1] • call Public_1(…); • } else if (choice == 2){ • [assume pre_2] • call Public_2(…); • } … • } • Cleanup(…); Large difference between theory and practice Imprecise • Modeling of lists/arrays Unsound • Modeling of lists/arrays • Aliasing, pointer arithmetic • Restricted harness Complex “proof” calculus • Combination of analyses
Full functional correctness is not a goal Neither is minimizing the trusted computing base
Proof method: Floyd-Hoare Triple • Floyd-Hoare triple {P} S {Q} P, Q : predicates/property S : a program • From a state satisfying P, if S executes, • No assertion in S fails, and • Terminating executions end in a state satisfying Q
Program verification Formula { b.f = 5 } a.f = 5 { a.f + b.f = 10 } is valid iff Select(f1,b) = 5 f2 = Store(f1,a,5) Select(f2,a) + Select(f2,b) = 10 is valid theory of equality: = theory of arithmetic: 5, 10, + theory of arrays: Select, Store • [Nelson & Oppen ’79]
Satisfiability-Modulo-Theory (SMT) • Boolean satisfiabilitysolving + theoryreasoning • Ground theories • Equality, arithmetic, Select/Store • NP-complete logics • Powerful methods to combine decision procedures for theories • [Nelson & Oppen ’79] • Phenomenal progress in the past few years • Yices, Z3, Mathsat, ….
Simple type-state property • Allocation type-state of DEV_OBJ • Device Objects (DEV_OBJ) allocated and freed • Property to check for a module • IoDeleteDevice() only called on elements in MyDevObj ~MyDevObj IoCreateDevice() IoDeleteDevice() MyDevObj
Simple property simple invariants do typedefstruct _DEV_OBJ{ DEV_EXT *DevExt; … } DEV_OBJ; typedefstruct _DEV_EXT{ DEV_OBJ *Self; … } DEV_EXT; requires (do MyDevObj) NT_STATUS PnP(DEV_OBJ do, IRP *pirp){ PDEV_EXT data = do->DevExt; …. switch(pirp->MajorFn){ case IRP_MN_REMOVE_DEVICE: IoDeleteDevice(data->Self); … } } DevExt DEV_OBJ Self DEV_EXT • x MyDevObj. x->DevExt->Self = x
Simple property simple invariants NT_STATUS Unload(…){ …. iter = hd->First; while(iter != null) { RemoveEntryList(iter); iter = iter->Next; IoDeleteDevice(iter->Self); } …. } hd First DEV_EXT DEV_EXT DEV_OBJ DEV_OBJ Next Next • x DevExt. x->Self MyDevObj Self Self • x Btwn(Next, hd->First,NULL). x DevExt • xDevExt. x->Self->DevExt = x DevExt DevExt
Limitations of SMT solvers • No support for precise reasoning with reachability predicate • Incompleteness in Floyd-Hoare proofs for straight line code • Brittle support for quantifiers • Complexity: NP-complete (ground) undecidable • Leads to unpredictable behavior of verifiers • Proof times, proof success rate
Limitations of SMT solvers • Answer the query {P} S {Q} for loop-free and call-free programs • To handle loops and procedures, contracts are needed • Loop invariants • Pre/post-conditions • Infeasible to manually supply internal contracts for large modules
Contributions • Efficient decision procedure for verifying list-based programs • Verifying and exploiting C type annotations • Annotation inference for large modules
Reachability predicate: Btwnf next next next x y f f f g g g Btwn(next,x,y)
Expressive logic • Express properties of collections x Btwn(next, next(hd), hd). state(x) = LOCKED //cyclic • Arithmetic reasoning on data (e.g. sortedness) x Btwn(next, hd, null) \ {null}. yBtwn(next, x, null) \ {null}. d(x) d(y)
Efficient decision procedure • Decides the validity of {P} S {Q} • Worst-case exponential time but works well in practice • Decision problem is NP-complete • Cannot expect any better with propositional logic • Retains the complexity of current SMT logics • Implemented in the Z3 SMT solver • Leverages powerful ground-theory reasoning (arithmetic, arrays, uninterpreted functions…)
Contributions Efficient decision procedure for verifying list-based programs Verifying and exploiting C type annotations Annotation inference for large modules
C language C types • Scalars (int, long, char, short) • Pointers (int*, struct T*, ..) • Nested structs and unions • Array (struct T a[10];) • Function pointers • Void * Difficult to establish type safety in presence of pointer arithmetic, casts • Type Safety (spatial) memory safety • Important default property to check Lack of types hurts property checking • Difficult to disambiguate heap pointers • Difficult to write concise type invariants
q p Example: Type Checking IRP IRP ListEntry ListEntry Flink Flink Blink Blink q = CONTAINING_RECORD(p, IRP, ListEntry) = (IRP*)((char*)p - &((IRP*)0->ListEntry)) Type Checker:Does variable qhave type IRP*?
q r Example: Property Checking IRP IRP Data1 Data1 ListEntry ListEntry Flink Flink Blink Blink Data2 Data2 ... q->Data2 = 42; Property Checker: Is r->Data1 unchanged?
q r Example: Property Checking Data1 Data2 / Data1 ListEntry ListEntry Flink Flink For all we know, Data1 and Data2 could be aliased! Blink Blink Data2 Data2
Our Approach • Implement a type checker in HAVOC • Provide formal semantics for C and its types • Use types to improve the property checker • Provide Java-style field disambiguation • Fully automated using Z3 SMT solver
Formalizing Type Safety A C program is type safe if the run-time value of every variable and heap location corresponds to its compile-time type. Mem : addr -> value Type : addr -> type HasType : value x type -> bool for all a in addr, HasType(Mem(a), Type(a))
Example #define ENCL(x) CONTAINING_RECORD(x, record, node) requires( HasType(ENCL(p), record*) && ENCL(p) != NULL ) void init_record(list *p) { record *r = CONTAINING_RECORD(p, record, node); r->data2 = 42; } requires( forall(q, Btwn(next, p, NULL), q != NULL ==> HasType(ENCL(q), record*) && ENCL(q) != NULL) ) void init_all_records(list *p) { while (p != NULL) { init_record(p); p = p->next; } }
Decision Procedure • Translation results in verification conditions that refer to Mem, Type, and HasType • Can be encoded into an NP-complete logic • No worse than SAT solving • Provide decision procedure using an SMT solver
Experiments • Implementation supports full C language • Supports polymorphism • Supports user-defined, dependent types • Fancier type invariants => slower checking • Pay only for what you use! • Annotated and checked four Windows drivers • Sample drivers provided with Windows DDK • About 2.3 KLOC total, with 225 annotations • Checking time: ~1 minute each
Contributions Efficient decision procedure for verifying list-based programs Verifying and exploiting C type annotations Annotation inference for large modules
Simple property simple invariants NT_STATUS Unload(…){ …. iter = hd->First; while(iter != null) { RemoveEntryList(iter); iter = iter->Next; IoDeleteDevice(iter->Self); } …. } hd First DEV_EXT DEV_EXT DEV_OBJ DEV_OBJ Next Next • x DevExt. x->Self MyDevObj Self Self • x Btwn(Next,hd->First,NULL). x DevExt • xDevExt. x->Self->DevExt = x DevExt DevExt
Need to simplify the problem Harness Module • A set of public/entry procedures • A set of private/internal procedures Specs • Interface specification • Property assertion Require the user to provide a module invariant • Initialize(..); • [loop_invmoduleInv] • while(*) { • choice= nondet(); • If (choice == 1){ • [assume pre_1] • call Public_1(…); • } else if (choice == 2){ • [assume pre_2] • call Public_2(…); • } … • } • Cleanup(…);
Module invariants • Module invariants • Invariant about all objects of a given type • Invariants on global variables • Preserved by the public functions • Low overhead • On “steady state” and therefore succinct • Only needed to be written at module level
Intra-module inference • Given module M, interface specs, property and module invariants • Infer annotations on internal procedures and loops • Use annotations to verify property and module invariant • Challenges • Module invariants are temporarily broken • Inference has to be scalable
Module invariant broken requires (TypeInvDO) ensures (TypeInvDO) void publicFoo () { PDEV_OBJ do = NewDEV_OBJ(); privateBar(do); } x DevExt DEV_OBJ Self requires (TypeInvDOExcept(do)) requires (TypeInvDO) ensures (TypeInvDO) void privateBar (PDEV_OBJ do) { do->DevExt->Self = do; } DEV_EXT • #define TypeInvDO \ • x MyDevObj. x->DevExt->Self = x \
Houdini algorithm (Flanagan-Leino 01) • Problem statement • Given a set of procedures P1, …, Pn • A set of C of candidate annotations for each procedure • Returns a subset of the candidate annotations such that each procedure satisfy its annotations • Also known as “monomial predicate abstraction” • Algorithm • Performs a greatest-fixed point starting from all annotations • Remove annotations that are violated • Requires a quadratic (n * |C|) number of theorem prover calls • Uses a modular checker
Candidate assertions • Candidate assertions • Type-states in module invariants • Over parameters, globals, locals and their fields • Module invariant exceptions(next slide) • Conditional annotations • Disjunction of above annotations
Module invariant exceptions TypeInvDOExcept({do,de->Self},requires)) TypeInvDOExcept({do,de->Self},ensures)) void privateBar (PDEV_OBJ do, PDEV_EXT de) { … do->DevExt->Self = do; } Exceptions come from parameters, return, globals, fields • #define TypeInvDO \ • x MyDevObj. x->DevExt->Self = x \ • #define TypeInvDOExcept({a,b},ANNOT) \ • ANNOT(x MyDevObj. x = a x = b x->DevExt->Self = x)\ • ANNOT(a->DevExt->Self = a) \ • ANNOT(b->DevExt->Sefl = b) \
Observations • Able to synthesize most intermediate invariants • “close” to the module invariant (simple) • readable • Invariants contain quantifiers, Boolean structure • Checking all Boolean combinations expensive (from NP-Complete PSPACE-complete [CADE’09]) • Retains scalability of the Houdini inference
Experiments • Benchmarks • 4 device drivers (~7KLOC each), contains lists, arrays • #Internal methods: ~30 • #loops: ~20 • Properties • double-free, lock-usage • User provides module invariant • Tool infers intermediate invariants and modifies clauses
Results • Verified the properties with 0 false alarms • Module invariant overhead • Number of module invariants ~5-10 • Reused across multiple drivers • Most internal annotations inferred • Approx 1500 inferred annotation per driver • Less than 5 manual annotation per driver • Mostly conditional annotations (e.g. predicated on return value) • Inference time < 5X of the checking time
Contributions Efficient decision procedure for verifying list-based programs Verifying and exploiting C type annotations Annotation inference for large modules