660 likes | 780 Views
SMT based predictable analysis of systems code. Shuvendu Lahiri Microsoft Research, Redmond. Joint work with: S . Qadeer (MSR) J . Condit, B. Hackett, Z. Rakamaric, T. Wies, J. Voung , J. Galeotti. Problem. Modular property checking of C modules
E N D
SMT based predictable analysis of systems code Shuvendu Lahiri Microsoft Research, Redmond Joint work with: S. Qadeer (MSR) J. Condit, B. Hackett, Z. Rakamaric, T. Wies, J. Voung, J. Galeotti
Problem Modular property checking of C modules • Device drivers, file systems, kernel components,… • Double-free, lock usage, memory safety, user-provided assertions • Goal: Predictable analysis using SMT solvers • Efficiently decidable logics
HAVOC • Property checker for C programs • Active [’06-’09] • Found 100+ errors in various kernel components
HAVOC modular checker C program Annotations C Boogie Memory model Boogie program Boogie VC gen SMT formula SMT Solver (Z3) Decision Procedures for types, lists, arrays Verified Warning
Challenges imposed for analyzing C Additional challenges (over Java/C#) • Lack of type safety • Presence of low-level data structures • Explicit memory management (free) • Bit-wise operations • ……
p Example: Type Checking IRP IRP ListEntry ListEntry Flink Flink Blink Blink typedefstruct _LIST_ENTRY{ LIST_ENTRY *Flink, *Blink; } LIST_ENTRY, *PLIST_ENTRY; typedefstruct _IRP{ …. LIST_ENTRY ListEntry; … } IRP, *PIRP;
q p Example: Type Checking IRP IRP ListEntry ListEntry Flink Flink Blink Blink q = CONTAINING_RECORD(p, IRP, ListEntry) = (IRP*)((char*)p - &((IRP*)0->ListEntry)) Type Checker:Does variable qhave type IRP*?
q r Example: Property Checking IRP IRP Data1 Data1 ListEntry ListEntry Flink Flink Blink Blink Data2 Data2 ... q->Data2 = 42; Property Checker: Is r->Data1 unchanged?
q r Example: Property Checking Data1 Data2 / Data1 ListEntry ListEntry Flink Flink For all we know, Data1 and Data2 could be aliased! Blink Blink Data2 Data2
Types in C programs • Types in C programs cannot be trusted • Unsafe type casts, pointer arithmetic • Typical type checking in C compilers cannot ensure memory safety • Lack of types hurts property checking • Disambiguation
Simple type-state property • Allocation type-state of DEV_OBJ • Device Objects (DEV_OBJ) allocated and freed • Property to check for a module • IoDeleteDevice() only called on MyDevObj ~MyDevObj IoCreateDevice() IoDeleteDevice() MyDevObj
Simple property simple invariants NT_STATUS Unload(…){ …. iter = hd->First; while(iter != null) { RemoveEntryList(iter); iter = iter->Next; IoDeleteDevice(iter->Self); } …. } hd First DEV_EXT DEV_EXT DEV_OBJ DEV_OBJ Next Next Pointers from the list point to distinct objects Self Self DevExt DevExt
Lists • Prevalent in most systems code • Manipulated by explicit pointer operations • Updates to next fields
This talk • Focus on two of these challenges • Lack of type-safety • Presence of low-level data structures • Solution • New efficient SMT theories for the above problems
Overview • Motivation • Background • Exploiting types [POPL’09] • Logic for lists [POPL’08] • Application [CAV’09]
Program Correctness: Floyd-Hoare Triple • Floyd-Hoare triple {P} S {Q} P, Q : predicates/property S : a program • From a state satisfying P, if S executes, • No assertion in S fails, and • Terminating executions end up in a state satisfying Q
Program verification Formula { b.f = 5 } a.f = 5 { a.f + b.f = 10 } is valid iff Select(f1,b) = 5 f2 = Store(f1,a,5) Select(f2,a) + Select(f2,b) = 10 is valid theory of equality: f, = theory of arithmetic: 5, 10, + theory of arrays: Select, Store • [Nelson & Oppen ’79]
Satisfiability-Modulo-Theory (SMT) • Boolean satisfiabilitysolving + theoryreasoning • Ground theories • Equality, arithmetic, arrays, bit-vectors, …. • Powerful methods to combine decision procedures for theories • [Nelson & Oppen ’79] • Phenomenal progress in the past few years • Z3, Mathsat, Yices, …. Works best for NP-complete theories
Overview • Motivation • Background • Exploiting types • Logic for lists • Case study
Memory model for C • Each pointer is an integer • Heap as a map // Mutable Mem: intint Alloc: int {UNALLOCATED, ALLOCATED, FREED} // Immutable Base: int int //base address of each pointer
C Boogie typedef struct { int g[10]; int f;} DATA; DATA *create() { int a; DATA *d = (DATA*) malloc(sizeof(DATA)); init(d->g, 10, &a); d->f = a; d->g[1] = 2; return d; } function f_DATA: int -> int; forall u: int:: f_DATA(u) = u + 40; procedure create() returns d:int{ var @a: int; @a := malloc(4); d := call malloc(44); call init(g_DATA(d),10, @a); Mem[f_DATA(d)] := Mem[@a]; Mem[g_DATA(d) + 1*4]:=2; free(@a); return; }
Missing part: Types? • Types in C programs can’t be trusted • Lack of types hurts property checking
Our Approach • [POPL’09] • Type checking assertion checking • Provide formal semantics for C and its types • Use types to improve the property checker • Provide Java-style field disambiguation • Provide decision procedures for the assertion checking
Formalizing Type Safety A C program is type safe if the run-time value of every variable and heap location corresponds to its compile-time type. Mem : addr -> value Type : addr -> type HasType : value x type -> bool for all a in addr, HasType(Mem(a), Type(a))
Modeling the Heap • Gives value stored at each heap location • Values are integers • Gives declared type for each heap location • Types include Int, Ptr(Int), … Mem : addr -> value Type : addr -> type
“Match” Predicate Match: addr x type -> bool • Lifts the Type map to multi-word types • Match(a, t) holds iff Type[a … n] matches t C Type C Type HAVOC Axiom HAVOC Axiom structfoo { int n; int m; int *p; } int Match(a, Int) <==> Type[a] == Int Match(a, Foo) <==> Match(a, Int) && Match(a+1, Int) && Match(a+2, Ptr(Int)) int* Match(a, Ptr(Int)) <==> Type[a] == Ptr(Int) ¬Match(101, Foo) Match(99, Foo) Match(101, Ptr(Int)) Match(99, Int) Type Int Int Ptr(Int) Int Ptr(Foo) … 99 100 101 102 103 …
“HasType” Predicate HasType: value x type -> bool • Defines which values belong to each type • HasType(v, t) holds iff v is a value of type t C Type HAVOC Axiom int HasType(v, Int) <==> true t* HasType(v, Ptr(t)) <==> v == 0 || (v > 0 && Match(v, t)) HasType(99, Ptr(Foo)) ¬ HasType(101, Ptr(Foo)) Type Int Int Ptr(Int) Int Ptr(Foo) … 99 100 101 102 103 …
Type Safety Invariant • Part of preconditions, postconditions, loop invariants • Assert at every program point • Add similar assertions for locals (if desired) for all a in addr, HasType(Mem(a), Type(a))
Decision Procedure • Verification conditions refer to Mem, Type, Match, HasType, Type-safety invariant • Decision problem: NP-complete • Provide decision procedure using an SMT solver • Suffices to instantiate the quantifiers in these axioms on a fixed set of terms
q p Example: Type Checking IRP IRP ListEntry ListEntry Flink Flink Blink Blink q = CONTAINING_RECORD(p, IRP, ListEntry) = (IRP*)((char*)p - &((IRP*)0->ListEntry)) Type Checker:Does variable qhave type IRP*?
Solution: Add Preconditions #define ENCL(x) CONTAINING_RECORD(x, record, node) requires( HasType(ENCL(p), record*) && ENCL(p)!= NULL ) void init_record(list *p) { record *r = CONTAINING_RECORD(p, record, node); r->data2 = 42; }
Field Safety Invariant • Field safety • Refinement of type safety • Disambiguate two fields of same type • Change • HasType/Match are refined to distinguish different field names of same type
Adding Field Names struct list { list *prev; list *next; } struct record { int data1; list node; int data2; } Match(a, List) <==> Match(a, Ptr(List)) && Match(a+1, Ptr(List)) Match(a, Record) <==> Match(a, int) && Match(a+1, List) && Match(a+3, int) Match(a, Ptr(List)) <==> Type[a] == Ptr(List) HasType(v, Ptr(List))<==> v == 0 || (v > 0 && Match(v, List)) Match(a, int) <==> Type[a] == int HasType(v, int) <==> true same definition as Int … same for Next and Data2 …
Adding Field Names struct list { list *prev; list *next; } struct record { int data1; list node; int data2; } Match(a, List) <==> Match(a, Prev) && Match(a+1, Next) Match(a, Record) <==> Match(a, Data1) && Match(a+1, List) && Match(a+3, Data2) Match(a, Prev) <==> Type[a] == Prev HasType(v, Prev) <==> v == 0 || (v > 0 && Match(v, List)) Match(a, Data1) <==> Type[a] == Data1 HasType(v, Data1) <==> true same definition as Int … same for Next and Data2 …
Experiments • Implementation supports full C language • Supports polymorphism • Supports user-defined, dependent types • Annotated and checked four Windows drivers • Sample drivers provided with Windows DDK
Enables field splitting Disambiguates writes to fields + faster checking • Can split the heap for “field-safe” programs • One heap map per word-type field and pointer type (almost!) Mem_f: addrval Mem_g : addrval Mem_T*: addrval • Simple example • C code x->f = 1; • Boogie code Mem_f[x + Offset(f)] := 1;
Why almost? struct A {int a; int b; }; struct B {int c; int d; int e;} void P(struct B *x){ struct A *y = (struct A*) x; y->a = 1; assert (x->c == 1); } Field safety assertion will fail Have to merge {a, c} {b, d}
Summary • Types as addition part of the state • Type safety checking assertion checking • Efficiently decidable (NP) logic • Separation of concern for property checking • Can exploit field disambiguation for “field-safe” programs
Overview • Motivation • Background • Exploiting types • Logic for lists • Case study
Logic for lists • SMT theory with new predicate symbols
Reachability predicate: Btwnf next next next x y prev prev prev data data data Btwnnext(x,y) Btwnprev(y,x)
Inverse of a function: f-1 next next next x y prev prev prev data data data w data-1(w) = {x, y}
Expressive logic • Express properties of collections x Btwnf(f(hd), hd). state(x) = LOCKED //cyclic • Arithmetic reasoning on data (e.g. sortedness) x Btwnf(hd, null) \ {null}. yBtwnf(x, null) \ {null}. d(x) d(y) • Type/object invariants x Type-1(“__logentry”). logtype(x) > 0 file_name(x) != null
Can express desired invariants NT_STATUS Unload(…){ …. iter = hd->First; while(iter != null) { RemoveEntryList(iter); iter = iter->Next; IoDeleteDevice(iter->Self); } …. } hd First DEV_EXT DEV_EXT DEV_OBJ DEV_OBJ Next Next • x BtwnNext(hd->First,NULL). x->Self->DevExt = x Self Self OR • x BtwnNext(hd->First,NULL). Self-1(x->Self) = {&x->Self} DevExt DevExt
Precise and efficient • [POPL ‘08] • Precision • Given a Floyd-Hoare triple {P} S {Q}, • P/Q are in the assertion logic, and S is a loop-free, call-free code fragment • There is a formula in the assertion logic • Linear in the size of the triple • Valid iff the triple holds • Efficiency • The decision problem is NP-complete
Ground Logic Logic t Term ::= c | x | t1 + t2 | t1 - t2 | f(t) G GFormula ::= t = t’| t < t’ | t Btwnf(t1, t2) | G S Set ::= f-1(t) | Btwnf(t1, t2) F Formula ::= G | F1 F2 |F1 F2 | x S. F
Ground decision procedure • Provide a set of 10 rewrite rules for Btwnf • Sound, complete and terminating • E.g. Transitivity3 t1Btwnf(t0, t2) t Btwnf(t0, t1) t Btwnf(t0, t2), t1Btwnf(t, t2)
t Term ::= c | x | t1 + t2 | t1 - t2 | f(t) G GFormula ::= t = t’| t < t’ | t Btwnf(t1, t2) | G Logic Bounded quantification over interpreted sets S Set ::= f-1(t) | Btwnf(t1, t2) F Formula ::= G | F1 F2 |F1 F2 | x S. F