550 likes | 693 Views
Refinements to techniques for verifying shape analysis invariants in Coq. Kenneth Roe GBO Presentation 9/30/2013 The Johns Hopkins University. Summary. Formal Methods for Imperative Languages Such as C Many bugs caused by corruption of data structures
E N D
Refinements to techniques for verifying shape analysis invariants in Coq Kenneth Roe GBO Presentation 9/30/2013 The Johns Hopkins University
Summary • Formal Methods for Imperative Languages Such as C • Many bugs caused by corruption of data structures • Use formal methods to document data structure invariants and then verify correct program execution • Framework being developed in the Coq interactive theorem prover
Research contribution • Extension of Coq based program verification to larger programs with more complex data structures. • Existing systems only work on small examples
A tree traversal example in C Struct list { t = NULL; struct list *fld_n; } else { struct tree *fld_t; list *tmp = i->n; }; t = i->fld_t; free(l); Struct tree { i = tmp; struct tree *fld_l, *fld_r; } int value; } else if (t->r==NULL) { }; t = t->fld_l; } else if (t->l==NULL) { struct list *p; t = t->fld_r; void build_pre_order(struct tree *r) { } else { struct list *i = NULL, *n, *x; n = i; struct tree *t = r; i = malloc( p = NULL; sizeof(struct list)); while (t) { i->fld_n = n; n = p; x = t->fld_r; p = malloc(sizeof(struct list)); i->fld_t = x; p->fld_t = t; t = t->fld_l; p->fld_n = n; } if (t->fld_l==NULL && t->fld_r==NULL) { } if (i==NULL) { }
What this program does r p i t 10 • 1 • Nil • Nil 12 18 • 2 • 5 16 20 14 • 3 • 4 • 6
What this program does r p i t 10 • 1 • 10 • Nil 12 18 • 2 • 5 • Nil 16 20 14 • 3 • 4 • 6
What this program does r p i 10 • 1 • 10 • 18 t 12 18 • 2 • 5 • Nil • Nil 16 20 14 • 3 • 4 • 6
What this program does r p i 10 • 1 • 12 • 18 12 18 t • 2 • 5 • 10 • Nil 16 20 14 • 3 • 4 • 6 • Nil
What this program does r p i 10 • 1 • 12 • 16 12 18 • 2 • 5 • 10 • 18 16 20 14 t • 3 • 4 • 6 • Nil • Nil
What this program does r p i 10 • 1 • 14 • 16 12 18 • 2 • 5 • 12 • 18 16 20 t 14 • 3 • 4 • 6 • 10 • Nil • Nil
What this program does r p i 10 • 1 • 14 • 18 12 18 • 2 • 5 • 12 • Nil 16 20 14 • 3 • 4 • 6 • 10 • Nil t
What this program does r p i 10 • 1 • 16 • 18 12 18 • 2 • 5 • 14 • Nil 16 20 14 • 3 • 4 • 6 • 12 • 10 t • Nil
What this program does r p i t 10 • 1 • 16 • Nil 12 18 • 2 • 5 • 14 16 20 14 • 3 • 4 • 6 • 12 • 10 • Nil
What this program does r p i t 10 • 1 • 18 • Nil 12 18 • 2 • 5 • 16 16 20 14 • 3 • 4 • 6 • 14 • 12 • 10 • Nil
What this program does r p i 10 • 1 • 18 • Nil 12 18 • 2 • 5 • 16 16 20 14 • 3 • 4 • 6 • 14 • 12 t • 10 • Nil
What this program does r p i 10 • 1 • 20 • Nil 12 18 • 2 • 5 • 18 16 20 14 • 3 • 4 • 6 • 16 • 14 t • 12 • 10 • Nil
What this program does r p i 10 • 1 • 20 • Nil 12 18 • 2 • 5 • 18 16 20 14 • 3 • 4 • 6 • 16 • 14 t • 12 • 10 • Nil
Invariants to be formally proven • The program maintains two well formed linked lists, the heads of which are pointed to by iand p. • By well formed we mean that memory on the heap is properly allocated for the lists and there are no loops in the data structures. • The program maintains a well formed tree pointed to by r. • t always points to an element in the tree rooted at r. • The two lists and the tree do not share any nodes. • Other than the memory used for the two lists and the tree, no other heap memory is allocated. • The fld_tfield of every element in both list structures points to an element in the tree.
Coq Goal { ? } WHILE not (T == 0) DO N := P; NEW P, 2; ... { ? }
Program state h = {10 → 12, 11 → 18, 12 → 14, 13 → 16, 14 → 0, 15 → 0, 16 → 0, 17 → 0, 18 → 20, 19 → 0, 20 → 0, 21 → 0, …} Environment R=10 I=30 P=40 T=10 Heap e = { R → 10, I → 20, P → 30, T → 10 }
Program state h = {10 → 12, 11 → 18, 12 → 14, 13 → 16, 14 → 0, 15 → 0, 16 → 0, 17 → 0, 18 → 20, 19 → 0, 20 → 0, 21 → 0} Environment R=10 I=30 P=40 T=10 Heap v0=[10, [12 ,[14,[0],[0]], [16,[0],[0]]], [18, [20,[0],[0]], [0]]] X e = { R → 10, I → 20, P → 30, T → 10 } struct tree { struct tree *left; struct tree * right; } ∃v0 . (e,h) ⊨ TREE(R,v0,2,[0,1])
Separation logic h={10 → 12,11 → 18,12 → 14,13 → 16,14 → 0,15 → 0,16 → 0,17 → 0,18 → 20, 19→ 0,20 → 0,21 → 0,30 → 32,31 → 10,32 → 0,33 → 12,40 → 42,41 → 14, 42 → 44,43 → 12,44 → 0,45 → 10} (e,h) ⊨ ∃v0 v1 v2 TREE(R,v0,2,[0,1]) * TREE(I,v1,2[0]) * TREE(P,v2,2,[0])
Separation logic (e, h) ⊨ s1 ∗s2 if and only if ∃h′ , h′′ . (e,h′)⊨ s1 ⋀ (e,h′′)⊨ s2 ⋀ dom(h1)∩dom(h2)=∅ ⋀ h=h′ ∪ h′′
Data structure relationships (e,h) ⊨ ∃v0 v1 v2TREE(R,v0,2,[0,1]) * TREE(I,v1,2,[0]) * TREE(P,v2,2,[0])
Data structure relationships (e,h) ⊨ ∃v0 v1 v2TREE(R,v0,2,[0,1]) * TREE(I,v1,2[0]) * TREE(P,v2,2,[0]) * ∀ v3 ∈ TreeRecords(v1). [nth(find(v1,v3),2) inTree v0]
Data structure relationships (e,h) ⊨ ∃v0 v1 v2TREE(R,v0,2,[0,1]) * TREE(I,v1,2[0]) * TREE(P,v2,2,[0]) * ∀ v3 ∈ TreeRecords(v1). [nth(find(v1,v3),2) inTree v0] * ∀ v3 ∈ TreeRecords(v2). [nth(find(v2,v3),2) inTree v0]
Data structure relationships T → (e,h) ⊨ ∃v0 v1 v2TREE(R,v0,2,[0,1]) * TREE(I,v1,2[0]) * TREE(P,v2,2,[0]) * ∀ v3 ∈ TreeRecords(v1). [nth(find(v1,v3),2) inTree v0] * ∀ v3 ∈ TreeRecords(v2). [nth(find(v2,v3),2) inTree v0] * [T = 0∨T inTree v0]
Deep model • Process • Create data structure for predicates • Write semantic interpretation function • Write customized tactics • Advantage: greater flexibility in designing tactics • Tactics can be any function that transforms the data structure • Tactic is proven correct once and used for all verifications
Summary of tactics • Forward propagation • Fold/unfold • Merge • Works by pairing off identical pieces • Simplify • State implications • Also works by pairing off
Results (so far) • Tree traversal • Code size: ~30 lines • Invariant size: ~10 lines • Proof check time: ~5 minutes • Main proof size: ~220 lines • Status: top level complete, lemmas need to be proven • DPLL (A decision procedure for sentential satisfiability) • Code size: ~200 lines • Invariant size: ~52 lines • Status: Proof incomplete
Research contributions • Extension of Coq separation logic reasoning to larger programs with more complex data structures. • Creation of a library of useful predicates, functions and tactics • Deep model allows greater control over the design of tactics • Key challenge: Performance tuning • Tradeoffs between performance and automation • Development of a powerful simplification tactic • Simplification tactic executed after every major proof step • Based on term rewriting (with contextual rewriting) concepts • Automates reasoning about associativity, communtivity and other simple property classes • Design decisions in creating canonical form addressed
Proposed work for PhD • Finish DPLL verification • Prove all underlying theorems • Create improved presentation framework for the environment
Deep model • Consider proving: a+c = a+b+c−b
Deep Model type expr= | Constint | Var id | Plus expr×expr | Minus expr×expr | Times expr×expr Fixpointeval (env : id → nat) (e : expr) := match e with |Const c =⇒ c |Var v =⇒ env v |Plus e1 e2 =⇒ (evalenv e1) + (evalenv e2) |Minus e1 e2 =⇒ (evalenv e1) − (evalenv e2) |Times e1 e2 =⇒ (evalenv e1) ∗ (evalenv e2) Definition simplify e := … Theorem ∀env.∀e. evalenv (simplify e) = evalenv e
Deep Model Prove the following: Plus a b = simplify (Plus a (Plus b (Plus (Minus c b)))
Deep Model • Parameterized predicate data types type expr= | Constint | Varid | Fun id × list expr
Verification of initialization {∃v0 .TREE(R, v0 , 2, [0,1])} T := R;I := 0;P := 0; {∃v0 .∃v1 .∃v2 . TREE(R, v0 , 2, [0,1])∗ TREE(I, v1 , 2, [0])∗ TREE(P, v2 , 2, [0])∗ ∀ v3 ∈ TreeRecords(v1).[nth(find(v1,v3),2) inTree v0]∗ ∀ v3 ∈ TreeRecords(v2).[nth(find(v2,v3),2) inTree v0]∗ [T = 0∨T inTree v0]}
Verification of initialization {∃v0 .TREE(R, v0 , 2, [0,1])} T := R;I := 0;P := 0; {?1234}
Verification of initialization {∃v0 .TREE(R, v0 , 2, [0,1])} T := R;I := 0;P := 0; {?1234}
Verification of initialization {∃v0 .TREE(R, v0 , 2, [0,1])* [T=R]} I := 0;P := 0; {?1234}
Verification of initialization {∃v0 .TREE(R, v0 , 2, [0,1])* [T=R]} I := 0;P := 0; {?1234}
Verification of initialization {∃v0 .TREE(R, v0 , 2, [0,1])* [T=R] *[I = 0]}P := 0; {?1234}
Verification of initialization {∃v0 .TREE(R, v0 , 2, [0,1])* [T=R] * [I = 0]}P := 0; {?1234}
Verification of initialization ∃v0 .TREE(R, v0 , 2, [0,1])* [T=R] * [I = 0]* [P = 0] -> ?1234
Verification of initialization ∃v0 .TREE(R, v0 , 2, [0,1])* [T=R] * [I = 0]* [P = 0] -> ?1234 ?1234 = ∃v0 .TREE(R, v0 , 2, [0,1]) * [T=R] * [I = 0] * [P = 0]
Verification of initialization ?1234 →∃v0 .∃v1 .∃v2 . TREE(R, v0 , 2, [0,1])∗ TREE(I, v1 , 2, [0])∗ TREE(P, v2 , 2, [0])∗∀ v3 ∈ TreeRecords(v1). [nth(find(v1,v3),2) inTree v0]∗ ∀ v3 ∈ TreeRecords(v2). [nth(find(v2,v3),2) inTree v0]∗ [T = 0 ∨ T inTree v0]
Verification of initialization ∃v0 .TREE(R, v0 , 2, [0,1]) * [T=R] * [I = 0] * [P = 0] →∃v0 .∃v1 .∃v2 . TREE(R, v0 , 2, [0,1])∗ TREE(I, v1 , 2, [0])∗ TREE(P, v2 , 2, [0])∗∀ v3 ∈ TreeRecords(v1). [nth(find(v1,v3),2) inTree v0]∗ ∀ v3 ∈ TreeRecords(v2). [nth(find(v2,v3),2) inTree v0]∗ [T = 0 ∨ T inTree v0]
Unfold example {∃v0∃v1∃v2[Tmp l = 0]∗[l /= 0]∗[tmp r = 0]∗ [Tmp r = 0 ∨ Tmp r ∈ TreeRecords(v0)]∗ [nth(nth(find(v0,T)),2),0) = (Tmp r)]∗ [nth(nth(find(v0 , T )), 1), 0) = 0]∗ [T ∈ TreeRecords(v0)]∗ P + 0 → N ∗ P + 1 → T ∗ [T /= 0]∗ TREE(R, v0 , 2, [0,1]) ∗ TREE(I, v1 , 2, [0])* TREE(N, v2 , 2, [0])∗ ∀ v3 ∈ TreeRecords(v1). [nth(find(v1,v3),2) inTree v0]∗ ∀ v3 ∈ TreeRecords(v2). [nth(find(v2,v3),2) inTree v0]} ∗ T := ∗(I+1); … {?1234}
Unfold example ∃v0 ∃v1 ∃v2 ∃v3 ∃v4 I + 1 → v1 ∗ I → nth(v0, 0) ∗ TREE(nth(v0, 0), nth([I, v0, v1], 1), 2, [0]) [Tmp r = 0 ∨ Tmp r ∈ TreeRecords(v0)]∗ [nth(nth(find(v2,T)),2),0) = (Tmp r)]∗ [nth(nth(find(v2 , T )), 1), 0) = 0]∗ [T ∈ TreeRecords(v2)]∗ P+0→N∗P+1→ T∗[T /= 0]∗ TREE(I, v1 , 2, [0]) * [I + 1 → v1 ∗ I → nth(v0, 0)] ∗ TREE(nth(v0, 0), nth([I, v0, v1], 1), 2, [0]) ∗ TREE(R, v2 , 2, [0,1]) ∗ Empty * TREE(N,v4,2,[0]) * ∀ v5 ∈ TreeRecords([I,v0,v1]). [nth(find([I,v0,v1)]),v5),2) inTree v2]∗ ∀ v5 ∈ TreeRecords(v4). [nth(find(v4,v5),2) inTree v2]} T := ∗(I+1); ...{?1234}