190 likes | 306 Views
Recursive Proofs for Inductive Tree Data-Structures. xiaokang qiu with P. Madhusudan Andrei Stefanescu University of Illinois at Urbana-Champaign POPL 2012, Philadelphia, PA, USA. functional verification of heap-manipulating programs. Expressive. keep expressiveness.
E N D
Recursive Proofs for Inductive Tree Data-Structures xiaokangqiu with P. Madhusudan Andrei Stefanescu University of Illinois at Urbana-Champaign POPL 2012, Philadelphia, PA, USA
functional verification of heap-manipulating programs Expressive keep expressiveness Expressive Logics: Separation logics, HOL, Matching logic, etc. Proving programs correct Our Goal give up decidability sound but incomplete preserve automaticity Decidable Logics: LISBQ, CSL, STRANDdec, etc. Bug-finding Automatic
our Strategy • Handle a logic that is very expressive (inevitably undecidable) • Identify a class of simple and naturalproofs C such that • Many correct programs can be proved using a proof in class C • The class C is effectively searchable (searching thoroughly for a proof in C is efficiently decidable) • In this paper, we apply the above strategy to inductive tree data-structures • Exhibit a procedure that can automatically prove routines on BSTs, red-black trees, avl-trees, binomial heaps, etc., written as imperative programs fully functionally correct. All Proofs C
motivation for simple proofs x • Assisted Proof of BST-Search: • ( unfold bst(x)and keys(x) ) • ( view bst and keys as uninterpretedfunctions, or use unification techniques [Suteret al., 2010] )
our contribution • A new recursive extension of FOL, called DRYAD • combines quantifier-free logic with recursive predicates/functions defined on trees • recursive predicates/functions allow stating complex properties of heaps without explicit quantification • A VC-generation algorithm • given a recursive imperative program, with proof annotations in DRYAD (with several key restrictions) • symbolic execution of the program over a footprint structure • unfold recursive definitions to the frontier of the footprint • Solve the validity of the generated VC • the scheme of formula abstraction: replace recursive definitions as uninterpreted functions
dryad example: AVL-search bool find(Node t, Int v) • //@requires • //@ensures • { • if (t = NIL) return false; • tv := t.value; • if (v = tv) return true; • else if (v < tv) { • w := t.left; • r := find(w, v); } • else { • w := t.right; • r := find(w, v); } • }
pre- and post-conditions • program functions: loc f( locv, intz1, …, intzn) • Stringent restrictions: • only one input location parameter (v), which must subtends a tree • When a location is returned, either • old_v and ret_loc point to disjoint trees or • old_v is “havoc”-ed(nothing about v and all locations reachable from v in the post-state is known) • A pre-condition is of the form tree(v) /\ψ(v, z1, …, zn) • A post-condition for f(v, z1, …, zn) is of the form either • havoc(old_v) /\ψ(old_v, old_z1, …, old_zn, ret_loc) or • old_v#ret_loc/\ψ(old_v, old_z1, …, old_zn, ret_loc) ret_loc v ret_loc a) b)
programs and basic blocks We consider annotated imperative programs with recursion only(no while-loops, therefore no loop-invariants) We verify linear blocks of code, called basic blocks (conditionals are replaced with assume statements) bb1 bb2 bb3
verification conditions • Each basic block bb gives a Hoare-triple • (φpre, bb, φpost) • We track the evolving of footprint (the portion of the heap touched explicitly by the program) • a footprint = a symbolic heap + a DRYAD formula • the symbolic heap is a graph structure denoting a portion of the concrete heap
symbolic heaps • A symbolic contains concrete nodes and symbolic nodes, where there is no pointer/data field from symbolic nodes. • A concrete heap CH with nodes Ncorresponds toSH if there is a homomorphism such that • for every symb. node s, h(s) is the root of a tree • for every distinct symb. nodes s and s’ , h(s) and h(s’) are the roots of two disjoint trees x x C2 C2 C1 concrete nodes C1 r r h l l symbolic nodes l l S1 S2 S2 r S1 r l y l y
crucial property of sym heaps: checking tree-ness of nodes • We can determine certain nodes subtend trees by checking the symbolic heap. • Lemma: If s is the root of a tree in the underlying graph of SH, then h(s) also subtends a tree in any corresponding concrete CH. T r l l T cnil l r r T T
expanding the footprint • ( , ) • ( , ) unfold recursive definitions on n n each new symbolic node is different from others n is not nil expand the footprint with respect to n n r l nr nl
handling function calls • ( assume the function call does not havoc v ) • ( , ) • ( , ) m m φwith some adaption incorporate the post-condition r
footprint evolving ( v : i1 ) • ( avl*(t) ) • assume (t ≠ nil); • tv:= t.value; • assume (tv ≠ v); • assume (tv < v); • w := t.left; • r := find(w, v); • return r; • ( avl*(t) /\ keys*(t)= keys*(old_t) • /\ h*(t) = h*(old_t) /\ … ) n0 n0 n0 t t t w w ( t.val : i2 ) ( tv : i4 ) ( ret : i5 ) l l r r ( r : i6 ) n1 n1 n2 n2 r
formula abstraction • How to check the verification condition • (SH, φVC)ψVC? • Procedure: • DropSH after checking tree-ness of nodes required by post, Then check φVCψVC • Replace recursive predicates/functions with uninterpretedpredicates/functions, obtain φabsψabs • Check the validity of φabsψabs in the theory combining uninterpreted functions, linear arithmetic and set/multiset of integers. (Decidable, NP-complete [Kuncaket al., 2010]) • Soundness: • If φabsψabs is valid, then φVCψVC is also valid. • Incompleteness: • E.g., height(x) = 3 size(x) ≤ 7 is valid when interpreted, but is invalid when uninterpreted.
experiments • We verify several inductive tree data-structures appearing in the classical textbook [CLRS: Cormen et al.] • Sorted list, Binary heap, Treap, AVL tree, Red-black tree, B-tree, and Binomial heap • Annotate each standard operation (insert/delete/rotate/merge) with pre-and post-conditions, specifying complex structural and data properties, e.g., for binomial-heap-merge, we check: • what returned is still a binomial heap • the set of keys stored is the union of the two inputs • the order of the binomial heap increases up to 1 • Examine the validity of the VCs in the uninterpreted theory, using a simple decision procedure, which employs Z3, an state-of-the-art SMT solver.
experiment results http://cs.uiuc.edu/~madhu/dryad/
related work • Separation Logic + recursive predicates [Chin et al., 2011] • Formulas are quantified, employs Isabelle and Mona, and is less efficient. • Bedrock [Chlipala, 2011] • Mostly automated, requires proof tactics given by the user. • VeriFast[Jacobs & Piessens, 2008] • Partially automated tool that accepts proof tactics from the user.
conclusion • A scheme for finding simple and natural proofs automatically and efficiently for tree data-structures • Future work: • Extend beyond trees, for arbitrary data-structures? • Handling while-loops: Functional programs [Suter et al., 2010] Imperative recursive programs [this paper] Imperative while programs • Challenge: Can we build automatic procedures that can verify all data-structure algorithms we hand out to undergraduate CS students?