190 likes | 388 Views
A practical and complete approach to predicate abstraction. Ranjit Jhala UCSD Ken McMillan Cadence Berkeley Labs. Completeness of abstraction. An abstraction is a restricted language L Example: predicate abstraction L is the language of Boolean combinations of predicates in P
E N D
A practical and complete approach to predicate abstraction Ranjit Jhala UCSD Ken McMillan Cadence Berkeley Labs
Completeness of abstraction • An abstraction is a restricted language L • Example: predicate abstraction • L is the language of Boolean combinations of predicates in P • We try to compute the strongest inductive invariant of a program in L • An abstraction refinement heuristic chooses a sequence of sublangauges L0µ L1,... from a broader langauge L. • Example: predicate abstraction • L is the set of quantifier-free FO formulas (QF) • Li is characterized by a set of atomic predictes Pi An abstraction refinement heuristic is complete for language L, iff it always eventually chooses a sublanguage Liµ L containing a safety invariant whenever L contains a safety invariant.
Good Predicates Bad Predicates Program x=i; y=i; while(x!=0) {x--; y--;} assert y==0; x=y x=0, y=0 x=1, y=1 x=2, y=2 ... Divergence • Existing refinement heuristics for predicate abstraction are incomplete. • They can produce an infinite sequence of refinements even when a saftey invariant exists in L.
Cause: CEGAR loop • CounterExample Guided Abstraction Refinement (CEGAR) • Abstract counterexample is sequence of minterms in Li. • Refinement adds predicates sufficient to refute counterexample. • Not complete, since predicates can diverge as number of loop iterations in counterexample increases. Verify unsafe abstract counterexample Li Refine Abstraction safe
Example: refinement using weakest procondition (WP) x=0,y=0 x=i,y=i x=i; y=i; while(x!=0) {x--; y--;} assert y==0; x¹0,y¹0 [x!=0] x--, y-- Add these predicate x=0,y¹0 [x=0] [y!=0] x=0,y¹0 Error! Divergence example • Most heuristics derive predicates in some way from the refutation of the counterexample. i=1 ) i=1 x=1 ) y=1 x=0 ) y=0 False
1. Stratify L into finite languages L0µL1µL ... L2 x=2 L1 L L L L L x=1 L0 x=0 Enforcing completeness Lattice of sublanguages L 2. Refute counterexample at lowest possible stratum x=y If a saftey invariant exists in Lk, then we never exit Lk. Since this is f finite language, abstraction refinement must converge.
Split prover approach • To restrict the refutation of counterexamples, we use a "split prover" • Each prover component knows just one time frame • Components can only communicate facts in Lk R1 R2 R3 Rn L By restricting the language of communication between time frames, we prevent the prover from using larger constants as the number of loop iterations increase, and force it to generalize.
Programs • A program is a pair (T, P) where • T is a set of symbolic transition relations (program statements) • PµT * is a regular language defining the unsafe runs of the program
x=y x=y False True Example: let L be {x=y}: Error! [x!=0] x--, y-- [x=0] [y!=0] x=i,y=i Predicate abstraction • Definitions: • Given a set of fmlas L, let B(L) be the set of Boolean combinations of L • Let spL(s)(f), where s2T be the strongest -postcondition of f in B(L) • Let spL(p1Lpn) = spL(p1) ±L± spL(pn) • A program path pis L-refutable when spL(p)(True)=False Fact: Predicate abstraction with predicates b proves a program safe exactly when every unsafe path p2P is b-refutable. A counterexample for predicate abstraction is a non-b-refutable path in P.
Consequence finders • A consequence finder takes as input a set of hypothese and returns a set of consequences of . • Consequence finder R is complete for L-generation iff, for any f2 L G²fimpliesR(G)Å L²f That is, the consequence finder need not generate all consequences of in L, but the generated L-consequences must imply all others.
R1 R2 R3 Rn L • Each Ri knows just i(*) • Ri and Ri+1 exchange only facts in L(i)ÅL(i+1) • iRi is the composition of the Ri’s Split prover Divide the prover into a sequence of communicating consequence finders... Theorem: If each Ri is complete for L(i+1)-generation, then iRi is complete for refutation [McIlraith & Amir, 2001]. *Actually, here we mean i instantiated at time i, as in BMC
L-restricted split prover • In the L-restricted composition, LRi, the provers can exchange only formulas in L. R1 R2 R3 Rn L L L L L Theorem: If each Ri is complete for LÅL(Ti+1)-generation, then path is L-refutable exactly when is refuted by LRi. Corrolary: Let b’ be the set of AP’s exchanged by LRi in refuting . is ’-refutable
Complete heuristic • Given finite languages L0µ L1, µL where [ Li = QF... bÃ{} Pred Abs safe p not b-refutable k à 0 ’ µ Lk s.t. is ’-refutable? yes k à k+1 no b Ã[’ Theorem: This procedure is complete for QF. That is, if a safety invariant exists in QF, we conclude "safe".
An efficient split prover • Complete consequence generation could be expensive! • We will consider QF formulas with • integer difference bound constraints (e.g., x · y + c) • equality and uninterpreted functions • restricted use of array operations "select" and "store" • Our restriction language Lk will be determined by • The finite set CD of allowed constants c in x · y + c • The finite set CB of allowed constants c in x · c • The bound bf on the depth of nesting of function sybols Note that for a finite vocabulary, Lk is finite, and as long as the constant and depth bounds are increasing, every formula is included in some Lk.
=,f =,f =,f =,f L ·+ ·+ ·+ ·+ Prover architecture • SAT solver generates propositionally satisfying minterms • Split prover refutes this minterm • Hypotheses of split prover are thus literals, not clauses • Convexity: theory is convex if all consequences are Horn • In convex case, provers only exchange literals [Nelson & Oppen, 1980] • Simple proof rules for complete unit consequence finding • In case of a non-Horn consequence, split cases in SAT solver • Integers and array store operations introduce non-convexities. • Multiple theories handled by hierarchical decomposition These and other optimizations can result in a relatively efficient prover...
Performance comparison • Refuting counterexamples for two hardest Widows device driver examples in the Blast benchmark set. • Compare split prover against Nelson-Oppen style, same SAT solver
Some "trivial" benchmarks example: substring copy main(){ char x[*], z[*]; int from,to,i,j,k; i = from; j = 0; while(x[i] != 0 && i < to){ z[j] = x[i]; i++; j++; } /* prove strlen(z) >= j */ assert !(k >= 0 && k < j && z[k] == 0); }
Results X = refine fail, = bug, = diverge, TO = timeout, = verified safe
Conclusions • An abstraction refinement heuristic is complete for langauge L if it guarantees to find a safety invariant if one exists in L • Existing PA heuristics are incomplete and diverge on trivial programs • CEGAR can be made complete by... • Stratifying L into hierarchy of finite sublanguages L0, L1, ... • Refuting counterexamples as low as possible in hierarchy • Using Lk-restricted split prover • A split prover can be made efficient enough to use in practice • (at least for some useful theories) • Future work: • New theories (transitive closure?) • Quantified invariants, indexed predicate abstraction • Interpolant-based software model checker (coming soon)