450 likes | 469 Views
Learn about lazy abstraction and interpolants in model checking and abstraction techniques to fight the state-explosion problem. Understand the concept of CEGAR and how to model programs using formal logic.
E N D
Lazy Abstraction with Interpolants Yakir Vizel (based on the work and slides of K. L. McMillan at CAV06)
Agenda • Introduction • Model Checking • Abstraction and CEGAR • Software MC • Lazy Abstraction • Lazy Abstraction with Interpolants
Model Checking • Given a system, and a specification – does the system satisfies the specification • There are efficient algorithms that receive a model and a formula and return: • “True” if the system satisfy the specification. • “False” and a counterexample otherwise. • The problem at hand is Non-Decidable • Suffers from the state-explosion problem
Abstraction • A way to fight the state explosion problem. • Preserves properties – those that hold in the concrete model also hold in the abstract model (over approximation). • An abstract state represents a set of concrete states and transitions defined existentially. • Types of abstraction: Localization Reduction, Data Abstraction, Predicate Abstraction
CEGAR • One problem with over approximations is spurious counterexamples. • The solution: Counter-Example Guided Abstraction Refinement. • Given an abstract model M’ and a spurious counterexample refine M’ automatically such that a new abstract model is created M’’ ≥ M’ and it does not contain the given spurious counterexample.
M and generateinitial abstraction M’ M’╞ model check M’ |= generate counterexample C stop refinement C C check spurious counterexample C is not spurious is spurious CEGAR (cont.)
Modeling Programs • FOL formulas are used to characterize programs. • S is the set of individual variables, constants etc. • A state formula is a formula in L(S) where L(S) is the set of well-formed formulas over the vocabulary S. • For a non-logical symbol s, s’ represents s in the next time. • s with n primes represent s at n time units in the future. • For a formula f, the notation f(n) denote the addition of n primes to every symbol in f. • A transition formula is a formula in L(S U S’) • Example: x’ = x + 1 • Program is usually represented using a Control Flow Graph (CFG).
Modeling Programs (2) • A program is a tuple (λ, Δ, li,lf) where λ is a finite set of program locations, Δ is a set of actions, li is the initial location and lf is the error location (both in λ). • An action is a triple (l,T,m) where l,m are respectively the entry and exit locations of the action and T is a transition formula. • A path π of a program is a sequence of transitions of the form (l0,T0,l1),…,(ln-1,Tn-1,ln). • The unfoldingU(π) of path π is the sequence of formulas: T0(0),T1(1),…,Tn-1(n-1) • A path π is feasible when ΛU(π) is consistent.
L=0 [L!=0] L=1; old=new do{ lock(); old = new; if(*){ unlock(); new++; } } while (new != old); L=0; new++ [new!=old] [new==old] program fragment control-flow graph Example ΛU(π) = L=0 Λ L’=1 Λ old’=new Λ T Λ new = old’ ΛU(π) = L=0 Λ L’=1 Λ old’=new Λ L’’=0 Λ new’ = new + 1 Λ new’ != old’ Λ L’’ != 0
Modeling Programs (3) • A program is said to be safeif every error path of the program is infeasible. • An inductive invariant of a program is a map F:λL(S) such that: • F(li) = TRUE • For every action (l,T,m) in Δ, F(l)/\T implies F(m)’. • A safety invariant of a program is an inductive invariant such that F(lf) = FALSE. • Existence of a safety invariant of a program implies that the program is safe.
Program Unwinding • An unwinding of a program A = (λ, Δ, li,lf) is a quadruple (V,E,Mv,Me), where (V,E) is a directed tree rooted at ε, Mv:V λ is the vertex map, and Me:E Δ is the edge map such that: • Mv(ε) = li • For every non-leaf vertex v in V, for every action (Mv(v),T,m) in Δ, there exists an edge (v,w) in E such that Mv(w) = m and Me(v,w) = T • For two vertices v and w of a tree, w < v denotes that w is a proper ancestor of v.
Mv Me 0 L=0 [L!=0] 2 1 L=1; old=new 3 L=0; new++ 4 8 Unwinding the CFG • An unwinding is a tree with an embedding in the CFG L=0 [L!=0] L=1; old=new L=0; new++ [new!=old] [new==old]
Mv Me Expansion • Every non-leaf vertex of the unwinding must be fully expanded... If this is not a leaf... 0 L=0 L=0 ...and this exists... ...then this exists. 1 ...but we allow unexpanded leaves (i.e., we are building a finite prefix of the infinite unwinding)
Program Unwinding (2) • A labeled unwinding of a program A = (λ, Δ, li,lf) is a triple (U,ψ,C) where • U = (V,E,Mv,Me) is an unwinding of A • Ψ:VL(S) is called the vertex labeling, and • C is a relation in V x V and is called the covering relation • A vertex v is said to be covered iff there exists (w,x) in C such that w≤v. • Unwinding is said to be safe iff for all vertices v in V, Mv(v)=lf implies Ψ(v) ≡ FALSE. • Unwinding is said to be complete iff every leaf v in V is covered.
T F L=0 T L=0 F L=0 T These two nodes are covered. (have a ancestor at the tail of a covering arc) Labeled unwinding 0 • A labeled unwinding is equipped with: • a labeling function y : V L(S) • a covering relation C in V x V L=0 [L!=0] 2 1 L=1; old=new 3 L=0; new++ ... 4 [new!=old] [new==old] [L!=0] 5 6 7 ...
Program Unwinding (3) • A labeled unwinding (U,ψ,C) of a program A = (λ, Δ, li,lf) where U = (V,E,Mv,Me), is said to be well-labeled iff: • Ψ(ε) ≡ TRUE, and • For every edge (v,w) in E, Ψ(v) Λ Me(v,w) implies Ψ(w)’, and • For all (v,w) in C, Ψ(v) => Ψ(w), and w is not covered • Main Theorem: If there exists a safe, complete, well-labeled unwinding of program A, then A is safe.
T F L=0 T L=0 F L=0 T Well-Labeled Unwinding • An unwinding is well-labeled when... • y(e) = True • every edge is a valid Hoare triple • if (x,y) in C then y is not covered 0 L=0 [L!=0] 2 1 L=1; old=new 3 L=0; new++ 4 [new!=old] [new==old] [L!=0] 5 6 7
old=new T old=new 8 [new==old] [new!=old] T F T [L!=0] T 9 7 10 9 F T Safe and Complete safeif every error vertex is labeled False completeif every non-terminal leaf is covered T 0 L=0 F [L!=0] L=0 2 1 L=1; old=new T 3 L=0; new++ L=0 4 [new!=old] F L=0 [L!=0] 5 6 ... ... Theorem: A CFG with a safe complete unwinding is safe.
w v p p y x z p T T Why a Covered Vertex Cannot Cover? • y covers x, w covers v y is covered (v≤y) • Ψ(x) Ψ(y) • Every state reachable from x is reachable from y. • Ψ(v) Ψ(w) • Every state reachable from v is reachable from w. • Any state reachable from y should be reachable from w through its descendent z. • NOT every state reachable from x is also reachable from z. • z is the only vertex that is not covered.
Proof (Main Theorem) • Let U’ be the set of uncovered vertices, and let function M map location l to \/{ψ(v) | Mv(v) = l, v in U’} • Let’s show that M is a safety invariant for A: • M(li) = TRUE – Given by the definition well-labeled unwinding • M(lf) = FALSE – The unwinding is safe.
Proof (2) • Let (l,T,m) be an action: • For every v in U’ Mv(v) ≠ l M(l) = FALSE M(l) ΛT implies M(m)’ • There is a v in U’ such that Mv(l) = v. • Let us assume that there is an action (l,T,m) such that M(l) ΛT does not imply M(m)’. For every w in V such that Mv(w) = m, w is covered. • Let (v,w) be an edge in E. We know: ψ(v) ΛT implies ψ(w)’. • Let u be the covering vertex of w. Then, ψ(w) ψ(u) and u is NOT covered. • By adding the assumption that if u covers w then Mv(u) = Mv(w) we get a contradiction. (ψ(v) ΛT implies ψ(u)’ M(l) Λ T implies M(m)’)
Predicate Abstraction • Predicates are defined over system states • X = Y; counter < 100; etc. • Keeps track of certain predicates on data. • Captures relationship between variables. • States satisfying the same predicate are equivalent in the abstract model. Merged into the same abstract state. • Abstract state space is always finite.
Predicate Abstraction (2) • Calculating strongest post-condition over the given set of predicates. • Abstract post computation is very expansive. • For N predicates there are possibly 2^2N transitions • Information computed about predicates may be irrelevant.
PA with CEGAR Loop Choose initial M’ • Choose predicates to refute cex's • Generalizes failures • Still performs expensive deduction without justification • strongest Boolean post-condition • Fails to learn from past • Start fresh each iteration • Forgets expensive deductions Model check abstraction M’ true, done Cex yes, Cex Can extend Cex from M’ to M? no Add predicates to M’
x=y x=y y=0 ERR! Lazy Predicate Abstraction • Unwind the program CFG into a tree • Refine paths as needed to refute errors Add predicates along path to allow refutation of error • Refinement is local to an error path • Search continues after refinement • Do not start fresh -- no big CEGAR loop • Previously useful predicates applied to new vertices
Lazy Predicate Abstraction (2) • Procedure Expand (v in V) if v is uncovered leaf then for all actions (Mv(v),T,m) in Δ add a new vertex w to V and a new edge (v,w) to E; Mv(w) = m and ψ(w) = postPA(ψ(v),T) Me(v,w) = T • Procedure Refine (v in V) if Mv(v) = lf and ψ(v) != FALSE then let π = (v0,T0,v1)…(vn-1,Tn-1,vn) be the unique path from ε to v. pivot = BackwardCexAnalysis(v, π); if pivot != NULL then AddNewPredicates(pivot); else abort (program is unsafe) • Procedure Cover (v,w in V) if v is uncovered and Mv(v) = Mv(w) and ((v ≤ w)=FALSE) then if ψ(v) ψ(w) then add(v,w) to C delete all (x,y) in C, s.t. v≤y
0 Pivot L=0 L=0 [L!=0] [L!=0] 2 1 L=1; old=new L=1; old=new 3 L=0; new++ [new!=old] 4 [new!=old] [new==old] [L!=0] 5 6 control-flow graph Example - Lazy PA • Error is hit – check if the path is feasible • Backwards CEX analysis (weakest precondition) • Refinement… T F L=0 {L=0 /\ new != new} {L=1 /\ new != old} L=1 L=1 {L=1 /\ new != old} L=1 {L=1} L=1
T 0 Pivot L=0 F [L!=0] L=0 2 1 {L=0} /\ new != new L=1; old=new {L=1} /\ new != old L=1 3 L=1 {L=1} /\ new != old 4 [new!=old] L=1 [L!=0] {L=1} 5 L=1 6 Example (2) - Refinement • A theorem prover is used to do the emptiness check at pivot point. • Path: L’=1 Λ old’=new Λ T Λ new != old’ Λ L’!=0 • R: L=0 • Formula: R Λ Path • The result is UnSAT • The reason: old’=new, new!=old’ • Add the predicate old=new
L=0 [L!=0] L=1; old=new L=0; new++ 8 [new!=old] [new!=old] [L!=0] 9 10 [new==old] 9 control-flow graph ... Example (3) - MC T 0 L=0 F [L!=0] L=0 2 1 L=1; old=new 3 L=1 Λold=new L=0; new++ L=1 Λ old=new L=0 Λold!=new 4 [new!=old] [new==old] F [L!=0] F 5 6 7 F F L=0 Λold!=new ... L=1 Λ old=new
LA – Is It Really That Good? • Predicates are being stored locally according to their relevance to the program location. • Computation of post-condition transformer is less expansive. • Still, computing the strongest post-condition transformer. • The refinement is local. • No fresh restart after refinement is being done • Part of the information computed is being used • LA spends most of its time in predicate image operation. • The solution…
Interpolants from Proofs • Given a pair of formulas (A,B) such that A /\ B is inconsistent, an interpolant for (A,B) is a formula Ā with the following properties: • A implies Ā, • Ā /\ B is unsatisfiable, and • Ā is in L(A) ∩ L(B) • Interpolant always exists for inconsistent formulas in FOL (Craig’s lemma)
Interpolants for Sequences • We want to handle program paths, therefore a generalization of interpolant is needed. • Given a sequence of formulas Γ = A1,A2,…,An, we say that Ā 0, Ā 1,…, Ā n is in an interpolant for Γ when: • Ā 0 = TRUE and Ā n = FALSE, • For all 1≤i≤n, Ā i-1/\ Ai implies Ā i, and • For all 1≤i≤n, Ā i is in L(A1,…,Ai)∩L(Ai+1,…,An) • If Γ is quantifier-free we can derive a quantifier-free interpolant for Γ (from the refutation of Γ )
... A1 A2 A3 Ak True False ... Ā1 Ā2 Ā3 Āk-1 Interpolants for Sequences (2) • An intuition… • So this is a structured refutation of A1,…,Ak
True x1= y0 x=y; 1. Each formula implies the next x1=y0 y1=y0+1 y++; y1>x1 x1=y1 [x=y] False Path refinement procedure proof structured proof SSA sequence Path Refinement Prover Interpolation Interpolants as Floyd-Hoare proofs 2. Each is over common symbols of prefix and suffix 3. Begins with true, ends with false
Lazy PA with Interpolants (2) • Procedure Expand (v in V) if v is uncovered leaf then for all actions (Mv(v),T,m) in Δ add a new vertex w to V and a new edge (v,w) to E; Mv(w) = m and ψ(w) = T; Me(v,w) = T; • Procedure Refine (v in V) if Mv(v) = lf and ψ(v) != FALSE then let π = (v0,T0,v1)…(vn-1,Tn-1,vn) be the unique path from ε to v. if U(π) has an interpolant A’0,…,A’n then for i=0…n: let Φ = A’i(-i) if ψ does not imply Φ then remove all pairs ( ,vi) from C set ψ(vi) = ψ(vi)Λ Φ
Property: lock() is not called if the lock is already being held. L=0 [L!=0] do{ lock(); old = new; if(*){ unlock(); new++; } } while (new != old); L=1; old=new L=0; new++ [new!=old] program fragment [new==old] control-flow graph The Example
T 0 L=0 F T L=0 T [L!=0] 2 1 Label error state with false, by refining labels on path Unwinding the CFG L=0 [L!=0] L=1; old=new L=0; new++ [new!=old] [new==old] control-flow graph
L=1; old=new T 3 old=new L=0; new++ T L=0 4 [new!=old] F T T L=0 [L!=0] 5 6 Covering: state 5 is subsumed by state 1. Unwinding the CFG T 0 L=0 L=0 [L!=0] F [L!=0] L=0 2 1 L=1; old=new L=0; new++ [new!=old] [new==old] control-flow graph
old=new T old=new 8 [new==old] [new!=old] T F T [L!=0] T 10 7 11 9 F T Another cover. Unwinding is now complete. Unwinding the CFG T 0 L=0 L=0 [L!=0] F [L!=0] L=0 2 1 L=1; old=new L=1; old=new old=new 3 L=0; new++ L=0; new++ [new!=old] L=0 4 [new!=old] [new==old] F L=0 [L!=0] 5 6 control-flow graph
x=y x≤y X Covering Step • If y(x) y(y)... • add covering arc (x, y) to C • remove all (z, w) in C for w descendant of y We restrict covers to be descending in a suitable total order on vertices. This prevents covering from diverging.
Covering Step (2) • Covering one vertex may result in uncovering others. • Applying covering non-deterministically may not terminate. • A total order is defined ◄ on the vertices. • Respects the ancestor relation. • v ≤ w v ◄ w • Can be defined by a preorder traversal of the tree. • Cover is only applied to (v,w) if w ◄ v. • If by adding (v,w) we remove (x,y) where v ≤ y. By transitivity we get v ◄ x. • Covering a vertex v can result in uncovering vertices greater then v. • Therefore, we cannot apply covering infinitely.
x=0 y=0 y¹0 X F Refinement may remove covers Refinement Step • Label an error vertex False by refining the path to that vertex with an interpolant for that path. • By refining with interpolants, we avoid predicate image computation. T x = 0 T [x¹y] [x=y] T T y++ y=2 T T [y=0] T
The Complete Algorithm • A vertex v is said to be closed if either it is covered or no covering arc (v,w) can be added to C (while maintaining well-labeledness). • Procedure Close(v in V) For all w in V s.t. w ◄ v and Mv(v) = Mv(w): Cover(v,w) • Procedure DFS(v in V) Close(v) if v is uncovered then if Mv(v) = lf then Refine(v); for all w ≤ v: Close(w) Expand(v); for all children w of v: DFS(w)
The Complete Algorithm (2) • Procedure Unwind() V = {ε}, E = Φ, ψ(ε) = True, C = Φ While there exists an uncovered leaf v in V: for all w in V s.t. w ≤ v: Close(w); DFS(v); • Theorem: If procedure Unwind terminates without aborting on a program A, then A is safe. • Proof: Expand, Refine and Cover alter the unwinding and all preserve well-labeledness, the resulting unwinding is well-labeled. • All vertices are refined The unwinding is safe • Terminates when there are no more uncovered leaves Complete.
Something about Interpolant • A(X,Y) Λ B(Y,Z) ≡ FALSE • There exists I(Y) such that • A(X,Y) I(Y) • I(Y) Λ B(Y,Z) ≡ FALSE • The “best” interpolant: • Interpolantion is an Existential Quantification