Integrating Arithmetic Constraint-Based Verification and Shape Analysis

Integrating Arithmetic Constraint Based Verification and Shape Analysis Tevfik Bultan Joint work with Tuba Yavuz-Kahveci Department of Computer Science University of California, Santa Barbara bultan@cs.ucsb.edu http://www.cs.ucsb.edu/~bultan/composite

Motivation • Concurrent programming is difficult and error prone • Sequential programming: states of the variables • Concurrent programming: states of the variables and the processes • Linked list manipulation is difficult and error prone • States of the heap: possibly infinite • We would like to guarantee properties of a concurrent linked list implementation

More Specific Problem • There has been work on verification of concurrent systems with integer variables (and linear constraints) • [Boigelot 98], [Bultan, Gerber and Pugh, TOPLAS 99], [Delzanno and Podelski, STTT 01] • There has been work on verification of (concurrent) linked lists • [Sagiv,Reps, Wilhelm TOPLAS 98], [Yahav POPL 01] • What can we do for concurrent systems: • where both integer and heap variables influence the control flow • or the properties we wish to verify involve both integer and heap variables?

Our Approach • Use symbolic verification techniques • Use polyhedra to represent the states of the integer variables • Use BDDs to represent the states of the boolean and enumerated variables • Use shape graphs to represent the states of the heap • Use a composite representation to combine them • Use forward-fixpoint computations to compute reachable states • Truncated fixpoint computations can be used to detect errors • Over-approximation techniques can be used to prove properties • Polyhedra widening • Summarization in shape graphs

Action Language Verifier Action Language Specification Action Language Tool Set OR Guarded Commands • Students who work with me • on this project: • Tuba Yavuz-Kahveci • Constantinos Bartzis • Aysu Betin-Can Translator to Action Language Action Language Parser Composite Symbolic Library Code Generator Omega Library CUDD Package MONA Presburger Arithmetic Manipulator BDD Manipulator Automata Manipulator Verified code (Java monitor classes)

Related Publications • Composite Symbolic Library, Integration of polyhedra representation with BDDs • [Yavuz-Kahveci, Tuncer, Bultan, TACAS 01], [Yavuz-Kahveci, Bultan, STTT] • Action Language Verifier • [Bultan ICSE 00], [Bultan, Yavuz-Kahveci ASE 01] • Verification of Concurrency Control Components using Action Language Verifier • [Yavuz-Kahveci, Bultan ISSTA 02] • Using automata representation for Presburger arithmetic in Composite Symbolic Library • [Bartzis and Bultan, CIAA 02], [Bartzis and Bultan, IJFCS] [Bartzis, Bultan CAV 03]

Outline • Specification of concurrent linked lists • Action Language • Symbolic verification • Composite representation • Approximation techniques • Summarization • Widening • Counting abstraction • Experimental results • Related Work • Conclusions

Action Language [Bultan ICSE 00] [Yavuz-Kahveci, Bultan ASE 01] • A state based language • Actions correspond to state changes • States correspond to valuations of variables • Integer (possibly unbounded), heap, boolean and enumerated variables • Parameterized constants are allowed • Transition relation is defined using actions • Atomic actions: Predicates on current and next state variables • Action composition: synchronous (&) or asynchronous (|) • Modular • Modules can have submodules • Properties to be verified • Invariant(p) : p always holds

Composite Formulas: State Formulas • We use state formulas to express the properties we need to check • No primed variables in state formulas • State formulas are boolean combination (, ,,,) of integer, boolean and heap formulas numItems > 2 => top.next != null integer formula heap formula

State formulas • Boolean formulas • Boolean variables and constants (true, false) • Relational operators: =,  • Boolean connectives (, ,,,) • Integer formulas (linear arithmetic) • Integer variables and constants • Arithmetic operators: +,, and * with a constant • Relational operators: =, , > , <, ,  • Boolean connectives (, ,,,) • Heap formulas • Heap variable, heap-variable.selector, heap constant null • Relational operators: =,  • Boolean connectives (, ,,,)

Composite Formulas: Transition Formulas • We use transition formulas to express the actions • In transition formulas primed-variables denote the next-state values, unprimed-variables denote the current-sate values pc=l2 and numItems=0 and top’=add and numItems’=1 and pc’=l3; current state variables next state variables

Transition Formulas • Transition formulas are in the form: • boolean-formula  integer-formula heap-transition-formula • Heap transition formulas are in the form: • guard-formula  update-formula

Heap Transition Formulas • A guard formula is a boolean combination of terms in the form: id1 = id2 id1 id2 id1.f = id2 id1.f  id2 id1.f = id2.f id1.f  id2.f id1 = null id1  null id1.f = null id1.f  null • An update formula is a term in the form: id’1 = id2 id’1 = id2.f id’1.f = id2 id’1.f = id2.f id’1 = null id’1.f = null id’1= new id’1.f = new

module main() heap {next} top, add, get, newTop; boolean mutex; integer numItems; initial: top=null and mutex and numItems=0; module push() enumerated pc {l1, l2, l3, l4}; initial: pc=l1 and add=null; push1: pc=l1 and mutex and !mutex’ and add’=new and pc’=l2; push2: pc=l2 and top=null and top’=add and numItems’=1 and pc’=l3; push3: pc=l3 and top’.next =null and mutex’ and pc’=l1; push4: pc=l2 and top!=null and add’.next=top and pc’=l4; push5: pc=l4 and top’=add and numItems’=numItems+1 and mutex’ and pc’=l1; push: push1 | push2 | push3 | push4 | push5; endmodule Stack Example Variable declarations define the state space of the system Initial states Atomic actions: primed variables denote the next sate variables Transition relation of the push module is defined as the asynchronous composition of its atomic actions

Stack (Cont’d) module pop() enumerated pc {l1, l2, l3}; initial: pc=l1 and get=null and newTop=null; pop1: pc=l1 and mutex and top!=null and newTop’=top.next and !mutex’ and pc’=l2; pop2: pc=l2 and get’=top and pc’=l3; pop3: pc=l3 and top’=newTop and mutex’ and numItems’=numItems-1 and pc’=l1; pop: pop1 | pop2 | pop3; endmodule main: pop() | pop() | push() | push(); spec: invariant([mutex =>(numItems=0 <=> top=null)]) spec: invariant([mutex =>(numItems>2 => top->next!=null)]) endmodule Transition relation of main defined as asynchronous composition of two pop and two push processes Invariants to be verified

Stack (with integer guards) module main() heap {next} top, add, get, newTop; boolean mutex; integer numItems; initial: top=null and mutex and numItems=0; module push() enumerated pc {l1, l2, l3, l4}; initial: pc=l1 and add=null; push1: pc=l1 and mutex and !mutex’ and add’=new and pc’=l2; push2: pc=l2 and numItems=0 and top’=add and numItems’=1 and pc’=l3; push3: pc=l3 and add’.next=null and mutex’ and pc’=l1; push4: pc=l2 and numItems>0 and add’.next=top and pc’=l4; push5: pc=l4 and top’=add and numItems’=numItems+1 and mutex’ and pc’=l1; push: push1 | push2 | push3 | push4 | push5; endmodule

Outline • Specification of concurrent linked lists • Action Language • Symbolic verification • Composite representation • Approximation techniques • Summarization • Widening • Counting abstraction • Experimental results • Related Work • Conclusions

Symbolic Verification: Forward Fixpoint • Forward fixpoint for the reachable states can be computed by iteratively manipulating symbolic representations • We need forward-image (post-condition), union, and equivalence check computations ReachableStates(I: Set of initial states, T: Transition relation) { RS := I; repeat { RSold := RS; RS := RSold forwardImage(RSold, T); } until (RS  RSold) }

Symbolic Verification: Symbolic Representations • We use symbolic representations for encoding sets of states • Boolean logic formulas (stored as a BDDs) represent the sets of states of the boolean variables: pc=l1  mutex • Presburger arithmetic formulas (stored as polyhedra) represent the sets of states of integer variables: numItems > 0

Symbolic Representation: Shape Graphs • Sets of shape graphs represent the sates of the heap variables and the heap • Each node in the shape graph represents a dynamically allocated memory location • Heap variables point to nodes of the shape graph (if they are not null) • The edges between the nodes show the locations pointed by the fields of the nodes heap variables add and top point to node n1 add.next is node n2 top.next is also node n2 add.next.next is null add top next n1 n2 next

Composite Representation • Each variable type is mapped to a symbolic representation type • Boolean and enumerated types  BDD representation • Integer variables  Polyhedra • Heap variables  Shape graphs • Each conjunct in a transition formula operates on a single symbolic representation • Composite representation: A disjunctive representation to combine different symbolic representations • Union, subsumption check and forward-image computations are performed on this disjunctive representation

Composite Representation • A composite representation A is a disjunction where • n is the number of composite atoms in A • tis the number of basic symbolic representations • Each composite atom is a conjunction • Each conjunct corresponds to a different symbolic representation

add top Composite Representation: Example A set of shape graphs BDD A set of polyhedra   pc=l1  mutex • numItems=2  add top   pc=l2  mutex • numItems=2  add top   pc=l4  mutex • numItems=2  add top   pc=l1  mutex • numItems=3

Composite Symbolic Library [Yavuz-Kahveci, Tuncer, Bultan TACAS01], [Yavuz-Kahveci, Bultan STTT] • Composite Library implements this approach using an object-oriented design • An abstract class defines the common interface for symbolic representations • Easy to extend with new symbolic representations • Enables polymorphic verification • As a BDD library we use Colorado University Decision Diagram Package (CUDD) [Somenzi et al] • As an integer constraint manipulator we use Omega Library [Pugh et al] • For encoding the states of the heap variables and the heap we use shape graphs encoded as BDDs (using CUDD)

CompSym –representation: list of comAtom + union() • • • compAtom ShapeGraph –atom: *Symbolic –atom: *Symbolic Composite Symbolic Library: Class Diagram Symbolic +union() +isSatisfiable() +isSubset() +forwardImage() BoolSym HeapSym IntSym –representation: BDD –representation: list of ShapeGraph –representation: list of Polyhedra +union() • • • +union() • • • +union() • • • CUDD Library OMEGA Library

Satisfiability Checking for the Composite Representation • Given a composite representation • We can check satisfiability as follows:

Forward Image Computation for the Composite Representation • Given composite representations for a set of states and a transition relation: • We can compute the forward image as follows:

numItems’=numItems+1 top’=add  • numItems=3 Forward-Image Computation: Example add top   set of states pc=l4  mutex • numItems=2  transition relation pc=l4 and mutex’ pc’=l1  add top  pc=l1  mutex

Forward–Fixpoint Computation (Repeatedly Applies Forward-Image) add top   pc=l1  mutex • numItems=0  add top   pc=l2  mutex • numItems=0  add top   pc=l3  mutex • numItems=1  add top   pc=l1  mutex • numItems=1

 add top   pc=l2  mutex • numItems=1  add top   pc=l4  mutex • numItems=1  add top   pc=l1  mutex • numItems=2  add top   pc=l2  mutex • numItems=2  add top   pc=l4  mutex • numItems=2

 add top   pc=l4  mutex • numItems=3 . . .

Forward-Fixpoint does not Converge • We have two reasons for non-termination • integer variables can increase without a bound • the number of nodes in the shape graphs can increase without a bound • The state space is infinite • Even if we ignore the heap variables, reachability is undecidable when we have unbounded integer variables • So, we use conservative approximations

Outline • Specification of concurrent linked lists • Action Language • Symbolic verification • Composite representation • Approximation techniques • Summarization • Widening • Counting Abstraction • Experimental results • Related Work • Conclusions

p RS + RS “The property is satisfied” Conservative Approximations • To verify or falsify a property p • Compute a lower ( RS  ) or an upper ( RS + ) approximation to the set of reachable states • There are three possibilities:

reachable sates which violate the property p RS RS  “The property is false” p RS RS + RS  “I don’t know” Conservative Approximations

Computing Upper and Lower Bounds for Reachable States • Truncated fixpoint computation • To compute a lower bound for a least-fixpoint computation • Stops after a fixed number of iterations • Widening • To compute an upper bound for the least-fixpoint computation • We use a generalization of the polyhedra widening operator by [Cousot and Halbwachs POPL’77] • Summarization • Generate summary nodes in the shape graphs which represent more than one concrete node • Materialization: we need to generate concrete nodes from the summary nodes when needed

Summarization • The nodes that form a chain are mapped to a summary node • No heap variable points to any concrete node that is mapped to a summary node • Each concrete node mapped to a summary node is only pointed by a concrete node which is also mapped to the same summary node • During summarization, we also introduce an integer variable which counts the number of concrete nodes mapped to a summary node ...

Summarization Example add top   pc=l1  mutex • numItems=3 summarized nodes After summarization, it becomes: add top   pc=l1  mutex • numItems=3  summarycount=2 a new integer variable representing the number of concrete nodes encoded by the summary node summary node

Summarization • Summarization guarantees that the number of different shape graphs that can be generated are finite • However, the summary-counts can still increase without a bound • We use polyhedral widening operation to force the fixpoint computation to convergence

Let’s Continue the Forward-fixpoint  add top • numItems=3 •  summaryCount=2   pc=l1  mutex  add top • numItems=3 •  summaryCount=2   pc=l2  mutex  add top • numItems=3 •  summaryCount=2   pc=l4  mutex  add • numItems=4 •  summaryCount=2  top  pc=l1  mutex We need to do summarization again

Summarization add • numItems=4 •  summaryCount=2  top  pc=l1  mutex After summarization, it becomes: add • numItems=4 •  summaryCount=3  top  pc=l1  mutex

Simplification • After each fixpoint iteration we try to merge as many composite atoms as possible • For example, following composite atoms can be merged add top • numItems=3 •  summaryCount=2   pc=l1  mutex add • numItems=4 •  summaryCount=3  top  pc=l1  mutex

Simplification add top • numItems=3 •  summaryCount=2   pc=l1  mutex  add • numItems=4 •  summaryCount=3  top  pc=l1  mutex = add • (numItems=4 • summaryCount=3 • numItems=3  summarycount=2)  top  pc=l1  mutex

Simplification on the integer part add • (numItems=4 • summaryCount=3 • numItems=3  summaryCount=2)  top  pc=l1  mutex = add   top • numItems=summaryCount+1 • 3  numItems • numItems  4 pc=l1  mutex

Widening • Forward-fixpoint computation still will not converge since numItems and summaryCount keep increasing without a bound • We use the widening operation: • Given two composite atoms c1 and c2 in consecutive fixpoint iterates, assume that c1 = b1 i1 h1 c2 = b2 i2 h2 where b1 = b2 and h1 = h2 and i1  i2 • Also assume that i1 is a single polyhedron (i.e. a conjunction of arithmetic constraints) and i2 is also a single polyhedron

Widening • Then • i1 i2 is defined as: all the constraints in i1 which are also satisfied by i2 • Replace i2 with i1 i2 in c2 • This generates an upper approximation to the forward-fixpoint computation

Widening Example add   top • numItems=summaryCount+1 • 3  numItems • numItems  4 pc=l1  mutex  add   top • numItems=summaryCount+1 • 3  numItems • numItems  5 pc=l1  mutex = add   top • numItems=summaryCount+1 • 3  numItems pc=l1  mutex Now, the forward-fixpoint converges

Dealing with Arbitrary Number of Processes • Use counting abstraction [Delzanno CAV’00] • Create an integer variable for each local state of a process • Each variable will count the number of processes in a particular state • Local states of the processes have to be finite • Shared variables of the monitor can be unbounded • Counting abstraction can be automated

Stack After Counting Abstraction Variables for counting the number of processes in each state Parameterized constant representing the number of processes module main() heap top, add, get, newTop; boolean mutex; integer numItems; integer l1C, l2C, l2C, l4C; parameterized integer numProc; initial: top=null and mutex and numItems=0 and l1C=numProc and l2C=0 and l3C=0 and l4C=0; restrict: numProc>0; module push() //enumerated pc {l1, l2,l3,l4}; initial: add=null; push1: l1C>0 and mutex and !mutex' and add'=new and l1C'=l1C-1 and l2C'=l2C+1; push2: l2C>0 and top=null and top'=add numItems'=1 and l2C'=l2C-1 and l3C'=l3C+1; ... push: push1 | push2 | push3 | push4 | push5; endmodule Initialize initial state counter to the number of processes. Initialize other states to 0. When local state changes, decrement current local state counter and increment next local state counter

Verified Properties

Integrating Arithmetic Constraint-Based Verification and Shape Analysis

Integrating Arithmetic Constraint-Based Verification and Shape Analysis

Presentation Transcript

Integrating Constraint Programming and Mathematical Programming

Shape Analysis

Shape Analysis and Retrieval

Shape Analysis and Retrieval

Shape Analysis and Retrieval

Shape Analysis and Retrieval

Shape Analysis and Retrieval

CONSTRAINT-BASED SCHEDULING and PLANNING

Combining verification and analysis

Shape Analysis

Constraint-Based Analysis

Static Analysis and Verification

Constraint-Based Analysis

Constraint Based Systems

Constraint-Based Scheduling

Design Constraint Analysis

Constraint-Based Verification

Explanation-based constraint programming

Constraint-Based Random Verification by Mutation Analysis

Statistics and Shape Analysis

Design Constraint Analysis

ALGEBRAIC APPROACH TO ARITHMETIC DESIGN VERIFICATION