440 likes | 583 Views
Abstraction and Modular Reasoning for the Verification of Software. Corina Pasareanu, October, 2001. Thesis Committee: Matthew Dwyer, Major Advisor David Schmidt Michael Huth George Strecker Kenneth Kemp, Outside Chairperson. OK. Finite-state system. Verification tool. or.
E N D
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu, October, 2001 Thesis Committee: Matthew Dwyer, Major Advisor David Schmidt Michael Huth George Strecker Kenneth Kemp, Outside Chairperson
OK Finite-state system Verification tool or Error trace (F W) Line 5: … Line 12: … Line 15:… Line 21:… Line 25:… Line 27:… … Line 41:… Line 47:… Specification Finite-state Verification
Finite-state Verification of Software Techniques for checking correctness of concurrent programs: • Formal verification • That a program satisfies a complete specification • Dynamic methods/testing • That examine the outcome of executions of the program • That check assertions during executions • Finite state verification (FSV) • Automated (like testing) • Guarantees satisfaction of properties (like formal verification) • Useful for debugging (counter-examples) • FSV is a potential very useful technique for insuring high-quality software
Challenges: Solutions: Finite State Verification of Software Creation of tractable models … software has large/infinite state space Modular reasoning about software components … using tools that reason about complete systems Abstract interpretation … to reduce the data domains of a program Program completion … with code for missing components … use environment assumptions, assume-guarantee paradigm
Goals of our work … Develop multiple forms of tool support for … … abstraction and program completion … applicable to program source code … largely automated, usable by non-experts We concentrate on building … … finite-state models, not finite-state tools We evaluate the effectiveness of this tool support through… … implementation … application to real Java and Ada programs
Universal and synthesized environments Heuristics for abstraction selection yes no Property True! Choose-free search and guided simulation SPIN, SMV, JPF… General Methodology Partial Program, Property Y, Assumption F Program Completion Data Abstraction Refine Selection Abstracted Completed Program Translator Model of Program Counter-example Analysis Model Checking <F,Y> Property False!
Contributions • Integration of existing technologies into a coherent methodology that enables FSV of software systems • Development of new technologies: • Use of theorem proving to automatically derive abstractions of data types • Instrumentation of source code to enable assume-guarantee reasoning about partial programs • Two techniques for analysis of abstract counter-examples • Formalization and implementation of these technologies • Evaluation of technologies on case studies (Java, Ada)
Context • The Bandera toolset (KSU) • Collection of program analysis and transformation components • Allows users to analyze properties of Java programs • Tailor analysis to property to minimize analysis time • The Ada translation toolset • Java PathFinder (NASA Ames) • Model checker for Java programs • Built on top of a custom made Java Virtual Machine • Checks for deadlock and violations of assertions
Related Projects • The Microsoft Research SLAM project • G. Holzmann’s Fever Tool (Bell Labs) • Stoller’s stateless checker for Java programs • Godefroid’s Verisoft (Bell Labs) • David Dill’s Hardware Verification Group • Eran Yahav’ s Java checking tool
Outline of the Talk • Introduction • Data Abstraction (FSE’98, ICSE’01) • Program Completion • Abstract Counter-example Analysis • Formal Justification • Case Studies: Java and Ada Programs • Conclusions, Limitations and Future Work
Finite-state Verification • Effective for analyzing properties of hardware systems Widespread success andadoption in industry • Recent years have seen many efforts to apply those techniques to software Limited success due to the enormous state spaces associated with most software systems
symbolic state represents a set of states Original system Abstract system abstraction Safety: The set of behaviors of the abstract system over-approximates the set of behaviors of the original system Abstraction: the key to scaling up
int (n<0) : NEG (n==0): ZERO (n>0) : POS Signs Signs x = ZERO; if (Signs.eq(x,ZERO)) x = Signs.add(x,POS); NEG ZERO POS Data Type Abstraction Collapses data domains via abstract interpretation: Code Data domains int x = 0; if (x == 0) x = x + 1;
Our hypothesis (?) Abstraction of data domains is necessary Automated support for • Defining abstract domains (and operators) • Selecting abstractions for program components • Generating abstract program models • Interpreting abstract counter-examples will make it possible to • Scale property verification to realistic systems
Bandera Abstraction Specification Language PVS Concrete Type Abstract Type Inferred Type Abstraction Definition Variable x int y int done bool count int BASL Compiler …. o Object b Buffer Program Abstracted Program Abstract Code Generator Abstraction in Bandera Signs Signs Signs bool Abstraction Library int …. Point Buffer
Automatic Generation Example: Start safe, then refine:+(NEG,NEG)={NEG,ZERO,POS} Forall n1,n2: neg?(n1) and neg?(n2) implies not pos?(n1+n2) Proof obligations submitted to PVS... Forall n1,n2: neg?(n1) and neg?(n2) implies not zero?(n1+n2) Forall n1,n2: neg?(n1) and neg?(n2) implies not neg?(n1+n2) Definition of Abstractions in BASL operator + add begin (NEG , NEG) -> {NEG} ; (NEG , ZERO) -> {NEG} ; (ZERO, NEG) -> {NEG} ; (ZERO, ZERO) -> {ZERO} ; (ZERO, POS) -> {POS} ; (POS , ZERO) -> {POS} ; (POS , POS) -> {POS} ; (_,_) -> {NEG,ZERO,POS}; /* case (POS,NEG),(NEG,POS) */ end abstraction Signs abstracts int begin TOKENS = { NEG, ZERO, POS }; abstract(n) begin n < 0 -> {NEG}; n == 0 -> {ZERO}; n > 0 -> {POS}; end
Compiled Compiling BASL Definitions abstraction Signs abstracts int begin TOKENS = { NEG, ZERO, POS }; abstract(n) begin n < 0 -> {NEG}; n == 0 -> {ZERO}; n > 0 -> {POS}; end operator + add begin (NEG , NEG) -> {NEG} ; (NEG , ZERO) -> {NEG} ; (ZERO, NEG) -> {NEG} ; (ZERO, ZERO) -> {ZERO} ; (ZERO, POS) -> {POS} ; (POS , ZERO) -> {POS} ; (POS , POS) -> {POS} ; (_,_)-> {NEG, ZERO, POS}; /* case (POS,NEG), (NEG,POS) */ end public class Signs { public static final int NEG = 0; // mask 1 public static final int ZERO = 1; // mask 2 public static final int POS = 2; // mask 4 public static int abs(int n) { if (n < 0) return NEG; if (n == 0) return ZERO; if (n > 0) return POS; } public static int add(int arg1, int arg2) { if (arg1==NEG && arg2==NEG) return NEG; if (arg1==NEG && arg2==ZERO) return NEG; if (arg1==ZERO && arg2==NEG) return NEG; if (arg1==ZERO && arg2==ZERO) return ZERO; if (arg1==ZERO && arg2==POS) return POS; if (arg1==POS && arg2==ZERO) return POS; if (arg1==POS && arg2==POS) return POS; return Bandera.choose(7); /* case (POS,NEG), (NEG,POS) */ }
Data Type Abstractions • Library of abstractions for base types contains: • Range(i,j),i..j modeled precisely, e.g., Range(0,0) is the signs abstraction • Modulo(k), Set(v,…) • Point maps all concrete values to unknown • User extendable for base types • Array abstractions • Specified by an index abstraction and an element abstraction • Class abstractions • Specified by abstractions for each field
Comparison to Related Work • Predicate abstraction (Graf, Saidi) • We use PVS to abstract operator definitions, not complete systems • We can reuse abstractions for different systems • Tool support for program abstraction • e.g., SLAM, JPF, Feaver • Abstraction at the source-code level • Supports multiple checking tools • e.g., JPF, Java Checker/Verisoft, FLAVERS/Java, …
Outline of the Talk • Introduction • Data Abstraction • Program Completion (FSE’98, SPIN’99) • Abstract Counter-example Analysis • Formal Justification • Case Studies: Java and Ada Programs • Conclusions, Limitations and Future Work
Program Completion • Software systems are collections of software components • Approach similar to unit testing of software • Applied to components (units) that are code complete • Stubs and drivers simulate environment for unit • We complete a system with a source code representation for missing components • Universal environments • Environment assumptions used to refine definitions of missing components
Example of Universal Environment • The stub for an Ada partial program, with an Insert procedure and a Removefunction procedure stub is choice: Integer; theObject: Object_Type; begin loop case choice is when 1 => Insert(theObject); when 2 => theObject:=Remove; when 3 => null; when others => exit; end case; end loop; end stub;
Problem: check property Y vs. , assuming F System Linear temporal paradigm: 2nd approach: Synthesized environment F (LTL) Universal environment F -> Y vs. || System Y vs. || System (LTL) (LTL) (ACTL) Assume-guarantee Model Checking • A system specification is a pair <F,Y> • Y describes the property of the system • F describes the assumption about the environment
Synthesis of Environments • Environment assumptions F • Encode behavioral info. about system interfaces • Written in LTL • “Classical” tableau construction for LTL • Modified to assume that only one interface operation can occur at a time • Builds a model of F : • I.e., a graph with arcs labeled by interface operations • Can be easily translated into source code
0 Insert t Insert Insert t 9 8 t Remove Remove Insert 16 t Graph Generated from Assumption: (! Remove) U Insert
Synthesized Code when 9 => case choice is when 1 => Insert(theObject);state:=8; when 2 => null;state:=9; when others => exit; end case; when 16=> case choice is when 1 => Insert(theObject);state:=16; when 2 => theObject:=Remove;state:=16; when 3 => null; state:=16; when others => exit; end case; end case; end loop; end stub; procedure stub is state,choice: Integer; theObject: Object_Type; begin state:=0; loop case state is when 0 => case choice is when 1 => Insert(theObject);state:=8; when 2 => null;state:=9; when others => exit; end case; when 8 => case choice is when 1 => Insert(theObject);state:=8; when 2 => theObject:=Remove;state:=16; when 3 => null; state:=8; when others => exit; end case;
Comparison to Related Work • Much theoretical work on compositional verification • Avrunin, Dillon, Corbett: • Analysis of real-time systems, described as a mixture of source code and specifications • Colby, Godefroid, Jagadeesan: • Automatable approach to complete a partially specified system • Use the VeriSoft toolset • No environment assumptions • Long: • Synthesis of environments from ACTL
Outline of the Talk • Introduction • Data Abstraction • Program Completion • Abstract Counter-example Analysis (TACAS’01) • Formal Justification • Case Studies: Java and Ada Programs • Conclusions, Limitations and Future Work
Abstract Counter-example Analysis • For an abstracted program, a counter-example may be infeasible because: • Over-approximation introduced by abstraction • Example: x = -2; if(x + 2 == 0) then ... x = NEG; if(Signs.eq(Signs.add(x,POS),ZERO)) then ... {NEG,ZERO,POS}
Our Solutions • Choice-bounded State Space Search • “on-the-fly”, during model checking • Abstract Counter-example Guided Concrete Simulation • Exploit implementations of abstractions for Java programs • Effective in practice
Choose-free state space search • Theorem [Saidi:SAS’00] Every path in the abstracted program where all assignments are deterministic is a path in the concrete program. • Bias the model checker • to look only at paths that do not include instructions that introduce non-determinism • JPF model checker modified • to detect non-deterministic choice (i.e. calls to Bandera.choose()); backtrack from those points
State space searched Undetectable Violation Detectable Violation Choice-bounded Search choose() X X
Counter-example guided simulation (?) • Use abstract counter-example to guide simulation of concrete program • Why it works: • Correspondence between concrete and abstracted program • Unique initial concrete state (Java defines default initial values for all data)
Comparison to Related Work • Previous work: • After model checking; analyze the counter-example to see if it is feasible • Pre-image computations; theorem prover based (InVest) • Forward simulation (CMU) • Symbolic execution (SLAM)
Outline of the Talk • Introduction • Data Abstraction • Program Completion • Abstract Counter-example Analysis • Formal Justification • Case Studies: Java and Ada Programs • Conclusions, Limitations and Future Work
Formal Justification • Our techniques build safe abstractions of software systems P < P’ means “ P’ is a safe abstraction of P” • < is a simulation relation (Milner) • Properties are expressedin universal temporal logics (LTL, ACTL) preserved by < • Data Abstraction builds abstractions Pabs of systems P s.t. P < Pabs • Universal Environments U: P||E < P||U, for all environments E • Synthesized Environments TF from assumptions F: P||E < P||TF, for all environments E that satisfy F
Outline of the Talk • Introduction • Data Abstraction • Program Completion • Abstract Counter-example Analysis • Formal Justification • Case Studies: Java and Ada Programs • Conclusions, Limitations and Future Work
Case Studies • Filter-based Model Checking of a Reusable Parameterized Programming Framework (FSE’98) • Ada implementation of the Replicated Workers Framework • Model Checking Generic Container Implementations (GP’98) • Ada implementations of generic data structures: • queue, stack and priority queue • Abstracted using data type abstraction • Completed with code for universal environments • Environments’ behavior refined based on LTL assumptions • We used the SPIN tool: • To check 8/5 properties; 5/3 required use of assumptions • To detect seeded errors
Case Studies (contnd.) • Assume-guarantee Model Checking of Software: A Comparative Case Study (SPIN’99) • Ada implementations of software components: • The Replicated Workers Framework, the Generic Containers • The Gas Station, the Chiron Client • Out of 39 properties, 15 required use of assumptions • For < F,Y >, we compared two ways of completing systems: • Universal environment: check F->Y (LTL) • using SPIN • Synthesized environment from F(LTL): check Y (LTL/ACTL) • using SPIN/SMV • Synthesized environments enable faster verification
Case Studies (contnd.) • Analysis of the Honeywell DEOS Operating System (NASA’00,ICSE’01) • A real time operating system for integrated modular avionics • Non-trivial concurrent program (1433 lines of code, 20 classes, 6 threads) • Written in C++, translated into Java and Promela • With a known bug • Verification of the system exhausted 4 Gigabytes of memory without completion • Abstracted using data type abstraction • Completed with code for • User threads that run on the kernel • A system clock and a system timer • Environments’ behavior refined based on LTL assumptions • Checked using JPF and SPIN • Defect detected using choice-bounded search
Case Studies (contnd.) • Finding Feasible Counter-examples when Model Checking Abstracted Java Programs (TACAS’01) • We applied the 2 counter-example analysis techniques to Java defective applications: • Remote agent experiment, Pipeline, Readers-Writers, DEOS • Both techniques are fast: • Choose-free search is depth-bounded • Cost of simulation depends on the length of the counter-example • Choose-free counter-examples … • are common • are short • enable more aggressive abstraction
Summary • We integrated several technologies into a coherent methodology that enables FSV of complex, concurrent software systems • We developed technologies for… • Data type abstraction: • Enable non-experts to generate models that make FSV less costly • Facilities to define, select and generate abstracted programs • Program completion • Enable verification of properties of individual components, collections of components and entire systems • Take into account assumptions about environment behavior • Analysis of abstract counter-examples • Choose-free state space search • Counter-example guided simulation • We implemented and formalized these technologies • We evaluated them on extensive case studies (Ada and Java)
Future Work • Generalize choice-bounded search technique • CTL* model checking of abstracted programs • Extend abstraction support for objects • Heap abstractions tohandle an unbounded number of dynamically allocated objects • Extend automation • Automated selection and refinement of abstractions based on counter-example analysis
Conclusions • Tool support for: • Abstraction of data domains • Program completion • Interpretation of counter-examples … is essential for the verification of realistic software systems • Ensures the safety of the verification process