650 likes | 810 Views
Finite State Verification: An Emerging Technology for Validating Software Systems. Lori A. Clarke University of Massachusetts Clarke@cs.umass.edu http://laser.cs.umass.edu/. UMASS. Laboratory for Advanced Software Engineering Research. Outline of Presentation. Lay of the Land:
E N D
Finite State Verification: An Emerging Technology for Validating Software Systems Lori A. ClarkeUniversity of Massachusetts Clarke@cs.umass.eduhttp://laser.cs.umass.edu/ UMASS Laboratory for Advanced Software Engineering Research
Outline of Presentation • Lay of the Land: • Testing, Theorem-proving based verification, Finite state verification(FSV) • Overview of FSV • Look at 3 Different Approaches to FSV • Model Checking • Flow Equations • Data Flow Analysis • Major Challenges to be Addressed
Sorry State of Affairs • Testing consumes about half the cost of s/w development • Maintenance consumes about 80% of the full life cycle costs--much of that devoted to testing • Most companies use ad hoc QA practices • Unhappy with the results; Unhappy with the cost • Failed projects • Delayed product releases
Testing • can: • Uncover failures • Show specifications are (not) met for specific test cases • Be an indication of overall reliability • cannot: • Prove that a program will/will not behave in a particular way
Must do better! • Increasing number of high assurance applications • Medical applications • Flight control software • Electronic commerce • Increasing number of complex systems • Systems of systems • Distributed systems
Distributed Systems • Better performance, better flexibility, but there is a cost • distributed systems are more difficult to test than sequential systems • number of execution paths can grow exponentially with the number of processes • Testing can not even demonstrate that a system works on the selected/executed test data UMASS Laboratory for Advanced Software Engineering Research
1,6 2,6 1,7 2 3,6 2,7 1,8 3,7 2,8 5 3,8 4 5,9 Complexity of Distributed Systems T1 T2 6 1 7 8 3 4 9
1,6 2,6 1,7 2 3,6 2,7 1,8 3,7 2,8 5 3,8 4 5,9 Uncertainty of Testing T1 T2 6 1 7 X: =2 8 3 X:=1 4 9 X==?
Formal Verification: An Alternative to Testing • Theorem Proving Based Verification • Use mathematical reasoning • Prove properties about all possible executions • Difficult and error prone • Finite State Verification • Reason about a finite model of the system • Prove properties about all possible executions, but not as powerful as theorem proving • Almost a totally automated process
Spectrum of Difficulty Ad-hoc Testing SystematicTesting Finite State Verification Theorem Proving • Arbitrary testcases • Reqts based test planning • Requirements captured as properties • Properties guaranteed on all possible executions
Finite State Verification (FSV) • Holds the promise of providing a cost effective way of verifying important properties about a system • Not all faults are created equal • Invest effort into most important properties • Several promising prototypes • Reachability Based • SPIN or Symbolic Model Checking (SMV) • Flow Equations • Integer Necessary Conditions (INCA) • Data Flow Analysis • FLAVERS
High-Level Architecture of FSV Systems Property Property Translator Property Representation System System Model System Translator Property Verified ReasoningEngine Counter Examples for Model UMASS Laboratory for Advanced Software Engineering Research
Conservative Analysis • If property verified, property holds for all possible executions of the system • If property not verified: • An errorOR • A spurious result • System model abstracts information to be tractable • Conservative abstractions over-approximate behavior • If inconsistency relies upon over-approximations, then a spurious result • e.g. counter example corresponds to an infeasible path
System Model • Depends on property being verified • Eliminate information that does not impact the proof • Abstraction techniques allows “states” in the model to be reduced/collapsed
Some Properties of Properties • State-based versus event-based • Once temperature is greater than 100 degrees, lock is true • Elevator door closes before elevator moves • Single locations versus (sub)paths • Deadlock or race conditions • Sequences of states or events • Safety versus Liveness
A quick look at three approaches to FSV • Model Checking • Flow Equations • Data Flow Analysis Big Disclaimer!
Model Checking: some history • Originally proposed for hardware • Early 80’s: E. Clarke and Emerson; Quielle and Sifakis • Late 80’s: Improved algorithms and property notations (E. Clarke, Emerson, Sistla) • 90’s: Symbolic Model Checking (SMV)and other optimizations (Burch, E. Clarke, Dill, Long, and McMillan) • Current: Hybrid approaches
Model Checking • Properties usually expressed in a temporal logic • System represented as a (possibly “abstracted”) reachability graph • State based • Reasoning engine propagates valid subformulas through the graph
High-Level Architecture of Model Checking Temporal Logic Property Property Translator Property Representation State-based Reachability Graph System System Translator Property Verified Subformula propagation Counter Examples for Model UMASS Laboratory for Advanced Software Engineering Research
Representing Properties • CTL operators • G - globally • F - future • X- next • U - until • At a state in the model: • AG p means that for all paths from this state, p is true and will remain true • EF p means that for some path from this state, p will eventually be true
AF p AF p Propagating Propositions p AF p AF p
Example: mutual exclusion protocol* reachability graph n1,n2,turn=0 n1,t2,turn=2 t1,n2,turn=1 n1,c2,turn=2 t1,t2,turn=2 c1,n2,turn=1 t1,t2,turn=1 c1,t2,turn=1 t1,c2,turn=2 *McMillan
Example Property • AG(t1=>AF c1) • If process1 tries (t1) to get the lock then eventually it gets into its critical region (c1) • Note, would like to prove this for all processes but FSV approaches usually must instantiate property (and system)
AF c1 AF c1 AF c1 AF c1 Example: propagation AG(t1=>AF c1) n1,n2,turn=0 t1=> AF c1 n1,t2,turn=2 t1,n2,turn=1 t1=> AF c1 t1=> n1,c2,turn=2 t1,t2,turn=2 c1,n2,turn=1 AF c1 t1,t2,turn=1 c1,t2,turn=1 t1,c2,turn=2 AF c1 t1=> AF c1
Formula Propagation • Propagate until no change • propagate from smaller to larger subformulas • “smart” algorithm: linear in the size of model and size of the formula • Many optimization techniques • Symbolic model checking • Use efficient algorithms that propagate subformula for sets of values
c c c c 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 Symbolic Model Checking • With abstraction, nodes may represent sets of values • BDD • Worst case bound exponential in size of the model • For some examples, able to deal with 10120 states ab+c a a 0 1 0 1 b b b 0 1 0 0 1 1 1 c 0 1 1 1
Some observations: Model Checking • Worst case bound linear in size of the model • Model exponential • Experimentally often very effective • Not clear if model checking or symbolic model checking is superior • Depends on the problem
Flow Equations: some history • Originally proposed for designs • Early 80’s: Initial development (Avrunin, Dillon, and Wileden) • 90’s: Optimized and extended to real-time (Avrunin, Buy, Corbett, Dillon, and Wileden) • Current: INCA prototype (Avrunin, Corbett, and Siegel)
Flow Equations • Model system as finite state automata • Use extended network flow inequalities to capture legal flow through a concurrent system • Represent negation of the property as a set of inequalities
Solving the Set of Inequalities • Determine if combined system of inequalities is consistent • Use integer linear programming • If consistent, there is a set of flows through automata that violate the property • Provides guidance for trace through the model (but may not be executable)
High-Level Architecture of INCA System Property System Translator Property Translator Set ofInequalities FSA’s Set of Inequalities Property Verified(no solution) FSA Translator Integer Linear Programming System Counter Examples for Model (solution) UMASS Laboratory for Advanced Software Engineering Research
x0 x8 x4 x7 x2 x10 x1 x9 x5 x6 x11 x3 x5 + x7 = x4 + x6 x5 = x6x4 = 1; x7 = 1 x9 = x8 + x10x9 = x10 + x11x8 = 1; x11 = 1 x1 = x0 + x2x1 = x2 + x3x0 = 1; x3 = 1 Example: Process Flow Equations a b a’ b’
x0 x8 x4 x7 x2 x10 x1 x9 x5 x6 x11 x3 Example: Inter-process Flow Equations a b a’ b’ x1 = x5 x9 = x6
Solving for a property x1 = x0 + x2x1 = x2 + x3x0 = 1; x3 = 1x5 + x7 = x4 + x6 x5 = x6x4 = 1; x7 = 1x9 = x8 + x10x9 = x10 + x11x8 = 1; x11=1x1 = x5x9 =x6j: 0 ≤ xj Property: For all paths, event a occurs more than event brepresent complement ¬(x1 > x9) = = (x1 ≤ x9) Solution exists e.g., x2, x10 = 0, all other xi = 1 => property does not hold
x0 x8 x4 x7 x2 x10 x1 x9 x5 x6 x11 x3 Seeing the counter example Property: For all paths, event a occurs more than event b a b a’ b’ x2, x10 = 0, all other xi = 1
Some Limitations • Integer Linear Programming has an exponential worst case bound • Inter-process order information is not preserved • only checks whether event counts are consistent • Like most static techniques, may produce spurious results
Some Benefits • Does not enumerate the state space! • Integer linear Programming is often very efficient • Empirical evidence: linear inequality systems usually grow linearly and take sub-exponential times to solve • In practice, INCA is usually an effective technique
Data Flow Based Verification: some history • Mid-70’s: Originally proposed for def-ref anomalies in FORTRAN (Osterweil and Fosdick) • Early 80’s: Extended to general properties (Olender and Osterweil) & concurrency (Taylor and Osterweil) • 90’s: Deadlock detection (Masticola and Ryder); Efficient representation of concurrency & incremental precision improvement (Dwyer and L. Clarke) • Recent: Optimizations, Java (Avrunin, L. Clarke, Cobleigh, Naumovich, and Osterweil)
Data Flow Analysis: FLAVERS • Represents property as a finite state automaton • System model is collection of annotated control flow graphs • Inter-process communication and interleavings are represented with additional edges • does not enumerate all reachable states • over-approximates relevant executable behaviors • Reasoning engine based on data flow analysis
High-Level Architecture of FSV Systems Property Property Translator FSA Collection of annotated CFG’s System System Translator Property Verified State Propagation Counter Examples from Model UMASS Laboratory for Advanced Software Engineering Research
2 7 5 Modeling the System 1,6 T1 T2 6 1 2,6 1,7 3,6 2,7 1,8 8 3 3,7 2,8 4 3,8 9 • State explosion 4 5,9
2 7 5 Modeling the System T1 T2 6 • Automatically creates the program model from source code • Instead of the state space, explicitly represents interleaved execution via edges • Smaller model • Loss of precision 1 8 3 4 9
close 0 move open close open 1 move 2 close, open,move Representing Properties Example:
State Propagation • States of the property are propagated through the model • The property is proved if only accepting (non-accepting) states are contained in the final node of the model
Example public static void main (String [] args) { … if (elevatorStopped) {... openDoors(); } recordState(); if (elevatorStopped) {... closeDoors(); } moveToNextFloor(); } if open if close move
close 0 move open close open 1 move 2 close, open,move Example {0} if {1} open {0,1} if {0} close {0,2} move
Constraints ... Incrementally Improving Precision Property Property Translator FSA System model System Translator State Propagation System Property Verified Counter Examples for Model UMASS Laboratory for Advanced Software Engineering Research
close if move S==true S==false open if S==true S==false close close, open,move move Example with Constraints Property (0,0) Constraint 0 0 (0,1) S==false S==true open close (1,1) S==false S==true (1,1) open 1 2 1 (1,viol) move S==true S==false viol 2 S==true S==false
close if 0 move S==true S==false open close open open 1 if move S==true S==false 2 close close, open,move move Example with Constraints Property Constraint (0,0) 0 (0,1) (0,2) S==false S==true (1,1) S==true S==false {(1,1), (0,2)} 1 2 {(1,1), (0,viol)} {(1,viol), (0,2)} S==true S==false viol {(0,1)} S==true {(0,1), (0,2)} S==false
Some Observations: Data Flow Analysis • Overall complexity is O(N2S) • N is the # nodes in the model • S is the number of states: property x constraints • Experimentally: performance subexponential • Usually requires several iterations to determine needed constraints • Constraints • Many automatically generated on request • Can be used to model other information