700 likes | 835 Views
FATAL ERROR: Evento Imprevedibile … o no. Enrico Tronci Dipartimento di Informatica, Università di Roma “La Sapienza”, Via Salaraia 113, 00198 Roma, Italy, tronci@dsi.uniroma1.it , http://www.dsi.uniroma1.it/~tronci.
E N D
FATAL ERROR: Evento Imprevedibile … o no Enrico Tronci Dipartimento di Informatica, Università di Roma “La Sapienza”, Via Salaraia 113, 00198 Roma, Italy,tronci@dsi.uniroma1.it, http://www.dsi.uniroma1.it/~tronci Incontri 2003-2004 con la Facoltà di Scienze Matematiche Fisiche e Naturali 3 Dicembre 2003
This program has executed an illegal operation and will be terminated Error code: 56 Press any key to restart your system
Verification Game Given: a system Sys (e.g. hardware, software, hybrid, etc) and an specifications Spec for Sys (e.g. what Sys should or should not do) We want to know: If system Sys satisfies the given specifications Spec.
Examples This program has executed an illegal operation and will be terminated Error code: 56 Press any key to restart your system … is an undesired state for Windows XX … is an undesired state for ANY Microprocessor FPU (… Pentium 2 included … ) 1.1 1.1 = 3.7 A CRASH is an undesired state for rockets (e.g. Arianne V), airplanes, trains, cars, …
An approximate answerBUG HUNTING: Testing + Simulation Input sequence (stimulus) Output sequence System (Model) Compute output by Simulation or by running the actual system when possible … u(3) u(2) u(1) u(0) y(0) y(1) y(2) y(3) … Define initial state + parameters Observer checks that output sequence ok
Example (Testing 1) x(t + 1) = if x(t) <= 3 then x(t) + u(t) else x(t) – u(t), u(t) = 1, 2. x(0) = 0 2 1 3 0 1 1 1 1 1 2 2 2 4 2 Sim length: 10 1, 2, 1, 2, 1, 1, 2, 2, 2, 1 2 Spec: x(t) < 5. I.e. no state with x(t) >= 5 is reachable. Spec does not fail on this run
Example (Testing 2) x(t + 1) = if x(t) <= 3 then x(t) + u(t) else x(t) – u(t), u(t) = 1, 2. x(0) = 0 1 3 2 2 0 1 5 1 1 2 4 2 Sim length: 6 1, 2, 1, 2, 1, 2 Spec: x(t) < 5. I.e. no state with x(t) >= 5 is reachable. Spec FAIL
Testing: Obstructions (1) Generation of targeted testing sequences can be costly (human resources + time). Methods improving ATPG (Automatic Test Pattern Generation) are needed. Hand Made + ATPG + Random Walk + … Testing Sequences Requirements
Testing: Obstructions (2) Testing is SLOW because to get a reasonable coverage we need to run many testing sequences. This is a problem when time-to-market is an issue (and for most product this IS an issue). Methods to speed up testing are needed. System (Model) Compute output by Simulation or by running the actual system when possible Input sequences Output sequences y1(0) y1(1) y1(2) y1(3) … … u1(3) u1(2) u1(1) u1(0) ……………… ……………… … un(3) un(2) un(1) un(0) yn(0) yn(1) yn(2) yn(3) … The value of n can easily be in the order of 106. Note that for each input sequence an output sequence has to be generated and checked for conformity.
Testing: Obstructions (3) Testing without automation tends to discover errors towards the end of the design flow. Error fixing is very expensive at that point and may delay product release. Methods to discover errors as soon as possible are needed. Source: Mercury Interactive, Siebel Siemens Errors caught (percent) Number of times more expensive to fix Early development Implementation
Testing: Obstructions (4) Presently more than 50% of the cost of the final product is testing, and this cost is growing up. Thus keeping low (no much more than 50%) the cost of testing is a key issue in a competitive market.
Testing: False Negatives Testing can only cover a SMALL part of the set of reachable system states. This may lead to false negatives (… unforseeable circumstances). Typically corner cases (i.e. states that have a low probability of being reached) are not visited during testing. Thus errors that have a low probability of showing up are hard (impossible) to detect using testing. Unfortunately such low probability errors may be costly to fix at later design stages and/or their consequences can be very costly (Pentium 2 bug: billions of dollars …). Moreover today complex designs are full of such cases …
Summing up Speed up bug hunting(to decrease costs and time-to-market) Improve coverage (to increase quality … and to decrease the probability of losing markets) HOW ???
Automatic Verification Game Given: a system Sys (e.g. hardware, software, hybrid, etc) and an specifications Spec for Sys (e.g. what Sys should or should not do) Check automatically: If system Sys satisfies the given specifications Spec. This is equivalent to run ALL possible testing sequences!!
Automatic Verification Game (2) Sys: definition of system under consideration using your beloved language, e.g.: VHDL, Verilog, SDL, StateCharts, C, C++, Java, MathLab, Simulink, … Spec: definition of what Sys should do and should not do using your beloved language (again) and/or your most favorite logic, e.g: Temporal Logic (CTL, CTL*, …), First Order Logic, etc
Formal Verification via Model Checking The main goal of Formal Verification is to verify that a given system (hardware and/or software) meets its specifications. Thus formal verification is conceptually equivalent to testing with 100\% coverage. Exhaustive testing is not feasible even for small systems.Thus verification methods rely on a suitable analysis of system definition and system specifications to produce their answers. As a result formal verification, unlike testing, applies only to the system description. Testing also applies to the physical system (when it exists). Formal verification can be interactive or automatic. Model Checking is an automatic method for formal verification of Finite State Systems. Note that many hardware and/or software systems can be modeled as finite state systems.
Model Checking Dream Sys (VHDL, Verilog, C, C++ Java, MathLab, Simulink, …) BAD (CTL, CTL*, LTL, …) Model Checker (Equivalent to Exhaustive testing) PASS FAIL I.e. no sequence of events (states) can possibly lead to an undesired state. What went wrong … Counterexample I.e. sequence of events (states) leading to an undesired state.
Example (Model Checking) (1) x(t + 1) = if x(t) <= 3 then x(t) + u(t) else x(t) – u(t), u(t) = 1, 2. x(0) = 0 1 3 2 2 0 5 1 2 2 2 4 Spec: x(t) < 5. I.e. no state with x(t) >= 5 is reachable. Spec FAIL
2 1 3 2 2 0 1 5 1 1 1 1 2 1 2 2 4 2 Example (Model Checking) (2) x(t + 1) = if x(t) <= 3 then x(t) + u(t) else x(t) – u(t), u(t) = 1, 2. x(0) = 0 Spec: x(t) < 5. I.e. no state with x(t) >= 5 is reachable. Spec FAIL Spec ok if u(t) = 0, 1.
A Larger System x(t + 1) = case x(t) – 2 + u(t) when x(t) + y(t) > 4 x(t) – 1 + u(t) when x(t) + y(t) = 4 x(t) + u(t) when x(t) + y(t) = 3 x(t) + 1 + u(t) when x(t) + y(t) = 2 x(t) + 2 + u(t) when x(t) + y(t) < 2 esac y(t + 1) = u(t) u(t) = -1, 0, 1 x,y 1,-1 2,-1 3,-1 -1 0 0,0 2,0 3,0 4,0 1 3,1 4,1 5,1
Verification/Testing Verification, from system model Sys AND specifications Spec produces a sequence of stimuli (events) j, if any, leading Sys to violate Spec. + faster than testing (good to improve time-to-market) + gives full coverage (good to improve quality) + early error detection (decreases costs) - is computationally VERY expensive (because of state explosion) • Testing, from system model Sys AND a sequence of stimuli (events) j shows where j leads (in | j | steps). • + is computationally inexpensive • many testing sequences may be needed to get good coverage • false negatives • - late error detection
0 1 Transition Graph/Transition Relation x(t + 1) = f(x(t), u(t)) x’ = f(x, u) x’ = if (u = 0) then (x + 1)mod3 else (x – 1)mod3; x(0) = 0; u = 0, 1 u = 1 u = 0 u = 0 x 1 0 2 u = 1 u = 1 u = 0 u Transition Graph = Transition Relation u, x 0, 0 0, 1 0, 2 1, 0 1, 1 1, 2
Layman model checking usage We show with a small running example (mutex) a typical “verification via model checking” layman session.
S1=n1 & S2=t2 S1 S2 n1 t1 n2 t2 1 T 2 S2 = n2 S1 = n1 S1=t1 & T=2 S2=t2 & T=1 S2=n2 & S1=t1 c1 c2 Mutual Exclusion (Mutex) n1, n2, 1 t1, n2, 1 c1, n2, 1 n1, t2, 1 t1, t2, 1 c1, t2, 1 n1, c2, 1 t1, c2, 1 c1, c2, 1 n1, n2, 2 t1, n2, 2 c1, n2, 2 n1, t2, 2 t1, t2, 2 c1, t2, 2 n1, c2, 2 t1, c2, 2 c1, c2, 2 Mutual exclusion: AG (S1 != c1 | S2 != c2) … true Negation of mutual exclusion: EF (S1 = c1 & S2 = c2) … false No starvation S1: AG (S1 = t1 --> AF (S1 = c1)) … true No starvation S2: AG (S2 = t2 --> AF (S2 = c2)) … true State (t1, n2, *) reachable: AG (S1 != t1 | S2 != n2) … false
S1=n1 & S2=t2 S1 S2 n1 t1 n2 t2 1 T 2 S2 = n2 S1 = n1 S1=t1 & T=2 S2=t2 & T=1 S2=n2 & S1=t1 c1 c2 Mutex 2 (arbitrary initial state) Mutual exclusion: AG (S1 != c1 | S2 != c2) … Negation of mutual exclusion: EF (S1 = c1 & S2 = c2) … No starvation S1: AG (S1 = t1 --> AF (S1 = c1)) … No starvation S2: AG (S2 = t2 --> AF (S2 = c2)) …
SMV output (mutex 2) -- AG (S1 != c1 | S2 != c2) is false as demonstrated by the followingexecution sequence state 1.1:S1 = c1S2 = c2turn = 2 -- EF (S1 = c1 & S2 = c2) is false -- AG (S1 = t1 -> AF S1 = c1) is true --AG (S2 = t2 -> AF S2 = c2) is true resources used: user time: 0.03 s, system time: 0.04 s BDD nodes allocated: 730 Bytes allocated: 1245184 BDD nodes representing transition relation: 31 + 6
S1=n1 & S2=t2 S1 S2 n1 t1 n2 t2 1 T 2 S2 = n2 S1 = n1 S1=t1 & T=2 S2=t2 & T=1 S2=n2 & S1=t1 c1 c2 Mutex 3 (~ arbitrary initial state) Mutual exclusion: AG (S1 != c1 | S2 != c2) … Negation of mutual exclusion: EF (S1 = c1 & S2 = c2) … No starvation S1: AG (S1 = t1 --> AF (S1 = c1)) … No starvation S2: AG (S2 = t2 --> AF (S2 = c2)) …
SMV output (mutex 3) -- specificationAG (S1 != c1 | S2 != c2) is true -- specification EF (S1 = c1 & S2 = c2) is false --specificationAG (S1 = t1 -> AF S1 = c1) is true --specificationAG (S2 = t2 -> AF S2 = c2) is true resources used: user time: 0.02 s, system time: 0.04 s BDD nodes allocated: 635 Bytes allocated: 1245184 BDD nodes representing transition relation: 31 + 6
S1=n1 & S2=t2 S1 S2 n1 t1 n2 t2 1 T 2 S2 = n2 S1 = n1 S1=t1 & T=2 S2=t2 & T=1 c1 c2 S2=n2 & S1=t1 Mutex 4 (with arbitrary delays) Mutual exclusion: AG (S1 != c1 | S2 != c2) … Negation of mutual exclusion: EF (S1 = c1 & S2 = c2) … No starvation S1: AG (S1 = t1 --> AF (S1 = c1)) … No starvation S2: AG (S2 = t2 --> AF (S2 = c2)) …
S1=n1 & S2=t2 S1 S2 n1 t1 n2 t2 1 T 2 S2 = n2 S1 = n1 S1=t1 & T=2 S2=t2 & T=1 c1 c2 S2=n2 & S1=t1 SMV output (mutex 4) -- AG (S1 != c1 | S2 != c2) is true -- AG (S1 = t1 -> AF S1 = c1) is false -- as demonstrated by the following execution sequence state 2.1: S1 = c1 S2 = n2 turn = 2 state 2.2: S1 = n1 S2 = t2 -- loop starts here – state 2.3: S1 = t1 S2 = c2 state 2.4:
SMV output (mutex 4) … cntd -- AG (S2 = t2 -> AF S2 = c2) is false -- as demonstrated by the followingexecution sequence state 3.1: S1 = c1 S2 = n2 turn = 2 -- loop starts here – state 3.2: S2 = t2 state 3.3: resources used: user time: 0.03 s, system time: 0.04 s BDD nodes allocated: 799 Bytes allocated: 1245184 BDD nodes representing transition relation: 34 + 6
S1=n1 & S2=t2 S1 S2 n1 t1 n2 t2 1 T 2 S2 = n2 S1 = n1 S1=t1 & T=2 S2=t2 & T=1 c1 c2 S2=n2 & S1=t1 SMV (mutex 4 + FAIRNESS) FAIRNESS !(S1 = n1) FAIRNESS !(S1 = t1) FAIRNESS !(S1 = c1) FAIRNESS !(S2 = n2) FAIRNESS !(S2 = t2) FAIRNESS !(S2 = c2) SPEC AG((S1 != c1) | (S2 != c2)) SPEC EF((S1 = c1) & (S2 = c2)) SPEC AG((S1 = t1) -> AF (S1 = c1)) SPEC AG((S2 = t2) -> AF (S2 = c2))
S1=n1 & S2=t2 S1 S2 n1 t1 n2 t2 1 T 2 S2 = n2 S1 = n1 S1=t1 & T=2 S2=t2 & T=1 c1 c2 S2=n2 & S1=t1 SMV output (mutex 4 + FAIRNESS) -- AG (state1 != c1 | state2 != c2) is true -- EF (state1 = c1 & state2 = c2) is false -- AG (state1 = t1 -> AF state1 = c1) is true -- AG (state2 = t2 -> AF state2 = c2) is true resources used: user time: 0.03 s, system time: 0.04 s BDD nodes allocated: 615 Bytes allocated: 1245184 BDD nodes representing transition relation: 34 + 6
A look under the hood Spec E.g. CTL, CTL*, LTL, … Sys (VHDL, Verilog, C, C++ Java, MathLab, Simulink, …) Sys can be described by boolean functions: initial states: I(x) = 1 iff x is an initial state of Sys transition relation: N(x, x’) = 1 iff there exists a transition from x to x’ From Spec we can define a function F from I, N to boolean values {0, 1} s.t. F(I, N ) is identically 1 iff Spec is satisfied Check if it holds that F(I, N ) is identically 1
Obstructions Representation of Sys (i.e. (I, N)) may be too big (easily gigabytes … of RAM). Note: Sys transition graph can easily have more than 1020 nodes (state explosion). Even if (I, N) is not too big we may run out of memory when checking if F(I, N) is identically 1. The above obstructions cannot be eliminated. However there are algorithms that can actually mitigate them. Such algorithms are effective in many practical cases (… altough they are exponential with probability 1).
Model Checking as State Space Exploration For safety properties (i.e. no bad state is reachable) the model checking problem becomes the reachability problem on the transition graph of the system to be analyzed. Given a Finite State System S = (S, I, Next), where: S : Finite set of states; I : set of initial states; Next : function mapping a state to the set of its successors; Visit all states that S can reach from I … in order to check if there is bad reachable state (i.e. a state that violates our specs).
Model Checking Flavors Explicit Set Reach of visited states stored in a Hash Table. Explicit approach typically works well for protocols, hybrid systems and software-like systems (i.e. asynchronous systems). E.g.: SPIN (Bell Lab), Murphi (Stanford), COSPAN (Bell Lab) Symbolic Set Reach of visited states represented with its characteristic function f. That is f(s) = if (s is in Reach) then 1 else 0. States are bit vectors, thus f is a Boolean function. Ordered Binary Decision Diagrams (OBDDs) are used to efficiently represent and manipulate f. Symbolic approach typically works well for Hardware-like systems (i.e. synchronous systems). E.g.: SMV (CMU), VIS (CU + Berkeley), CUDD (CU), FORTE (INTEL), SLAM (Microsoft), RuleBase (IBM).
Explicit State Space Exploration • From the system definition (e.g. given using VHDL or C) we get the following functions: • Next(s) returning the set of successors of system state s; • Start() returning the set of initial states; • Inv(s) returning true iff state s satisfies the invariants (our spec). With such functions we can define a State Space Exploration function. E.g. we can use a BFS (Breadth first Search) or a DFS (Depth First Search).
Visited states Visited states to be expanded BFS s1 1. Get a new state s to expand from queue System Transition Graph Hash Table T Queue Q s2 s 2. Check inv for s 3. If s1 (s2, s3) is not already in H, insert s1 (s2, s3) in H and Q. s3 Hash_Table T; Queue Q; bfs() { for each startstate s {insert(T, s); enqueue(Q, s)} while (Q is not empty) { s = dequeue(Q); check invariants for s; for all s’ in Next(s) if (s’ is not in T) /* fresh state */ {insert(T, s’); enqueue(Q, s’); } }} Successors of state s
2 1 3 2 2 0 1 5 1 1 1 1 2 1 2 2 4 2 Example (BFS) (1) x(t + 1) = if x(t) <= 3 then x(t) + u(t) else x(t) – u(t), u(t) = 1, 2. x(0) = 0
Example (BFS) (2) x(t + 1) = if x(t) <= 3 then x(t) + u(t) else x(t) – u(t), u(t) = 1, 2. x(0) = 0 1 3 2 2 0 5 1 2 2 2 4
Hash Compaction States may take hundreds of bytes. To save on RAM we can store in T just state signatures h(s). Usually a state signature takes 5 bytes or so. It can be proved that the omission probability is very low. 011000111001010101010101100001111111001010101010101010101010101010101 Hash Compaction 001010101001000010101000
Caching To save even more RAM we can forget some of the state signatures in hash table T. Experimental results show that we can forget about 50% of the states in T and still get termination. This work because protocol transitions are local. Previously stored state forgot Collision Hash Table T Hash Table T 000000011111111 01101010101010 000000011111111 Danger: we may revisit the same state forever and ever: no termination!!
K-transition iff level(s’) – level(s) = K Locality Transition k-local iff |level(s’) – level(s)| <= k -4 1 -2 1 0 1 0 -1 -1 1 0 1 1 -1 0 1 2 3 4
Locality Our experimental results show that: For all protocol like systems, for most states, most transitions (typically more than 75%) are 1-local.
State Sampling Let d(s, k) be the fraction of transitions from state s that are k-transitions. Thus d(s, k) is the probability of getting a k-transition when picking at random a transition from state s. Consider the experiment of selecting at random a state s and then returning d(s, k). In this way we get a random variable that we denote with d(k). The expected value of d(k) is the average value of d(s, k) on all reachable states. s
Symbolic Model Checking Set of initial states represented with a boolean function I s.t.: I(x) = 1 iff x is an initial state. Transition graph represented with transition relation, i.e. a boolean function N s.t.: N(x, x’) = 1 iff there is a transition from x to x’ Reachable states: least solution to the following (functional) fixpoint equation (unknown: R) R(x) = I(x) E y [R(y) N(y, x)] x’ x
Computing the set of reachable states Problem: how do we solve equation R(x) = I(x) E y [R(y) N(y, x)] Answer (classical): R(0)(x) = 0 R(k+ 1)(x) = I(x) E y [R(k)(y) N(y, x)], k = 0, 1, 2… Stop when R(k+ 1) = R(k) This eventually happens … Obstructions: Efficient manipulation of boolean functions Efficient check of functional equality
Ordered Binary Decision Diagrams (OBDDs: an efficient representation for boolean functions) x1 x1 x2 x1 x2 x2 1 1 1 0 0 0 1 1 2 1 0 1 1 0