330 likes | 532 Views
Logic Model Checking. Lecture Notes 18:18 Caltech CS 118 January-March 2006 Course Text: The Spin Model Checker: Primer and Reference Manual Addison-Wesley 2003, ISBN 0-321-22862-6, 608 pgs. model extraction... scaling model checking. details: Automating Software Feature Verification
E N D
Logic Model Checking Lecture Notes 18:18 Caltech CS 118 January-March 2006 Course Text: The Spin Model Checker: Primer and Reference Manual Addison-Wesley 2003, ISBN 0-321-22862-6, 608 pgs.
model extraction...scaling model checking details: Automating Software Feature Verification Bell Labs Technical Journal, Vol. 5, No. 2, April-June 2000, pp. 72-87. http://spinroot.com/gerard/pdf/bltj2000.pdf Lucent Pathstar voice+data switch (1998-2001) • distributed system design • multiple asynchronous threads of execution per call • software cleanly separates • control oriented code (state/event processing), and • non-control oriented (sequential) code Logic Model Checking [18 of 18]
POTS + approx. 50 features (call waiting, call forwarding, call screening, etc.) allowing ~250 possible feature combinations... call processing: 13,099 lines of C “core piece” control code at the heart of the switch: ~1,500 lines of C pathstar software PathStar Software ~ 2 million lines of code
bug reporting model extraction implemented as a lookup - table database of ~200 logic requirements from standard Spin model test drivers 3 4 formalized in LTL written in Promela a verification “test harness” 1 C code 2
main exit endthread modeling a program... (Chapter 10 in the book) beginthread Logic Model Checking [18 of 18]
abstraction/filtering • the label set from the extracted automata represent • declarations, statements, and expressions from C • each label is classified (with a lookup table) as: • irrelevant to the property: hide • partially relevant: map (i.e., replace with user-defined string) • fully relevant: keep • a modified C parser (modex) uses the table to generate a Spin verification model • in the Pathstar application: 30% hide, 10% map, 60% keep Logic Model Checking [18 of 18]
true offhook !dialtone !dialtone onhook onhook true program properties a sample property: ‘’when the subscriber goes offhook, dialtone is generated.’’ a failure to satisfy the property: <> eventually, the subscriber goes offhook /\ and X thereafter, no dialtone is U generated until the next onhook LTL: <> (offhook /\ X ( !dialtone U onhook)) Logic Model Checking [18 of 18]
LTL requirement reverse lookup logical negation spin -f x = * error scenario: no dialtone generated finding and reporting bugs C program lookup table modex Logic Model Checking [18 of 18]
automating the verificationprocess ltl formulae client/server sockets code scripts Logic Model Checking [18 of 18]
status display: tracking progress Logic Model Checking [18 of 18]
performance (Sept. 1999)(using 16 500 MHz PCs with 512 Mb each) 15 error reports in 3 minutes over half of all reports generated within the first 10 minutes first error report in 2 minutes 100 (60) (number of violations reported) 75 percent of violations reported (50) 50 (30) 25 (15) 10 20 30 40 minutes since start of check Logic Model Checking [18 of 18]
some variants of model checking1: classic model checking source code compiler executable code this is the basic model checking method used since the early 1980s-… main challenge: constructing faithful models tracking changes in original design/code hand-built logic model property violations model checker (Spin) properties Logic Model Checking [18 of 18]
some variants of model checking2: model extraction methods source code compiler executable code source code (fragment) model extractor Examples: C: FeaVer (PathStar application, 1998-2000) SLAM (Microsoft, 2000) Java: Pathfinder-1 (Ames, Havelund 1997) Bandera (KSU, Dwyer/Hatcliff 1997) main challenge: setting up the model extraction context Abstraction logic model + embedded C property violations model checker (Spin) tracked C data properties Logic Model Checking [18 of 18]
special purpose virtual machine some variants of model checking3: using a virtual machine (e.g., JVM) source code compiler executable code Example: Java Pathfinder-2 (Ames, Visser et al.) main challenge: complexity/tractability dealing with abstraction errors Logic Model Checking [18 of 18]
synopsis of new approach4: model-driven verification of executable code source code compiler executable code test interface Examples of this approach: C: Mars Pathfinder verification (NASA/JPL 2001) JPL Mars Resource Arbiter verification (NASA/JPL 2003) main challenge: accurate state capture & definition of abstractions state information non-deterministic interface driver Data Abstraction differences from source-level model extraction: higher-level and coarser main functions in code are executed atomically (interleaving is at function-level not at statement-level) model checker (Spin) errors Logic Model Checking [18 of 18]
support for embedded C code in Spin v4(implemented Sep. 2000, added to spin distribution Jan. 2003) • five extra Promela primitives: c_decl c_state c_track c_expr c_code to define new C data types that can be used later in c_state or c_track primitives to declare new C variables inside state vector: Global, Local or outside state vector: Hidden (meant to be used by model extractors only) data declaration to report C data objects, declared outside the model (in C code), as holding state information blocking: execute a C expression and use the boolean return value in the model behavior specification non-blocking: execute a fragment of C code atomically and deterministically (like a d_step) Logic Model Checking [18 of 18]
depth-first search: start x=2.0 x==2.0 x=x+2.0 x=x*3.0 assert(x<5.0) assert(x<7.0) stop embedded C data objectsthe good and the bad c_code { float x; } active proctype simple() { c_code { x = 2.0; }; if :: c_code { x = x+2.0; }; assert(x<5.0) :: c_code { x = x*3.0; }; assert(x<7.0) fi } $ spin -a simple.pr $ cc -o pan pan.c $ pan pan: assertion violated (x < 7.0) pan: wrote simple.pr.trail .. ..(shows x equals 12.0...!?!) the embedded var x holds state information; the failure to track it can confuse the model checker: 1. may erroneously consider a state match (due to missing information on x) 2. can execute erroneous paths (due to inability to restore the value accurately) Logic Model Checking [18 of 18]
tracking external data objects c_code { float x; } c_track “&x” “sizeof(float)” active proctype simple() { c_code { x = 2.0; }; if :: c_code { x = x+2.0; }; assert(x<5.0) :: c_code { x = x*3.0; }; assert(x<7.0) fi } depth-first search: start x=2.0 x==2.0 x=x+2.0 x=x*3.0 assert(x<5.0) assert(x<7.0) $ spin -a simple.pr $ cc -o pan pan.c $ pan (...no errors...) stop x is now correctly tracked and stored 1. all state matching operations are now accurate 2. the value of x is now restored on reverse moves in the dfs erroneous path are now impossible Logic Model Checking [18 of 18]
the game of tic-tac-toedetails:http://spinroot.com/gerard/pdf/spin04.pdf o x x o o x data structure for a classic Spin model: typedef Row { byte s[3]; } typedef Board { Row r[3]; } Board b; 2 players make alternate moves a toggle bit variable z determines whose move it is current player non-deterministically selects a valid move markes an empty square with a 0 or a 1 check after each move if the new position is a win if yes stop, if no repeat Logic Model Checking [18 of 18]
tic-tac-toe.pml #define SQ(x,y) !b.r[x].s[y] -> b.r[x].s[y] = z+1 #define H(v,w) b.r[v].s[0]==w && b.r[v].s[1]==w && b.r[v].s[2]==w #define V(v,w) b.r[0].s[v]==w && b.r[1].s[v]==w && b.r[2].s[v]==w #define UD(w) b.r[0].s[0]==w && b.r[1].s[1]==w && b.r[2].s[2]==w #define DD(w) b.r[2].s[0]==w && b.r[1].s[1]==w && b.r[0].s[2]==w typedef Row { byte s[3]; }; typedef Board { Row r[3]; }; Board b; bit z, won; init { do :: atomic { /* do not store intermediate states */ !won -> if /* all valid moves */ :: SQ(0,0) :: SQ(0,1) :: SQ(0,2) :: SQ(1,0) :: SQ(1,1) :: SQ(1,2) :: SQ(2,0) :: SQ(2,1) :: SQ(2,2) :: else -> break /* a draw: game over */ fi; if /* winning positions */ :: H(0,z+1) || H(1,z+1) || H(2,z+1) || V(0,z+1) || V(1,z+1) || V(2,z+1) || UD(z+1) || DD(z+1) -> /* print winning position */ printf("%d %d %d\n%d %d %d\n%d %d %d\n", b.r[0].s[0], b.r[0].s[1], b.r[0].s[2], b.r[1].s[0], b.r[1].s[1], b.r[1].s[2], b.r[2].s[0], b.r[2].s[1], b.r[2].s[2]); won = true /* and force a stop */ :: else -> z = 1 - z /* continue */ fi; } /* end of atomic */ od } Logic Model Checking [18 of 18]
“verification” game analysis: 765 unique board positions (taking out rotation, mirror, etc. symmetries) 135 winning positions maximally 9 moves in a game this means that the minimum problem complexity is: states: 765 depth: 9 wins: 135 the classic spin model (see paper) explores: states: 5,510 depth: 40 wins: 942 positions conclusion: we could benefit from considering symmetries we’re not using abstraction effectively Logic Model Checking [18 of 18]
real-life verification context • what if the data structure (the game board) and the move generator were defined in C, and not in Promela • the real application runs with concrete (non-abstracted) data • the verifier wants to run with abstracted data • can we reconcile these two? • running the real application, using concrete data for execution • driven by the model checker, using abstract data for verification Logic Model Checking [18 of 18]
front-end for C functiona Spin model “driver” #define SQ(a,b) c_expr { (!board[a][b]) } -> x=a; y=b c_decl { extern short board[3][3]; extern short play(int, int, int); }; c_track "&board[0][0]" "sizeof(board)"; /* default: “Matched” */ byte x, y, z, won; init { do :: atomic { !won -> if /* all valid moves, pick x and y */ :: SQ(0,0) :: SQ(0,1) :: SQ(0,2) :: SQ(1,0) :: SQ(1,1) :: SQ(1,2) :: SQ(2,0) :: SQ(2,1) :: SQ(2,2) /* draw */ :: else -> break fi; c_code { switch (play(now.x, now.y, now.z+1)) { /* toggle */ case 1: now.z = 1 - now.z; break; /* we won */ case 2: now.won = 1; break; } now.x = now.y = 0; /* reset */ } } od } externally declared C data description of the game board c_track for the game board non-deterministic move generation the move update as an external C function Logic Model Checking [18 of 18]
comparison minimum: states: 765 depth: 9 wins: 135 positions classic spin model: states: 5510 depth: 40 wins: 942 spin model with embedded C code: states: 5510 depth: 31 wins: 942 this version remains close to the handwritten pure Spin model we’re still not using abstraction, but now we have fewer choices to do so, because the data is in C, and the move update is a given piece of C code that we do not want to have to rewrite... Logic Model Checking [18 of 18]
method: track concrete data values,but match abstract data values c_decl { extern short board[3][3]; extern short play(int, int, int); extern short abstract; /* abstract board values */ }; c_track "&board[0][0]" "sizeof(board)" "UnMatched"; c_track "&abstract" "sizeof(short)" "Matched"; c_code { ... } /* declare C function: abstract_value() */ byte x, y, z, won; init { do :: atomic { !won -> if :: SQ(0,0) :: SQ(0,1) :: SQ(0,2) :: SQ(1,0) :: SQ(1,1) :: SQ(1,2) :: SQ(2,0) :: SQ(2,1) :: SQ(2,2) :: else -> break fi; c_code { switch (play(now.x, now.y, now.z+1)) { case 1: now.z = 1 - now.z; break; case 2: now.won = 1; break; } now.x = now.y = 0; abstract_value();/* compute abstract board value */ } } od } concrete data values are tracked, but not stored in the statevector (i.e., they are not matched) abstract data is tracked and stored in the statevector and recomputed after every state transition that could affect the concrete data values logical soundness of abstraction: (1) the abstraction defines a bisimulation relation on states (2) any two bisimilar states satisfy the same propositional formulae Logic Model Checking [18 of 18]
result minimum: states: 765 depth: 9 wins: 135 positions classic spin model: states: 5510 depth: 40 wins: 942 spin model with embedded C code: states: 5510 depth: 31 wins: 942 spin model with embedded C code and data abstraction: states: 771 depth: 31 wins: 135 Logic Model Checking [18 of 18]
other examples Logic Model Checking [18 of 18]
Arb U1 U2 request grant request rescind cancel grant request deny cancel sample application:verifying the MER Arbiter code 11 user threads 15 shared resources arbiter code: 2,960 lines of C resolution algorithm + lookup table #define _K ARB_ACT_RESCIND #define _R ARB_ACT_DENY #define _Q ARB_ACT_PEND #define _G ARB_ACT_GRANT #define _X ARB_ACT_RESCIND_AND_DENY #define _N ARB_ACT_NONSENSE /* 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 */ DEF_ACTION(_N,_K,_G,_G,_G,_G,_K,_G,_G,_G,_G,_G,_G,_G,_G) /* 1 - HGA_COMM */ DEF_ACTION(_R,_Q,_G,_G,_G,_G,_R,_G,_G,_G,_G,_G,_G,_G,_G) /* 2 - HGA_NOM */ DEF_ACTION(_G,_G,_N,_K,_K,_K,_K,_K,_G,_G,_G,_G,_G,_G,_G) /* 3 - MAST_COMM */ DEF_ACTION(_G,_G,_Q,_Q,_Q,_Q,_R,_G,_G,_G,_G,_G,_G,_G,_G) /* 4 - MAST_NOM */ DEF_ACTION(_G,_G,_R,_Q,_Q,_Q,_R,_G,_G,_G,_G,_G,_G,_G,_G) /* 5 - MAST_ATT */ DEF_ACTION(_G,_G,_Q,_Q,_Q,_N,_R,_G,_G,_G,_G,_G,_G,_G,_G) /* 6 - MAST_MINI_TES * DEF_ACTION(_R,_R,_R,_R,_X,_R,_N,_R,_R,_G,_G,_G,_G,_G,_G) /* 7 - DRIVE */ DEF_ACTION(_G,_G,_Q,_G,_Q,_G,_R,_N,_R,_G,_G,_G,_G,_G,_G) /* 8 - ARM */ DEF_ACTION(_G,_G,_G,_G,_G,_G,_R,_X,_N,_G,_G,_G,_G,_G,_G) /* 9 - RAT */ DEF_ACTION(_G,_G,_G,_G,_G,_G,_G,_G,_G,_N,_R,_G,_G,_G,_G) /* 10 - UP_LOSS */ DEF_ACTION(_G,_G,_G,_G,_G,_G,_G,_G,_G,_K,_N,_G,_G,_G,_G) /* 11 - LOW_PWR */ DEF_ACTION(_G,_G,_G,_G,_G,_G,_G,_G,_G,_G,_G,_N,_K,_K,_K) /* 12 - SURF_MCLK */ DEF_ACTION(_G,_G,_G,_G,_G,_G,_G,_G,_G,_G,_G,_R,_N,_K,_K) /* 13 - SURF_UPL */ DEF_ACTION(_G,_G,_G,_G,_G,_G,_G,_G,_G,_G,_G,_R,_R,_N,_K) /* 14 - SURF_PWR */ DEF_ACTION(_G,_G,_G,_G,_G,_G,_G,_G,_G,_G,_G,_R,_R,_R,_N) /* 15 - SURF_COMM */ sample requirement: communication always takes precedence over driving Logic Model Checking [18 of 18]
verifying the MER resource arbiter some ways to tackle this problem • direct test of arbiter C code, random test runs, + 11 user threads • Spin model of arbiter + 3 user threads • unmodified arbiter C code, Spin front-end, + 11 user threads default classic new Logic Model Checking [18 of 18]
verifying the arbiter [1: using the original arbiter C code – random test] • Approach: • use original flight code of arbiter written in C, as is • made separately compilable by stubbing calls to rest of MER code • approx 457 lines of C added for 27 stubbed fct calls • add random test interface, simulating requests from 11 user processes • add property-checking demons as additional C functions • compile and run (ran random test for 17 hours straight) • Problem: • only works for safety (our sample requirement is a liveness property) • no confidence in coverage of test, no matter how long the test is continued • Possible resolution: • find a way to use the model checker to drive the tester • can deliver guarantees of coverage + allows us to test any logic property Logic Model Checking [18 of 18]
verifying the arbiter 2: Spin model of arbiter algorithm • Approach: • pure Spin v4 model of arbiter algorithm + user processes • embedded original C lookup table into the model • Model Size: • 280 lines of Promela • 37 lines of original lookup table in C (the only part in C) • 2 users: 2,702 reachable states; verification in <<0.1sec • 3 users: 345,352 states; verification in 3.2sec • easily finds potential property violations (none considered important) • (for 3 users, finds first violation in <<0.1sec after exploring <1,000 states) • Problem: • finds bugs, but • how do we know if the hand-built Spin model is accurate? • Possible resolutions: • use the original flight code – limit the number of artifacts built to test it Logic Model Checking [18 of 18]
p true p && !q model-driven verification with abstraction 3: original full arbiter code in C + non-deterministic Spin driver Approach: • 2,960 lines of original flight code for abiter (algorithm, table, + support fcts) • 27 stubbed function calls to rest of MER code • 147 lines for a minimal front-end Spin model, to serve as the test-driver • test-driver models requests/responses for all 11 user threads and all 15 resources • Spin captures and tracks (an abstraction of) the arbiter’s state • avoids redundancies from random testing • can now systematically explore the full arbiter statespace • data-abstraction can be used to reduce complexity • e.g., order in which reservations are stored in the arbiters linked list is irrelevant – mapped to fixed order before state matching in spin – a form of symmetry reduction Result: • finds both safety and liveness violations (none serious) example: LTL: !([]p -> (p U (p && q))) • p: (user[CBM_ARBUSER ]:rid == HGA_COMM && user[CBM_ARBUSER ]@wait) • q: (arb?[Grant, CBM_ARBUSER, HGA_COMM ]) • violation found after inspecting 1.14 107 states in 9 min 19s (~20Kstates/sec, >>5K scenarios/sec) Logic Model Checking [18 of 18]