480 likes | 580 Views
The Yogi Project Software property checking via verification and testing. Aditya V. Nori, Sriram K. Rajamani Programming Languages and Tools Microsoft Research India. The plan. Part 1: Our philosophy, the basic algorithm, related work coffee break
E N D
The Yogi ProjectSoftware property checking via verification and testing Aditya V. Nori, Sriram K. Rajamani Programming Languages and Tools Microsoft Research India
The plan • Part 1: Our philosophy, the basic algorithm, related work • coffee break • Part 2: Handling programs with pointers and procedures • coffee break • Part 3: The engineering Download: http://research.microsoft.com/yogi
Thanks to our collaborators • AwsAlbarghouthi • Nels Beckman • Patrice Godefroid • Bhargav Gulavani • Tom Henzinger • YaminiKannan • Rahul Kumar • Robert Simmons • Sai Deep Tetali • Aditya Thakur
Property checking Question Is the error unreachable for all possible inputs? void f(int *p, int *q) { 0: *p = 4; 1: *q = 5; 2: if () 3: error(); } • More generally, we are interested in the query
Possible solution: Testing • The “old-fashioned” and practical method of validating software • Generate test inputs and see if we can find a test that violates the assertion
What’s wrong with testing? If we view testing as a “black-box” activity, Dijkstra is right! • After executing many tests, we still don’t • know if there is another test that can violate • the assertion
Our hypothesis If we view testing as a “white-box” activity, and “observe” what happens inside the program, we can do interesting things: • For a buggy program, we can generate tests in a directed manner to find the bug • For a correct program, we can prove that the assertion holds for all inputs!
Tests: presence of bugs void foo(int a) { 0: i= 0; 1: c = 0; 2: while (i < 1000) { 3: c = c + i; 4:i= i + 1; } 5: if (a <= 0) 6: error(); }
Tests: presence of bugs void foo(int a) { 0: i= 0; 1: c = 0; 2: while (i < 1000) { 3: c = c + i; 4:i= i + 1; } 5: if (a <= 0) 6: error(); } …
Proofs: absence of bugs Exponential number of tests required! void foo(int y1, int y2) { 0: state = 1; 1: if (y1) { 2: x0 = x0 + 1; } else { 3: x0 = x0 – 1; } 4: if (y2) { 5: x1 = x1 + 1; } else { 6: x1 = x1 – 1; } 7: if(state != 1) 8: error(); }
Proofs: absence of bugs 0 void foo(int y1, int y2) { 0: state = 1; 1: if (y1) { 2: x0 = x0 + 1; } else { 3: x0 = x0 – 1; } 4: if (y2) { 5: x1 = x1 + 1; } else { 6: x1 = x1 – 1; } 7: if(state != 1) 8: error(); } overapproximation state = 1 1 assume(y1) 2
Proofs: absence of bugs 0 void foo(int y1, int y2) { 0: state = 1; 1: if (y1) { 2: x0 = x0 + 1; } else { 3: x0 = x0 – 1; } 4: if (y2) { 5: x1 = x1 + 1; } else { 6: x1 = x1 – 1; } 7: if(state != 1) 8: error(); } state = 1 1 assume(y1) assume(!y1) 2 3 x0=x0+1 x0=x0-1 4 assume(y2) assume(!y2) 5 6 x1=x1+1 x1=x1-1 7 assume(state != 1) assume(state == 1) 9 8:error
Proofs: absence of bugs 0:state=1 void foo(int y1, int y2) { 0: state = 1; 1: if (y1) { 2: x0 = x0 + 1; } else { 3: x0 = x0 – 1; } 4: if (y2) { 5: x1 = x1 + 1; } else { 6: x1 = x1 – 1; } 7: if(state != 1) 8: error(); } linear proof exists! 1:state=1 2:state=1 3:state=1 4:state=1 5:state=1 6:state=1 7:state=1 8:error
Yogi: Proofs from Tests • Algorithm uses only test case generation operations • Maintains two data structures: • A forest of reachable concrete states (tests) • Under-approximates executions of the program … … …
Yogi: Proofs from Tests 0 • Algorithm uses only test case generation operations • Maintains two data structures: • A forest of reachable concrete states (tests) • Under-approximates executions of the program • A region graph (an abstraction) • Over-approximates all executions of the program state = 1 1 assume(y1) assume(!y1) 2 3 x0=x0+1 x0=x0-1 4 assume(y2) assume(!y2) 5 6 x1=x1+1 x1=x1-1 7 assume(state != 1) assume(state == 1) 9 8:error
Yogi: Proofs from Tests • Algorithm uses only test case generation operations • Maintains two data structures: • A forest of reachable concrete states (tests) • Under-approximates executions of the program • A region graph (an abstraction) • Over-approximates all executions of the program • Our goal: bug finding and proving • If a test reaches an error, we have found bug • If we refine the abstraction so that there is *no* path from the initial region to error region, we have a proof • Key ideas • Frontier: Boundary between tested regions and untested regions • uses only aliases α that are present along concrete tests that are executed
Key ideas 0 Frontier = edge that crosses from tested to untested region Step 1: Try to generate a test that crosses the frontier • Perform symbolic simulation on the path until the frontier and generate a constraint • Conjoin with the condition needed to cross frontier • Is satisfiable? 1 2 3 frontier 4 5 7 8 6 10 9
Key ideas 0 Step 1: Try to generate a test that crosses the frontier • Perform symbolic simulation on the path until the frontier and generate a constraint • Conjoin with the condition needed to cross frontier • Is satisfiable? [YES] Step 2: run the test and extend the frontier 1 2 3 frontier 4 5 7 8 6 10 9
Key ideas 0 Step 1: Try to generate a test that crosses the frontier • Perform symbolic simulation on the path until the frontier and generate a constraint • Conjoin with the condition needed to cross frontier • Is satisfiable? [NO] Step 2: use to refine so that the frontier moves back! 1 2 frontier 3 4: 4: 5 7 8 6 10 9
The Yogi algorithm Input: Program Property Construct initial abstraction Construct random tests Test succeeded? yes Bug! no Abstraction succeeded? yes Proof! no τ = error path in abstraction f = frontier of error path Can extend test beyond frontier? yes no Refine abstraction
The Yogi algorithm Input: Program Property × 0 Construct initial abstraction Construct random tests 1 void f(int y) { 0:intlock=0, x; 1: do { 2: lock = 1; 3: x = y; 4: if (*) { 5: lock = 0; 6: y = y+1; } 7: } while (x != y) 8: if (lock != 1) 9: error(); 10: } 2 Test succeeded? yes Bug! 3 no Abstraction succeeded? yes Proof! 4 no τ = error path in abstraction f = frontier of error path 5 7 6 8 Can extend test beyond frontier? yes ) 9 no Symbolic execution + Theorem proving Refine abstraction frontier 10
Symbolic execution • symbol table void f(int y) { 0:intlock=0, x; 1: do { 2: lock = 1; 3: x = y; 4: if (*) { 5: lock = 0; 6: y = y+1; } 7: } while (x != y) 8: if (lock != 1) 9: error(); 10: }
Symbolic execution • symbol table void f(int y) { 0:intlock=0, x; 1: do { 2: lock = 1; 3: x = y; 4: if (*) { 5: lock = 0; 6: y = y+1; } 7: } while (x != y) 8: if (lock != 1) 9: error(); 10: }
Symbolic execution • symbol table void f(int y) { 0:intlock=0, x; 1: do { 2: lock = 1; 3: x = y; 4: if (*) { 5: lock = 0; 6: y = y+1; } 7: } while (x != y) 8: if (lock != 1) 9: error(); 10: }
Symbolic execution • symbol table void f(int y) { 0:intlock=0, x; 1: do { 2: lock = 1; 3: x = y; 4: if (*) { 5: lock = 0; 6: y = y+1; } 7: } while (x != y) 8: if (lock != 1) 9: error(); 10: }
Symbolic execution • symbol table void f(int y) { 0:intlock=0, x; 1: do { 2: lock = 1; 3: x = y; 4: if (*) { 5: lock = 0; 6: y = y+1; } 7: } while (x != y) 8: if (lock != 1) 9: error(); 10: } • constraints
Symbolic execution • symbol table void f(int y) { 0:intlock=0, x; 1: do { 2: lock = 1; 3: x = y; 4: if (*) { 5: lock = 0; 6: y = y+1; } 7: } while (x != y) 8: if (lock != 1) 9: error(); 10: } • constraints
The Yogi algorithm Input: Program Property × 0 Construct initial abstraction Construct random tests 1 void f(int y) { 0:intlock=0, x; 1: do { 2: lock = 1; 3: x = y; 4: if (*) { 5: lock = 0; 6: y = y+1; } 7: } while (x != y) 8: if (lock != 1) 9: error(); 10: } 2 Test succeeded? yes Bug! 3 no Abstraction succeeded? yes Proof! 4 no τ = error path in abstraction f = frontier of error path 5 7 6 8 Can extend test beyond frontier? yes ) 9 no Refine abstraction NO frontier 10
Refinement 0 1 8:¬ρ 8 8:ρ refine 2 assume(lock != 1) 9 9 3 4 Candidates for refinement predicate Strongest postcondition (SP) along the path up to the frontier Weakest precondition (WP): Any formula separating the two above (e.g., an interpolant) 5 7 6 8 9 10
Refinement 0 1 8:¬ρ 8 8:ρ refine 2 assume(lock != 1) 9 9 3 4 5 7 6 8: 8:p 9 10
Next iteration Input: Program Property × 0 Construct initial abstraction Construct random tests 1 void f(int y) { 0:intlock=0, x; 1: do { 2: lock = 1; 3: x = y; 4: if (*) { 5: lock = 0; 6: y = y+1; } 7: } while (x != y) 8: if (lock != 1) 9: error(); 10: } 2 Test succeeded? yes Bug! 3 no Abstractionsucceeded? yes Proof! 4 no τ = error path in abstraction f = frontier of error path 5 7 6 8: 8:p Can extend test beyond frontier? yes 9 no Refine abstraction frontier 10
Correct, the program is … 1 2 0 void f(int y) { 0:intlock=0, x; 1: do { 2: lock = 1; 3: x = y; 4: if (*) { 5: lock = 0; 6: y = y+1; } 7: } while (x != y) 8: if (lock != 1) 9: error(); 10: } 3 4⋀s 4⋀¬s 5⋀s 5⋀¬s 6⋀¬r 6⋀r 7⋀q 7⋀¬q 8⋀p 8⋀¬p 9 10
Soundness and Termination • Theorem: Suppose we run Yogi on any program and property • If Yogi returns , then the abstract program simulates, and thus is a proof that does not reach • IfYogireturns , then is an error trace
SLAM and CEGAR boolean program c2bp prog. P prog. P’ slic bebop SLIC rule predicates path newton
CEGAR (SLAM) on an example 0 void foo(int a) { 0: i = 0; 1: c = 0; 2: while (i < 1000) { 3: c = c + i; 4: i = i + 1; } 5: if(a <= 0) 6: error(); } 1 CEGAR will need to refine the loop 1000 times! 2 3 4 5 6
Yogi on the same example Symbolic execution produces the constraint 0 void foo(int a) { 0: i = 0; 1: c = 0; 2: while (i < 1000) { 3: c = c + i; 4: i = i + 1; } 5: if(a <= 0) 6: error(); } 1 2 3 4 5 frontier 6
Directed testing with DART • void test_me(int x, int y) • { • 1: int z = 2*x; • 2: if (z==y) { • 3: if (y == x+10) • 4: error(); • } • 5: return; • } • Step 1: • Try random test: DART: Directed Automated Random Testing Patrice Godefroid, Nils Klarlund, KoushikSen, PLDI 2005
Directed testing with DART • void test_me(int x, int y) • { • 1: int z = 2*x; • 2: if (z==y) { • 3: if (y == x+10) • 4: error(); • } • 5: return; • } • Step 1: • Try random test: • Step 2: • Do a “parallel” symbolic execution • Collect constraints and negate the last branch • Solve for the constraint: • Try next test DART: Directed Automated Random Testing Patrice Godefroid, Nils Klarlund, KoushikSen, PLDI 2005
Directed testing with DART • void test_me(int x, int y) • { • 1: int z = 2*x; • 2: if (z==y) { • 3: if (y == x+10) • 4: abort(); • } • 5: return; • } • Step 1: • Try test: • Step 2: • Do a “parallel” symbolic execution • Collect constraints and negate the last branch • Solve for constraint: • Try next test • Find error! DART: Directed Automated Random Testing Patrice Godefroid, Nils Klarlund, KoushikSen, PLDI 2005
Path explosion with DART • void foo(int x1, int x2,…, intxn) • { • lock = 1 • if (x1) • g = g + 1; • else • g = g – 1; • if (x2) • g = g + 1; • else • g = g – 1; • ... • if (xn) • g = g + 1; • else • g = g – 1; • if (lock != 1) • error(); • } • Full path coverage expensive • Yogi uses abstraction to ignore irrelevant states and efficiently computes a proof …
Partition refinement, Bisimulations and the Lee-Yannakakis algorithm Init Error
Bisimulations Between concrete states Between regions Init Error • Partition is stable if for every region we have that doesn’t split any other region in a non-trivial manner • Bisimulation = coarsest stable partition, precise abstraction for property checking
Partition refinement • Partition refinement: • For every region compute and split every partition as needed • Until no splits can be generated
Partition refinement • On termination generates a bisimulation: • For every pair of regions and we have that iffthere is some state and some state such that • Partition refinement: • For every region compute and split every partition as needed • Until no splits can be generated On termination Yogi generates a simulation: For every pair of regions and ,we have if there is some state and some state such that then
Lee-Yannakakis 92 andPasaraneu-Palanek-Visser 05 • Compute reachable bisimulation quotient • Do splitting only for reachable portion of the state space • Generate witnesses for reachability of regions and split only regions which have reachable witnesses • On termination reachable partitions are a bisimulation: • For every pair of regions and we have that iffthere is some state and some state such that Can split Error Init Can’t split
Comparison with Yogi • Yogi terminates with this abstraction shown • In contrast, Lee-Yannakakis will refine regions and with predicates forever • Reachable bisumulation quotient for this program is infinite • But bisimulation quotient is not necessary to prove the property. Simulation is sufficient, and Yogi is able to find such a simulation void foo(int y) { 0: while(y>0){ 1: y = y -1; } 2: if(false) 3: error(); } 0 1 2 3
Summary so far … • Yogi combines testing and verification • Maintains a partition (region graph) just like SLAM • Generates witnesses using test case generation like DART • Core operation: • If frontier can be extended by a test case, extend • If frontier cannot be extended, refine the pre-state region • Combines advantages of SLAM and DART • Uses regions to avoid exploring exponential number of paths (like SLAM) • Uses tests to explore complicated code (like DART) • Computes simulations that can do proofs • Terminates in more cases than algorithms that compute bisimulationquotients, such as Lee-Yannakakis