210 likes | 220 Views
Discover proof of validity or invalidity of a program by using a machine learning algorithm based on probabilistic inference.
E N D
Program Verification as Probabilistic Inference Sumit Gulwani Nebojsa Jojic Microsoft Research, Redmond
Problem of Program Verification • Given a program with a pre/post-condition pair, discover proof of validity or invalidity. • Proof is in the form of an invariant at each program point that can be locally verified.
Example 1 x = 0 Proof of Validity entry y := 50; 1 2 x <100 False True exit 3 y = 100 x < 50 True False 6 4 x := x +1; y := y +1; x := x +1; 5 7 8
Machine Learning Algorithm for Program Verification • Initialize invariants at all program points to any element (from an abstract domain over which the proof exists) • Pick a program point (randomly) whose invariant is locally inconsistent & update it to make it less inconsistent.
Outline • Inconsistency Measure • Algorithm • Experiments
Consistency of an invariant I at program point I is consistent at iff Post() ) I Æ I ) Pre() • Post() is the strongest postcondition of “the invariants at the predecessors of ” at • Pre() is the weakest precondition of “the invariants at the successors of ” at • Example: 1 P s 2 • Post(2) = StrongestPost(P,s) • Pre(2) = (c ) Q) Æ (: c ) R) I c 4 3 Q R
Measuring Inconsistency of an invariant I at Local inconsistency of invariant I at program point = IM(Post(), I) + IM(I, Pre()) Where the inconsistency measure IM(1, 2) is some approximation of the number of program states that violate 1)2
Example of an inconsistency measure IM Consider the abstract domain of Boolean formulas (with the usual implication as the partial order). Let 1´ a1Ç … Ç an in DNF and 2´ b1Æ … Æ bm in CNF IM(1, 2) = (ai,bj) where (ai,bj) = 0, if ai) bj = 1, otherwise
Outline • Inconsistency Measure & Penalty Function • Algorithm • Experiments
Algorithm • Search for proof of validity and invalidity in parallel. • Same algorithm with different boundary conditions. • Proof of Validity • Iexit = Postcondition • Ientry = Precondition • Proof of Invalidity • Iexit = : Postcondition • Ientry) Precondition, and Ientry is satisfiable • This assumes that program terminates on all inputs.
Algorithm (Continued) • Initialize invariant Ij at program point j to any element (from an abstract domain over which the proof exists) • While invariant at some point is locally inconsistent: • Choose j randomly s.t. Ij is inconsistent at j • Update Ij s.t. inconsistency of Ij at j is minimized [Sandwich Step] • More precisely, Ij is chosen randomly with probability inversely proportional to its inconsistency at j (to avoid getting stuck in a local minima). But now, termination is only probabilistic.
Comparison with Interpolants Interpolant Given 1, 2 such that 1)2, find such that: • 1))2 • Vars() µ Vars(1) Å Vars(2) Sandwich Step Given 1, 2, find such that: • IM(1, ) + IM(,2) is minimum (i.e., # of states violating 1))2 is minimum) • is from a given abstract domain
Intersection of Forward & Backward Analysis x = 0 y := 50; 1 2 x <100 False True • - Assume abstract elements can have at most 3 conjuncts. • Post(8): x¸0 Æx·100 Æ (x·50 Çx=y) Æ (y=50 Çx¸51). Dropping any conjunct is a valid choice at 8 in a forward analysis. • But backward guidance from 2 calls for keeping x·100 and (x·50 Çx=y) 3 y = 100 x < 50 True False 6 4 x := x +1; y := y +1; x := x +1; 5 7 8
Outline • Inconsistency Measure & Penalty Function • Algorithm • Experiments
Example 1 x = 0 Proof of Validity entry y := 50; 1 2 x <100 False True exit 3 y = 100 x < 50 True False 6 4 x := x +1; y := y +1; x := x +1; 5 7 8
Stats: Proof vs Incremental Proof of Validity • Black: Proof of Validity • Grey: Incremental Proof of Validity • Incremental proof requires fewer updates
Stats: Different Sizes of Boolean Formulas • Grey: 5*3, Black: 4*3, White: 3*2 • n*m denotes n conjuncts & m disjuncts • Larger size requires fewer updates
* Example 2 true Proof of Validity entry x := 0; m := 0; 1 2 x < n False True exit 3 n· 0 Ç 0· m < n 4 6 m := x; 5 7 x := x +1; 8
Stats: Proof of Validity • Example 2 is “easier” than Example 1. • Easier example requires fewer updates.
Related Work: Probabilistic Techniques • Used successfully in several areas of computer science. • Yields more efficient, precise, even simpler algorithms. • An earlier technique: Random Interpretation [POPL ’03-’05] • Discovers program invariants • Monte Carlo Algorithm: May generate invalid invariants with a small probability. Running time is bounded. • “Random Testing” + “Abstract Interpretation” • This talk: Machine Learning • Discovers proof of validity/invalidity of a Hoare triple. • Las Vegas Algorithm: Generates a correct proof. Running time is probabilistic. • “Forward Analysis” + “Backward Analysis”
Conclusion • Combining Randomized & Symbolic techniques is powerful • Interprocedural Random Interpretation [POPL ’05] • DART [PLDI ’05], Yogi [FSE ’06] • This work • Machine Learning Algorithm • Inconsistency Measure for an abstract domain: How far are two abstract elements from satisfying the partial order? • Algorithm: Pick a program point (randomly) whose invariant is locally inconsistent & update it to make it less inconsistent. • Intersection of forward and backward analysis.