290 likes | 306 Views
This research paper explores SAT-based decision procedures for subsets of first-order logic, specifically focused on separation logic. It discusses various encoding techniques and their impact on the efficiency of SAT solvers for verification purposes. The paper also presents a revised selection strategy for choosing between different encoding methods based on the input formula's features.
E N D
SAT-Based Decision Procedures for Subsets of First-Order Logic Part II: Separation Logic Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant
Outline • Background • SAT-based Decision Procedures • Equality with Uninterpreted Functions • Translating to propositional formula • Exploiting positive equality and sparse transitivity • Separation Logic • Translating to propositional formula • Hybrid encoding techniques
Suitable for verifying wider class of systems Terms (T ) Integer Expressions ITE(F, T1, T2)If-then-else Fun (T1, …, Tk) Function application T + 1 Increment T – 1 Decrement Formulas (F )Boolean Expressions F, F1F2, F1F2 Boolean connectives T1 = T2 Equation T1 < T2 Inequality Pred(T1, …, Tk) Predicate application Separation Logic with Uninterpreted Functions (SUF)
Eliminate function and predicate applications using fresh variables and ITE expressions [Bryant, German, Velev, CAV’99] f(x) v1andf(y) ITE(x = y, v1, v2) v Integer variable Formulas (F )Boolean Expressions F, F1F2, F1F2 Boolean connectives T1 = T2 Equation T1 < T2 Inequality Pred(T1, …, Tk) Predicate application Separation Predicate b Boolean variable SUF Separation Logic Terms (T ) Integer Expressions ITE(F, T1, T2) If-then-else Fun (T1, …, Tk) Function application T + 1 Increment T - 1 Decrement
Boolean Formula SAT Solver satisfiable/unsatisfiable Eager Boolean Encoding Methods for Separation Logic Separation Logic Formula Small Domain Encoding (SD) Per-Constraint Encoding (EIJ)
x x x+1 x+1 0x1x00y1y00y1y00z1z00z1z00x1x0 + 1 y y z z Values increase Small Domain Encoding (SD) [Bryant, Lahiri, Seshia, CAV’02] x y y z z x+1 Observation: To check satisfiability, need to consider all possible relative orderings of finitely-many expressions • Can use Boolean encoding of finite range of values • 4 values in this case, so 2-bit encoding
e1 x y y z e2 e1 e2 e3 e3 z x+1 Overall Boolean Encoding e1 e2 e4 New Separation Predicate e4 x z e4 e3 Transitivity Constraints Per-Constraint Encoding (EIJ) [Strichman, Seshia, Bryant, CAV’02] x y y z z x+1
c3 + c4 c3 + c2 c1 + c4 c1 + c2 c4 c3 Enforcing Transitivity Constraints xy + c1 • Graph Representation of Separation Constraints • Directed multigraph where edges labeled by constants • Fourier-Motzkin Elimination • Eliminate nodes in succession • Possibly exponential growth in edges x c1 x y z c1 c2 y
c3 + c4 c3 + c2 c1 + c4 c1 + c2 c4 c3 Introducing New Predicates xy + c1 x c1 x y z Sample Predicates c1 c2 y Sample Transitivity Constraint Sample Ordering Constraint (for c1 < c2)
Comparing Eager Encoding Methods • Of SD and EIJ encoding methods, which one is better? • Comparison with respect to • Size of resulting Boolean formula • Performance of SAT solver
Example: N = 6813 • Method • Boolean Encoding Size • EIJ • > 1000000 • SD • 54465 Size of Boolean Encoding: SD better than EIJ • Let N be size of original separation logic formula • Size of a directed acyclic graph representation • SD encoding size is worst-case O(N2) • EIJ encoding size is worst-case O(2N) • Can generate O(2N) transitivity constraints
Impact on SAT problem: SD vs EIJ • Experimentally compared zChaff performance on SD and EIJ encodings of several unsatisfiable formulas • Sample result: EIJ better than SD for zChaff
Impact on SAT: Why is EIJ better than SD? • Conjecture: For SD, SAT solver has to “discover” transitivity constraints as conflict clauses • Violation of transitivity constraint might be discovered only after assigning bits of several bit-vectors • EIJ adds all such constraints a priori • Less learning and backtracking required by the SAT solver
Eager Encoding Tradeoffs • SD encoding • Polynomial size encoding • Worse for SAT solvers • EIJ encoding • Worst-case exponential size encoding • Better for SAT solvers • Can we automatically select between SD and EIJ based on the input formula?
Selection Strategy Seshia, Lahiri, Bryant, DAC ‘03 • Problem: • Computationally hard to estimate number of transitivity constraints • Can we use a different metric? • Idea: Identify feature of the input formula that varies monotonically with run-time of EIJ (but not with run-time of SD) Estimate number of transitivity constraints, C NO YES C > T ? Use SD encoding Use EIJ encoding
Revised Selection Strategy Easy to count number of separation predicates Very approximate measure of # of transitivity constraints • Constraints only relate predicates that share variables • Also need to automate setting of threshold T • Statistically estimate from “training” set of benchmarks Count number of separation predicates, m NO YES m > T ? Use SD encoding Use EIJ encoding
{x,y,z} shared Identifying Variable Classes Æ Ç Ç u¸v Æ z¸x+1 u= v-2 y¸z x¸y {u,v} shared Assignments to {u,v} are independent of those to {x,y,z}
Compute 1. Variable classes based on predicates 2. Number of separation predicates for each class {u,v}, mk {x,y,z}, m1 mk > T ? m1 > T ? YES YES NO NO SD SD EIJ EIJ Encode each class using SD or EIJ based on local decision Encoded Boolean Formula Hybrid Encoding Technique Separation Logic Formula
Automatically Selecting a Threshold Value: Intuition EIJ run time increases drastically beyond a certain number of separation predicates
Automatically Selecting a Threshold Value using Clustering Cluster total time (Y-axis) values, minimizing variance of each cluster
Experimental Evaluation Setup • Compared Hybrid against • SD and EIJ encodings • Cooperating Validity Checker (CVC) based on lazy encoding method [Stump et al.’02] • Stanford Validity Checker (SVC) – non SAT-based [Barrett et al. ’96] • CVC & SVC can handle more expressive logics than SUF • Benchmarks • 49 unsatisfiable SUF formulas • Load-store unit, out-of-order unit, device driver code, compiler validation, DLX pipeline • Threshold value calculated from subset of 16 benchmarks • Worked well for 39 out of the 49 benchmarks • Setup • Used zChaff SAT solver • Imposed timeout of 1800 sec. on total time (Encoding+SAT)
Hybrid vs. SD (39/49 benchmarks) Hybrid better SD better
Hybrid vs. EIJ (39/49 benchmarks) Hybrid better EIJ better
Hybrid vs. Lazy Encoding (CVC) (39/49 benchmarks) Hybrid better CVC better
Hybrid vs. Non-SAT-based Procedure (SVC) (39/49 benchmarks) Hybrid better SVC better
SD outperforms Hybrid on 10/49 benchmarks Hybrid better SD better
Conclusions & Ongoing Work • Hybrid combination of EIJ and SD encodings • is robust to formula variations • outperforms lazy encoding methods (CVC) • outperforms non-SAT-based methods (SVC) • Ongoing & Future work • Alternate estimators for number of transitivity constraints • Threshold setting technique based on clustering applies to other CAD problems too • Combination of lazy and eager encoding techniques might perform well on satisfiable formulas? • More on UCLID project webpage http://www.cs.cmu.edu/~uclid