650 likes | 663 Views
Learn about Boolean Satisfiability, theory solvers, SMT solving approaches, applications in hardware verification, software testing, and more. Discover logical theory, syntax, and useful theories like linear arithmetic and equality theory.
E N D
Introduction to Satisfiability Modulo Theories(SMT) Clark Barrett, NYU Sanjit A. Seshia, UC Berkeley ICCAD Tutorial November 2, 2009
Boolean Satisfiability (SAT) p1 Ç Æ p2 : . . . Æ Ç Ç pn Is there an assignment to the p1, p2, …, pn variables such that evaluates to 1? ICCAD 2009 Tutorial
Satisfiability Modulo Theories p1 x= y Ç Æ p2 x + 2 z¸1 : . . . Æ Ç w & 0xFFFF = x Ç x % 26 = v pn Is there an assignment to the x,y,z,w variables s.t. evaluates to 1? ICCAD 2009 Tutorial
Satisfiability Modulo Theories • Given a formula in first-order logic, with associated background theories, is the formula satisfiable? • Yes: return a satisfying solution • No [generate a proof of unsatisfiability] ICCAD 2009 Tutorial
Applications of SMT • Hardware verification at higher levels of abstraction (RTL and above) • Verification of analog/mixed-signal circuits • Verification of hybrid systems • Software model checking • Software testing • Security: Finding vulnerabilities, verifying electronic voting machines, … • Program synthesis • … ICCAD 2009 Tutorial
References Satisfiability Modulo Theories Clark Barrett, Roberto Sebastiani, Sanjit A. Seshia, and Cesare Tinelli. Chapter 8 in the Handbook of Satisfiability, Armin Biere, Hans van Maaren, and Toby Walsh, editors, IOS Press, 2009. (available from our webpages) SMTLIB: A repository for SMT formulas (common format) and tools SMTCOMP: An annual competition of SMT solvers ICCAD 2009 Tutorial
Roadmap for this Tutorial • Background and Notation • Survey of Theories • Theory Solvers • Approaches to SMT Solving • Lazy Encoding to SAT • Eager Encoding to SAT • Conclusion ICCAD 2009 Tutorial
Roadmap for this Tutorial • Background and Notation • Survey of Theories • Theory Solvers • Approaches to SMT Solving • Lazy Encoding to SAT • Eager Encoding to SAT • Conclusion ICCAD 2009 Tutorial
First-Order Logic • A formal notation for mathematics, with expressions involving • Propositional symbols • Predicates • Functions and constant symbols • Quantifiers • In contrast, propositional (Boolean) logic only involves propositional symbols and operators ICCAD 2009 Tutorial
First-Order Logic: Syntax • As with propositional logic, expressions in first-order logic are made up of sequences of symbols. • Symbols are divided into logical symbolsand non-logical symbols or parameters. • Example: (x = y) Æ (y = z) Æ (f(z) ¸ f(x)+1) ICCAD 2009 Tutorial
First-Order Logic: Syntax • Logical Symbols • Propositional connectives: Ç, Æ, :, !, $ • Variables: v1, v2, . . . • Quantifiers: 8, 9 • Non-logical symbols/Parameters • Equality: = • Functions: +, -, %, bit-wise &, f(), concat, … • Predicates: ·, is_substring, … • Constant symbols: 0, 1.0, null, … ICCAD 2009 Tutorial
Quantifier-free Subset • We will largely restrict ourselves to formulas without quantifiers (8, 9) • This is called the quantifier-free subset/fragment of first-order logic with the relevant theory ICCAD 2009 Tutorial
Logical Theory • Defines a set of parameters (non-logical symbols) and their meanings • This definition is called a signature. • Example of a signature: Theory of linear arithmetic over integers Signature is (0,1,+,-,·) interpreted over Z ICCAD 2009 Tutorial
Roadmap for this Tutorial • Background and Notation • Survey of Theories • Theory Solvers • Two Approaches to SMT Solving • Lazy Encoding to SAT • Eager Encoding to SAT • Conclusion ICCAD 2009 Tutorial
Some Useful Theories • Equality (with uninterpreted functions) • Linear arithmetic (over Q or Z) • Difference logic (over Q or Z) • Finite-precision bit-vectors • integer or floating-point • Arrays / memories • Misc.: Non-linear arithmetic, strings, inductive datatypes (e.g. lists), sets, … ICCAD 2009 Tutorial
Theory of Equality and Uninterpreted Functions (EUF) • Also called the “free theory” • Because function symbols can take any meaning • Only property required is congruence: that these symbols map identical arguments to identical values i.e., x = y ) f(x) = f(y) • SMTLIB name: QF_UF ICCAD 2009 Tutorial
x0 x1 x x2 ALU xn-1 Bit-vectors to Abstract Domain (e.g. Z) f Functional units to Uninterpreted Functions a = x Æ b = y ) f(a,b) = f(x,y) Data and Function Abstraction with EUF Common Operations … p x 1 0 ITE(p, x, y) y If-then-else x = x = y y Test for equality ICCAD 2009 Tutorial
IF/ID ID/EX EX/WB PC Control Control Op Instr Mem Rd Ra = Adat Reg. File ALU Imm +4 = Rb Hardware Abstraction with EUF • For any Block that Transforms or Evaluates Data: • Replace with generic, unspecified function • Also view instruction memory as function F1 F2 F3 ICCAD 2009 Tutorial
Example QF_UF (EUF) Formula (x = y) Æ (y = z) Æ (f(x) f(z)) Transitivity: (x = y) Æ (y = z) ) (x = z) Congruence: (x = z) ) (f(x) = f(z)) ICCAD 2009 Tutorial
Equivalence Checking of Program Fragments int fun1(int y) { int x, z; z = y; y = x; x = z; return x*x; } SMT formula Satisfiable iff programs non-equivalent ( z = y Æ y1 = x Æ x1 = z Æ ret1 = x1*x1) Æ ( ret2 = y*y ) Æ ( ret1 ret2 ) int fun2(int y) { return y*y; } What if we use SAT to check equivalence? ICCAD 2009 Tutorial
Equivalence Checking of Program Fragments SMT formula Satisfiable iff programs non-equivalent ( z = y Æ y1 = x Æ x1 = z Æ ret1 = x1*x1) Æ ( ret2 = y*y ) Æ ( ret1 ret2 ) int fun1(int y) { int x, z; z = y; y = x; x = z; return x*x; } Using SAT to check equivalence (w/ Minisat) 32 bits for y: Did not finish in over 5 hours 16 bits for y: 37 sec. 8 bits for y: 0.5 sec. int fun2(int y) { return y*y; } ICCAD 2009 Tutorial
Equivalence Checking of Program Fragments int fun1(int y) { int x, z; z = y; y = x; x = z; return x*x; } SMT formula ’ ( z = y Æ y1 = x Æ x1 = z Æ ret1 = sq(x1) ) Æ ( ret2 = sq(y) ) Æ ( ret1 ret2 ) int fun2(int y) { return y*y; } Using EUF solver: 0.01 sec ICCAD 2009 Tutorial
Equivalence Checking of Program Fragments int fun1(int y) { int x; x = x ^ y; y = x ^ y; x = x ^ y; return x*x; } Does EUF still work? No! Must reason about bit-wise XOR. Need a solver for bit-vector arithmetic. Solvable in less than a sec. with a current bit-vector solver. int fun2(int y) { return y*y; } ICCAD 2009 Tutorial
Finite-Precision Bit-Vector Arithmetic (QF_BV) • Fixed width data words • Can model int, short, long, etc. • Arithmetic operations • E.g., add/subtract/multiply/divide & comparisons • Two’s complement and unsigned operations • Bit-wise logical operations • E.g., and/or/xor, shift/extract and equality • Boolean connectives ICCAD 2009 Tutorial
Linear Arithmetic (QF_LRA, QF_LIA) • Boolean combination of linear constraints of the form (a1 x1 + a2 x2 + … + an xn» b) • xi’s could be in Q or Z , »2 {¸,>,·,<,=} • Many applications, including: • Verification of analog circuits • Software verification, e.g., of array bounds ICCAD 2009 Tutorial
Difference Logic (QF_IDL, QF_RDL) • Boolean combination of linear constraints of the form xi-xj» cijor xi»ci »2 {¸,>,·,<,=}, xi’s in Q or Z • Applications: • Software verification (most linear constraints are of this form) • Processor datapath verification • Job shop scheduling / real-time systems • Timing verification for circuits ICCAD 2009 Tutorial
Arrays/Memories • SMT solvers can also be very effective in modeling data structures in software and hardware • Arrays in programs • Memories in hardware designs: e.g. instruction and data memories, CAMs, etc. ICCAD 2009 Tutorial
Theory of Arrays (QF_AX)Select and Store • Two interpreted functions: select and store • select(A,i) Read from A at index i • store(A,i,d) Write d to A at index i • Two main axioms: • select(store(A,i,d), i) = d • select(store(A,i,d), j) = select(A,j) for i j • One other axiom: • (8 i. select(A,i) = select(B,i)) ) A = B ICCAD 2009 Tutorial
Equivalence Checking of Program Fragments int fun1(int y) { int x[2]; x[0] = y; y = x[1]; x[1] = x[0]; return x[1]*x[1]; } SMT formula ’’ [ x1 = store(x,0,y)Æ y1 = select(x1,1) Æ x2 = store(x1,1,select(x1,0)) Æ ret1 = sq(select(x2,1)) ] Æ ( ret2 = sq(y) ) Æ ( ret1 ret2 ) int fun2(int y) { return y*y; } ICCAD 2009 Tutorial
Roadmap for this Tutorial • Background and Notation • Survey of Theories • Theory Solvers • Two Approaches to SMT Solving • Lazy Encoding to SAT • Eager Encoding to SAT • Conclusion ICCAD 2009 Tutorial
Over to Clark… ICCAD 2009 Tutorial
Roadmap for this Tutorial • Background and Notation • Survey of Theories • Theory Solvers • Approaches to SMT Solving • Lazy Encoding to SAT • Eager Encoding to SAT • Conclusion ICCAD 2009 Tutorial
Key Ideas: Small-domain encoding Constrain model search Rewrite rules Abstraction-based methods (eager + lazy) Example Solvers: UCLID, STP, Spear, Boolector, Beaver, … Input Formula Satisfiability-preserving Boolean Encoder Boolean Formula SAT Solver unsatisfiable satisfiable EAGER ENCODING Eager Approach to SMT SAT Solver involved in Theory Reasoning ICCAD 2009 Tutorial
Theories • Eager Encoding Methods have been demonstrated for the following Theories: • Equality & Uninterpreted Functions • Integer Linear Arithmetic • Restricted Lambda expressions • Arrays, memories, etc. • Finite-precision Bit-Vector Arithmetic • Strings ICCAD 2009 Tutorial
UCLID Operation Input Formula Lambda Expansion for Arrays -free Formula • Operation • Series of transformations leading to Boolean formula • Each step is validity (satisfiability) preserving • Each step performs optimizations Function & Predicate Elimination Linear/ Bitvector ArithmeticFormula Encoding Arithmetic Boolean Formula Boolean Satisfiability http://uclid.eecs.berkeley.edu ICCAD 2009 Tutorial
Bryant, German, Velev’s Encoding Ackermann’s Encoding f(x1) vf1 f(x1) vf1 f(x2) f(x2) vf2 ITE(x1=x2, vf1, vf2) x1=x2 vf1 = vf2 Rewrites: Eliminating Function Applications • Two applications of an uninterpreted function f in a formula • f(x1) and f(x2) ICCAD 2009 Tutorial
Small-Domain Encoding • Consider an SMT formula (x1, x2, …, xn) where xi2Di • Small-domain encoding/Finite instantiation: Derive finite set Si½Di s.t. |Si| ¿ |Di| • In some cases, Si is finite where Di is infinite • Encode each xi to take values only in Si • Could be done by encoding to SAT • Example: Integer Linear Arithmetic (QF_LIA) ICCAD 2009 Tutorial
Solving QF_LIA is NP-complete • In NP: • If a satisfying solution exists, then one exists within a bound d • log d is polynomial in input size • Expression for d[Papadimitriou, ‘82] (n+m) ¢ (bmax+1) ¢ ( m¢amax ) 2m+3 • Input size: • m – # constraints • n – # variables • bmax – largest constant (absolute value) • amax– largest coefficient (absolute value) ICCAD 2009 Tutorial
Small-domain encoding / Finite Instantiation: Naïve approach • Steps • Calculate the solution bound d • Encode each integer variable with dlog de bits & translate to Boolean formula • Run SAT solver • Problem: For QF_LIA, d is W( m m ) • W( m log m )bits per variable • Solution: Exploit special-cases and domain-specific structure ICCAD 2009 Tutorial
x1=x2Æx2x3Æx1x3 Can find solution with domain {1, 2} [Pnueli et al., Information and Computation, 2002] Special Case 1: Equality Logic • Linear constraints are equalities xi = xj • Result:d = n x1x2Æx2x3Æx1x3 3-valued domain is needed: {1, 2, 3} ICCAD 2009 Tutorial
Special Case 2: Difference Logic • Boolean combination of difference-bound constraints • xi¸xj + b, §xi¸b • Result: d = n¢ (bmax + 1)[Bryant, Lahiri, Seshia, CAV’02] • Proof sketch: satisfying solution corresponds to shortest path in constraint graph • Longest such path has length ·n¢ (bmax + 1) • Tighter formula-specific bounds possible ICCAD 2009 Tutorial
Special Case 3: Generalized 2SAT • Generalized 2SAT constraints • xi + xj¸b, - xi - xj¸b, xi - xj¸b, xi¸b • d = 2¢ n ¢(bmax + 1)[Seshia, Subramani, Bryant,’04] ICCAD 2009 Tutorial
Full Integer Linear Arithmetic • Can we avoid the mm blow-up? • In fact, yes. The idea is to derive a new parameterized solution bound d • Formalize parameters that the bound really depends on • Parameters characterize sparse structure • Occurs especially in software verification; also in many high-level hardware models • [Seshia & Bryant, LICS’04, LMCS’05] ICCAD 2009 Tutorial
Structure of Linear Constraints in Software Verification • Characteristics of studied benchmarks • Mostly differenceconstraints • Only 3% of constraints were NOT difference constraints • Non-difference constraints are sparse • At most 6 variables per constraint (total number of variables in 1000s) • Some similar observations: Pratt’77, ESC/Java-Simplify-TR’03 ICCAD 2009 Tutorial
Our solution bound: n ¢ (bmax+1) ¢ ( w¢amax ) k Previous: (n+m) ¢ (bmax+1) ¢ ( m¢amax ) 2m+3 • Direct dependence on m eliminated • (and k¿m ) Parameterized Solution Bound • New parameters: • k non-difference constraints, • w variables per constraint (width) ICCAD 2009 Tutorial
Æ Ç Ç : x1 - x2¸1 x1 + 2 x2 + x3 > -3 x2 – x4¸0 d = 96 Previous d = 282,175,488 Example ICCAD 2009 Tutorial
Summary of d Values ICCAD 2009 Tutorial
Abstraction-Based Methods • For some logics, one cannot easily compute a closed-form expression for the small domain • Example: Bit-Vector Arithmetic • In such cases, an abstraction-refinement approach can be used to compute formula-specific small domains ICCAD 2009 Tutorial
Bit-Vector Arithmetic: Some History B.C. (Before Chaff) String operations (concatenate, field extraction) Linear arithmetic with bounds checking Modular arithmetic SAT-Based “Bit Blasting” Generate Boolean circuit based on bit-level behavior of operations Handles arbitrary operations Check with best available SAT solver Effective in many applications CBMC [Clarke, Kroening, Lerda, TACAS ’04] Microsoft Cogent + SLAM [Cook, Kroening, Sharygina, CAV ’05] ICCAD 2009 Tutorial
Research Challenge Is there a better way than bit blasting? Requirements Provide same functionality as with bit blasting Must support all bit-vector operators Exploit word-level structure Improve on performance of bit blasting Current Approaches based on two core ideas: Simplification: Simplify input formula using word-level rewrite rules and solvers Abstraction: Can use automatic abstraction-refinement to solve simplified formula ICCAD 2009 Tutorial