650 likes | 768 Views
Introduction to Satisfiability Modulo Theories (SMT). Clark Barrett, NYU Sanjit A. Seshia, UC Berkeley. ICCAD Tutorial November 2, 2009. Boolean Satisfiability (SAT). p 1. Ç. Æ. p 2. . :. . . . Æ. Ç. Ç. p n.
E N D
Introduction to Satisfiability Modulo Theories(SMT) Clark Barrett, NYU Sanjit A. Seshia, UC Berkeley ICCAD Tutorial November 2, 2009
Boolean Satisfiability (SAT) p1 Ç Æ p2 : . . . Æ Ç Ç pn Is there an assignment to the p1, p2, …, pn variables such that evaluates to 1? ICCAD 2009 Tutorial
Satisfiability Modulo Theories p1 x= y Ç Æ p2 x + 2 z¸1 : . . . Æ Ç w & 0xFFFF = x Ç x % 26 = v pn Is there an assignment to the x,y,z,w variables s.t. evaluates to 1? ICCAD 2009 Tutorial
Satisfiability Modulo Theories • Given a formula in first-order logic, with associated background theories, is the formula satisfiable? • Yes: return a satisfying solution • No [generate a proof of unsatisfiability] ICCAD 2009 Tutorial
Applications of SMT • Hardware verification at higher levels of abstraction (RTL and above) • Verification of analog/mixed-signal circuits • Verification of hybrid systems • Software model checking • Software testing • Security: Finding vulnerabilities, verifying electronic voting machines, … • Program synthesis • … ICCAD 2009 Tutorial
References Satisfiability Modulo Theories Clark Barrett, Roberto Sebastiani, Sanjit A. Seshia, and Cesare Tinelli. Chapter 8 in the Handbook of Satisfiability, Armin Biere, Hans van Maaren, and Toby Walsh, editors, IOS Press, 2009. (available from our webpages) SMTLIB: A repository for SMT formulas (common format) and tools SMTCOMP: An annual competition of SMT solvers ICCAD 2009 Tutorial
Roadmap for this Tutorial • Background and Notation • Survey of Theories • Theory Solvers • Approaches to SMT Solving • Lazy Encoding to SAT • Eager Encoding to SAT • Conclusion ICCAD 2009 Tutorial
Roadmap for this Tutorial • Background and Notation • Survey of Theories • Theory Solvers • Approaches to SMT Solving • Lazy Encoding to SAT • Eager Encoding to SAT • Conclusion ICCAD 2009 Tutorial
First-Order Logic • A formal notation for mathematics, with expressions involving • Propositional symbols • Predicates • Functions and constant symbols • Quantifiers • In contrast, propositional (Boolean) logic only involves propositional symbols and operators ICCAD 2009 Tutorial
First-Order Logic: Syntax • As with propositional logic, expressions in first-order logic are made up of sequences of symbols. • Symbols are divided into logical symbolsand non-logical symbols or parameters. • Example: (x = y) Æ (y = z) Æ (f(z) ¸ f(x)+1) ICCAD 2009 Tutorial
First-Order Logic: Syntax • Logical Symbols • Propositional connectives: Ç, Æ, :, !, $ • Variables: v1, v2, . . . • Quantifiers: 8, 9 • Non-logical symbols/Parameters • Equality: = • Functions: +, -, %, bit-wise &, f(), concat, … • Predicates: ·, is_substring, … • Constant symbols: 0, 1.0, null, … ICCAD 2009 Tutorial
Quantifier-free Subset • We will largely restrict ourselves to formulas without quantifiers (8, 9) • This is called the quantifier-free subset/fragment of first-order logic with the relevant theory ICCAD 2009 Tutorial
Logical Theory • Defines a set of parameters (non-logical symbols) and their meanings • This definition is called a signature. • Example of a signature: Theory of linear arithmetic over integers Signature is (0,1,+,-,·) interpreted over Z ICCAD 2009 Tutorial
Roadmap for this Tutorial • Background and Notation • Survey of Theories • Theory Solvers • Two Approaches to SMT Solving • Lazy Encoding to SAT • Eager Encoding to SAT • Conclusion ICCAD 2009 Tutorial
Some Useful Theories • Equality (with uninterpreted functions) • Linear arithmetic (over Q or Z) • Difference logic (over Q or Z) • Finite-precision bit-vectors • integer or floating-point • Arrays / memories • Misc.: Non-linear arithmetic, strings, inductive datatypes (e.g. lists), sets, … ICCAD 2009 Tutorial
Theory of Equality and Uninterpreted Functions (EUF) • Also called the “free theory” • Because function symbols can take any meaning • Only property required is congruence: that these symbols map identical arguments to identical values i.e., x = y ) f(x) = f(y) • SMTLIB name: QF_UF ICCAD 2009 Tutorial
x0 x1 x x2 ALU xn-1 Bit-vectors to Abstract Domain (e.g. Z) f Functional units to Uninterpreted Functions a = x Æ b = y ) f(a,b) = f(x,y) Data and Function Abstraction with EUF Common Operations … p x 1 0 ITE(p, x, y) y If-then-else x = x = y y Test for equality ICCAD 2009 Tutorial
IF/ID ID/EX EX/WB PC Control Control Op Instr Mem Rd Ra = Adat Reg. File ALU Imm +4 = Rb Hardware Abstraction with EUF • For any Block that Transforms or Evaluates Data: • Replace with generic, unspecified function • Also view instruction memory as function F1 F2 F3 ICCAD 2009 Tutorial
Example QF_UF (EUF) Formula (x = y) Æ (y = z) Æ (f(x) f(z)) Transitivity: (x = y) Æ (y = z) ) (x = z) Congruence: (x = z) ) (f(x) = f(z)) ICCAD 2009 Tutorial
Equivalence Checking of Program Fragments int fun1(int y) { int x, z; z = y; y = x; x = z; return x*x; } SMT formula Satisfiable iff programs non-equivalent ( z = y Æ y1 = x Æ x1 = z Æ ret1 = x1*x1) Æ ( ret2 = y*y ) Æ ( ret1 ret2 ) int fun2(int y) { return y*y; } What if we use SAT to check equivalence? ICCAD 2009 Tutorial
Equivalence Checking of Program Fragments SMT formula Satisfiable iff programs non-equivalent ( z = y Æ y1 = x Æ x1 = z Æ ret1 = x1*x1) Æ ( ret2 = y*y ) Æ ( ret1 ret2 ) int fun1(int y) { int x, z; z = y; y = x; x = z; return x*x; } Using SAT to check equivalence (w/ Minisat) 32 bits for y: Did not finish in over 5 hours 16 bits for y: 37 sec. 8 bits for y: 0.5 sec. int fun2(int y) { return y*y; } ICCAD 2009 Tutorial
Equivalence Checking of Program Fragments int fun1(int y) { int x, z; z = y; y = x; x = z; return x*x; } SMT formula ’ ( z = y Æ y1 = x Æ x1 = z Æ ret1 = sq(x1) ) Æ ( ret2 = sq(y) ) Æ ( ret1 ret2 ) int fun2(int y) { return y*y; } Using EUF solver: 0.01 sec ICCAD 2009 Tutorial
Equivalence Checking of Program Fragments int fun1(int y) { int x; x = x ^ y; y = x ^ y; x = x ^ y; return x*x; } Does EUF still work? No! Must reason about bit-wise XOR. Need a solver for bit-vector arithmetic. Solvable in less than a sec. with a current bit-vector solver. int fun2(int y) { return y*y; } ICCAD 2009 Tutorial
Finite-Precision Bit-Vector Arithmetic (QF_BV) • Fixed width data words • Can model int, short, long, etc. • Arithmetic operations • E.g., add/subtract/multiply/divide & comparisons • Two’s complement and unsigned operations • Bit-wise logical operations • E.g., and/or/xor, shift/extract and equality • Boolean connectives ICCAD 2009 Tutorial
Linear Arithmetic (QF_LRA, QF_LIA) • Boolean combination of linear constraints of the form (a1 x1 + a2 x2 + … + an xn» b) • xi’s could be in Q or Z , »2 {¸,>,·,<,=} • Many applications, including: • Verification of analog circuits • Software verification, e.g., of array bounds ICCAD 2009 Tutorial
Difference Logic (QF_IDL, QF_RDL) • Boolean combination of linear constraints of the form xi-xj» cijor xi»ci »2 {¸,>,·,<,=}, xi’s in Q or Z • Applications: • Software verification (most linear constraints are of this form) • Processor datapath verification • Job shop scheduling / real-time systems • Timing verification for circuits ICCAD 2009 Tutorial
Arrays/Memories • SMT solvers can also be very effective in modeling data structures in software and hardware • Arrays in programs • Memories in hardware designs: e.g. instruction and data memories, CAMs, etc. ICCAD 2009 Tutorial
Theory of Arrays (QF_AX)Select and Store • Two interpreted functions: select and store • select(A,i) Read from A at index i • store(A,i,d) Write d to A at index i • Two main axioms: • select(store(A,i,d), i) = d • select(store(A,i,d), j) = select(A,j) for i j • One other axiom: • (8 i. select(A,i) = select(B,i)) ) A = B ICCAD 2009 Tutorial
Equivalence Checking of Program Fragments int fun1(int y) { int x[2]; x[0] = y; y = x[1]; x[1] = x[0]; return x[1]*x[1]; } SMT formula ’’ [ x1 = store(x,0,y)Æ y1 = select(x1,1) Æ x2 = store(x1,1,select(x1,0)) Æ ret1 = sq(select(x2,1)) ] Æ ( ret2 = sq(y) ) Æ ( ret1 ret2 ) int fun2(int y) { return y*y; } ICCAD 2009 Tutorial
Roadmap for this Tutorial • Background and Notation • Survey of Theories • Theory Solvers • Two Approaches to SMT Solving • Lazy Encoding to SAT • Eager Encoding to SAT • Conclusion ICCAD 2009 Tutorial
Over to Clark… ICCAD 2009 Tutorial
Roadmap for this Tutorial • Background and Notation • Survey of Theories • Theory Solvers • Approaches to SMT Solving • Lazy Encoding to SAT • Eager Encoding to SAT • Conclusion ICCAD 2009 Tutorial
Key Ideas: Small-domain encoding Constrain model search Rewrite rules Abstraction-based methods (eager + lazy) Example Solvers: UCLID, STP, Spear, Boolector, Beaver, … Input Formula Satisfiability-preserving Boolean Encoder Boolean Formula SAT Solver unsatisfiable satisfiable EAGER ENCODING Eager Approach to SMT SAT Solver involved in Theory Reasoning ICCAD 2009 Tutorial
Theories • Eager Encoding Methods have been demonstrated for the following Theories: • Equality & Uninterpreted Functions • Integer Linear Arithmetic • Restricted Lambda expressions • Arrays, memories, etc. • Finite-precision Bit-Vector Arithmetic • Strings ICCAD 2009 Tutorial
UCLID Operation Input Formula Lambda Expansion for Arrays -free Formula • Operation • Series of transformations leading to Boolean formula • Each step is validity (satisfiability) preserving • Each step performs optimizations Function & Predicate Elimination Linear/ Bitvector ArithmeticFormula Encoding Arithmetic Boolean Formula Boolean Satisfiability http://uclid.eecs.berkeley.edu ICCAD 2009 Tutorial
Bryant, German, Velev’s Encoding Ackermann’s Encoding f(x1) vf1 f(x1) vf1 f(x2) f(x2) vf2 ITE(x1=x2, vf1, vf2) x1=x2 vf1 = vf2 Rewrites: Eliminating Function Applications • Two applications of an uninterpreted function f in a formula • f(x1) and f(x2) ICCAD 2009 Tutorial
Small-Domain Encoding • Consider an SMT formula (x1, x2, …, xn) where xi2Di • Small-domain encoding/Finite instantiation: Derive finite set Si½Di s.t. |Si| ¿ |Di| • In some cases, Si is finite where Di is infinite • Encode each xi to take values only in Si • Could be done by encoding to SAT • Example: Integer Linear Arithmetic (QF_LIA) ICCAD 2009 Tutorial
Solving QF_LIA is NP-complete • In NP: • If a satisfying solution exists, then one exists within a bound d • log d is polynomial in input size • Expression for d[Papadimitriou, ‘82] (n+m) ¢ (bmax+1) ¢ ( m¢amax ) 2m+3 • Input size: • m – # constraints • n – # variables • bmax – largest constant (absolute value) • amax– largest coefficient (absolute value) ICCAD 2009 Tutorial
Small-domain encoding / Finite Instantiation: Naïve approach • Steps • Calculate the solution bound d • Encode each integer variable with dlog de bits & translate to Boolean formula • Run SAT solver • Problem: For QF_LIA, d is W( m m ) • W( m log m )bits per variable • Solution: Exploit special-cases and domain-specific structure ICCAD 2009 Tutorial
x1=x2Æx2x3Æx1x3 Can find solution with domain {1, 2} [Pnueli et al., Information and Computation, 2002] Special Case 1: Equality Logic • Linear constraints are equalities xi = xj • Result:d = n x1x2Æx2x3Æx1x3 3-valued domain is needed: {1, 2, 3} ICCAD 2009 Tutorial
Special Case 2: Difference Logic • Boolean combination of difference-bound constraints • xi¸xj + b, §xi¸b • Result: d = n¢ (bmax + 1)[Bryant, Lahiri, Seshia, CAV’02] • Proof sketch: satisfying solution corresponds to shortest path in constraint graph • Longest such path has length ·n¢ (bmax + 1) • Tighter formula-specific bounds possible ICCAD 2009 Tutorial
Special Case 3: Generalized 2SAT • Generalized 2SAT constraints • xi + xj¸b, - xi - xj¸b, xi - xj¸b, xi¸b • d = 2¢ n ¢(bmax + 1)[Seshia, Subramani, Bryant,’04] ICCAD 2009 Tutorial
Full Integer Linear Arithmetic • Can we avoid the mm blow-up? • In fact, yes. The idea is to derive a new parameterized solution bound d • Formalize parameters that the bound really depends on • Parameters characterize sparse structure • Occurs especially in software verification; also in many high-level hardware models • [Seshia & Bryant, LICS’04, LMCS’05] ICCAD 2009 Tutorial
Structure of Linear Constraints in Software Verification • Characteristics of studied benchmarks • Mostly differenceconstraints • Only 3% of constraints were NOT difference constraints • Non-difference constraints are sparse • At most 6 variables per constraint (total number of variables in 1000s) • Some similar observations: Pratt’77, ESC/Java-Simplify-TR’03 ICCAD 2009 Tutorial
Our solution bound: n ¢ (bmax+1) ¢ ( w¢amax ) k Previous: (n+m) ¢ (bmax+1) ¢ ( m¢amax ) 2m+3 • Direct dependence on m eliminated • (and k¿m ) Parameterized Solution Bound • New parameters: • k non-difference constraints, • w variables per constraint (width) ICCAD 2009 Tutorial
Æ Ç Ç : x1 - x2¸1 x1 + 2 x2 + x3 > -3 x2 – x4¸0 d = 96 Previous d = 282,175,488 Example ICCAD 2009 Tutorial
Summary of d Values ICCAD 2009 Tutorial
Abstraction-Based Methods • For some logics, one cannot easily compute a closed-form expression for the small domain • Example: Bit-Vector Arithmetic • In such cases, an abstraction-refinement approach can be used to compute formula-specific small domains ICCAD 2009 Tutorial
Bit-Vector Arithmetic: Some History B.C. (Before Chaff) String operations (concatenate, field extraction) Linear arithmetic with bounds checking Modular arithmetic SAT-Based “Bit Blasting” Generate Boolean circuit based on bit-level behavior of operations Handles arbitrary operations Check with best available SAT solver Effective in many applications CBMC [Clarke, Kroening, Lerda, TACAS ’04] Microsoft Cogent + SLAM [Cook, Kroening, Sharygina, CAV ’05] ICCAD 2009 Tutorial
Research Challenge Is there a better way than bit blasting? Requirements Provide same functionality as with bit blasting Must support all bit-vector operators Exploit word-level structure Improve on performance of bit blasting Current Approaches based on two core ideas: Simplification: Simplify input formula using word-level rewrite rules and solvers Abstraction: Can use automatic abstraction-refinement to solve simplified formula ICCAD 2009 Tutorial