Introduction to Satisfiability Modulo Theories (SMT)

Introduction to Satisfiability Modulo Theories(SMT) Clark Barrett, NYU Sanjit A. Seshia, UC Berkeley ICCAD Tutorial November 2, 2009

Boolean Satisfiability (SAT) p1 Ç Æ p2  : . . . Æ Ç Ç pn Is there an assignment to the p1, p2, …, pn variables such that  evaluates to 1? ICCAD 2009 Tutorial

Satisfiability Modulo Theories p1 x= y Ç Æ p2 x + 2 z¸1  : . . . Æ Ç w & 0xFFFF = x Ç x % 26 = v pn Is there an assignment to the x,y,z,w variables s.t.  evaluates to 1? ICCAD 2009 Tutorial

Satisfiability Modulo Theories • Given a formula in first-order logic, with associated background theories, is the formula satisfiable? • Yes: return a satisfying solution • No [generate a proof of unsatisfiability] ICCAD 2009 Tutorial

Applications of SMT • Hardware verification at higher levels of abstraction (RTL and above) • Verification of analog/mixed-signal circuits • Verification of hybrid systems • Software model checking • Software testing • Security: Finding vulnerabilities, verifying electronic voting machines, … • Program synthesis • … ICCAD 2009 Tutorial

References Satisfiability Modulo Theories Clark Barrett, Roberto Sebastiani, Sanjit A. Seshia, and Cesare Tinelli. Chapter 8 in the Handbook of Satisfiability, Armin Biere, Hans van Maaren, and Toby Walsh, editors, IOS Press, 2009. (available from our webpages) SMTLIB: A repository for SMT formulas (common format) and tools SMTCOMP: An annual competition of SMT solvers ICCAD 2009 Tutorial

Roadmap for this Tutorial • Background and Notation • Survey of Theories • Theory Solvers • Approaches to SMT Solving • Lazy Encoding to SAT • Eager Encoding to SAT • Conclusion ICCAD 2009 Tutorial

First-Order Logic • A formal notation for mathematics, with expressions involving • Propositional symbols • Predicates • Functions and constant symbols • Quantifiers • In contrast, propositional (Boolean) logic only involves propositional symbols and operators ICCAD 2009 Tutorial

First-Order Logic: Syntax • As with propositional logic, expressions in first-order logic are made up of sequences of symbols. • Symbols are divided into logical symbolsand non-logical symbols or parameters. • Example: (x = y) Æ (y = z) Æ (f(z) ¸ f(x)+1) ICCAD 2009 Tutorial

First-Order Logic: Syntax • Logical Symbols • Propositional connectives: Ç, Æ, :, !, $ • Variables: v1, v2, . . . • Quantifiers: 8, 9 • Non-logical symbols/Parameters • Equality: = • Functions: +, -, %, bit-wise &, f(), concat, … • Predicates: ·, is_substring, … • Constant symbols: 0, 1.0, null, … ICCAD 2009 Tutorial

Quantifier-free Subset • We will largely restrict ourselves to formulas without quantifiers (8, 9) • This is called the quantifier-free subset/fragment of first-order logic with the relevant theory ICCAD 2009 Tutorial

Logical Theory • Defines a set of parameters (non-logical symbols) and their meanings • This definition is called a signature. • Example of a signature: Theory of linear arithmetic over integers Signature is (0,1,+,-,·) interpreted over Z ICCAD 2009 Tutorial

Roadmap for this Tutorial • Background and Notation • Survey of Theories • Theory Solvers • Two Approaches to SMT Solving • Lazy Encoding to SAT • Eager Encoding to SAT • Conclusion ICCAD 2009 Tutorial

Some Useful Theories • Equality (with uninterpreted functions) • Linear arithmetic (over Q or Z) • Difference logic (over Q or Z) • Finite-precision bit-vectors • integer or floating-point • Arrays / memories • Misc.: Non-linear arithmetic, strings, inductive datatypes (e.g. lists), sets, … ICCAD 2009 Tutorial

Theory of Equality and Uninterpreted Functions (EUF) • Also called the “free theory” • Because function symbols can take any meaning • Only property required is congruence: that these symbols map identical arguments to identical values i.e., x = y ) f(x) = f(y) • SMTLIB name: QF_UF ICCAD 2009 Tutorial

x0 x1  x x2 ALU xn-1 Bit-vectors to Abstract Domain (e.g. Z)  f Functional units to Uninterpreted Functions a = x Æ b = y ) f(a,b) = f(x,y) Data and Function Abstraction with EUF Common Operations … p x 1 0 ITE(p, x, y) y If-then-else x = x = y y Test for equality ICCAD 2009 Tutorial

IF/ID ID/EX EX/WB PC Control Control Op Instr Mem Rd Ra = Adat Reg. File ALU Imm +4 = Rb Hardware Abstraction with EUF • For any Block that Transforms or Evaluates Data: • Replace with generic, unspecified function • Also view instruction memory as function F1 F2 F3 ICCAD 2009 Tutorial

Example QF_UF (EUF) Formula (x = y) Æ (y = z) Æ (f(x)  f(z)) Transitivity: (x = y) Æ (y = z) ) (x = z) Congruence: (x = z) ) (f(x) = f(z)) ICCAD 2009 Tutorial

Equivalence Checking of Program Fragments int fun1(int y) { int x, z; z = y; y = x; x = z; return x*x; } SMT formula  Satisfiable iff programs non-equivalent ( z = y Æ y1 = x Æ x1 = z Æ ret1 = x1*x1) Æ ( ret2 = y*y ) Æ ( ret1  ret2 ) int fun2(int y) { return y*y; } What if we use SAT to check equivalence? ICCAD 2009 Tutorial

Equivalence Checking of Program Fragments SMT formula  Satisfiable iff programs non-equivalent ( z = y Æ y1 = x Æ x1 = z Æ ret1 = x1*x1) Æ ( ret2 = y*y ) Æ ( ret1  ret2 ) int fun1(int y) { int x, z; z = y; y = x; x = z; return x*x; } Using SAT to check equivalence (w/ Minisat) 32 bits for y: Did not finish in over 5 hours 16 bits for y: 37 sec. 8 bits for y: 0.5 sec. int fun2(int y) { return y*y; } ICCAD 2009 Tutorial

Equivalence Checking of Program Fragments int fun1(int y) { int x, z; z = y; y = x; x = z; return x*x; } SMT formula ’ ( z = y Æ y1 = x Æ x1 = z Æ ret1 = sq(x1) ) Æ ( ret2 = sq(y) ) Æ ( ret1  ret2 ) int fun2(int y) { return y*y; } Using EUF solver: 0.01 sec ICCAD 2009 Tutorial

Equivalence Checking of Program Fragments int fun1(int y) { int x; x = x ^ y; y = x ^ y; x = x ^ y; return x*x; } Does EUF still work? No! Must reason about bit-wise XOR. Need a solver for bit-vector arithmetic. Solvable in less than a sec. with a current bit-vector solver. int fun2(int y) { return y*y; } ICCAD 2009 Tutorial

Finite-Precision Bit-Vector Arithmetic (QF_BV) • Fixed width data words • Can model int, short, long, etc. • Arithmetic operations • E.g., add/subtract/multiply/divide & comparisons • Two’s complement and unsigned operations • Bit-wise logical operations • E.g., and/or/xor, shift/extract and equality • Boolean connectives ICCAD 2009 Tutorial

Linear Arithmetic (QF_LRA, QF_LIA) • Boolean combination of linear constraints of the form (a1 x1 + a2 x2 + … + an xn» b) • xi’s could be in Q or Z , »2 {¸,>,·,<,=} • Many applications, including: • Verification of analog circuits • Software verification, e.g., of array bounds ICCAD 2009 Tutorial

Difference Logic (QF_IDL, QF_RDL) • Boolean combination of linear constraints of the form xi-xj» cijor xi»ci »2 {¸,>,·,<,=}, xi’s in Q or Z • Applications: • Software verification (most linear constraints are of this form) • Processor datapath verification • Job shop scheduling / real-time systems • Timing verification for circuits ICCAD 2009 Tutorial

Arrays/Memories • SMT solvers can also be very effective in modeling data structures in software and hardware • Arrays in programs • Memories in hardware designs: e.g. instruction and data memories, CAMs, etc. ICCAD 2009 Tutorial

Theory of Arrays (QF_AX)Select and Store • Two interpreted functions: select and store • select(A,i) Read from A at index i • store(A,i,d) Write d to A at index i • Two main axioms: • select(store(A,i,d), i) = d • select(store(A,i,d), j) = select(A,j) for i  j • One other axiom: • (8 i. select(A,i) = select(B,i)) ) A = B ICCAD 2009 Tutorial

Equivalence Checking of Program Fragments int fun1(int y) { int x[2]; x[0] = y; y = x[1]; x[1] = x[0]; return x[1]*x[1]; } SMT formula ’’ [ x1 = store(x,0,y)Æ y1 = select(x1,1) Æ x2 = store(x1,1,select(x1,0)) Æ ret1 = sq(select(x2,1)) ] Æ ( ret2 = sq(y) ) Æ ( ret1  ret2 ) int fun2(int y) { return y*y; } ICCAD 2009 Tutorial

Roadmap for this Tutorial • Background and Notation • Survey of Theories • Theory Solvers • Two Approaches to SMT Solving • Lazy Encoding to SAT • Eager Encoding to SAT • Conclusion ICCAD 2009 Tutorial

Over to Clark… ICCAD 2009 Tutorial

Roadmap for this Tutorial • Background and Notation • Survey of Theories • Theory Solvers • Approaches to SMT Solving • Lazy Encoding to SAT • Eager Encoding to SAT • Conclusion ICCAD 2009 Tutorial

Key Ideas: Small-domain encoding Constrain model search Rewrite rules Abstraction-based methods (eager + lazy) Example Solvers: UCLID, STP, Spear, Boolector, Beaver, … Input Formula Satisfiability-preserving Boolean Encoder Boolean Formula SAT Solver unsatisfiable satisfiable EAGER ENCODING Eager Approach to SMT SAT Solver involved in Theory Reasoning ICCAD 2009 Tutorial

Theories • Eager Encoding Methods have been demonstrated for the following Theories: • Equality & Uninterpreted Functions • Integer Linear Arithmetic • Restricted Lambda expressions • Arrays, memories, etc. • Finite-precision Bit-Vector Arithmetic • Strings ICCAD 2009 Tutorial

UCLID Operation Input Formula Lambda Expansion for Arrays -free Formula • Operation • Series of transformations leading to Boolean formula • Each step is validity (satisfiability) preserving • Each step performs optimizations Function & Predicate Elimination Linear/ Bitvector ArithmeticFormula Encoding Arithmetic Boolean Formula Boolean Satisfiability http://uclid.eecs.berkeley.edu ICCAD 2009 Tutorial

Bryant, German, Velev’s Encoding Ackermann’s Encoding f(x1) vf1 f(x1) vf1 f(x2) f(x2) vf2 ITE(x1=x2, vf1, vf2) x1=x2  vf1 = vf2 Rewrites: Eliminating Function Applications • Two applications of an uninterpreted function f in a formula • f(x1) and f(x2) ICCAD 2009 Tutorial

Small-Domain Encoding • Consider an SMT formula (x1, x2, …, xn) where xi2Di • Small-domain encoding/Finite instantiation: Derive finite set Si½Di s.t. |Si| ¿ |Di| • In some cases, Si is finite where Di is infinite • Encode each xi to take values only in Si • Could be done by encoding to SAT • Example: Integer Linear Arithmetic (QF_LIA) ICCAD 2009 Tutorial

Solving QF_LIA is NP-complete • In NP: • If a satisfying solution exists, then one exists within a bound d • log d is polynomial in input size • Expression for d[Papadimitriou, ‘82] (n+m) ¢ (bmax+1) ¢ ( m¢amax ) 2m+3 • Input size: • m – # constraints • n – # variables • bmax – largest constant (absolute value) • amax– largest coefficient (absolute value) ICCAD 2009 Tutorial

Small-domain encoding / Finite Instantiation: Naïve approach • Steps • Calculate the solution bound d • Encode each integer variable with dlog de bits & translate to Boolean formula • Run SAT solver • Problem: For QF_LIA, d is W( m m ) • W( m log m )bits per variable • Solution: Exploit special-cases and domain-specific structure ICCAD 2009 Tutorial

x1=x2Æx2x3Æx1x3 Can find solution with domain {1, 2} [Pnueli et al., Information and Computation, 2002] Special Case 1: Equality Logic • Linear constraints are equalities xi = xj • Result:d = n x1x2Æx2x3Æx1x3 3-valued domain is needed: {1, 2, 3} ICCAD 2009 Tutorial

Special Case 2: Difference Logic • Boolean combination of difference-bound constraints • xi¸xj + b, §xi¸b • Result: d = n¢ (bmax + 1)[Bryant, Lahiri, Seshia, CAV’02] • Proof sketch: satisfying solution corresponds to shortest path in constraint graph • Longest such path has length ·n¢ (bmax + 1) • Tighter formula-specific bounds possible ICCAD 2009 Tutorial

Special Case 3: Generalized 2SAT • Generalized 2SAT constraints • xi + xj¸b, - xi - xj¸b, xi - xj¸b, xi¸b • d = 2¢ n ¢(bmax + 1)[Seshia, Subramani, Bryant,’04] ICCAD 2009 Tutorial

Full Integer Linear Arithmetic • Can we avoid the mm blow-up? • In fact, yes. The idea is to derive a new parameterized solution bound d • Formalize parameters that the bound really depends on • Parameters characterize sparse structure • Occurs especially in software verification; also in many high-level hardware models • [Seshia & Bryant, LICS’04, LMCS’05] ICCAD 2009 Tutorial

Structure of Linear Constraints in Software Verification • Characteristics of studied benchmarks • Mostly differenceconstraints • Only 3% of constraints were NOT difference constraints • Non-difference constraints are sparse • At most 6 variables per constraint (total number of variables in 1000s) • Some similar observations: Pratt’77, ESC/Java-Simplify-TR’03 ICCAD 2009 Tutorial

Our solution bound: n ¢ (bmax+1) ¢ ( w¢amax ) k Previous: (n+m) ¢ (bmax+1) ¢ ( m¢amax ) 2m+3 • Direct dependence on m eliminated • (and k¿m ) Parameterized Solution Bound • New parameters: • k non-difference constraints, • w variables per constraint (width) ICCAD 2009 Tutorial

Æ Ç Ç : x1 - x2¸1 x1 + 2 x2 + x3 > -3 x2 – x4¸0 d = 96 Previous d = 282,175,488 Example ICCAD 2009 Tutorial

Summary of d Values ICCAD 2009 Tutorial

Abstraction-Based Methods • For some logics, one cannot easily compute a closed-form expression for the small domain • Example: Bit-Vector Arithmetic • In such cases, an abstraction-refinement approach can be used to compute formula-specific small domains ICCAD 2009 Tutorial

Bit-Vector Arithmetic: Some History B.C. (Before Chaff) String operations (concatenate, field extraction) Linear arithmetic with bounds checking Modular arithmetic SAT-Based “Bit Blasting” Generate Boolean circuit based on bit-level behavior of operations Handles arbitrary operations Check with best available SAT solver Effective in many applications CBMC [Clarke, Kroening, Lerda, TACAS ’04] Microsoft Cogent + SLAM [Cook, Kroening, Sharygina, CAV ’05] ICCAD 2009 Tutorial

Research Challenge Is there a better way than bit blasting? Requirements Provide same functionality as with bit blasting Must support all bit-vector operators Exploit word-level structure Improve on performance of bit blasting Current Approaches based on two core ideas: Simplification: Simplify input formula using word-level rewrite rules and solvers Abstraction: Can use automatic abstraction-refinement to solve simplified formula ICCAD 2009 Tutorial

Introduction to Satisfiability Modulo Theories (SMT)

Introduction to Satisfiability Modulo Theories (SMT)

Presentation Transcript

Satisfiability Modulo Theories: A Calculus of Computation PUC, Rio de Janeiro, 2009

Satisfiability Modulo Theories: An Appetizer SBMF 2009 - Gramado

Satisfiability Modulo Theories solvers in Program Analysis and Verification

Satisfiability Modulo Theories

Engineering Satisfiability Modulo Theories Solvers for I ntractable P roblems

Satisfiability modulo theories

Satisfiability Modulo Theories and Network Verification

Symbolic program analysis as Satisfiability Modulo Theories

Introduction to SMT Lecture 2, 2012

Satisfiability Modulo Theories and Network Verification

An Introduction to SMT

Satisfiability Modulo Theories and Network Verification

DPLL-based Checkers for Satisfiability Modulo Theories

Beyond Satisfiability: Model Counting, Quantification, and Randomization

3rd Party Update To RMS

DPLL-based Checkers for Satisfiability Modulo Theories

A Progressive Approach for Satisfiability Modulo Theories