520 likes | 539 Views
Explore decision procedures customized for formal verification, including SAT-based methods and data abstraction applied to hardware systems. Learn about modeling memories, buffers, and term-level techniques for hardware verification.
E N D
Decision Procedures Customized for Formal Verification Randal E. Bryant Carnegie Mellon University http://www.cs.cmu.edu/~bryant Contributions by former graduate students: Sanjit Seshia, Shuvendu Lahiri
Outline • Context • Infinite state models of hardware systems • Verification techniques • Needs • Requirements for decision procedures • Dealing with quantifiers • Our Solution • SAT-based procedure • “Eager” Boolean encoding
Verification Example • Task • Verify that microprocessor correctly implements instruction set definition • Even though heavily pipelined Alpha 21264 Microprocessor Microprocessor Report, Oct. 28, 1996
Existing Hardware Verification Methods • Simulators, equivalence checkers, model checkers, … • All Operate at Bit Level • View each register or memory bit as state variable • Behavior of each state variable defined by Boolean function • Strengths • Finite-state systems conceptually simple • BDDs & SAT procedures allow high degrees of automation • Limitations • State space can be very large • Only verify fixed instantiation of system • Specific memory sizes, number of processes, buffer lengths, …
Verification Challenges • Sources of Complexity • Lots of internal state • Complex control logic • Opportunities • Most of the logic serves to store, select, and communicate data Alpha 21264 Microprocessor Microprocessor Report, Oct. 28, 1996
Applying Data Abstraction to Hardware Verification • Idea • Abstract details of data encodings and operations • Keep control logic precise • Applications • Verify overall correctness of system • Assuming individual functional units correct • Advantages of Abstraction • Abstract infinite-state system easier to verify than detailed finite-state one • Parametric representation allows verification of many different system variants • Arbitrary number of processes, buffer lengths, etc.
Data Path Com. Log. 1 Com.Log. 2 Word Abstraction Control Logic • Data: Abstract details of form & functions • Control: Keep at bit level • Timing: Keep at cycle level
x Data Abstraction #1: Bits → Terms x0 • View Data as Symbolic Words • Arbitrary integers • No assumptions about size or encoding • Classic model for reasoning about software • Can store in memories & registers x1 x2 xn-1
Data Path Data Path Com. Log. 1 Com. Log. 1 ? Com.Log. 2 Com. Log. 1 ? What do we do about logic functions? Abstracting Data Bits Control Logic
ALU Abstraction #2: Uninterpreted Functions • For any Block that Transforms or Evaluates Data: • Replace with generic, unspecified function • Only assumed property is functional consistency: a = x b = y f(a, b) = f(x, y) f
F1 F2 Abstracting Functions Control Logic • For Any Block that Transforms Data: • Replace by uninterpreted function • Ignore detailed functionality • Conservative approximation of actual system Data Path Com. Log. 1 Com. Log. 1
M a M m0 a Abstraction #3: Modeling Memories as Mutable Functions • Memory M Modeled as Function • M(a): Value at location a • Initially • Arbitrary state • Modeled by uninterpreted function m0
Writing Transforms Memory M = Write(M, wa, wd) Reading from updated memory: Address wa will get wd Otherwise get what’s already in M Express with Lambda Notation M = a . ITE(a = wa, wd, M(a)) M wa = wd a M 1 0 Effect of Memory Write Operation
Systems with Buffers Circular Queue Unbounded Buffer • Modeling Method • Mutable function to describe buffer contents • Integers to represent head & tail pointers • Parameterize buffer capacity with symbolic value Max
Some History of Term-Level Modeling • Historically • Standard model used for program verification • Unbounded integer data types • Widely used with theorem-proving approaches to hardware verification • E.g, Hunt ’85 • Automated Approaches to Hardware Verification • Burch & Dill, ’95 • Tool for verifying pipelined microprocessors • Implemented by form of symbolic simulation • Continued application to pipelined processor verification
UCLID • Seshia, Lahiri, Bryant, CAV ‘02 • Term-Level Verification System • Language for describing systems • Inspired by CMU SMV • Symbolic simulator • Generates integer expressions describing system state after sequence of steps • Decision procedure • Determines validity of formulas • Support for multiple verification techniques • Available by Download http://www.cs.cmu.edu/~uclid
Required Logic • Scalar Data Types • Formulas (F ) Boolean Expressions • Control signals • Terms (T ) Integer Expressions • Data values • Functional Data Types • Functions (Fun) Integer Integer • Immutable: Functional units • Mutable: Memories • Predicates (P) Integer Boolean • Immutable: Data-dependent control • Mutable: Bit-level memories
To support pointer operations CLU Logic • Counter Arithmetic, Lambda Expressions and Uinterpreted Functions • Terms (T ) Integer Expressions ITE(F, T1, T2) If-then-else Fun (T1, …, Tk) Function application succ (T) Increment pred (T) Decrement • Formulas (F ) Boolean Expressions F, F1F2, F1F2 Boolean connectives T1 = T2 Equation T1 < T2 Inequality P(T1, …, Tk) Predicate application
CLU Logic (Cont.) • Functions (Fun) Integer Integer f Uninterpreted function symbol x1, …, xk . T Function definition • Predicates (P) Integer Boolean p Uninterpreted predicate symbol x1, …, xk . F Predicate definition
Outline • Context • Infinite state models of hardware systems • Verification techniques • Needs • Requirements for decision procedures • Dealing with quantifiers • Our Solution • SAT-based procedure • “Eager” Boolean encoding
Present State Next State Inputs (Arbitrary) Verifying Safety Properties • State Machine Model • State encoded as Booleans, integers, and functions • Next state function expresses how updated on each step • Prove: System will never reach bad state Bad States Reachable States Reset States Reset
Reachable Rn • • • Bounded Model Checking Bad States • Repeatedly Perform Image Computations • Set of all states reachable by one more state transition • Underapproximation of Reachable State Set • But, typically catch most bugs with 8–10 steps R2 R1 Reset States
Reset Bad S X1 X2 Xn Implementing BMC Satisfiable? • Construct verification condition formula for step n by symbolically simulating system for n cycles • Check with decision procedure • Do as many cycles as tractable
Reach Fixed-Point Rn = Rn+1 = Reachable Impractical for Term-Level Models Many systems never reach fixed point Can keep adding elements to buffer Convergence test undecidable (Bryant, Lahiri, Seshia, CHARME ’03) Rn • • • True Model Checking Bad States R2 R1 Reset States
I Inductive Invariant Checking Bad States • Key Properties of System that Make it Operate Correctly • Formulate as formula I • Prove Inductive • Holds initially I(s0) • Preserved by all state changes I(s) I((i, s)) Reachable States Reset States
Inductive Invariants • Formulas I1, …, In • Ij(s0) holds for any initial state s0, for 1 jn • I1(s) I2(s) … In(s) Ij(s ) for any current state s and successor state s for 1 jn • Overall Correctness • Follows by induction on time • Restricted form of invariants • x1x2…xk (x1…xk) • (x1…xk) is a CLU formula without quantifiers • x1…xk are integer variables free in (x1…xk) • Express properties that hold for all buffer indices, register IDs, etc.
Proving Invariants • Proving invariants inductive requires quantifiers |= [x1x2…xk (x1…xk)] [y1y2…ym (y1…ym)] • Prove unsatisfiability of formula x1x2…xk (x1…xk) (y1…ym) • Undecidable Problem • In logic with uninterpreted functions and equality
Invariant Checking:Out-of-Order Processor Designs • Generating invariants requires considerable human effort • Impractical for realistic designs
Constructing Invariants from Predicates Predicates rob.head reg.tag(r) Invariant r,t.reg.valid(r) reg.tag(r) = t (rob.head reg.tag(r) < rob.tail rob.dest(t) = r ) reg.valid(r) Result: Correctness reg.tag(r) = t rob.dest(t) = r
Automatic Predicate Abstraction • Graf & Saïdi, CAV ’97 • Idea • Given set of predicates P1(s), …, Pk(s) • Boolean formulas describing properties of system state • View as abstraction mapping: States {0,1}k • Defines abstract FSM over state set {0,1}k • Form of abstract interpretation • Do reachability analysis similar to symbolic model checking • Early Implementations Inefficient • Guess at possible next abstract states • Test with call to decision procedure
A I Rn • • • R2 R1 Reset States Concretize C Concrete System Reset States P.E. as Invariant Generator • Reach Fixed-Point on Abstract System • Termination guaranteed, since finite state • Equivalent to Computing Invariant for Concrete System • Strongest possible invariant that can be expressed by formula over these predicates Abstract System
Symbolic Formulation of Predicate Abstraction Lahiri, Bryant, Cook, CAV ‘03 • Basic Operation • Compute set of legal abstract next states (B) given current abstract states (B) B,B: Abstract current and next-state state variables , : Boolean formulas • Create formula of form (S,B) Possible combinations of current concrete state S and next abstract state B • Formulate as Quantifier Elimination Problem • Generate formula of form (B) S(S,B) S: Integer variables • For interpretation of B, formula true iff (S,B) satisfiable
Outline • Context • Infinite state models of hardware systems • Verification techniques • Needs • Requirements for decision procedures • Dealing with quantifiers • Our Solution • SAT-based procedure • “Eager” Boolean encoding
Decision Procedure Needs • Bounded Model Checking • Satisfiability of quantifier-free CLU formula • Handled by decision procedure • Invariant Checking • Satisfiability of quantified CLU formula • Undecidable • Predicate Abstraction • Eliminate quantifiers from CLU formula • Role of Decision Procedure • Apply in sound, but incomplete way
UCLID Decision Procedure Operation CLU Formula • Series of transformations leading to propositional formula • Except for lambda expansion, each has polynomial complexity Lambda Expansion -free Formula Function & Predicate Elimination Term Formula Finite Instantiation Boolean Formula Boolean Satisfiability
Input Formula Input Formula additional clause unsatisfiable Approximate Boolean Encoder Satisfiability-preserving Boolean Encoder First-order Conjunctions SAT Checker Boolean Formula Boolean Formula satisfiable SAT Solver SAT Solver satisfying assignment satisfiable unsatisfiable satisfiable unsatisfiable LAZY ENCODING EAGER ENCODING SAT-based Decision Procedures
Input Formula Satisfiability-preserving Boolean Encoder Boolean Formula SAT Solver unsatisfiable satisfiable Eager Encoding Characteristics • Must encode all information about domain properties into Boolean formula • Some properties can give exponential blowup • Lets SAT solver do all of the work Good Approach for Some Domains • Modern SAT solvers have remarkable capacity • Good at extracting relevant portions out of very large formulas • Learns about formula properties as search proceeds
Boolean Formula SAT Solver satisfiable/unsatisfiable Encoding Methods Difference Logic Formula Small Domain Encoding (SD) Per-Constraint Encoding (PC)
x x x+1 x+1 y y z z Values increase Small Domain Encoding (SD) [Bryant, Lahiri, Seshia, CAV’02] x y y z z x+1 0x1x00y1y00y1y00z1z00z1z00x1x0+1 Observation: To check satisfiability, need to consider all possible relative orderings of finitely-many expressions • Can use Boolean encoding of finite range of values • 4 values in this case, so 2-bit encoding
e1 x y y z e2 e1 e2 e3 e3 z x+1 Overall Boolean Encoding e1 e2 e4 New Difference Predicate e4 x z e4 e3 Transitivity Constraints Per-Constraint Encoding (PC) [Strichman, Seshia, Bryant, CAV’02] x y y z z x+1
Method • Boolean Encoding Size Example: N = 6813 • PC • > 1000000 • SD • 54465 Size of Boolean Encoding: SD better than PC • Let N be size of original difference logic formula • Size of a directed acyclic graph representation • SD encoding size is worst-case O(N2) • PC encoding size is worst-case O(2N) • Can generate O(2N) transitivity constraints
Impact on SAT problem: SD vs PC • Experimentally compared zChaff performance on SD and PC encodings of several unsatisfiable formulas • Sample result: PC better than SD for zChaff
How to Choose Encoding • Hybrid Strategy • Partition variables into classes • Which ones are compared to each other • For each class, choose encoding method • PC except SD when PC blows up • How to Determine Whether PC Will Work • Try to predict based on formula characteristics • Number of constraints, density, … • Selection procedure trained by machine learning
Some Lessons We’ve Learned About Decision Procedures • Preserve Boolean Structure • Other approaches require collapsing to conjunctions of predicates (or extracting them dynamically) • Exploit Problem Characteristics • Sparseness • Polarity structure • Let SAT Solver Do the Work • Eager encoding: provide sufficient set of constraints to prove / disprove formula • They are good at digesting large volume of information
Invariant Checking Revisited • Prove Unsatisfiability of Formula x1x2…xk (x1…xk) (y1…ym) • General Form: X(X) (Y) • Quantifier Instantiation • Generate expressions E1(Y), …, En(Y) • Using terms that appear in Q • Expand as (E1(Y)) … (En(Y)) (Y) • If unsatisfiable, then so is quantified formula • Sound, but incomplete • Trade-off • Be clever about instantiation, or • Instantiate many terms and rely on decision procedure capacity
Predicate Abstraction Revisited • Formulate as Quantifier Elimination Problem • Generate formula of form (B) S(S,B) S: Integer variables • Use Eager SAT Encoding of • Get formula AP(A,B) A: Boolean variables • Satisfying solutions for P w.r.t. B same as those for • Core problem of symbolic model checking
Quantifier Elimination for P.A. • Formula AP(A,B) A: Boolean variables • Typically: 200+ variables for A, ~20 for B • BDD-Based • Use partitioning techniques developed for symbolic model checking • Typically too many total Boolean variables • SAT Enumeration • Find satisfying solution (A) (B) to P • Enumerate solution (B) • Reformulate P as P (B) • Performance: about 1000 solutions / second
Why Verification Tasks Feasible • CLU Logic Fairly Simple • Equality, uninterpreted functions, difference constraints • Small model property • “Deep” Reasoning Not Required • Formulas large and messy, but straightforward • Verifying systems that are designed to have constrained behaviors • Only checking effect of a few cycles of system operation
Decision Procedures Revisited • SAT-Based Approaches Effective • Good performance as decision procedures • Key to implementing predicate abstraction • Quantifier elimination • Eager Encoding Gives Good Performance • Avoids many iterations of theory-specific checkers • Extends to linear integer arithmetic • Seshia & Bryant, LICS ‘04 • Quantifier-free Presburger • Small domain encoding exploiting sparseness
Areas of Research • Bit-Vector Decision Procedures • True model for hardware & low-level software • Bit-field extraction • Bit-wise Boolean operations • Overflow effects • Automatically apply abstractions • Abstract to symbolic terms whenever possible • Boolean Quantifier Elimination • SAT enumeration still not good enough • Limits predicate abstraction to ~25 predicates • Core problem for symbolic model checking