* Carnegie Mellon University † IBM

Exploiting Positive Equality in a Logic of Equality with Uninterpreted Functions Randal E. Bryant* Steven German† Miroslav Velev* *Carnegie Mellon University †IBM http://www.cs.cmu.edu/~bryant

Outline • Application Domain • Verify correctness of a pipelined processor • Based on Burch-Dill correspondence checking • Burch & Dill CAV ‘94 • Verification Task • Abstracted representation of data manipulation • Must decide validity of formula in logic of Equality with Uninterpreted Functions (EUF) • New Contribution • Exploit properties of formulas to reduce verification complexity • Significant performance improvement when modeling microprocessor operation

IF/ID ID/EX EX/WB PC Control Control Op Instr Mem Rd Ra = Adat Reg. File ALU Imm +4 = Rb Microprocessor Modeling • Simplified RISC pipeline • Described at RTL level • Words viewed as bit vectors • Bit-level functionality Bdat

 x p T F x T F ITE(p, x, y) x x T F T F y x y y y Abstracting Data x0 x1 • View Data as Symbolic “Terms” • No particular properties or operations • Except for equations: x = y • Can store in memories & registers • Can select with multiplexors • ITE: If-Then-Else operation x2 xn-1

IF/ID ID/EX EX/WB PC Control Control Op Instr Mem Rd Ra = Adat Reg. File ALU Imm +4 = Rb Abstraction Via Uninterpreted Functions • For any Block that Transforms or Evaluates Data: • Replace with generic, unspecified function • Assume functional consistency x = y f(x) = f(y) F3 F2 F1

e 1 f T F Ù e Ø 0 = x f 0 T Ú = F T F d 0 Decision Problem • Logic of Equality with Uninterpreted Functions (EUF) • Domain Values • Solid lines • Uninterpreted functions • If-Then-Else operation • Truth Values • Dashed Lines • Uninterpreted predicates • Logical connectives • Equations • Task • Determine whether formula is universally valid • True for all interpretations of variables and function symbols

Some History • Ackermann, 1954 • Quantifier-free decision problem can be decided based on finite instantiations • Automatic Theorem Proving • Tradition of using uninterpreted functions when modeling hardware • E.g., Warren Hunt, 1985 • Burch & Dill, CAV ‘94 • Automatic decision procedure • Davis-Putnam enumeration • Congruence closure to enforce functional consistency • Verified single-issue DLX • Simple 5-stage RISC pipeline • Becomes less effective for more complex processors • Burch, DAC ‘96 & FMCAD ‘96

Previous Attempts to Use BDDs • Hojati, et al., IWLS ‘97 • Generate binary encodings of limited-range integer variables • Hit exponential blow-up • Goel, et al., CAV ‘98 • Encode equality relation among variables as propositional variables • Results not compelling • Velev & Bryant, FMCAD ‘98 • Work with modified RTL model • Replace memory & function blocks with special behavioral blocks • Exponential blow-up for processor with branch or load/store instructions

Why Did BDDs Fail? • Result of Load instruction used in address computation • Similar effect for branch instruction • Impossible to have good BDD variable ordering • Variables encoding addresses must precede those encoding data • Leads to circular constraints on ordering Data Memory Address Data Address Data Pipeline Logic

Ø = g h Ú = g h g x y Decision Problem Example #1

EUF Syntax • Logic of Equality with Uninterpreted Functions • Terms ITE(F, T1, T2) If-then-else f (T1, …, Tk) Function application • Formulas F, F1F2, F1F2 Boolean connectives T1 = T2 Equation p (T1, …, Tk) Predicate application • Special Cases v Domain variable (order-0 function) a Propositional variable (order-0 predicate)

PEUF Syntax • Logic of Positive Equality with Uninterpreted Functions • Formulas (General) F, F1F2, F1F2 GT1 = GT2 p (PT1, …, PTk) • P-Formulas (Special) F PF1PF2, PF1PF2 PT1 = PT2 • Key Properties • P-formulas cannot be negated & cannot control ITEs • P-terms only used as funct. args. and in positive equations • Applications of p-function symbols occur only in p-terms • G-Terms (General) ITE(F, GT1, GT2) fg(PT1, …, PTk) • P-Terms (Special) GT ITE(F, PT1, PT2) fp(PT1, …, PTk)

Analyzing Example #1 Formulas Ø = • P-Function Symbols g, h • G-Function Symbols • Appear in negated equation x, y g h Ú G-terms = P-formulas g h g P-terms x y

= g h T F = g h g x y Example #2

= g h T F = g h g x y Analyzing Example #2 • ITE control must be formula • “Interesting” things happen when false Formula G-terms P-formula P-terms

Ø = g h Ú = g h g x y Maximally Diverse Interpretations • P-Function Symbols • Equal results only for equal arguments • G-Function Symbols • Potentially yield equal results for unequal arguments • Property • Formula valid only if true under all maximally diverse interpretations Terms Equal? xyPotentially g (x) g (y)Only if x= y g (x) yNo g (g (x)) g (y)No g (g (x)) g (x)No

Ø = g h Ú = g h Create Worst Case for Validity • Falsify positive equation Create Worst Case for Validity • Falsify positive equation • Function applications yield distinct results Create Worst Case for Validity • Falsify positive equation • Function applications yield distinct results • Function arguments distinct g x y Justification of Maximal Diversity Property • Key Argument • For every interpretation I, there is a maximally diverse interpretation I such that I[F] I[F]

IF/ID ID/EX EX/WB PC Control Control Op Instr Mem Rd Ra = Adat Reg. File ALU Imm +4 = Rb Equations in Processor Verification • Data Types Equations • Register Ids Control stalling & forwarding + Addresses for register file • Instruction Address Only top-level verification condition • Program Data Only top-level verification condition

a a a a = = = 1 2 3 d d d 1 2 3 T T T F F F f M Modeling Memories • Conventional Expansion of Memory Operations • Effects of writes represented as nested ITEs • Initial memory state represented by uninterpreted function fM Write(a1, d1); Write(a2, d2); Write(a3, d3); Read(a) • Problem • Equations over addresses control ITEs • Addresses must be g-terms • OK for register file, but not for data memory

fr Rdata Raddr Memory State fu Waddr Wdata Data Memory Modeling • Generic State Machine • Memory state represented as term • Initial state given by variable vM • Write operation causes arbitrary state change • Uninterpreted function fu • Read operation function of address & state • Uninterpreted function fr Read Write

a f r v M a f a f a f u u u 1 2 3 d d d 1 2 3 Data Memory Modeling (Cont.) • No equations over addresses! • Can keep as p-terms • Limitations • Does not capture full semantics of memory • Only works when processor preserves program order for: • Writes relative to each other • Reads relative to writes Write(a1, d1); Write(a2, d2); Write(a3, d3); Read(a)

Function Symbols in Processor Verification • G-Function Symbols • Register Ids • 20--25% of function applications • P-Function Symbols • Program data • Data & instruction addresses • Opcodes • 75--80% of function applications • Effect • Breaks dependency loop that caused exponential blow-up

Ø = g h Ú = g h g x y Decision Procedure • Steps • Eliminate function applications • Assign limited ranges to domain variables • Encode domain variables as bit vectors • Translate into propositional logic

f vf1 x1 = f x2 T F vf2 = = x3 f T F T F vf3 Eliminating Function Applications • Replacing Application • Introduce new domain variable • Nested ITE structure maintains functional consistency

vf1 x1 = = iff x1=x2 x2 T F vf2 Exploiting Positive Equality • Property • P-function symbol f • Introduce variables vf1, …, vfn during elimination • Consider only diverse interpretations for variables vf1, …, vfn • vfiv for any other variable v • Example • Assuming vf1vf2 :

 =  F x1 vf1  = vf2 x2 f f Compare: Ackermann’s Method • Replacing Application • Introduce new domain variable • Enforce functional consistency by global constraints • Unclear how to generate diverse interpretations

Ø = Ø = T h g h Ú F Ú = = g = h h = T g T F F x y v g v g v g x y 1 2 3 Eliminating Function Symbol g

Ø Ø = = = T T h Ú Ú F F = = Ù = = h T = = F T T = T T F F F F x x y y v v g g v v g g v v g g v h v h 1 1 2 2 3 3 1 2 Eliminate Function Symbol h • Final Form • Only domain and propositional variables

Ø = = T Ú F = Ù = T = F T = T F F Instantiating Variables • Can assign fixed interpretations to variables arising from eliminating p-function applications • Need to consider only two different cases • y = 0 vs. y = 1 x v g v g v g v h v h 1 2 3 1 2 {2} {3} {4} {5} {6} {0} y {0,1}

y=0 y0 y=0 2 5 ITE(y=0,2,3) y=0 F y=0 F ITE(y=0,5,6) 4 T 4 Evaluating Formula Ø = • Actual implementation uses BDD evaluation = T T Ú F = Ù = T = F T = T F F x v g v g v g v h v h 1 2 3 1 2 {2} {3} {4} {5} {6} {0} y {0,1}

Pnueli, et al., CAV ‘99 • Similarities • Examine structure of equations • Whether used in positive or negative form • Exploit structure to limit variable domains • Differences in Their Approach • Examine equation structure after function applications eliminated • Use Ackermann’s method to eliminate function applications

Ø = Ù Ø = Ø = g = h Ú Ù Ú Ø = = g h = g Ù x y Ø = Ù = Ø = x y v g v g v g v h v h 1 2 3 1 2 Ackermann’s Method Example • Many more equations • 2  8 • P-formula / P-term structure destroyed 

Comparison to Pnueli, et al. • Relative Advantage of Their Method • Better at exploiting equation structure among g-terms • Worse at exploiting structure among p-terms

Experimental Results • Verify Modified RTL Circuits • Replace memories, latches, and function blocks by special functional models. • Bryant & Velev, FMCAD ‘98 • Small modification to generate fixed bit patterns for p-function block • Simplified MIPS Processor • Reg-Reg, and Reg-Immediate only Before: 48 s / 7 MB After: 6 s / 2 MB • RR, RI + Load/Store Before: Space-Out After: 12 s / 1.8 MB • RR, RI, L/S, Branch Before: Space-Out After: 169 s / 7.5 MB

Conclusion • Exploiting Positive Equality • Greatly reduces number of interpretations to consider • Our function elimination scheme provides encoding mechanism • Enables verification of complete processor using BDDs • Ongoing Work • New implementation using pure term-level models • Velev & Bryant, CHARME ‘99 • Single-issue DLX now takes 0.15 s. • Dual-issue DLX takes 35 s.

* Carnegie Mellon University † IBM