1 Tel Aviv University 2 University of Massachusetts-Amherst 3 University of Wisconsin-Madison

Simulating Reachability using First-Order Logic with Applications to Verification of Linked Data Structures Tal Lev-Ami1, Neil Immerman2, Tom Reps3, Mooly Sagiv1, Siddharth Srivastava2 and Greta Yorsh1 1 Tel Aviv University 2 University of Massachusetts-Amherst 3 University of Wisconsin-Madison CADE 2005

Applications of TC in verification • Transitive closure is natural for reasoning about linked data structures • Element (v) of a list (pointed to by x) • w. x(w)n*(w,v) • Acyclicity •  v1,v2.n(v1,v2)  n*(v2,v1) • Unreachable objects (garbage) • v2.v1. Var(v1)  f*(v1,v2) • Deadlocks

Automated reasoning for FOL • Powerful tools available for automated reasoning in FOL (with equality) • Resolution • SPASS, Vampire, … • Nelson-Oppen • Simplify, Zapato, … • … • Prove, disprove (or diverge)

What about FOL+TC? • No known tools for automated reasoning in full FOL+TC • No surprise – TC is very powerful, even small fragments of FOL become undecidable with the addition of TC • C2,  • No R.E. axiomatization of TC in FOL

Agenda • Verifying heap-manipulating programs • Initial axiomatization • Induction axiom scheme • Automating axiom instantiation • Conclusion

Verifying heap-manipulating programs • Heap objects: Individuals • Reference variables: Unary relation symbols • x(v), y(v) – if v is pointed to by x, y • Fields: Binary relation symbols • n(v,w) – the n field of v points to w

Reflexive transitive closure • n*(v1,v2) • v2 is reachable from v1 by following 0 or more n-fields • n*(v1,v2) is the least fixed point of ntc in • v1,v2.ntc(v1,v2)↔(v1=v2)w.n(v1,w)ntc(w,v2) or • v1,v2.ntc(v1,v2)↔(v1=v2)w.ntc(v1,w)n(w,v2)

Verification example • A list pointed to by x • A list pointed to by y • Show that • xy the lists are disjoint

Premise • Unary reachability (shorthand) • v. rz,n(v) ↔w.z(w)n*(w,v) • No heap sharing • v,v1,v2.n(v1,v)n(v2,v)v1=v2 • No incoming edges to x and y • v,w. x(v)  y(v) n(w, v) • x and y are unique and different • v1,v2.x(v1)x(v2)v1=v2 • v1,v2.y(v1)y(v2)v1=v2 • v. (x(v)y(v))

Goal • The lists pointed to by x and y are disjoint • v. rx,n(v) ry,n(v)

Approximating TC in FOL • Extend vocabulary with new binary relation symbol ntc • Replace all occurrences of n* with ntc • Add ‘Natural’ axioms • v1,v2.ntc(v1,v2)↔(v1=v2)w.n(v1,w)ntc(w,v2) • v1,v2.ntc(v1,v2)↔(v1=v2)w.ntc(v1,w)n(w,v2) • The problem – minimality • Least fixed point is not expressible in FOL

TC-models • TC-model - a model M s.t. • if n and ntc are in the vocabulary of M, then • (ntc)M = (nM)*, i.e., M interprets ntc as the reflexive, transitive closure of its interpretation of n • A set of axioms (axiomatization)  is • TC-valid - if is true in every TC-model. • TC-complete - if for every formula  that is true in all TC-models,  

Approximating TC in FOL • Natural axiomatization is TC-complete for acyclic finite models • Not TC-complete otherwise • Negative occurrences of TC are the problem • TC-valid formulas with only positive occurrences of TC are implied from the natural axiomatization

Problems: cycles ntc ntc ntc ntc ntc u1 u3 ntc ntc n n ntc n ntc n ntc ntc u2 u4 ntc ntc ntc ntc n*  ntc n*=ntc TC-model v1,v2.ntc(v1,v2)↔(v1=v2)w.n(v1,w)ntc(w,v2) v1,v2.ntc(v1,v2)↔(v1=v2)w.ntc(v1,w)n(w,v2)

Problems: infinite models x … n n n … … n n n n n y … n n n n TC-model x n*=ntc … n n n n y n*ntc

Problems: infinite models • Existing FOL theorem provers cannot be restricted to finite models • Finiteness is not FOL expressible

Induction axiom scheme • IND[P,Z,n] = (w. Z(w)  P(w))  (w1,w2. P(w1)  n(w1,w2)  P(w2))  (w1,w2. Z(w1)  ntc(w1,w2)  P(w2)) • Incomplete • Complete axiomatization is non-R.E. • How to choose Z and P?

Choosing axiom instantiations • Hard to find Z and P to instantiate IND directly • Introduce new axiom schemes provable from IND in FOL • Add enough axioms to  to prove target formula • Used in practice to prove interesting examples

Ideas towards solution • Reasoning about edges toward reasoning about paths • Reasoning about one type of paths toward reasoning about another type

Coloring axioms • Start with transitivity • w1,w2,w3. ntc(w1,w2)ntc(w2,w3) ntc(w1,w3) • Add instances of coloring axiom schemes • NoExit • NewStart

NoExit A • NoExit[A,n] = (w1,w2. A(w1)  n(w1,w2)  A(w2))  (w1,w2. A(w1)  ntc(w1,w2)  A(w2))

y … n n n … … n n n n n x … n n n n TC-model y n*=ntc … n n n n x n*ntc

u’ n = ntc ntc w u x = ¬ntc n v Example Revisited • Two lists pointed to by x and y respectively • NoExit[rx,n,n] • Axiom Premise v1,v2. rx,n(v1)n(v1,v2) rx,n(v2)

Example revisited • Two lists pointed to by x and y respectively • NoExit[rx,n,n] • Axiom Premise v1,v2. rx,n(v1)n(v1,v2) rx,n(v2) v1,v2. rx,n(v1)  ntc(v1,v2) rx,n(v2)  disjointness: v. rx,n(v) ry,n(v)

NewStart A g f

gtc gtc NewStart A g f gtc ftc w1,w2. A(w1)A(w2)g(w1,w2)f(w1,w2)

gtc gtc NewStart A g f gtc ftc • NewStart[A,g,f] = (w1,w2. A(w1)A(w2)g(w1,w2)f(w1,w2))  w1,w2. gtc(w1,w2)ftc(w1,w2)  w.A(w)gtc(w1,w)gtc(w,w2)

NewStart • Important when updating fields • Prove no fields changed within A • Prove no incoming or no outgoing paths to A • Conclude no paths changed within A

Instantiating coloringaxiom schemes • Coloring axioms are effective only if they can be automatically instantiated • Verification of imperative programs • Use boolean combinations of program variables and unary reachability • Exponential number of axioms

Incremental algorithm • Axioms are built as PremiseConclusion • Both closed formulas • Try to prove Premise and only then introduce Conclusion • Try boolean combinations in BFS

Prototype implementation • Used to automatically prove partial correctness (given loop invariants) of several interesting programs • Destructive reversal of singly linked list • Destructive append • Simple mark & sweep garbage collector • Use SPASS as underlying theorem prover

Completeness • TC-complete with respect to a theory • Finiteness is expressible with TC • TC-complete axiomatization implies FINITE-VALIDITY is decidable • No R.E. TC-complete axioms with respect to logic with 2 binary relation symbols encoding partial functions

Related work • Nelson’s axiomatization [Nelson ‘83] • Incomplete and follows from IND • Mark & Sweep • Updating transitive closure using FO [Dong, Su ‘95], [Hesse ‘03] • Induction [Bundy ’01] • Inductionless induction [Lankford ‘81] [Comon ‘01] • Decidable logics with TC (e.g. MSO)

Future work • New axioms • Finiteness • END[n]: v. w. ntc(v, w)  (u. n(w, u))  (u. n(w, u)ntc(u, w)) • Fragments of FOL where axiomatization is possible • Integration with TVLA

Thank you

1 Tel Aviv University 2 University of Massachusetts-Amherst 3 University of Wisconsin-Madison

1 Tel Aviv University 2 University of Massachusetts-Amherst 3 University of Wisconsin-Madison

Presentation Transcript

Jennifer Lang- Rigal James Madison University InToSpan I , University of Massachusetts Amherst

University of Wisconsin-Madison Arboretum

UNIVERSITY OF MASSACHUSETTS AMHERST

University of Wisconsin-Madison

University of Massachusetts Amherst November 2 nd , 2011

Barry Field, University of Massachusetts—Amherst

University of Wisconsin -Madison

Mauro Giavalisco University of Massachusetts Amherst

The University of Massachusetts at Amherst

HEP Tel Aviv University

HEP Tel Aviv University

University of Wisconsin-Madison

Nilanjana Dasgupta University of Massachusetts, Amherst

University of Wisconsin Madison

UNIVERSITY OF MASSACHUSETTS AMHERST

University of Massachusetts - Amherst

University of Wisconsin - Madison

HEP Tel Aviv University

1 Tel Aviv University 2 University of Massachusetts-Amherst 3 University of Wisconsin-Madison

University of Massachusetts - Amherst

The University of Massachusetts Amherst