310 likes | 425 Views
The logic-automaton connection and applications. Nils Klarlund. Michael I. Schwartzbach. Anders Møller. Mona project Initiated at the University of Aarhus (BRICS) Google: Mona. Overview. Introduction Pointer reasoning and start of project Verification of protocols
E N D
The logic-automaton connection and applications Nils Klarlund Michael I. Schwartzbach Anders Møller Mona project Initiated at the University of Aarhus (BRICS) Google: Mona
Overview • Introduction • Pointer reasoning and start of project • Verification of protocols • Fast parsing with declarative constraints over trees • What is WS1S? • The Mona tool in use • What else has been accomplished
Automata • Regular expressions automata • Useful, right? • Solves problems such as • Expressing text patterns • Expressing paths in graphs • Are regular languages limiting our use of automata? • With complement operator, regular expression emulate propositional logic! • But no quantification with REs?
QPL • Quantified Propositional Logic is fundamental in verification • Boil your problem down to QPL, then solve • A compositional framework for “modeling” phenomena (albeit of limited expressive power) • How to solve? • Use BDDs • But they are automata • Albeit not general ones, they are acyclic
Mona in Essence: Extend QPL, Tie to Automata, Solve More Problems! • WS1S is the answer • It ties the class of all automata to a logic • It becomes a vehicle for the operations • Cross product, • Determinization, • Subset construction, • Projection, • Complementation
That Verification Problems or Data Types or Invariants Can Be Expressed As Regular Languages:Not a New Idea • E.W.Dijkstra (parameterized verification) • N.D. Jones & S. Muchnik (tree grammars) • A. Gupta (parameterized hardware) • Early 60es: logic and automata for describing temporal behavior of sequential circuits
Motivation I: Pointers x y • Pointer manipulation in program is very difficult to get right • It shouldn’t be too difficult to verify that shapes in the heap stay invariant over a few operations? • No dangling pointers, all allocated memory accessible, no sharing of structures supposed to be separate • WF(store) • Let X reachable nodes from x Y reachable nodes from y • Intersection of X and Y is empty • Union of X and Y is all of the store X X
Floyd-Hoare Logic of Pointers • F = (WF(S) & S S’) => WF(S’) • Where is the transition relation that reflects pointer surgery • Is this even decidable? • Yes, because we can formulate it in WS1S through predicate transformations • So, let’s build (1994) • A decision procedure for WS1S • A tool for translating WF predicates to WS1S • That F holds takes 4 hours to calculate!
Additional work • Automatic Verification of Pointer Programs using Monadic Second-order Logic [PLDI ’97] • Pointer Assertion Logic Engine [PLDI ’01] • Related work on shape analysis • We still didn’t explain WS1S
Motivation II: Parameterized verification:Sliding Window Protocol (w. Mark A. Smith) • A sequence number is used as an acknowledgement • The windowsize is the max. number of messages in transit • We model • Unbounded queues • Unbounded channels • Dynamic window size
We must prove: What goes out is what comes in • Variables (D is a finite domain) SendBuf: Seq[D] := {}, hSendBuf: Int := 1, W: Int := choose n where (n > 0), RetranBuf: Seq[D] := {}, hRetranBuf: Int := 1, readyToSend: Bool := false, • Some variables flex in one, some even in two dimensions segment: D, seqNum: Int := 0, RcvBuf: Seq[D] := {}, hRcvB: Int := 1, sendAck: Bool := false, temp: D, transitSR: Map[Int, Mset[D]] := empty, transitRS: Map[Int, Mset[A]] := empty
What kind of code? internal prepareNewSeg(d) pre readyToSend = false /\ hSendBuf <= len(SendBuf) /\ d = SendBuf[hSendBuf] /\ len(RetranBuf) < W eff RetranBuf := RetranBuf |- d; seqNum := hSendBuf; hSendBuf := hSendBuf + 1; readyToSend := true; segment := d internal sendpktSR(d) pre readyToSend = true /\ d = segment eff readyToSend := false; transitSR := update(transitSR, seqNum, insert(d, transitSR[seqNum]))
We Note • Operations work on both ends of linear lists • We maintain pointers and length information • The rest is nitty-gritty, boring stuff • How is this related to regular languages? • The system as it evolves over time is not a regular language! • Are configurations regular languages? • Yes, if everything stretches in one dimension and • Indexing operations not ‘too complicated’ • WS1S will make this precise • Do changes to configurations, that is, operations, preserve regularity? • WS1S again can help us understand
Motivation III: YakYak---A Fast Parser With Constraints on Parse Trees • Logic notations for parsing • > 69 different Yacc-like parsers available… • So what’s new: a concise, declarative way of specifying constraints on parse trees • That also yields a fast parser
YakYak • Consider HTML • An a element denotes a text anchor • Text is in bold if inside a bold element • Here are two constraints • “For all positions p with p an a element there is not a position q below p that is an a element” • “If any part of a text within an a element is in bold then all anchor texts must be in bold”
How to Turn Such Constraints Into Automata? • Note we need tree automata • Xpath formulation possible (if parse tree was XML) • XML parse tree would be slow • Xpath query evaluation would be slow • Goal: one transition per production per constraint in a pre-computed automaton that works bottom-up • We need to go from formulas to tree automata!
What Is WS1S • Weak Second-order theory of 1Successor • First-order terms t • 0, p, t’ + 1 • Second-order terms T • Empty, P, Tunion T’, Tintersection T’ • Formulas • f & f’, ~f, f v f’ • t = t’, t < t’, t in T, b • ex2 P: f • ex1 p: f • ex0 b: f
A. Meyer’s Result • Deciding WS1S is non-elementary • No finite stack of exponentials can limit the growth • Each quantifier bumps you up one exponential • Recall: first experiment for very simple example took 4 hours to complete • Some people have suggested that we should have given up at this point • For more on this viewpoint • Google: Klarlund madman
Example var2 P,Q; P\Q = {0,4} union {1,2}; var1 x; var0 A; ex2 Q: x in Q & (all1 q: (0 < q & q <= x) => (q in Q => q - 1 notin Q) & (q notin Q => q - 1 in Q)) & 0 in Q; A & x notin P;
Mona Output A counter-example of least length (1) is: P X X Q X X x X 1 A 0 X P = {}, Q = {}, x = 0, A = false A satisfying example of least length (7) is: P X 1110100 Q X 000X0XX x X 0000001 A 1 XXXXXXX P = {0,1,2,4}, Q = {}, x = 6, A = true
A BDD Represents a Boolean Function of Boolean Variables • BDD = Boolean Decision Diagram • x1 or (x2 iff x3) • Often the diagram is very sparse
Now Formulate Algorithms • Keep automata determinized and minimal • Cross product (for & and v) • Projection for existential quantification • Subset construction for determinization • Minimization
Three and Six-valued Logic • To really make Mona work, we had to overcome spurious state space explosions • They were a direct consequences of working in an only two-valued logic! • The problem: say you want to model {green, blue, red}. You need two bits, say X and Y • 00=green, 01=blue, 10=red. Then, what is the truth status of formulas when XY=11? • For more, see [J. HOSC, to appear] • Many more tricks in [IJFCS 2002]
Applications of Mona • Debian/GNU package; also AIX package • Integrated with PVS, a leading theorem proving environment, SRI • Used as essential tool in Ph.D. theses and other research such as • natural language processing (Ohio) • duration calculus verifier (Mumbai) • Mona as decision procedure for description logics (Dresden) • verification of parameterized systems (Kiel) • verification and reachability (Upsala) • multimedia applications (Kent) • automata-based representations for arithmetic (Santa Barbara) • Presburger arithmetic (Synopsis) • automata in control synthesis (Aarhus) • acceleration of counter automata (Cachan) • verification of structures in imperative programs (Tel Aviv) • high-level language for verification (Toulouse) • a WS2S specification language (Freiburg) • YacYac parser generator (Aarhus) • Pale pointer engine (Aarhus) • Google Mona for home page with many papers online
Explain Automatic Pointer Reasoning I • points-to(a,b) iff cell at a contains a pointer to b • This predicate is definable for a wf store (because of list/tree assumptions) • Assume we want to verify {P}S{Q} • S is straight-line code, say p^.next := x • The store after is the same as before except that the predicate points-to(a,b) has changed for a=p
Explain Automatic Pointer Reasoning II • Let Q’ be Q, rewritten to account for a=p situation • The WF property can be expressed using least-fixed points in WS1S (or WS2S) based on the points-to predicate • WF is assumed in initial store by storage layout model • So, we need to verify P Q’ & WF’
Explain Automatic Pointer Reasoning III • Sometimes we need an invariant • x is only empty if y is empty, and p points to the last element of z • (x=nil=> y=nil) & z <next*>p & (z < > nil => p^.next=nil)