1 / 41

The Benefits of Exposing Calls and Returns

The Benefits of Exposing Calls and Returns. Rajeev Alur University of Pennsylvania. CONCUR/SPIN, August 2005. Software Model Checking. Control flow graph + Boolean vars (Pushdown automata). Observables. Predicate abstraction. Abstractor. Code. Model. Temporal logics/Automata

qiana
Download Presentation

The Benefits of Exposing Calls and Returns

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Benefits of Exposing Calls and Returns Rajeev Alur University of Pennsylvania CONCUR/SPIN, August 2005

  2. Software Model Checking Control flow graph + Boolean vars (Pushdown automata) Observables Predicate abstraction Abstractor Code Model Temporal logics/Automata Regular! Verifier Counter-example Specification Yes On-the-fly explicit state Symbolic fixpoint evaluation

  3. Regular specifications not expressive enough • Classical Hoare-style pre/post conditions • If p holds when procedure A is invoked, q holds upon return • Total correctness: every invocation of A terminates • Integral part of emerging standard JML • Stack inspection properties: security/access control • If a setuuid bit is being set, process root must be in the call stack • Inter-procedural data-flow analysis • An expression e is very busy at a control point p if on all paths from p, e will be used before any of its variables (possibly local in current procedure) are modified Need matching of calls with returns, or finding pending calls, or local paths --- Context-free properties!

  4. Checking Context-free Specifications • Obstcales • Context-free languages are not closed under intersection • Checking context-free properties against context-free models is undecidable • However, many such properties are verifiable • Existing work in security that handles some stack inspection properties[JMT99,JKS03] • Adding assert statements in the program (with additional local variables, if needed), and then checking regular properties (e.g. reachability) amounts to checking context-free properties • Inter-procedural data-flow analysis algorithms [RHS95]

  5. Exposing Calls and Returns • What’s common to the checkable properties? • Both model and property have their own stack, but the two stacks are synchronized and grow/shrink together! • As a generator, program exposes its calls and returns, and as an acceptor, property pushes on calls and pops on returns Formalization of this intuition: Visibly Pushdown Languages A surprisingly robust class of languages with properties like the regular languages and potentially many applications

  6. Talk Outline • Visibly Pushdown Languages • Temporal Logic CaRet and Model Checking • Ongoing Work References: • Visibly pushdown languages Alur, Madhusudan; STOC’04 • A temporal logic of nested calls and returns Alur, Etessami, Madhusudan; TACAS’04 • Congruences for visibly pushdown languages Alur, Kumar, Madhusudan, Viswanthan; ICALP’05

  7. Context-free Languages: Recap (1/2) • Given an alphabet S, a language L is a set of finite words over S • A pushdown automaton (PDA) has a finite control and a stack, and while reading a word, it can push/pop stack symbols while updating control state • Configuration of a PDA: control state + a string of stack symbols • Acceptance defined by empty stack or final state • A language L is a context-free language (CFL) if there is a pushdown automaton that accepts it • Sample CFLs • All regular languages • Set of words of the form an bn, for some n • Set of words with equal number of a and b symbols • Non-CFL: Set of words of the form an bn cn

  8. Context-free Languages: Recap (2/2) • Alternative characterization: Context-free grammars • Natural and popular for defining syntax • Nondeterministic PDAs are more expressive than deterministic ones • Emptiness of a PDA solvable in polynomial-time • Closed under union, but not closed under intersection or complementation • Language inclusion, emptiness of intersection undecidable • Applications: Parsing, Natural language processing, Program analysis…

  9. Exposing Calls and Returns • Pushdown alphabet: partitioned into 3 disjoint sets Σ = SpushSpopSlocal • Pushdown words: finite words over pushdown Σ • A visibly pushdown automaton over a pushdown alphabet Σ is a pushdown automaton that • pushes a symbol onto the stack on a symbol in Spush • pops the stack on a symbol in Spop • cannot change the stack on a symbol in Slocal Key: Stack size at any time is determined by the input wordbut not control state or stack content

  10. Visibly pushdown languages (VPL) • A language L is a VPL over a pushdown alphabet Σ, if there is a visibly pushdown automaton that accepts it (acceptance by final state) • The language {an bn | for some n} • VPL if a is in Spush and b is in Spop • Not a VPL for other partitions • The language of words with equal number of a and b symbols is not a VPL (independent of partition) • Every regular language L is a VPL independent of partitioning • Dyck language (words with well-balanced parantheses) is a VPL provided left/right parantheses are in Spush/Spop resp

  11. VPLs in Program Analysis Program Analysis bool P(u:int) { global int x; local int y; … a: if Q { x = (x+y) }; … } bool Q { local int y; if { …. y++; return 1;} else return P(x) } To figure out whether the expression e=(x+y) is very busy at program point a, Spush = {call-p, call-q} Spop = {ret-p, ret-q} Slocal = {used-e, mod-x, mod-y, skip} Executions are pushdown words, e.g. call-q, skip, mod-y, ret-q, used-e, mod-x, skip, ret-p Set of executions starting at a location a is a VPL: La Set of executions in which e is very-busy is also a VPL: Le e is very busy at a if La is included in Le

  12. VPLs for Document Processing XML Document Query Processing <conference> <name> CONCUR 2005 </name> <location> <city> San Francisco </city> <hotel> Stanford Court </hotel> </location> <sponsor> CISCO </sponsor> <sponsor> Microsoft </sponsor> … </conference> Pushdown alphabet Spush = {<name>, <location>, …} Spop = {</name>, </location>, …} Slocal = {San Francisco, Microsoft, …} A document d is a pushdown word Sample Query: Find documents related to conferences sponsored by Microsoft in San Francisco Specify query as a VPL: L Analysis: Membership question Does document d satisfy query L ? Use VPAs instead of tree automata! (typically, no recursion, but only hierarchy)

  13. Closure Properties Note: can’t combine languages wrt different partitions • Closed under intersection: Given two VPAs A and B, build a product C accepting intersection of L(A) and L(B) • State of C: (state of A, state of B) • Stack symbol of C: (stack symbol of A, stack symbol of B) • C can simulate the stacks of A and B together • Closed under union • Closed under complementation • Closed under concatenation and Kleene-* • Closed under partition-preserving homomorphisms

  14. Determinization • Given a nondeterministic VPA A, we can construct a deterministic VPA B that accepts the same language and has size exponential in A • Potentially useful for building runtime monitors for checking program executions, and online algorithms for XML query processing • VPLs are a subclass of DCFLs (languages defined by deterministic PDAs) • DCFLs not closed under union • Equivalence problem for DCFLs decidable, but complex

  15. Determinization: Sketch of the construction • Determinization of nondeterministic automata uses subset construction: a state R of B is a set of states of A (the states that A can be, having read the word w so far) • Subset construction does not apply to stack • But we can do subsets of summaries: if w is a well-matched word, (q,q’) is a summary of A on w, if A can go from (q,$) to (q’,$), where $ is stack bottom • More precisely, if w=w1c1w2c2…cnwn+1, where ci’s are calls and wi’s are well-matched words, then after reading w, determinized automaton B has • Stack is (Sn,Rn,cn),….(S1,R1,c1)$ • Control state is (Sn+1,Rn+1) • Ri = Set of all states A can be in after w1c1…wi • Si= Set of all summaries of A on the segment wi

  16. Decision Problems • Emptiness: Given a VPA A, is its language empty? • Same as for PDAs: Polynomial-time complete (cubic) • Language inclusion (or equivalence): Given VPAs A and B, is language of A contained in that of B? • Determinize B, take its complement, take product with A, and test for emptiness • Exponential-time complete • Recall: Inclusion is PSPACE-complete for (nondeterministic) finite automata, and undecidable for PDAs

  17. VPL Properties Summary Emptiness Inclusion L Regular Yes Yes Yes NLOG Pspace CFL Yes No No PTIME Undec DCFL Undec No No Yes PTIME Yes Yes Yes Exptime VPL PTIME

  18. Pushdown Words as Binary Trees Let w = i5c1i1 c2i4 i3 i3 r2c1i1 r1r1i5 i3 i5 c1 r1 i1 Stack trees i5 c2 i3 r2 i4 c1 i3 i1 r1 i3

  19. VPL: Connection to tree languages Tree-language characterization: Let L be a set of pushdown words and let ST(L) be the set of stack trees that correspond to L. Theorem: L is a VPL iff ST(L) is a regular tree language Note: It is well-known that the set of parse trees corresponding to a context-free grammar is a regular tree language

  20. Finite word automata that can jump Let w = i5c1i1 c2i4 i3 i3 r2c1i1 r1r1i5 i3 • Summary Automata • Finite-state automaton that reads pushdown word • While reading a call, can send a copy to matching return • d(q,a) is a set of pairs of states if a is in Spush • Nondeterministic summary automata are expressively equivalent to VPAs • Deterministic VPA (= VPL) > Deterministic summary automata > Deterministic tree automata (on stack trees)

  21. Robustness: Alternative Characterizations • Monadic second order logic with matching predicate • m(x,y) means x is a call and y is matching return • Sample formula: forall x. if p(x) then exists y,z. ( q(y) and x<y<z and m(x,z) ) • Thm. MSO + matching predicate interpreted over pushdown words is expressively equivalent to VPLs • Thm: Every CFL is a homomorphic image of a VPL • Context-free grammar based characterization • Two types of non-terminals V0 (matched words) and V1 • All productions are of the form X a if X is in V0 then a must be local X  a Y b Za is a call, b is a return, Y is in V0 if X is in V0 then Z must be in V0

  22. “Regular-like” properties continue.. • Congruences and minimization (Myhill-Nerode Theorem) central to theory of regular languages • Given a language L, for well-matched words u and v, define u ~L v iff for all words x and y, xuy in L iff xvy in L • Theorem: A language L of well-matched words is a VPL iff the congruence ~L is of finite index • Minimization • No unique minimal deterministic VPA in general, but… • Minimization of (single-entry) RSMs (i.e. procedural boolean programs) possible. Partitioning into k procedures/modules is adequate to get canonicity!

  23. ω-VPL - Extension to Infinite Words • A Büchi VPA: • VPA over infinite pushdown words • A word is accepted if along a run, the set Fis seen infinitely often • ω-VPL – class of languages accepted by Büchi VPAs • ω-VPL is closed under all Boolean operations Characterization using regular trees and MSO characterization hold. • However, ω-VPLs are not determinizable! • Let L be set of all words such that the stack is repeatedly bounded i.e. for some n, the stack depth is n infinitely often. • L is an ω-VPLbut there is no deterministic (Muller) VPA for it • Language inclusion and equivalence are still decidable

  24. Talk Outline • Visibly Pushdown Languages • Temporal Logic CaRet and Model Checking • Ongoing Work

  25. Software Model Checking Control flow graph + Boolean vars (Pushdown automata) Observables Predicate abstraction Abstractor Code Model CaRet/VPAs Verifier Counter-example Specification Yes

  26. bool bx, by; if bx { ……… by=true ……… } else { ………… by={true,false} ………. } Boolean Program Abstracting Software int x, y; if x>0 { ……. y=x+1 .…… } else { …… y=x+1 …… } bx: x>0 by: y>0 Program

  27. Abstracting Modular Programs Program Recursive State Machine (RSM)/ Pushdown automaton main() { bool y; … x = P(y); … z = P(x); … } bool P(u: bool) { … return Q(u); } bool Q(w: bool) { if … else return P(~w) } A1 A2 A2 A2 A3 A3 Box (function-calls) A3 A1 Entry/Inputs Exit/outputs

  28. Linear-time Propositional Temporal Logic Q ::- p | not Q | Q or Q’ | Next Q | Always Q | Eventually Q | Q Until Q’ Interpreted over (infinite) sequences. Models of an LTL formula is a w-regular language. Useful for stating sequencing properties: • If req happens, then req holds until it is granted: Always ( req → (req Until grant) ) • An exception is never raised: Always ( not Exception )

  29. CARET CARET: A temporal logic for Calls and Returns Expresses context-free properties A B C A …………. Global successor used by LTL

  30. CARET CARET: A temporal logic for Calls and Returns Expresses context-free properties A B C D …………. Global successor used by LTL Local successor: Jump from calls to returns Otherwise global successor at the same level

  31. CARET CARET: A temporal logic for Calls and Returns Expresses context-free properties A B C A …………. Global successor used by LTL Local successor: Jump from calls to returns Otherwise global successor at the same level

  32. CARET CARET: A temporal logic for Calls and Returns Expresses context-free properties Local path A B C A …………. Global successor used by LTL Local successor: Jump from calls to returns Otherwise global successor at the same level

  33. CARET CARET: A temporal logic for Calls and Returns Expresses context-free properties A B C A …………. Global successor used by LTL Local successor: Jump from calls to returns Otherwise global successor at the same level Caller modality: Jump to the caller of the current module Defined for every position except top-level ones

  34. CARET CARET: A temporal logic for Calls and Returns Expresses context-free properties A B C A Caller path gives the stack content! …………. Global successor used by LTL Abstract successor: Jump from calls to returns Otherwise global successor at the same level Caller modality: Jump to the caller of the current module Defined for every position except top-level ones

  35. CARET Definition Syntax: Q ::- p | not Q | Q or Q’ | Next Q | Always Q | Eventually Q | Q Until Q’ Local-Next Q | Local-Always Q Local-Eventually Q | Q Local-Until Q’ Caller Q | CallerPath-Always Q CallerPath-Eventually Q | Q CallerPath-Until Q’ • Local- and Caller- versions of all temporal operators • All these operators can be nested

  36. Expressing properties in Caret Pre-post conditions: If P holds when A is called, then Q must hold when the call returns Always ( (P and call-to-A) Local-Next Q ) Q P A Integrating Manna/Pnueli-style reasoning for reactive computations with Hoare-style reasoning for structured programs

  37. Expressing properties in Caret If A is called with low priority, then it cannot access the file Always ( call-to-A and low-priority Local-Always ( not access-file ) ) A low-priority A high-priority access-file

  38. Expressing properties in Caret Stack inspection properties If a variable x is accessed, then A must be on the call stack Always ( access-to-x CallerPath-Eventually call-to-A ) A access-to-x

  39. Model checking CARET • Given: A (boolean) recursive state machine/ visibly pushdown automaton M A CARET formula Q • Model-checking: Do all runs of M satisfy the specification Q? CARET can be model-checked in time that is polynomial in M and exponential in Q. |M|3 . 2O(|Q|) Complexity class same as that for LTL ! Generalization of Vardi-Wolper construction

  40. Model-checking CARET: Intuition • The specification matches calls and returns of the program, so the runs of the program and models of the formula are both visibly pushdown languages • Given M and formula Q, • Build a Buchi pushdown automaton that accepts words exhibited by M that satisfy (not Q) • Check this pushdown automaton for emptiness • Construction builds on the classical tableaux for LTL Local-Next Q1 Pop s and Q1 ; Check Q1 s Push s and Q1 s, Q1

  41. Conclusions and Ongoing Work • Exposing calls and returns lets you hide the stack! • VPLs seem robust and adequate to model software analysis problems • VPL-triggered research • Dynamic logic with VPL (Loding,Serre) • Visibly pushdown games (Loding,Madhusudan,Serre) • XML query processing (Pitcher) • Third-order Algol with iteration (Murawski,Walukiewicz) • Active area of current research • DTDs, XML, and query languages • Branching-time logics, Fixpoint calculus, and visibly pushdown tree automata (Alur, Chaudhuri, Madhusudan) • Expressive completeness of temporal operators • Implementing a model checker for VPL monitors

More Related