670 likes | 681 Views
Runtime Verification and Software Fault Protection with Eagle. Allen Goldberg Klaus Havelund Kestrel Technology Palo Alto, USA. Overview. Run-time Monitoring About E AGLE Software Fault Protection Summary. Motivation for Runtime Verification.
E N D
Runtime Verification and Software Fault Protectionwith Eagle Allen Goldberg Klaus Havelund Kestrel Technology Palo Alto, USA
Overview • Run-time Monitoring • About EAGLE • Software Fault Protection • Summary
Motivation for Runtime Verification • Model checking and Theorem Proving are rigorous, but: • Not fully automated: • model creation is often manual • Abstraction is often manual in the case of model checking • Lemma discovery is often manual in the case of theorem proving • Therefore: not very scalable • Applied Testing is scalable and widely used, but is ad hoc: • Lack of formal coverage • Lack of formal conformance checking • Combine Formal Methods and Testing
Run-time Verification • Combine temporal logic specification and testing: • Specify properties in some temporal logic. • Instrument program to generate events. • Monitor properties against a trace of events emitted by the running program.
A Model-Based Verification Architecture Model test & property generation test inputs behavioral properties Implemented system under test instrumentation Rqmts event stream reports dispatch Deadlock Dataraces Observer instrumentation
Static and DynamicAnalysis Specification Test case generation Runtime verification Program Input Output Program instrumentation
So many logics … • What is the most basic, yet, general specification language suitable for monitoring? EAGLE is our answer.
Logics • Assertions (Java 1.4) • Pre-post conditions, invariants (Eiffel, JML) • Temporal logic (Mac, Temporal Rover) • Time lines (Smith, Dillon) • Live Sequence Charts (Harel) • Automata (Buchi, Alternating, Timed) • Regular expressions (Rosu, Lee) • Process algebra (Jass) • Mathematical modeling/programming languages (Alloy, Prolog, Maude, Haskell, Specware)
F1 F2 #F @F now Eagle’s Core Concepts • Finite trace monitoring logic • Prop. logic + Three temporal connectives: • Next: @F • Previous:#F • Concatenation: F1;F2 • Recursive parameterized rules : Always(Term t) = t /\ @Always(t) . Each state is a environment mapping variables to values
Introducing EAGLE Can encode: • future time temporal logic • past-time logic • extended regular expressions • µ-calculus • real-time, data-binding, statistics…. • Monitoring formulas are evaluated online over a given input trace, on a state by state basis • identify failure as early as reasonably possible • Evaluation proceeds by checking facts about the past and generating obligations about the future.
library Basic Propositional Example // LTL rules: max Alws(Term t) = t /\ @ Alws(t) . min Even(Term t) = t \/ @ Even(t) . min Prev(Term t) = t \/ # Prev(t) . // Monitor: Mon M = Alws(y>0 -> (Prev(x>0) /\ Even(z>0))) . x>0 y>0 z>0 … …
Data Bindings min R(int k) = Prev(x==k) /\ Even(z == k) . // Monitor: mon M = Alws(y>0 -> R(y)) . mon Wrong = Alws(y>0 -> (Prev(x==y) /\ Even(z==y))) . x==k y>0 z==k … … k := y
x>0 y>0 x>0 y>0 Time is Just Data min Before(long t, Term F) = clock <= t /\ (F \/ @ Before(t,F)) . minWithin(long t, Term F) = Before(t+clock, F) . clock is a variable defined in the state mon M = Alws(x>0 -> Within(4,y>0)) . < 4 < 4
Automata open access init S1 S2 close max S1() = init -> @ S1() /\ open -> @ S2() /\ ~(access \/ close) . min S2() = access -> @ S2() /\ close -> @ S1() /\ ~(init \/ open) . mon M = S1() .
Timed Automata open access x := 0 init S1 S2 close x < 10 max S1() = init -> @ S1() /\ open -> @ S2(clock) /\ ~(access \/ close) . min S2(long t) = clock <= t+10 /\ access -> @ S2() /\ close -> @ S1() /\ ~(init \/ open) .
Combining Automatae and Temporal Logic open /\ Prev(init) access init S1 S2 close max S1() = init -> @ S1() /\ open -> (Prev(init) /\ @ S2()) /\ ~(access \/ close) . min S2() = access -> @ S2() /\ close -> @ S1() /\ ~(init \/ open) . mon M = S1() .
EAGLE by example: statistical logics Monitor that state property F holds with at least probability p: max Empty() = false . min Last() = @Empty() . min A(Term F, float p, int f, int t) = ( Last() /\ (( F /\ (1 – f/t) >= p) \/ (¬F /\ (1 – (f+1)/t) >= p)) ) \/ (¬Last() /\ (( F → @A(F, p, f, t+1)) /\ (¬F → @A(F, p, f+1, t+1))) . mon AtLeast (Term F, float p) = A(F, p, 0, 1) .
Property: File accesses are always enclosed by open and close operations. Regular Expressions M = (idle*;open;access*;close)* max S(Term t) = t /\ @ S(t) . // Star min P(Term t) = t /\ @ P(t) . // Plus mon M = S(S(idle());open();S(access());close()) .
Property: Locks are acquired and released nested. Grammars lock release lock lock release release // Match rule: max Match (Term l, Term r) = Empty() \/ (l;Match(l,r);r;Match(l,r)) . // Monitor: mon M = Match(lock(),release()) .
EAGLE by example: beyond regular Monitor a sequence of login and logout events – at no point should there be more logouts than logins and they should match at the end of the trace. min Match (Term F1, Term F2) = Empty() \/ F1;Match(F1, F2);F2;Match(F1, F2) mon M1 =Match(login, logout)
Some EAGLE facts • EAGLE-LTL (past and future). Monitoring formula of size m has space complexity bounded by m2 2m logm • EAGLE with data binding has worst case dependent on length of input trace • EAGLE without data is at least Context Free
EAGLE interface User defines these classes class MyState extends EagleState{ int x,y; update(Event e){ x = e.x; y = e.y; } } class Observer { Monitors mons; State state; eventHandler(Event e){ state.update(e); mons.apply(state); } } e1 e2 e3 … class Monitors { Formula M1, M2; apply(State s){ M1.apply(s); M2.apply(s); } } class Event { int x,y; }
Eagle Implementation • Built an initial prototype implementation and used it for a number of applications • test case monitoring scenario • monitoring the behavior of a planning system • High performance algorithm • pay for what you use • e.g. state machine • symbolic manipulation -> automata-like solutions
Online Algorithm • Start with the initial formula M to monitor • “see” a new state • Transform M to M’ so that M is valid on the whole trace iff M’ is valid on the remaining trace
Basic Algorithm: Future Time LTL max Alws(Term t) = t /\ @ Alws(t) . min Even(Term t) = t \/ @ Even(t) . mon M = Alws(y>0 -> Even(z>0))) . M0 = Alws(y>0 -> Even(z>0))) M1 = Alws(y>0 -> Even(z>0))) M2 = Alws(y>0 -> Even(z>0))) /\ Even(z>0) M3 = Alws(y>0 -> Even(z>0))) /\ Even(z>0) M4 = Alws(y>0 -> Even(z>0))) M5 = Alws(y>0 -> Even(z>0))) /\ Even(z>0) y=-1 z=0 y=2 z=0 y=-1 z=0 y=-1 z=2 y=2 z=0
Toward an Algorithm The previous example suggests how monitoring may be reduced to state machine execution • For propositional future time LTL by employing normal forms and simplification, the set of derivable monitors is finite. This is the idea behind generating a Buchi automata for model checking propositional future time LTL. • However Eagle has, • Past time operators • Data parameters
Past Time max Alws(Term t) = t /\ @ Alws(t) . min Prev(Term t) = t \/ # Prev(t) . mon M = Alws(y>0 -> (Prev(x>0))) . Cache: C = Prev(x>0) C0 = false M0 = Alws(y>0 -> (Prev(x>0))) c1 = false M1 = Alws(y>0 -> (Prev(x>0))) c2 = true M2 = false y=-1 x=0 y=2 x=2 y=-1 x=0 y=-1 x=2 y=2 x=0
Algorithm with Past Time • Static Analysis • determine what formulas are needed in cache • Evaluate formulas in cache in state 0 • for each state: • Evaluate monitors, referring to cache to look up values of formulas headed by # • update the cache • End of trace • evaluate monitor for “beyond the last state”
Improvements • Conjunctive normal form • Subsumption: (a \/ b) /\ a => a • Term caching • assign a unique id to each normalized term • Define various caches of the form cache:TermId -> TermId • subsumption cache • rule application cache • Normalization folded into evaluation • Ordering evaluation of conjuncts and disjuncts • Terms which do not (hereditarily) contain instances of the @ operator will fully evaluate to true/false. • Evaluate such terms first
Improvements : Automata Construction • On the fly automata construction • for terms that are conjunctions, disjunctions or rule applications dynamically create a state machine (decision tree) for evaluation. • map termId -> Automata true ? w==y true false z==y true false result is termID2 x>0 false result is termID1 ?
Improvements: Reflection Removal • Removal of use of reflection for Java expression evaluation • Alws(y>0 -> (Prev(x>0)) • static boolean greater(int a, int b) • Defined in user-defined State class • Generate during static analysis an interface class that dispatches on term id of methodCall terms to directly call the user-supplied method • all arguments must be available, otherwise it must be symbolically evaluated • must treat substitution instances
On the way to… • An Eagle compiler • static analysis of monitor specification to generate an alternating automata with no term manipulation and no use of reflection
Instrument Specification • An instrument specification is a collection of rules lhs rhs • LHS conjunction of syntactic conditions on a program point (local conditions only) • RHS set of actions that log reporting events • Each bytecode “statement” is examined to see if it satisfies RHS. Actions of all matching LHS are collected and inserted into the bytecode at that “point”.
Using AOP for instrumentation F ::=true | false | ~ F | F /\ F | f \/ F | F → F | @ F | # F | F;F | E| E => F|E & F E ::=[Nm:]Nm.Id(Nm,*)[returns [Nm]] Id ::= Java identifier Nm ::= Id[?] | *
Example Using AOP observer BufferObs{ max Always(Term t ) = t /\ @ Always(t) . min Eventually(Term t) = t \/ @ Eventually(t) . min Previously(term t) = t \/ # Previously(t) . var Buffer b; var Object o; monitor InOut = Always(put(b?,o?) => Eventually(get(b) returns o)) . monitor OutIn = Always(get(b?) returns o? => Previously(put(b,o))) . }
Software Fault Protection: Motivation To obtain higher levels of assurance and quality for safety and mission critical software. Marginal cost to remove next error 3 errors/1KLOC 10 errors/1KLOC Residual error rate per line of code
Approach • Instead of climbing the error curve, build mechanisms into the code to protect against inevitable residual software errors. • Software Fault Protection.
Three Levels of Fault Protection • Caution and Warning Systems • Warning generated for off nominal measurements of system state and resources • Autonomic response • Circuit breaker • MER low battery warning stopped repeated reboot cycle and transition into safe mode. • Model Based Diagnosis: fault detection isolation and repair • HAL reports a module will go critical in 7 days requiring EVA
Differences between System Engineering and Software • Software does not wear out, but suffers from design and coding errors. • Software has not been traditionally designed with using fault containment concepts • E.g. on MER hardware exceptions generated by out of memory faults were not explicitly dealt with. • A natural concept of component based on physicality exists for hardware. These components have known failure modes • E.g. a valve may be stuck on or a pipe may leak.
Component • Component is • Organizing notion for system composition/decomposition • Goal is to identify a failed component and possibly reconfigure components maximizing functional capability • Spectrum of component choices • Low granularity: a statement is a component. • Medium granularity : a procedure/method is a component. • High granularity: a component in a modeling framework.
Model Software Fault protection is like V&V in one important respect: • It must be cost effective to formulate model • Model must be useful in identifying and isolating faults. • Model has a delicate relationship to code: • structural + limited behavioral code synthesis, • monitor behavioral properties that are beyond synthesis. Consistency check between code and model
Approach • High granularity components • difficult to model, diagnose and repair at lower levels of granularity • Addresses higher-payoff, system of system and system integration issues. • Model • structural: component/connector models properties • interface behavior: • patterns of event interactions over connectors • real time response • data validity (pre- post- conditions) • resource usage • memory, object creation, disk, communication devices, quality of service, deadlock and other concurrency problems
Software Fault Protection Fault Detection, Isolation and Repair (FDIR) • Detection: instrument and monitor system and identify an error state • Isolation: identify the faulty component • Repair: take corrective action
Detection • Detection = property violation • Use Eagle to define and monitor propertie
Isolation (Diagnosis) • Process of moving from symptom to identification or isolation of faulty component. • Our simplistic approach: • Associate a failed component with each property.