360 likes | 476 Views
Verification Based on Run-Time, Field-Data, and Beyond. Séverine Colin Laboratoire d’Informatique (LIFC) Université de Franche-Comté-CNRS-INRIA Leonardo Mariani Dipartimento di Informatica, Sistemistica e Comunicazione (DISCo) Università di Milano Bicocca Tope Omitola
E N D
Verification Based on Run-Time, Field-Data, and Beyond Séverine Colin Laboratoire d’Informatique (LIFC) Université de Franche-Comté-CNRS-INRIA Leonardo Mariani Dipartimento di Informatica, Sistemistica e Comunicazione (DISCo) Università di Milano Bicocca Tope Omitola Computer Laboratory University of Cambridge, UK
Outline • Traditional Run-Time Verification Techniques • checking properties on execution data at run-time • Test and Verification Techniques based on Field-Data • gathering execution data to increase effectiveness of (off-line) test and verification techniques • Discussion on Test, Verification and Model-Checking • Conclusions
Run-Time Verification Techniques • Basic idea : to extract an execution trace of an executing program and to analyze it to detect errors • To check classical error pattern (data races, deadlock) • To verify a program against formal specification
Data races detection • Data race: two concurrent threads access a shared variable and at least one access is a write in same time • Eraser tool dynamically detects data races • To enforce every shared variable is protected by some lock • Eraser algorithm is used by PathExplorer, Visual Thread
Deadlock Detection • Deadlock: to occur whenever multiple shared resources are required to accomplish a task • A model representation of the program is constructed during the program execution • Deadlock: circularity in the dependency graph • Used by VisualThread and PathExplorer
Monitoring and Checking (MaC) • System requirements are formalised • Monitoring script is constructed: • to instrument the code • to establish a mapping from low-level information into high-level events • At run-time, generated events are monitored for compliance with the requirements specification
MaC: Events and Conditions • Events occur instantaneously during the system execution • Conditions are information that hold for a duration of time • Three-valued logic: true, false, undefined • PEDL (Primitive Event Definition Language): language for monitoring scripts • MEDL (Meta Event Definition Language): language for safety requirements
PathExplorer (1/2) • Instrumentation module (using Jtrek): it emits relevant events • An interaction module: send events to observer module • An observer module: it verifies the requirement specification
PathExplorer (2/2) • Requirements are written using past LTL (Monitoring operators are added: ↑F, ↓F, [F,F)S, [F,F)w • Use the recursive nature of past time temporal logic: the satisfaction relation for a formula can be calculated along the execution trace looking only one step backwards (see our paper for the algorithm)
T&V Techniques based on Field-data • Field-data: “run-time data collected from the field” • Why collecting field data for Test and Verification? • limited knowledge about the final system, • e.g., sw components are usually developed in isolation, assembled with third-party components and, finally, deployed in unknown environments • uncertaintyof the final environment • e.g., in the case of ubiquitous computing, pervasive computing, mobile computing, and wireless networks, it is not possible to predict in advance every possible situation • dynamic environments • e.g., in the case of mobile code, self-adaptive systems and peer-to-peer systems, resources suddenly appear and disappear
Existing Approaches • Field-data has been collected for: • Evaluating usability of an application (usability testing) • Modelling usage of the system • which components, modules and functionalities are used? • Learning properties of the implementation • Modelling program faults • which failures have been recognized on the target system?
Evaluating Usability • Traditionally, data for usability testing has been gathered by running testing sessions • Novel approaches: silent data-gathering systems • Automatic Navigability Testing System (ANTS) [Rod02] • Web Variable Instrumented Program (Webvip) [VG] • Gamma System [OLHL02]
http://... server agent communication http://... upload session file client-side agent Silent Data-Gathering Systems (1/2) ANTS Webvip ANTS server Data server user’s actions user’s actions script multimedia content
Silent Data-Gathering Systems (2/2) Gamma figure appeared in [OLHL02]
Modelling Usage of the System (1/2) • for performing system-specific impact analysis • Law and Rothermel’s impact analysis [LR03] • the program is instrumented to produce execution traces representing the procedure-level execution flow, e.g., MBrACDrErrrrx • the impacted set for procedure P is computed by selecting procedures that are called by P and procedures that are in the call stack when P returns • Orso et al.’s impact analysis [OAH03] • entity-level instrumentation: an execution trace is a sequence of traversed entities • a change c on entity e potentially affects all entities of traces containing e • the impact set is given from the intersection between the potentially affected entities and the result of a forward slicing with variable used on change c as slicing criterion
Modelling Usage of the System (2/2) • Information from impact analysis can be used in regression testing • Orso et al’s regression testing [OAH03] • entity-level instrumentation • test suite T’ is initialized with all test cases contained in existing test suite T traversing the change • T’ is augmented with test cases covering uncovered impacted entities computed with Orso et al’s impact analysis technique • test suite prioritization is performed by privileging test cases covering more impacted entities • for increasing confidence of the program • Pavlopoulou and Young’s perpetual testing [PY99] • normal executions are considered as tests • instrumentation measures statement coverage of uncovered blocks, even in the final environment • the program can be iteratively generated to reduce instrumentation
Learning Properties (1/2) • Automatic synthesis of properties/invariants • Ernst et al’s approach [ECGN01] • initially, a large set of invariants is supposed to hold over monitored variables • each execution can falsify some invariants. Falsified invariants are deleted • for each of true invariants is computed the probability that it “randomly holds“ • if this probability is below a given threshold the invariant is accepted • synthesized properties are defined by the set of accepted invariants • Automatic synthesis of programs • Many approaches from machine learning, but they learn very simple functions • Lau et al’s approach [LDW03] • it is still simple, but it learns small computer programs • based on accurate execution traces and programming constructs
Learning Properties (2/2) • Synthesized properties, invariants and programs can be used to • check the implementation with respect to the specification • verify safety of updates (in terms of components’ replacements) • Ernst at al. approach has been used to verify Pre-cond, Post-cond and Inv corresponding to implemented services when replacing components [ME03] • derive test suites • provide to the programmer confidence over the implementation
Test, Verification and Model-Checking (TVM) • Evolution of Testing, Model Checking, and Run-time Verification • Will mention their advantages and disadvantages • Mention future research agenda • Conclusion
TVM • It started with “The Software Crisis” [NATO, 1968] • Led to calls for software “Engineering” [Bauer, 1968] • Focus on methodology for constructing software (e.g. Structured Programming [Dijkstra, 1969]; Chief Programmer Team [Harlan Mills @ IBM, 1973])
TVM • Higher level languages viewed as panacea (C, Java, ML, Meta-ML) • Buggy software was still being produced • Focus shifted to detecting and preventing mistakes during software construction --- Testing
TVM - Testing • 2 main approaches to Testing: Reliability Growth Modelling (RGM) and Random Testing • In RGM, program is corrected, tested, fails, corrected, tested again, goes on many times • MTBF (Mean Time Between Failure) entered into a mathematical model derived from previous experiences
TVM - Testing • When the model indicates a very long MTBF, we stop testing, and ship product • Pitfalls of RGM: • Very tenuous (weak) link between past development processes and the current one • Correction of a bug can introduce new bugs, which reduces dependability, and
TVM - Testing • Industrial practice found you need extremely large amounts of failure-free testing • Thereby not cost-effective • Random Testing: test cases are selected randomly from a domain of possible inputs • Advantages of Random Testing over RGM: • Random, therefore non-automatable, you are more likely to find errors, and
TVM - Testing • Random testing draws on tools from information theory to analyse results • Pitfalls of Random Testing: • Distribution of random test cases may not be the same as real usage of system • Random testing takes no account of program size, a 10-line program treated the same as a 10000-line program
TVM - Program Review • Buggy software was still being produced • Another panacea tried was Program Review (Software Inspection) • Depends on humans making the right decisions • Fallible on human errors
TVM - Program Proving (Theorem Provers) • Solution then became Formal Deductive Reasoning – Program Proving • Automated Theorem Provers (e. g. Isabelle [Camb]) developed to prove programs • A main problem with theorem provers is the impracticality of proving all layers of the system from software programs to hardware to circuits
TVM - Model Checking • Alternative approach to theorem provers is model checking • In model checking, specification for a system is expressed in temporal logic, and the system is modelled as graph of finite state transitions, and a model checker checks whether the graph matches the temporal logic specification
TVM - Model Checking • Advantages over theorem provers: • Algorithmic, so the user need only to press a button and wait for the result while in theorem provers, a user may need to direct the theorem prover to find a solution • Gives counterexamples if formula is not satisfied
Model Checking • Disadvantage of model checking: • Computational complexity, and • Some information about the system is lost when you turn a system with an infinite number of states to a finite number • There are calls for Run-Time Verification of software
TVM - Run-Time Verification (RTV) • Some ideas of this were presented above. • Observations of some RTV tools: • Simply debuggers with fancy features • Or they provide good tracing mechanisms • Encouraging observations of RTV tools: • Some use LTL (or extensions) to describe the program monitor
TVM - RTV • Some use LTL as the basis for a Property Specification Language, such as PEDL, MEDL • May be used as a basis for understanding and for theory
Call to Arms - Future Research Agenda • We need a Theory of Testing • Such theory should integrate good aspects of testing, model checking, and run-time verification • I shall mention some approaches (references in our paper)
Some Approaches to Theory of Testing • Type Systems/Abstract Interpretation • Work from compiling and type systems directed towards optimisation of code can provide good information to direct selection of test cases • Polymorphism and linearity can help • Very little work so far on Semantics of Testing (encouraging work from this workshop)
Some Approaches to Theory of Testing • Developing semantic structures (e.g. of domain) that facilitate testing may be something to look at • Semantics of A.I. Planning to provide a basis for semantics of run-time verification (ref. in our paper) • Domain theory in concurrency to provide semantics for distributed system testing (ref. in paper)
Conclusions • Call to arms for theory builders and tool builders • Come up with good theories and better tools • Provide tools for software professionals to use for system specification, design, build, test, audit, monitor systems • Let’s do it !!!