240 likes | 370 Views
MOGENTES 3 rd and Final Review Reporting Period January 2010 – March 2011 Cologne, 26 May 2011. WP5 – Summary of Project Achievements Wolfgang Herzner, AIT. Further Project Results. Not (directly) covered by presentations so far. Comparison of Tools: OOAS and UPPAAL. Specifics
E N D
MOGENTES 3rd and Final ReviewReporting Period January 2010 – March 2011 Cologne, 26 May 2011 WP5 – Summary of Project Achievements Wolfgang Herzner, AIT
Further Project Results Not (directly) covered by presentations so far
Comparison of Tools: OOAS and UPPAAL • Specifics • CAS model used for both • OOAS-track: mutation-basedUPPAAL: state and transition coverage • different time models: • OOAS: timed observations, delay parameter on events • UPPAAL: timed automata • Results • 35 UPPAAL test cases (several test suites combined) • detected 88 of 110 mutants (80%, 83% if 4 equ.mutants added) • actually, only 3 test cases contributed • 15 OOAS test cases • full transition coverage with setting transition guard to FALSE • includes mutants undetected by UPPAAL test cases these stop before discriminating observable
Comparison of Tools: UML and Simulink • Specifics • CAS model used for both, as well as mutation • UML (OOAS): "requirements" coverageSimulink: "code" coverage • UML: non-det., asynchronous supp.dSimulink: synchronousUML-model modified to avoid non-determinism • Results • 5 Simulink test cases • 3 of them killed 63 of 78 UML mutants (3 mutants equivalent) • 6 UML test cases • 5 of them detected 59 of 82 Simulink mutants (7 mutants equ.) • indication: no one superset of the other
Progress over Existing TCG-Techniques • Mutation-Based Techniques • UML (example: CAS): • 35 TCs generated by UPPAAL detect 80% of 110 mutants • at least 8 TCs generated by AS detect 100% of these mutants • Simulink (example: SAC): • 10 test cases achieve 100% structural coverage • but detect only 107 of 881 mutants (~50% equivalent) • Random testing produces larger tests with less coverage • Minimal Cut Sets (MCS) – Oriented Techniques • Fault Injection (FI): MODIFI compared to other MIFI-Approaches • TCs directly utilisable for FI on real system • availability of > 30 fault models for diff. data types (bool., int., float) • more effective fault selection (~20% effective faults vs. less than 6 %) • iLock: formal verification-based MCS generation • Framework for programmable fault injection • MCS generation based on model checking (incl. heuristics) • Test case export support for different formats, including XML • Safety bag generation for validation of generated test cases
To reduce testing effort by at least 20% Considerations(beyond "saved labour effort (person hours) and / or elapsed time") Modelling versus test case writing: comparison would require same level of experience and skill, but these are difficult to find in projects like MOGENTES Increase of Coverage manually written test cases are usually "dedicated" e.g. for specific aspects, faults found earlier (regression) Quality (efficiency) of test cases MOGENTES in particular aims at detection of potential faults,but this may require more test cases than for conventional structural coverage such as transitions or branches Maintenance comparing maintenance costs of manually written test cases with those of (test/requirements) models is a long term task
To reduce testing effort by at least 20% Indications / 1 Ford (FFA): CAS example: 0,75 days TC writing vs. 0,5 days modelling 33% improvement (and higher fault coverage) at long range, total effort savings between 0% and 25% estimated but savings from any avoided recall not considered here! Re:LAB: at first glance, at least same effort for MBTCG than for manual TCG. But: manual approach: coverage of ~ 20%, testing process was slow and extremely test-operator dependent; minimal test sets shorten testing time significantly ProtoTest Director (PTD) speeds up by a factor of 6 combining MBTCG and PTD can save about 50% of testing efforts improve testing quality significantly allow use of less trained testing staff
To reduce testing effort by at least 20% Indications / 2 PROLAN: Test writing efforts: in case of failed tests, in 50-90% the newly written test case is wrong not only additional correction effort, but also psychological burden on the test team CC as complexity measure: McCabe engineer: 20 test cases / day, 1 ctrl. element / test case UML-modelling: 5 days, 182 test cases with > 2 ctrl. elem. / test case efficiency increase of about 50%modifications: 1-2 hours against 1 day before Test quality: automatically generated test also cover transitions, not only states code coverage better than with manually written tests TRSS: large set of test cases, grown over years: high maint. and exec. costs optimised automatically generated test cases would reduce both(but there was not enough opportunity to scrutinize this)
To reduce testing effort by at least 20% Indications / 3 FI: MIFI is about 10 times faster than HIFI In densely used memory areas, about 20% of all FI tests produce effects. These can be identified with MIFI: save about 80% of the HIFI time Inject only used memory significant further reduction of testing time First 2 points save about 70% of HIFI timeWith point 3, even more HIFI test cases can be avoided ProveriLock (Minimal Cut Sets): Comparison difficult due to lack of reference set of test cases Illustrating example for real interlocking model with MCS of size 2: 3481 safety requirements (generic requirements * element instances) 985 faults injected 59 safety requirements falsified 64 test cases derived Fault-based Testing: Ratios of 1:10 between test cases and detected mutants possible
To generate efficient test cases from system and fault models, using new, domain-specific coverage metrics, for functional and non-functional system prop.s of new and existing embedded systems / 1 • To generate efficient test cases from system and fault models • System models: Simulink and UML; see D5.2 .. D5.5 • Fault models: Fault-based model mutation is central MOGENTES topic, see also D3.1 • Efficiency: see discussion before • Using new, domain-specific coverage metrics • Used fault models are modelling language-specific rather than domain-specific, but their application can be semantically interpreted in the respective application domain – examples on next slide • This approach was chosen, because it is very difficult to identify domain-specific coverage metrics, which can be regarded as (sufficiently) complete and cannot be mapped to other coverage metrics. • Domain specific modelling languages would support this, but were out of scope of MOGENTES
To generate efficient test cases from system and fault models, using new, domain-specific coverage metrics, for functional and non-functional system prop.s of new and existing embedded systems / 2 • Example of domain-specific semantics of fault models requ_trainroute [#elem(disturbed) = 0] requ_trainroute [#elem(disturbed) = 0] Idle Idle AdmissibilityCheck Setup release_trainroute [#elem(disturbed) = 0] Idle AdmissibilityCheck wrong trigger causes initiation of train route checking requ_trainroute Idle AdmissibilityCheck admiss. check will be started in any case (even if it contains disturbed elements) no admiss. check at all (set up started immediately)
To generate efficient test cases from system and fault models, using new, domain-specific coverage metrics, for functional and non-functional system prop.s of new and existing embedded systems / 3 • For functional and non-functional system properties • Functional (also functional safety): addressed in UML and Simulink tracks • Non-functional: fault-tolerance and robustness addressed in the FI and iLock tracks • Of new and existing embedded systems • New: FFA's SAC, RELAB's bucket control • Existing: FFA's CAS, TRSS' ELEKTRA, PROL's ELPULT
To establish a framework for integration of involved tools, including model transformations to prepare inputs for model checkers etc., which can be easily used by domain experts • Tool integrationframework (WP2) – see D2.2c • Model transformation tools, e.g. • UML/OCL → OOAS • AS → LTS • Simulink/C → goto programs included in framework • TCG tools: integratedbymeans of tooladapters,whicheasesuse of toolsbydomainexperts
To provide traceability of requirements and match them to test analysis results • UML tracks (black box TCG): • requirement references are added to the UML model (stereotypes + tags) • each step in tool chain provides a tracing map for its input and output artefact elements (e.g. csv-file) • traceability manager of framework collects these maps and combines them into a tracing graph • this graph can be used by other tools,e.g. AIT Generic Test Benchfor mapping requirements (via mutations) to test cases • See also D2.2c
To foster application of automated testing for satisfying functional safety standards requ.s • IEC 61508 revision • model-based testing (MBT) and test case generation (MBTCG)added as recommended technique (AIT) • ISO/DIS 26262 • analogous input (AIT via Austrians Standards Institute) • SMT-Lib (Satisfiability Modulo Theories) • proposals for additional standard theories submitted (ETH): • for the theory for sets, lists and maps • for floating-point arithmetic • ISO 11783 (ISOBUS), ISO 25119, IEC 62061 (machinery sector), CEN 50126, CEN 50128 and CEN 50129, AUTOSAR • concepts of MBT and MBTCG introduced by AIT, SP, FFA and other MOGENTES partners • EWICS TC7, ERCIM, Artemis Standards WG, EPoSS domain WGs • MOGENTES outcomes presented (AIT)
In general, to increase the confidence in safety-relevant embedded systems by improving their testing and proving their conformance with safety standards / 1 • Long-term goal, not fully measurable during project • However, some results indications available: • ETH applied their TCG procedure to test the Simulink models from various industrial domains, such as avionics (AIRBUS), automotive (Toyota) and railway • RELAB: • Replacing manual by automatic TCG increases trust in the test cases, because in first case oracles are defined in close cooperation with SW-developer – violation of independency of testing from development
In general, to increase the confidence in safety-relevant embedded systems by improving their testing and proving their conformance with safety standards / 2 • PROL: • Difficult to build an independent well-motivated test team, which would like to understand the program and is strong enough to stand debates with developers • With model-based test case generation, the tool change can replace this independent test team • In the maintenance phase, the reaction time can be radically decreased (since "restarting" the test team is not more necessary), without decreasing the quality of the test • Erroneous new manually written test cases: see also slide 9 (psychological side effects)
Mutation-based TCG: State Space Expl. • Partial Order Reduction • avoid insignificant sequences of interleavings • alternative "single observable" not always possible • Symbolic (Concolic) Execution • deal with large sets of parameter values more efficiently(equivalence classes) • Time Handling • deal with timed behaviour more efficiently ("real-time ioco") • AS: currently, timed behaviour encoded as action param.s and guards • Online Testing • use implementation response to avoid non-determinism from test model • Model Simplification (Splitting, Sub-Models) • e.g.: separate functionality or fix some input parameters • may require test case composition and handling of emerging/lost behav. • Qualitative Fault Modelling • hierarchical handling of the complexity problem of test generation
Mutation-based TCG for White Box Test.g • Avoidance of Equivalent Mutants • Note: general issue (also results in state space explosion), but in particular for white box TCG • Approaches (ETH/UOXF): • k-induction for equivalence checking of Simulink models,based on a recent formulation of k-induction for programs • static analysis to derive invariants from Simulink model structure • Generate Long Test Cases (using Bounded Model Checking BMC) • Reasons: few long TCs often preferred over many short TCs; robustness t. • Approach: concatenate counterexample traces from multiple BMC iterat.s (final states of preceding traces are initial states for subsequent traces) • can also be used in UML/OOAS track • Precise Automated Verification of Floating-Point Arithmetic (FPA) • Based on standard theory for FPA submitted to SMT-Lib v4 • Assessment of compatibility of IEEE 754 with ISO 11783 (ISOBUS), ISO 25119, IEC 62061 (machinery sector), ISO 26262, AUTOSAR etc. • Development of SMT solvers with FPA support + benchmarks (SMT-Lib)
Mutation-based TCG for Black Box Test.g • "Good" Test Models • "Right" level of abstraction • Avoid more restrictive TCsthan given in requirements(ioco supports this) • Automated de-factoring for supporting more abstract modelling • "Good" Requirements • Avoid ambiguity and under-specification • ioco allows unspecified behaviour also prohibited behaviour if not explicitly forbidden • Use formalised notations; e.g. URN (Rec. ITU-T Z.151, 2008) • Reduction of Modelling Effort • Automated integration of separately modelled requirements possible? • Automated generation of models from requirements • Fault Models / Mutations / Equivalent Mutants (EqM) • Reachability analysis to avoid mutation of unreachable model parts • Select "effective" fault models by experience: generate no/few EqM
Challenges for MIFI • Extend MODIFI for • regression tests, to identify effects on modified models • FI into constants • supporting state charts • timing and control-flow errors in an efficient way (currently, FI into data-flow supported) • i.e. that they have a relevant connection to respective HW faults,e.g. bit-flip faults in program counter • Design Support • components for fault-tolerance and robustness,e.g. robust integrators, double storage, voters for replica-determinismavailable during model design • also for other non-functional properties such as power consumption or use of computational resources
Conclusions • Achievement of main goal – at least 20% savings of test effort at same or higher quality level – demonstrated • First results already used in industry, both within the project (e.g. RELAB, FFA, PROLAN), as well as outside (e.g. Airbus, Toyota) • All project partners will continue working on MOGENTES topics, both on application and research side • Improvements and extensions identified (see previous slides) • Some follow up activities already on track or initiated • Austrian national project TRUFAL (with all Austrian MOGENTES partners) • FP7 Call 7: FAMOS submitted