An Experimental Evaluation on Reliability Features of N-Version Programming

An Experimental Evaluation on Reliability Features of N-Version Programming Presented by Onur TÜRKYILMAZ Authors Xia Cai, Michael R. Lyu and Mladen A. Vouk International Symposium on Software Reliability Engineering 2005 (ISSRE’05)

Outline • Introduction • Motivation • Experimental evaluation • Fault analysis • Failure probability • Fault density • Reliability improvement • Discussions • Conclusion

Introduction • N-version programming is one of the main techniques for software fault tolerance • It has been adopted in some mission-critical applications • Yet, its effectiveness is still an open question • What is reliability enhancement? • Is the fault correlation between multiple versions a big issue that affects the final reliability?

Research questions • What is the reliability improvement of NVP? • Is fault correlation a big issue that will affect the final reliability? • What kind of empirical data can be comparable with previous investigations?

Motivation • To address the reliability and fault correlation issues in NVP • To conduct a comparable experiment with previous empirical studies • To investigate the “variant” and “invariant” features in NVP

Experimental background • Some features about the experiment • Complexity • Large population • Well-defined • Statistical failure and fault records • Previous empirical studies • UCLA Six-Language project • NASA 4-University project • Knight and Leveson’s experiment • Lyu-He study

Experimental setup • RSDIMU avionics application • 34 program versions • A team of 4 students • Comprehensive testing exercised • Acceptance testing: 800 functional test cases and 400 random test cases • Operational testing: 100,000 random test cases • Failures and faults collected and studied • Qualitative as well as quantitative comparisons with NASA 4-University project performed

Experimental description • Geometry - estimating the vehicle acceleration using eight redundant accelerometers (sensors) - sensors mounted on the four triangular faces of a semioctahedron

Comparisons between the two projects • Qualitative comparisons • General features • Fault analysis in development phase & operational test • Quantitative comparisons • Failure probability • Fault density • Reliability improvement

General features comparison

Faults in development phase

Distribution of related faults

Fault analysis in development phase • Common related faults • Display module (easiest part) • Calculation in wrong frame of reference • Initialization problems • Missing certain scaling computation • Faults in NASA project only • Division by zero • Incorrect conversion factor • wrong coordinate system problem.

Fault analysis in development phase (cont’) • Both cause and effect of some related faults remain the same • Related faults occurred in both easy and difficult subdomains • Some common problems, e.g., initialization problem, exist for different programming languages • The most fault-prone module is the easiest part of the application

Faults in operational test

Input/Output domain classification • Normal operations are classified as: Si,j = {i sensors previously failed and j of the remaining sensors fail | i = 0, 1, 2; j = 0, 1 } • Exceptional operations: Sothers

Failures in operational test • States S0,0, S1,0 and S2,0 are more reliable than states S0,1, S1,1, S2,1 • Exceptional state reveals most of the failures • The failure probability in S0,1 is the highest • The programs inherit high reliability on average

Coincident failures • Two or more versions fail at the same test case, whether the outputs identical or not • The percentage of coincident failures versus total failures is low: • Version 22: 25/618=4% • Version 29: 32/2760=1.2% • Version 32: (25+32)/1351=4.2%

Failure bounds for 2-version system • Lower and upper bounds for coincident failure probability under Popov et al model • DP1: normal test cases without sensor failures dominates all the testing cases • DP3: the test cases evenly distributed in all subdomains • DP2: between DP1 & DP3

Quantitative comparison in operational test • NASA 4-university project: 7 out of 20 versions passed the operational testing • Coincident failures were found among 2 to 8 versions • 5 out of 7 faults were not observed in our project

Invariants • Reliable program versions with low failure probability • Similar number of faults and fault density • Distinguishable reliability improvement for NVP, with 102 to 104 times enhancement • Related faults observed in both difficult and easy parts of the application

Variants • Compared with NASA project, our project: • Some faults not observed • Less failures • less coincident failures • Only 2-version coincident failures (other than 2- to 8- version failures) • The overall reliability improvement is an order of magnitude larger

Discussions • The improvement of the project may attributed to • stable specification • better programming training • experience in NVP experiment • cleaner development protocol • different programming languages & platforms

Discussions (cont’) • The hard-to-detected faults are only hit by some rare input domains • New testing strategy is needed to detect such faults: • Code coverage? • Domain analysis?

Conclusion • An empirical investigation is performed to evaluate reliability features by a comprehensive comparisons on two NVP projects • NVP can provide distinguishable improvement for final reliability according to the empirical study conducted • Small number of coincident failures provides a supportive evidence for NVP • Possible attributes that may affect the reliability improvement are discussed

Thank you ! Q & A

An Experimental Evaluation on Reliability Features of N-Version Programming

An Experimental Evaluation on Reliability Features of N-Version Programming

Presentation Transcript

Distribution System Reliability Evaluation

Experimental Evaluation

An Experimental Evaluation on Reliability Features of N-Version Programming

INVESTIGACI N EXPERIMENTAL

5. Evaluation of measuring tools: reliability

Distribution System Reliability Evaluation

Features of 7th SOW Evaluation

Regulation of Network Infrastructure Investments - An Experimental Evaluation

Evaluation of Programming Languages

An Experimental Evaluation of the Reliability of Adaptive Random Testing Methods

An experimental evaluation of incremental and hierarchical k -median algorithms

An experimental evaluation of continuous testing during development

An Introductory Talk on Reliability Analysis

Subjective Evaluation Of Delayed Risky Outcomes: An Experimental Approach

Experimental Evaluation of Voltage Unbalance

Experimental features of strip-RPCs

Experimental Evaluation

An Overview of Kotlin Programming Language & Its Features

Laravel Version 7 Features