160 likes | 393 Views
Initiative Title: Integrating Formal Methods and Testing in a Quantitative Software Reliability Assessment Framework. Towards Practical Software Reliability Assessment for IV&V Projects. B. Cukic, E. Gunel, H. Singh, V. Cortellessa Department of Computer Science and Electrical Engineering
E N D
Initiative Title:Integrating Formal Methods and Testing in a Quantitative Software Reliability Assessment Framework Towards Practical Software Reliability Assessment for IV&V Projects B. Cukic, E. Gunel, H. Singh, V. Cortellessa Department of Computer Science and Electrical Engineering Department of Statistics West Virginia University September, 2001.
Overview • Introduction and Motivation • Application example: DOLILU II • Assessment of Software Reliability • Statistical sampling • Bayesian approach with ignorance priors • Bayesian approach with non-ignorance priors • Bayesian Hypothesis Testing • Conclusions and Further Work
Introduction • Quantification of the V&V activities is always desirable, but • Is software reliability assessment practical for IV&V of safety/mission critical systems? • Time and cost considerations may limit the appeal • Reliability growth applicable only to integration testing, a tail end of IV&V. • Estimation of operational usage profiles is rare • Lifecycle V&V results ignored • Opinions may limit the appeal too
Introduction • Regulatory view: DO178-B (Software Considerations in Airborne Systems and Equipment Certification) “… methods for estimating the post-verification probabilities of software errors were examined. The goal was to develop numerical requirements for such probabilities for software in computer-based airborne systems of equipment. The conclusion reached, however, was that currently available methods do not provide results in which the confidence can be placed to the level required for this purpose... If the applicant proposes to use software reliability models for certification credit, rationale for the model should be included in the Plan for Software Aspects of Certification, and agreed with by the certification authority.”
Req Design Code Test (Verification & Validation) Unit Integration Acceptance IV&V IV&V Implementation Why impractical? • Most verification and validation techniques are qualitative in nature. • Typical approaches to software reliability estimation based exclusively on operational (system) testing. • Neglects the investment made in other V&V techniques • Requirements readings, inspections, problem reports and tracking, unit level tests…
Motivation • Can software reliability assessment benefit from: • Lifecycle long IV&V activities • Qualitative (formal and informal) V&V methods? • Can the amount of testing needed to assess mission critical reliability levels be reduced? • Realistic case study
Decision Data E X E C U T I V E Design Guidance Cmds RANGE SAFETY Simulate Trajectory Evaluate Results Tracking Data Verify Trajectory and Guidance Cmds SHUTTLE ORBITER DIVDT Decision Data Generate Wind Profile Wind Data Generate Range Data Transmit Guidance Cmds and Range Data Range Data Guidance Cmds Integrated Day-Of-Launch I-Load Update (DOLILU) System Verify Trajectory and Guidance Cmds Simulate Trajectory Case Study
DOLILU II Assessment Goals • Failure probability under 10-4 • Due to the criticality of the program, required confidence level should surpass 0.99 • Available methods for reliability estimation: • Formal verification: virtually impossible • Rigorous inspections, fault based and white box testing performed by an independent IV&V team • Done, but observations were never quantified • Reliability growth models cannot be used
Reliability Assessment Framework • Random testing and Bayesian inference chosen for assessment • Assessment must take into account failure free operational use of DOLILU II, and the results of performed V&V activities • Bayesian inference • Allows inclusion of a subjective probability of failure • Subjective estimate based on observed behavior, reflects beliefs • Hypothesis on the event occurrence probability is combined with new evidence, which may change the degree of belief • In reliability assessment, Beta distribution is frequently used due to its mathematical flexibility and tractability • Beta distributions form a conjugate family
Statistical Assessment (no assumptions) • P(q < 10-4) >= 0.99. Required testing effort (N), from random sampling: • Number of test cases as a function of the required failure rate, with C=0.99Value of Number of Tests10-2 458 10-3 4,602 10-4 46,048 10-5 460,51410-6 4,605,167 Required testing effort not realistic.
Bayesian assessment(non-ignorance priors) • DOLILU underwent extensive IV&V • Partial correctness proofs, requirements & design readings, code inspections, rigorous development practices • Sound formulation of prior beliefs is subject to further research • Historical data on failure occurrences under the same IV&V regime • Historical data on failure occurrence reduction following the application of the specific verification techniques • Process effectiveness measures [Smidts 98] • Represent the application of a specific verification method by an appropriate number of random tests [Miller 94]
Bayesian estimation(non-ignorance priors) • Assume that we can say that the system has achieved desired reliability prior to certification testing. • This “guess” should be “reasonably accurate” • Use random tests (operational profile) to corroborate assumed system failure probability • How many random tests U should be performed?
Benefits • What if corroboration testing is not failure free? • Keep adjusting the value of U [Littlewood 97]
Bayesian Hypothesis Testing (BHT) • Problem of Bayesian estimation: • Categorical assumption that the program meets required reliability must be made. • We can put a probability on this assumption! • Certification testing now searches for the evidence in favor of the hypothesized reliability • Ho : q <= qo null hypothesis H1 : : q > qo alternative hypothesis
Summary • Bayesian framework for reliability assessment allows: • Inclusion of IV&V activities into the reliability assessment. • A significant reduction in the number of tests. • Software reliability of DOLILU can be assessed with a reasonable effort. • CAUTION: Do you trust your (I)V&V methods? • Research directions • Sound formulation of prior beliefs from IV&V. • Can prior beliefs be based on the “formality” of the V&V methods (formal methods)? • Inclusion of CRITICALITY and RISK parameters. • Other case studies!!!