A Framework for Early Reliability Assessment

A Framework for Early Reliability Assessment (WVU UI: Integrating Formal Methods and Testing in a Quantitative Software Reliability Assessment Framework 2003) Bojan Cukic, Erdogan Gunel, Harshinder Singh, Lan Guo, Dejan Desovski West Virginia University Carol Smidts, Ming Li University of Maryland

Overview • Introduction and Motivation. • Software Reliability Corroboration Approach. • Case Studies. • Applying Dempster Shafer Inference to NASA datasets. • Summary and Further Work.

Introduction • Quantification of the effects of V&V activities is always desirable. • Is software reliability quantification practical for safety/mission critical systems? • Time and cost considerations may limit the appeal. • Reliability growth applicable only to integration testing, the tail end of V&V. • Estimation of operational usage profiles is rare.

Is SRE Impractical for NASA IV&V? • Most IV&V techniques are qualitative in nature. • Mature software reliability estimation methods based exclusively on testing. • Can IV&V techniques be utilized for reliability? • Requirements readings, inspections, problem reports and tracking, unit level tests… Traditional Software Reliability Assessment Techniques Req Design Code Test (Verification & Validation) Unit Integration Acceptance Life cycle long IV&V Implementation

Contribution • Develop software reliability assessment methods that build on: • Stable and mature development environments. • Lifecycle long IV&V activities. • Utilize all relevant available information • Static (SIAT), dynamic, requirements problems, severities. • Qualitative (formal and informal) IV&V methods. • Strengthening the case for IV&V across NASA enterprise. • Accurate, stable reliability measurement and tracking. • Available throughout the development lifecycle.

Assessment vs. Corroboration • Current thinking • Software reliability “tested into” the product through the integration and acceptance testing. • Our thinking • Why “waste” the results of all the qualitative IV&V activities. • Testing should corroborate that the life-cycle long IV&V techniques are giving the “usual” results, that the project follows usual quality patterns.

Approach Trustworthy Software Reliability Measure BHT software reliability corroboration SW Reliability Corroboration Testing Null Hypothesis, H0 Alternative Hypothesis, Ha RPS Combination (Experience, Learning, Dempster-Schafer…) RPS Combination Techniques RPS1 RPS2 RPSk RPSm . . . Software Development Lifecycle Reliability Prediction Systems (RPS) SQM3 SQM6 SQMj SQM1 Software quality Measures (SQM) SQM4 SQMi SQM2 SQM5

Software Quality Measures (roots) • The following ones used in experiments. • Lines of code • Defect density • No defect that remain unresolved after testing, divided by the LOC. • Test coverage • LOCtested / LOCtotal. • Requirements traceability • RT= #_requirements_implemented/#_original_requirements. • Function points • . . . • In principle, any measures available could/should be taken into account. • Defining appropriate Reliability Prediction Systems (RPS).

Reliability Prediction Systems • An RPS is a complete set of measures from which software reliability can be predicted. • The bridge between an RPS and software reliability is a MODEL. • Therefore, select (and collect) those measures that have the highest relevance to reliability. • Relevance to reliability ranked from expert opinions [Smidts 2002].

Measure Model Test coverage RPS Notation Root measure: Test coverage Support measures: ·Implemented LOC (LOCI) ·Tested LOC (LOCT) ·The number of defects found by test (N0) ·Missing function point (FPM) ·Backfiring coefficient (k) ·Defects found by test (DT) ·Linear execution time (TL) ·Execution time per demand (t) Fault exposure ratio (K) C0defect coverage C1 test coverage (statement coverage) a0,a1,a2 coefficients N0 the number of defects found by test N the number of defects remaining K fault exposure ratio TL linear execution time t the average execution time per demand RPS for Test Coverage

Approach Software Reliability Measure BHT software reliability corroboration SW Reliability Corroboration Testing Null Hypothesis, H0 Alternative Hypothesis, Ha RPS Combination (Experience, Learning, Dempster-Schafer…) RPS Combination Techniques RPS1 RPS2 RPSk RPSm . . . Software Development Lifecycle Reliability Prediction Systems (RPS) SQM3 SQM6 SQMj SQM1 Software quality Measures (SQM) SQM4 SQMi SQM2 SQM5

Reliability “worthiness” of different RPS 32 measures ranked by five experts

Combining RPS • Weighted sums used in initial experiments. • RPS results weighted by the expert opinion index. • Removing inherent dependencies/correlations. • Dempster-Shafer (D-S) belief networks approach developed. • Network automatically built from datasets by the Induction Algorithm. • Existence of suitable NASA datasets? • Pursuing leads with several CMM level 5 companies.

Approach Software Reliability Prediction BHT software reliability corroboration SW Reliability Corroboration Testing Null Hypothesis, H0 Alternative Hypothesis, Ha RPS Combination (Experience, Learning, Dempster-Schafer…) RPS Combination Techniques RPS1 RPS2 RPSk RPSm . . . Software Development Lifecycle Reliability Prediction Systems (RPS) SQM3 SQM6 SQMj SQM1 Software quality Measures (SQM) SQM4 SQMi SQM2 SQM5

Bayesian Inference • Allows for the inclusion of imprecise (subjective) probability of failure. • Subjective estimate reflects beliefs. • Hypothesis on the event occurrence probability is combined with new evidence, which may change the degree of belief.

Bayesian Hypothesis Testing (BHT) • Hypothesized reliability H0 comes as a result of RPS combination. • Based on the level of (in)experience, the degree of belief assigned: P(H0). • Corroboration testing now looks for the evidence in favor of the hypothesized reliability. • Ho : q <= qo null hypothesis H1 : q > qo alternative hypothesis.

The number of corroboration tests according to BHT theory

Controlled Experiments • Two independently developed versions of PACS (smart card based access control). • Controlled requirements document (NSA specs).

RPS Experimentation RPS predictions of system failure rates: Predicted Failure Rate: 0.084 Actual Failure Rate: 0.09

Reliability Corroboration • Accurate predictors appear adequate • Low levels of trust in the prediction accuracy. • No experience in repeatability at this point in time.

“Research Side Products” • Significant amount of time spent studying and developing Dempster-Shafer inference networks. • “No hope” of demonstrating this work within the scope of integrating RPS results. • Availability of suitable datasets. • But, some datasets are available. So, use them for D-S demo! • Predicting fault-prone modules in two NASA projects (KC2, JM1) • KC2 contains over 3,000 modules, 520 modules of research interest • 106 modules have errors, ranging from 1 to 13 • 414 modules are error free • JM1 contains 10,883 modules • 2,105 modules have errors, rangingfrom 1 to 26 • 8,778 modules are error free • Each dataset contains 21 software metrics, mainly McCabe and Halstead

How D-S Networks Work • Combining distinct sources of evidence by the D-S scheme. • Building D-S networks by prediction logic. • Nodes connected by implication rules. • Each implication rule assigned a specific weight. • Updating belief for the corresponding nodes • Propagating the updated belief to the neighboring nodes, and throughout the entire network. • D-S network can be tuned for a various range of verification requirements.

KC2 JM1 D-S Networks vs. Logistic Regression

KC2 JM1 D-S Networks vs. See5

KC2 dataset D-S Networks vs. WEKA

JM1 D-S Networks vs. WEKA

Status and Perspectives • Software reliability corroboration allows: • Inclusion of IV&V quality measures and activities into the reliability assessment. • A significant reduction in the number of (corroboration) tests. • Software reliability of safety/mission critical systems can be assessed with a reasonable effort. • Research directions. • Further experimentation (data sets, measures, repeatability). • Defining RPS based on the “formality” of the IV&V methods.

A Framework for Early Reliability Assessment

A Framework for Early Reliability Assessment

Presentation Transcript

Assessment: Building a Framework for Successful Learning

A Practical Framework for Integrating Assessment, Learning Curriculum

Common Assessment Framework

Human Reliability Assessment

2010 Reliability Assessment

A framework for supporting early career teacher resilience

Assessment Tools for Early Writing

Towards a framework for assessment in pervasive environnments

Framework for Assessment:

A Practical Transition Assessment Framework

Common Assessment Framework Process A

Reliability Assessment, Growth

A Common Assessment Framework

A Framework for Assessment “The Grid”

Early Assessment

Human Reliability Assessment

A Framework for Modeling Motor Drive Reliability

Early Help Assessment Framework

WS - DREAM : A Distributed Reliability Assessment Mechanism for Web Services

Proposal for a Flemish Sustainability Assessment Framework

Reliability Assessment, Growth

Reliability Assessment, Growth