290 likes | 445 Views
Using Excel to Implement Software Reliability Models. Norman F. Schneidewind Naval Postgraduate School 2822 Racoon Trail, Pebble Beach, California, 93953, USA Voice: (831) 656-2719 Fax: (831) 372-0445 nschneid@nps.navy.mil. Outline. Introduction Characteristics of Excel Implementation
E N D
Using Excel to Implement Software Reliability Models • Norman F. Schneidewind • Naval Postgraduate School • 2822 Racoon Trail, • Pebble Beach, California, 93953, USA • Voice: (831) 656-2719 • Fax: (831) 372-0445 • nschneid@nps.navy.mil
Outline • Introduction • Characteristics of Excel Implementation • Combined Software Reliability Tools – Excel Approach • Structure of Combined Approach • Notation for Prediction Worksheet • Equations for Prediction and Comparison Worksheets • Example Prediction Worksheet • Analysis of Prediction Worksheet • Notation for Actual – Prediction Comparisons Worksheet • Example Actual – Prediction Comparisons Worksheet • Analysis of Comparison Worksheet • Cumulative Failure Prediction Plots • Validation of Failure Count Predictions • Time to Failure Plot • Validation of Time to Failure Predictions • Conclusions • Excel Demo
Introduction • CASRE and SMERFS, hereafter referred to as SRT (software reliability tools), were developed prior to the availability of mature spreadsheet programs. • Programs like Excel were not an option, but things have changed. • In Excel, the user can create equations, do data and statistical analysis, make plots, an do programming, using Visual Basic. • In SRT, the programming of the models has been done for the user, but the functionality is fixed until the next revision.
Characteristics of Excel Implementation #1 • Advantages: • Almost all practitioners have Excel. A minority of practitioners have SRT. • Easier for practitioners to use than SRT. • Typically, failure data is provided by practitioners in Excel. • Improve technology transfer: • Predictions can be made by the researcher in the spreadsheet and returned to the practitioner in the same spreadsheet. • Formatted Excel data can be imported into Word and PowerPoint for creating reports and presentations.
Characteristics of Excel Implementation # 2 • Advantages: • User has more control over formatting of data, prediction results, and plots. • A large set of built-in mathematical and statistical functions are available for reliability analysis. • SRT limited to functions like Chi-square. • User can construct his own reliability equations. • SRT equations are fixed, based on the models implemented. • More flexibility in changing term in equations. • Change cell values; copy and paste equations.
Characteristics of Excel Implementation # 3 • Disadvantages: • Column and cell orientation of spreadsheets is cumbersome. • It is not a natural mathematical format. • Need to repeat parameter entries for iterations of equations. • Variable names are not case sensitive. • Variable names cannot be the same as column or cell names. • Thus, some variables must renamed to avoid naming conflicts.
Characteristics of Excel Implementation # 4 • Disadvantages: • Mathematical library is not as extensive as Fortran and C++ libraries used in SRT. • Does not have sophisticated model evaluation criteria of SRT. • However, error analysis between actuals and predictions (i.e., validation) can be done in Excel.
Combined Software Reliability Tools – Excel Approach • Best approach may be to combine SRT with Excel. • SRT provides model parameter estimation. • Beyond the capabilities of Excel unless programmed in Visual Basic. • Copy and paste parameters from SRT into spreadsheet. • Excel extends capabilities of SRT by allowing user provided equations, statistical analysis, and plots.
Structure of Combined Approach • Worksheets: • Definitions: • Notation • Equations • Predictions • Analysis • Actual – Prediction Comparisons • Analysis • Plots • Validation • Examples of this approach follow.
Equations for Prediction and Comparison WorksheetsTime to Next Failure(s) Predicted at Time t Remaining Failures Predicted at Time t: r(t) = (/) – Xs,t Cumulative Number of Failures Detected at Time T: D(T) = (α/β)[1 – exp (-β ((T –s + 1)))] + Xs-1 Cumulative Number of Failures Detected Over Life of Software TL: D(TL) = / + Xs-1 References: [1, 2, 3].
Analysis of Prediction Worksheet # 1 • s, , and obtained from SMERFS. • One interval = one week of calendar time. • Project 1: • Optimal s = 1 for both failure count and time to failure predictions. • t=26: interval when time to next failure prediction made This is also the last interval of observed failure data. • X26 = 130: observed failure count in the range [1,26]. • F1 = 1: given number of failures to occur after interval 26. • TF(26) = 3.96 intervals: time to next failure predicted at time 26 intervals.
Analysis of Prediction Worksheet #2 • Project 1: • r(26) = 2.14: remaining failures predicted at time 26 intervals. • T = 27 intervals: test time. • D(27) = 130.32: cumulative number of failures detected at time 27 intervals. • D() = 132.14: cumulative number of failures detected over life of software (conservatively, infinity). • r(26) = D() - X26 = 132.14– 130 = 2.14 remaining failures, as in the above.
Analysis of Prediction Worksheet #3 • Project 2: • Total range of 35 weeks divided into Parameter Estimation Range = 1, 23 weeks and Prediction Range = 24, 35 weeks for the purpose of model validation. • Model fit using historical data does not demonstrate validity! • Estimate model parameters in range 1, 23 weeks. • Accuracy of future predictions demonstrates validity. • Predict in range 24, 35 weeks and compare with actuals. • Optimal s = 12 for both failure count and time to failure predictions.
Analysis of Prediction Worksheet #4 • Project 2: • t=23: interval when time to next failure prediction made • X11 = 39: observed failure count in the range [1,11]. • X12,23 = 32: observed failure count in the range [12,23]. • X23 = 71: observed failure count in the range [1,23]. • F1 = 5, …, 20: given number of failures to occur after interval 23. • TF(23) = 2.63, …, 13.14 intervals: time to next failures predicted at time 23 intervals.
Analysis of Prediction Worksheet #5 • Project 2: • r(23) = 44.96: remaining failures predicted at time 23 intervals. • T = 23, …, 35 intervals: test time. • D(23, …, 35) = 71.00, …, 89.69 cumulative number of failures detected at time 23, …, 35 intervals. • D() = 115.96: cumulative number of failures detected over life of software (conservatively, infinity). • r(23) = D() - X23 = 115.96–71 = 44.96 remaining failures, as in the above.
Notation for Actual – Prediction Comparisons Worksheet • Parameter Estimation Range = 1, 23 weeks; Prediction Range = 24, 35 weeks; s = 12 weeks. • D(T) Actual = Actual Cumulative Count, from Interval 1, in Prediction Range • D(T) Pred = Predicted Cumulative Count, from Interval1, in Prediction Range • Interval Actual = Difference in D(T) Actual • Interval Pred = Difference in D(T) Pred • Int Act Cum = Interval Actual Cumulative Count, from Interval 24, in Prediction Range • Int Pred Cum = Interval Predicted Cumulative Count, from Interval 24, in Prediction Range • TF(t) Actual = Actual Time to Next Given Number of Failures in the Int Act Cum column • TF(t) Pred = Predicted Time to Next Given Number of Failures in the Int Act Cum column
Analysis of Comparison Worksheet # 1 • Project 2 • D(T)Actual is compared with D(T) Prediction. • Failure counts are accumulated from Interval1in the parameter estimation range, but are compared in the prediction range. • Interval Actual is compared with Interval Prediction. • Interval failure counts are compared in the prediction range. • Int Act Cum is compared with Int Pred Cum. • Interval failure counts are accumulated from Interval 24 in the prediction range and compared in the prediction range.
Analysis of Comparison Worksheet # 2 • Project 2 • Make plots in predictionrange: • Actualand Predicted Cumulative Failures in Range 1, 35 Weeks. • Actual and Predicted Cumulative Failures in Range 24,35 Weeks. • Validation of Failure Count Predictions. • Residuals: (Predicted – Actual) versus week. • Residuals do not show bias (i.e., trend in either positive or negative direction). • Average Residual = -0.55 failures indicates optimistic prediction on average.
Cumulative Failures in Range 1, 35 Weeks:Parameter Estimation Range plus Prediction Range
Validation of Failure Count Predictions Average Residual = -0.55 failures
Analysis of Comparison Worksheet # 3 • Project 2 • Make plot in predictionrange: • Actual and Predicted Time to Next Failures versus given number of failures. • Validation of Time to Failure Predictions. • Residuals: (Predicted – Actual) versus given number of failures. • Residuals show bias starting at 15 failures (week 32) as it becomes difficult to predict further out into the future. • Average Residual = 0.87 weeks indicates optimistic prediction on average.
Validation of Time to Failure Predictions Average Residual = 0.87 weeks
Conclusions • Spreadsheet technology can effectively support software reliability modeling and prediction. • Advantages relative to SRT are: • Easier transfer of technology to practitioners. • More user control of program’s operation. • Many built-in mathematical and statistical functions. • Disadvantages relative to SRT are: • Cell format is not conducive to mathematical modeling. • No built-in model evaluation criteria. • SRT and Excel can be combined to advantage: • SRT for reliability model parameter estimation. • Excel for reliability prediction.
References • [1] Norman F. Schneidewind, "Reliability Modeling for Safety Critical Software", IEEE Transactions on Reliability, Vol. 46, No.1, March 1997, pp.88-98. • [2] Norman F. Schneidewind, "Software Reliability Model with Optimal Selection of Failure Data", IEEE Transactions on Software Engineering, Vol. 19, No. 11, November 1993, pp. 1095-1104. • [3] Norman F. Schneidewind and T. W. Keller, "Application of Reliability Models to the Space Shuttle", IEEE Software, Vol. 9, No. 4, July 1992 pp. 28-33.