Comparative Assessment of Software Quality Classification Techniques: An Empirical Case Study

Comparative Assessment of Software QualityClassification Techniques: An Empirical Case Study TAGHI M. KHOSHGOFTAAR, NAEEM SELIYA Empirical Software Engineering Laboratory, Dept. of Computer Science and Engineering, Florida Atlantic University, Empirical Software Engineering, 9, 229-257, 2004. (G) 2004 Kluwer Academic Publishers. Manufactured in The Netherlands

Software metrics-based quality classification models SMQC • Predict a software module as fault-prone (fp) or not fault-prone (nfp) • Using SMQC early in development helps cost-effectiveness • Planned and better use of test and improve measures • Common SMQC models available: • include logistic regression • case-based reasoning • classification and regression trees (CART) • tree-based classification with S-PLUS • Sprint-Sliq • C4.5 • Treedisc

The R & D • Comparative evaluation of 7 classification techniques and/or tools • Introduction of expected cost of misclassification (ECM) • unified measure to compare the performances of different software quality classification models • ECM is computed for different cost ratios using: • Type I Costs of nfp module misclassified as fp – fault prone • Type II Costs of fp module misclassified as nfp – not fault prone • ECM can help use the appropriate SMQC for specific modules which have varying likelihoods of being misclassified as fp or nfp.

The Paper • Introduction • Description classification modeling methods • Case study used in this paper is described • Modeling objective, methodology, and techniques employed in comparing the different classification models • Results • Conclusions of comparative study

Models • CART - Classification and regression trees - decision tree system for e.g. data mining, automatically finds significant patterns large complex databases • S-PLUS - A solution for advanced data analysis, data mining, and statistical modeling - regression tree-based models • C4.5 algorithm - employs decision trees to represent a quality model • Treedisc algorithm - constructs a regression tree from an input data set, that predicts a specified categorical response variable based on one or more predictors • Sprint-Sliq builds classification tree models • Logistic regression - statistical modeling technique , can be adapted for classification • Case-based reasoning (CBR) - find solutions to new problems based on "cases" in a ‘‘case library’’ - a module currently under development is probably fp if a module with similar attributes in an earlier release (or similar project) was fp.

The Study • 4 successive releases of large legacy telecommunications system • Release 1 – baseline for classification models(7) • Measure faults after release that require code change • These are costly faults as that require visits and repair • A module is nfp if no post-release faults – else fp • Faults against LOC also calculated

Metrics • Number of post release faults • LOC • Other Metrics

Conclusions • Quality estimation models do work • Need to evaluate models • “fp” prediction is vital • ECM was testing • Cannot be generalised (single system) • Unclear effect of release # • Unclear reasons for differences in performance • New study needed with different type of software

Comparative Assessment of Software Quality Classification Techniques: An Empirical Case Study