250 likes | 356 Views
Software Metrics and Defect Prediction. Ayşe Başar Bener. Problem 1. How to tell if the project is on schedule and within budget? Earned-value charts. Problem 2. How hard will it be for another organization to maintain this software? McCabe Complexity. Problem 3.
E N D
Software Metrics and Defect Prediction • Ayşe Başar Bener
Problem 1 • How to tell if the project is on schedule and within budget? • Earned-value charts.
Problem 2 • How hard will it be for another organization to maintain this software? • McCabe Complexity
Problem 3 • How to tell when the subsystems are ready to be integrated • Defect Density Metrics.
Problem Definition • Software development lifecycle: • Requirements • Design • Development • Test (Takes ~50% of overall time) • Detect and correct defects before delivering software. • Test strategies: • Expert judgment • Manual code reviews • Oracles/ Predictors as secondary tools
Defect Prediction • 2-Class Classification Problem. • Non-defective • If error = 0 • Defective • If error > 0 • 2 things needed: • Raw data: Source code • Software Metrics -> Static Code Attributes
c > 0 c Static Code Attributes • void main() • { • //This is a sample code • //Declare variables • int a, b, c; • // Initialize variables • a=2; • b=5; • //Find the sum and display c if greater than zero • c=sum(a,b); • if c < 0 • printf(“%d\n”, a); • return; • } • int sum(int a, int b) • { • // Returns the sum of two numbers • return a+b; • } LOC: Line of Code LOCC: Line of commented Code V: Number of unique operands&operators CC: Cyclometric Complexity
Defect prediction using machine learning techniques How effectively we can estimate defect density? Regression models First classification, then regression Defect prediction in multi version software Defect prediction in embedded software B. Turhan, and A. Bener, "A Multivariate Analysis of Static Code Attributes for Defect Prediction", QSIC 2007, Portland, USA, October 11-12, 2007 A.D. Oral and A. Bener, "Defect Prediction for Embedded Software", ISCIS 2007, Ankara, Turkey, November 9-11, 2007. Software Defect Identification Using Machine Learning Techniques”, E. Ceylan, O. Kutlubay, A. Bener, EUROMICRO SEAA, Dubrovnik, Croatia, August 28th - September 1st, 2006 "Mining Software Data",B. Turhan and O. Kutlubay, Data Mining and Business Intelligence Workshop in ICDE'07 , İstanbul, April 2007 "A Two-Step Model for Defect Density Estimation", O. Kutlubay, B. Turhan and A. Bener, EUROMICRO SEAA, Lübeck, Germany, August 2007 "A Defect Prediction Method for Software Versioning",Y. Kastro and A. Bener, Software Quality Journal (in print). “Software Defect Density Estimation Using Static Code Attributes: A Two Step Model”, O. Kutlubay, B. Turhan, A. Bener, Eng. App. of AI (under review) Research on Defect Prediction
Constructing Predictors • Baseline: Naive Bayes. • Why?: Best reported results so far (Menzies et al., 2007) • Remove assumptions and construct different models. • Independent Attributes ->Multivariate dist. • Attributes of equal importance "Software Defect Prediction: Heuristics for WeightedNaïve Bayes", B. Turhan and A. Bener, ICSOFT2007, Barcelona, Spain, July 2007. “Software Defect Prediction Modeling”, B. Turhan, IDOESE 2007, Madrid, Spain, September 2007 “Yazılım Hata Kestirimi için Kaynak Kod Ölçütlerine Dayalı Bayes Sınıflandırması”, UYMS2007, Ankara, September 2007 “A Multivariate Analysis of Static Code Attributes for Defect Prediction”, B. Turhan and A. Bener QSIC 2007, Portland, USA, October 2007.
Weighted Naive Bayes Naive Bayes Weighted Naive Bayes
Performance Measures Accuracy: (A+D)/(A+B+C+D) Pd (Hit Rate): D / (B+D) Pf (False Alarm Rate): C / (A+C)
WC vs CC Data? • When to use WC or CC? • How much data do we need to construct a model? ICSOFT’07
Thank You http://softlab.boun.edu.tr