110 likes | 248 Views
Revisiting Software Development Effort Estimation Based on Early Phase Development Activities. MSR 2013 Masateru Tsunoda, Koji Toda, Kyohei Fushida, Yasutaka Kamei, Meiyappan Nagappan, Naoyasu Ubayashi
E N D
Revisiting Software Development Effort Estimation Based on Early Phase Development Activities MSR 2013 Masateru Tsunoda, Koji Toda, Kyohei Fushida, Yasutaka Kamei, Meiyappan Nagappan, Naoyasu Ubayashi (Kinki University, Japan; Fukuoka Institute of Technology, Japan;NTT DATA Corporation, Japan; Kyushu University, Japan;Queen's University, Canada)
Effort is estimated using software size • Total effort is estimated based on software size to decide staffing and schedule. • High accuracy is needed to avoid project failure. • Linear regression is one of common methods to build the model [1]. • Size is settled by a method such as function point analysis. Estimated effort = 4 person-month × 2 developers 2 months effort (estimated) = 1.5 + 0.069×size [1] L. Briand, T. Langley, and I. Wieczorek, “A replicated assessment and comparison of common software cost modeling techniques,” In Proc. of International Conference on Software Engineering (ICSE), pp. 377–386, Limerick, Ireland, June 2000.
Estimation using early phase activity • Practitioners use the ratio of early phase activities to the whole phase. • Which shows higher estimation accuracy? effort (estimated) = 3×early phase effort early phase effort Ratio of effort before design phase (estimation timing) is 33% (on avg.). total effort (estimated) • Estimation timing effort (estimated) = 3×early phase effort effort (estimated) = 1.5 + 0.069×size
What should be clarified? • RQ1: When a model using software size or early phase effort is built, which shows higher accuracy? • Collecting data requires effort, so some organizations do not have detailed data. • RQ2: When other variables are added to the models on RQ1, which modelshows higher accuracy? Model Early phaseeffort Model Totaleffort Softwaresize Totaleffort Add (RQ2) Add (RQ2) ProgrammingLanguage ProgrammingLanguage
What should be clarified? (contd.) • RQ3: When both software size and early phase effort are used, is estimation accuracy improved?Does multicollinearity arise by using them? Model Softwaresize Totaleffort Early phaseeffort
Built models using ISBSG dataset • 118 projects (data point) in ISBSG dataset • Collected from organizations in 20 countries by ISBSG. • Definition of early phase effort • Planning effort • Planning-and-analysis effort • Built model • FP (software size): baseline • planning effort • planning-and-analysis effort • planning effort, FP • planning-and-analysis effort, FP Softwaresize Early phaseeffort Early phaseeffort Softwaresize
How to compare estimation accuracy • Evaluated accuracies by differences in balanced relative error (BRE) from the baseline (FP model). effort = a x FP (baseline) average BRE = 100% effort = b x planning effort average BRE = 50% Difference of average BRE = 100% - 50% = 50% improved Example of BRE Actual effort 100 hour Estimated effort 50 hour BRE = |100-50| / 50 = 100%
Early phase effort improves accuracy • Accuracy of a model using early phase effort is higher than software size. • Built models without other variables. • Multicollinearity did not arise when using both early phase effort and FP. Size (FP) and early phase effort > Size (Ans. to RQ3: When other variables are not used) Difference
Other variables did not work well • Adding other variables improved accuracy of FP model. • But it did not improve accuracy of other models. • Other variables: development type, platform type, language type • Set baseline as FP with the variables, and compared other models without the variables. effort = a x FP+ c x platform type + … (baseline) Avg. BRE = 100% effort = b x planning effort Avg. BRE = 50% Difference of average BRE = 100% - 50% = 50% improved
Early phase effort is still effective • Accuracy of a model using planning-and-analysis effort is higher, but planning effort is not than software size (Ans. to RQ2). • Using both early phase effort and software size improves accuracy, and multicollinearity does not arise. (Ans. to RQ3) Size and early phase effort> Size Difference
How to build estimation model? • Preferable to build a model that only uses early phase effort as an explanatory variable. • If an organization does not collect data in detail • It might not be preferable to use variables which we used as additional variables. • In organizations that collect other data in detail • Using both early phase effort and software size improves the accuracy without multicollinearity. • If software size is settled precisely by a method such as function point analysis