220 likes | 337 Views
Using Machine Learning to Predict Project Effort: Empirical Case Studies in Data-starved Domains. Gary D. Boetticher Department of Software Engineering University of Houston - Clear Lake. What Customers Want. What Requirements Tell Us. Standish Group [Standish94].
E N D
Using Machine Learning to Predict Project Effort: Empirical Case Studies in Data-starved Domains Gary D. Boetticher Department of Software Engineering University of Houston - Clear Lake
Standish Group [Standish94] • Exceeded planned budget by 90% • Schedule by 222% • More than 50% of the projects had less than 50% requirements
Underlying Problems 85% are at CMM 1 or 2 [CMU CMM95, Curtis93] Scarcity of data
Consequences Early life-cycle estimates use a factor of 4 [Boehm81, Heemstra92]
Why are Machine Learning algorithms not used more often for estimating early in the life cycle?
Goal Apply Machine Learning (Neural Network) early in the software lifecycle against Empirical Data
Data • B2B Electronic Commerce Data • Delphi-based • 104 Vectors • Fleet Management Software • Delphi-based • 433 Vectors
Extrapolation issue Largest SLOCs divided by each other 4398 / 2796 = 1.57
Conclusions • Bottom-up approach produced very good results on a project-basis • Results comparable between NN and stat. • Scaling helped • Estimation Approach is suitable for Prototype/Iterative Development
Future Directions • Explore an extrapolation function • Apply other ML algorithms • Collect additional metrics • Integrate with COCOMO II • Conduct more experiments (additional data)