290 likes | 458 Views
Understanding the Human Estimator. Gary D. Boetticher Boetticher@uhcl.edu Univ. of Houston - Clear Lake, Houston, TX, USA. Nazim Lokhandwala Lokhandwala@uhcl.edu Univ. of Houston - Clear Lake, Houston, TX, USA. James C. Helm Helm@uhcl.edu
E N D
Understanding the Human Estimator Gary D. Boetticher Boetticher@uhcl.edu Univ. of Houston - Clear Lake, Houston, TX, USA Nazim Lokhandwala Lokhandwala@uhcl.edu Univ. of Houston - Clear Lake, Houston, TX, USA James C. Helm Helm@uhcl.edu Univ. of Houston - Clear Lake, Houston, TX, USA http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Introduction • Chaos Chronicles [Standish03] • 300 billion dollars • 250,000 new projects • 1.2 million dollars per project http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Boehm’s 4X http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Types of Estimation [Jorgenson04] 7 - 16% Algorithmic and Machine Learners 63 - 86% Human-Based http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Research Focus • Number of Papers On Software Estimation in IEEE [Jorgenson02] • Human-Based Estimation (17%) • Other (83%) http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Statement of Problem How do human demographics affect human-based estimation? Can predictive models be constructed using human demographics? http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Investigation Procedure • Collect demographics from participants • Request participants to estimate software components • Build models (Estimates vs. Actuals) Survey http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Which Demographics? • Basic Demographics • Academic Background • Work Experience • Domain Experience http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
The Survey http://nas.cl.uh.edu/boetticher/EffortEstimationSurvey.html http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Competitive Procurement Software Supplier Software Buyer Software Distribution Server Supplier1 Buyer Admin Supplier2 ... Buyer1 Buyern : Suppliern http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Sample Estimation Screenshots http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Survey Results Screenshots http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Data Collection • Invitations • Filtered Incomplete Records • 122 Final Records http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Participant Educational Background Most of the participants hold Bachelors or Masters Degrees http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Participant Work Experience http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Mean (Years) Maximum (Years) Standard Deviation Domain Experience Procurement and Billing 0.6209 10 1.3818 Process Industry 0.7274 20 2.2512 Participant Domain Experience http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Data Preparation INPUT= • 69% zeros…Needs Consolidation Courses, Workshops, Conferences, Programming Exp. 45 attributed reduced to 14 attributes • Highest Degree Achieved…Need Transformation OUTPUT= MRE=Abs (Total Actual – Total Est.)/(Total Actual) http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Build Models • Linear Regression (Excel) • Non-Linear Regression (DataFit) • Genetic Programming (GDB_GP) http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
GP Configuration 3 Settings • 1000 Chromosomes 50 Generations • 512 Chromosomes 128 Generations • 1000 Chromosomes 128 Generations 20 Trials each http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Linear Regression Genetic Programming Non-Linear Regression R Squared 0.1550 0.9174 0.8847 Std. Error 4.4580 1.3875 1.6470 Linear Regression Genetic Programming Non-Linear Regression Mean 0.1550 0.5592 0.8847 T-test 3.45E-17 1.87E-15 Results: All Demographic Factors Best Values of R Squared with Min. Std. Error T-Test between Average R Square Values http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Linear Regression Genetic Programming Non-Linear Regression R Squared 0.0373 0.2784 0.2136 Std. Error 4.6101 3.9738 4.1667 Linear Regression Genetic Programming Non-Linear Regression Mean 0.0373 0.1973 0.2136 T-test 2.74E-13 0.0486 Results: Educational Factors Best Values of R Squared with Min. Std. Error T-Test between Average R Square Values http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Linear Regression Genetic Programming Non-Linear Regression R Squared 0.0596 0.7572 0.3698 Std. Error 4.5169 2.2855 4.0644 Linear Regression Genetic Programming Non-Linear Regression Mean 0.0596 0.5564 0.3698 T-test 2.73E-19 1.54E-11 Results: Work Experience Best Values of R Squared with Min. Std. Error T-Test between Average R Square Values http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Linear Regression Genetic Programming Non-Linear Regression R Squared 0.0243 0.5911 0.3260 Std. Error 4.5425 2.9283 3.9091 Linear Regression Genetic Programming Non-Linear Regression Mean 0.0243 0.5405 0.3260 T-test 3.27E-23 4.55E-16 Results: Domain Experience Best Values of R Squared with Min. Std. Error T-Test between Average R Square Values http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Summary of All Experiments R Square Values http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Too Much of a Good Thing? Best Equation: All Factors. r2 = 0.9174 ((Log (TechGradCourses + (TechGradCourses ^ ((Log TotWShops)/(Cos (TechGradCourses ^ ((ProcIndExp + (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos (Log (Log (Log SWProjEstExp))))))))))))) / (TechGradCourses ^ (Log SWProjEstExp)))))) / (((Cos (TechGradCourses ^ ((ProcIndExp + (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos (TechGradCourses ^ ((ProcIndExp + (((ProcIndExp + (Log (Sin MgmtGradCourses)))/(Sin SWPMExp)) + (Sin ((Cos (TechGradCourses ^ ((ProcIndExp + (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Sin SWPMExp)))))))))) / (TechGradCourses ^ (Log SWProjEstExp)))))) / (((Cos (TechGradCourses ^ ((Log SWProjEstExp) / (((Log (ProcIndExp + (Log (TechGradCourses ^ ((Log SWProjEstExp) / (Log SWProjEstExp)))))) - 3) / (ProcIndExp + (TechGradCourses ^ (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos ((((Log SWProjEstExp) / ((ProcIndExp + (Log (TechGradCourses ^ (TechGradCourses ^ (Log SWProjEstExp))))) / (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos (Log (Log (Log SWProjEstExp)))))))))))))) / (Sin SWPMExp)) / (Sin SWPMExp)))))))))))) / (TechGradCourses ^ (Log SWProjEstExp))))))))))) - 3) / (TechGradCourses ^ (Log SWProjEstExp)))))) + ((Log SWProjEstExp) / (Log SWProjEstExp)))))) / (Log (Log (Log (TechGradCourses + (Cos (Log (Log (TechGradCourses ^ (Cos (((((Log SWProjEstExp) / (TechGradCourses ^ (Log SWProjEstExp))) / ((ProcIndExp + (Log (Sin MgmtGradCourses))) / ((Log SWProjEstExp) / (Log SWProjEstExp)))) / (Sin SWPMExp)) / (Sin SWPMExp))))))))))))))))))))))) / (TechGradCourses ^ (Log SWProjEstExp)))))) / (((Log ((((Log TotLangExp) / (Log SWProjEstExp)) / (Log SWProjEstExp)) / (Sin SWPMExp))) - 3) / (TechGradCourses ^ (Log SWProjEstExp)))))) - 3) / (TechGradCourses ^ (Log SWProjEstExp)))))))))) + (((((ProcIndExp + (Log (TechGradCourses ^ (Log (TechGradCourses + ((TechGradCourses ^ (TechGradCourses ^ (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos ((((Log SWProjEstExp) / ((ProcIndExp + (Log (TechGradCourses ^ (Log (TechGradCourses + (Cos (Log (Log (TechGradCourses ^ (Cos (((((Log SWProjEstExp) / (TechGradCourses ^ (Log SWProjEstExp))) / ((ProcIndExp + (Log (Sin MgmtGradCourses))) / ((Log SWProjEstExp) / (Log SWProjEstExp)))) / (Sin SWPMExp)) / (Sin SWPMExp)))))))))))) / ((Log SWProjEstExp) / (Log SWProjEstExp)))) / (Sin SWPMExp)) / (Sin SWPMExp)))))))))))) / (TechGradCourses ^ (Log SWProjEstExp))))))) / (Sin SWPMExp))))))) / (TechGradCourses ^ (Log SWProjEstExp))) / (TechGradCourses ^ (Log SWProjEstExp))) / (TechGradCourses ^ (Log SWProjEstExp))) / (Sin SWPMExp))) http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Conclusions • Viability of a human-based est. model • Model assessment • Non-linear GP • Impact on Human Based Estimation 1) All Factors 2) Domain ExperienceWork Experience 3) Education http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Future Directions • Equation Optimizer for GP • Collect More Data • Further analysis without consolidation • Detailed Effect of Educational Factors • Use other statistical indicators • Build other models • Hybrid (Non-linear and GP) • Classifiers • Impact of process on estimation http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Questions? http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop
Thank You! http://nas.cl.uh.edu/boetticher/publications.html The 2nd International Predictor Models in Software Engineering (PROMISE) Workshop