1 / 12

Reduced-Parameter Modeling (RPM) for Cost Estimation Models

Reduced-Parameter Modeling (RPM) for Cost Estimation Models. Zhihao Chen zhihaoch@cse.usc.edu. Reduced-Parameter Modeling (RPM). What Is RPM?. Why Is It Useful?. How Does It Work?. What Should You Not Use It?. What is RPM?.

forest
Download Presentation

Reduced-Parameter Modeling (RPM) for Cost Estimation Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reduced-Parameter Modeling (RPM) for Cost Estimation Models Zhihao Chen zhihaoch@cse.usc.edu

  2. Reduced-Parameter Modeling (RPM) What Is RPM? Why Is It Useful? How Does It Work? What Should You Not Use It?

  3. What is RPM? • A machine learning technique for determining a minimum-essential set of cost model parameters • Using an organization’s particular project data points • Assuming that the organization’s project data points will be representative of its future projects

  4. Why Is It Useful? • Simplifies cost model usage and data collection • Often improves estimation accuracy • Eliminates highly-correlated, weak-dispersion, or noisy-data parameters • Identifies organization’s most important cost drivers for productivity improvement

  5. Organizations Have Different Data Distributions Correlation Analysis of NASA Project02 22 Projects Correlation Analysis of COCOMO81 63 Projects

  6. Under-sampling: A Case Study for CPLX in NASA 60 • Is software complexity a useful cost driver in this domain? • In NASA60 data set, CPLX=high (usually); • Little information in this parameter • Consider dropping the parameter If the even higher complexity projects were the most important ones to NASA, redefine the complexity for the highly complex NASA systems.

  7. How Does It Work – Technically? • Organization collects critical mass of similar project data • RPM tool starts with Size, tests which additional parameter produces most accurate estimates • By calibrating many times to random data subsets, testing on holdout data points • RPM tool continues to add next best parameters until accuracy starts to decrease • This produces best RPM for the data set

  8. Real and Large Industry Data • Research is supported by CSE and NASA/JPL • Two datasets are public and available from PROMISE Software Engineering Repository - http://promise.site.uottawa.ca/ • 63 projects in Cocomo81/Software cost estimation • 60 projects NASA/Software cost estimation • Two datasets from COCOMO II database • 161 projects in COCOMO II 2000 • 119 projects in COCOMO II 2004 • More data are coming • 30 more projects from JPL • The techniques can be applied and basic results generalized to any model

  9. Example Result

  10. What Should You Not Use It • Do not subtract the parameters are important. • In many domains, expert business users hold in their head more knowledge than might be available in historical databases • Do not subtract parameter you still might need them. • User needs some of the subtracted parameters to make a business decision.

  11. Published Results Some results have been recently published on the use of data mining and machine learning techniques to analyze cost estimation models and data • Chen, Menzies, Port, and Boehm. "Finding the Right Data for Software Cost Modeling", IEEE Software 11/2005. • Menzies, Port, Chen, and Hihn. "Specialization and Extrapolation of Software Cost Models", ASE 2005, Long Beach, California, 11/2005. • Menzies, Port, Chen, Hihn, and Stukes. "Validation Methods for Calibration Software Effort Models", ICSE 2005, 05/2005, St. Louis, Missouri • Yang, Chen, Valerdi, and Boehm. "Effect of Schedule Compression on Project Effort", ISPA 2005, 06/2005, Denver, Colorado • Chen, Menzies, Port, and Boehm. "Feature Subset Selection Can Improve Software Cost Estimation Accuracy", PROMISE 2005, 05/2005, St. Louis, Missouri • Menzies, Chen, Port, and Hihn. "Simple Software Cost Analysis: Safe or Unsafe?", PROMISE 2005, 05/2005, St. Louis, Missouri  All papers are available from http://www.ssei.org/chen/papers/papers.html

  12. Question and Answer

More Related