300 likes | 451 Views
COCOMO II Integrated with Crystal Ball ® Risk Analysis Software. Clate Stansbury MCR, LLC cstansbury@mcri.com (703) 506-4600 Prepared for 19 th International Forum on COCOMO Software Cost Modeling University of Southern California Los Angeles CA 26-29 October 2004. Contents.
E N D
COCOMO II Integrated withCrystal Ball® Risk Analysis Software Clate Stansbury MCR, LLC cstansbury@mcri.com (703) 506-4600 Prepared for 19th International Forum on COCOMO Software Cost Modeling University of Southern California Los Angeles CA 26-29 October 2004
Contents • Purpose: Describing Uncertainty • Representing Uncertain Inputs • Simulating Costs • Correlating Inputs and Costs • Summary
Estimators Must Describe Uncertainty • Report Cost As a Statistical Quantity, Not a Point • Cost of Any Incomplete Program Is Uncertain • Estimator Must Report That Uncertainty as Part of His or Her Delivered Estimate • Cost-risk Analysis Allows Estimator to Report Cost As a Probability Distribution, So Decision-maker Is Made Aware of • Expected Cost (Mean) • 50th Percentile Cost (Median) • 80th Percentile Cost • Overrun Probability of Project Budget
Representing Uncertain Inputs Using Triangular Distributions
Triangular Distribution of Element Cost, Reflecting Uncertainty in “Best” Estimate Best-Estimate Cost (Mode = Most Likely) Cost Implication of Technical, Programmatic Assessment Optimistic Cost
COCOMO Cost Drivers as Triangular Distributions • For Each COCOMO II Input … • Input Request Interpreted as a Triangular Distribution • User Estimates Optimistic, Most Likely, and Pessimistic Values (which may not always be all different from each other) Most Likely (mode) Probability Optimistic Pessimistic Cost User provides three values for each COCOMO II input, as though there were three separate projects.
COCOMO Cost Drivers as Triangular Distributions Why triangular distribution? • Triangular Distribution is Simple and Malleable • Parameters (Optimistic, Most Likely, Pessimistic) Are Easy to Define and Explain • Could Have User Provide Parameters for Normal, Lognormal, Exponential, Uniform, or Beta Distributions, for Example, if More is known about the distributions • Good Topic for Further Research….
How to Process Triangular Distributions? • Taking the Product of Effort Multipliers When Each EM is a Triangular Distribution? • How to Compute Rest of COCOMO II Algorithm? • How to Sum Code Counts for All CSCIs?
Traditional “Roll-Up” Method (Too Simple) • Define “Best Estimate” of Each Cost Element to be the Most Likely Cost of that Element • List Cost Elements in a Work-Breakdown Structure (WBS) • Calculate “Best Estimate” of Cost for Each Element • Sum All Best Estimates • Define Result to be “Best Estimate” of Total Project Cost • Unfortunately, It Turns Out That Things are Not as Simple as They Seem – There are a Lot of Problems with This Approach
WBS-ELEMENT TRIANGULAR COST DISTRIBUTIONS MERGE WBS-ELEMENT COST DISTRIBUTIONS INTO TOTAL-COST NORMAL DISTRIBUTION MostLikely $ . . . MostLikely $ $ ROLL-UP OF MOST LIKELYWBS-ELEMENT COSTS MOST LIKELYTOTAL COST MostLikely $ Why “Roll-up” Doesn’t Work
What Information a Cost Estimate Should Provide Statistical Information Output About the Cost • Probability Density (Frequency Distribution or Histogram) • S-curve (Cumulative Probability Distribution) • Percentiles • Min, Max, Mode, Mean
Forecast: A8 10,000 Trials Frequency Chart 71 Outliers .020 197 “Density Curve” .015 147.7 .010 98.5 .005 49.25 .000 0 462.43 537.16 611.89 686.62 761.35 What a Cost Estimate Should Look Like (Crystal Ball Outputs) “S-Curve”
Cost-Risk Analysis Works by Simulating System Cost • In Engineering Work, Computer Simulation of System Performance is Standard Practice, with Key Performance Characteristics Modeled byMonte Carlo Analysis as Random Variables, e.g. • Data Throughput • Time to Lock • Time Between Data Receipt and Delivery • Atmospheric Conditions • Cost-Risk Analysis Enables the Cost Analyst to Conduct a Computer Simulation of System Cost • WBS-element Costs Are Modeled As Random Variables • Total System Cost Distribution is Determined by Monte Carlo Simulation • Cost is Treated as a Performance Criterion
Crystal Ball Risk- Analysis Software • Commercially Available Third-Party Software Add-on to Excel, Marketed by Decisioneering, Inc., 2530 S. Parker Road, Suite 220, Aurora, CO 80014, (800) 289-2550 • Inputs • Parameters Defining WBS-Element Distributions • Rank Correlations Among WBS-Element Cost Distributions • Mathematics • Monte-Carlo (Random) or Latin Hypercube (Stratified) Statistical Sampling • Virtually All Probability Distributions That Have Names Can Be Used • Suggests Adjustments to Inconsistent Input Correlation Matrix • Outputs • Percentiles and Other Statistics of Program Cost • Cost Probability Density and Cumulative Distribution Graphics
How CB Simulations Work Trial 1 Trial 5000 Trial 2 Assumption Cell G5 Total Cost =SUM($G$4:$G$8) Forecast
Risks are Correlated • Resolving One WBS Element’s Risk Issues by Spending More Money on It Often Involves Increasing Cost of Several Other Elements as Well • For Example, Excessive Complexity in One CSCI Impacts Effort Required to Develop Other CSCIs that Interface with It • Schedule Slippage Due to Problems in One CSCI Lead to Cost Growth and Schedule Slippage in Other Elements (“Standing Army Effect”) • Hardware Problems Discovered Late in Program Often Have to Be Circumvented by Making Expensive Last-minute Fixes to the Software • As We Will Soon See, Inter-Element Correlation Tends to Increase the Variance of the Total-Cost Probability Distribution • Numerical Values of Inter-WBS-Element Correlations are Difficult to Estimate, but That’s Another Story
Maximum Possible Underestimation of Total-Cost Sigma • Percent Underestimated When Correlation Assumed to be 0 Instead of r (n=# of WBS elements)
Selection of Correlation Values • “Ignoring” Correlation Issue is Equivalent to Assuming that Risks are Uncorrelated, i.e., that All Correlations are Zero • Square of Correlation (namely, R2) Represents Percentage of Variation in one WBS Element’s Cost that is Attributable to Influence of Another’s • Reasonable Choice of Nonzero Values Brings You Closer to Truth • Most Elements are, in Fact, Pairwise Correlated • 0.2 is at “Knee” of Curve on Previous Charts, thereby Providing Most of the Benefits at Least Commitment
Determining Correlations Among COCOMO II Cost Drivers • Default Correlations • Correlations of Intra-CSCI Inputs to Default to 0.5 • Correlations of Inter-CSCI Efforts to Default to 0.2 • More Detailed Default Correlations? • Higher Correlation Between RELY and DOCU? • COCOMO II Security Extension Cost Driver Related to Existing Cost Drivers
Summary • Estimator Must Model Uncertainty • Describe Uncertainty by Representing COCOMO Inputs as Triangular Distributions • Calculate Implications of Uncertainty by Using Monte Carlo or Latin Hypercube Simulations to Perform COCOMO II Algorithm • Consider Correlation Among CSCI Risks and Costs • Professional Software, e.g., Crystal Ball, is Available to do Computations
Acronyms AA Assessment and Assimilation AT Automatically Translated code CB Crystal Ball CM Percent of Code Modified COCOMO Constructive Cost Model CSCI Computer Software Cost Integrator DM Percent of Design Modified EI External Input EIF External Interface File EO External Output EQ External Inquiry ILF Internal Logical File IM Effort for Integration KSLOC Thousands of Source Lines of Code MS Microsoft O,M,P Optimistic, Most Likely, Pessimistic SCED Schedule compression/expansion rating SLOC Source Lines of Code SU Software UFP Unadjusted Function Point UNFM Programmer Unfamiliarity rating USC University of Southern California WBS Work Breakdown Structure
( ) 2 = s Var C i Correlation Matters • Suppose for Simplicity • There are n Cost Elements • Each • Each Corr(Ci ,Cj ) = < 1 • Total Cost
Effort Cost Estimate Frequency Chart • Approximation of Cost-Probability Distribution
Effort Cost Estimate Cumulative-Probability Function • Probability of Cost Being Less Than x
Cost Estimate Statistics Confidence Levels Statistical Information Trials 1500 Mean 190.12 Median 189.36 Mode --- Standard Deviation 13.96 Variance 195.01 Skewness 0.15 Kurtosis 2.84 Coeff. of Variability 0.07 Range Minimum 152.86 Range Maximum 237.42 Range Width 84.56 Mean Std. Error 0.36 Percentile Effort 0% 152.86 5% 167.41 10% 171.84 15% 175.86 20% 178.55 25% 180.77 30% 182.66 35% 184.49 40% 185.99 45% 187.57 50% 189.36 Percentile Effort 55% 191.11 60% 193.12 65% 195.36 70% 197.51 75% 199.69 80% 202.34 85% 204.86 90% 208.09 95% 213.71 100% 237.42
Correlation Matrices Allow User to Adjust Correlations • One Matrix for Each CSCI Allows Estimator to Set Correlations Among Cost Drivers for that CSCI