520 likes | 738 Views
Background. At the MDA* Risk Working Group of 29/30 May 01, Schedule Risk was a major topicAction Item:Investigate Schedule Risk Content variation Cost risk* PERTTime and budget constraints* The subject of this paper. This work was conducted for and funded by the IC CAIG and MDA. The Hyp
E N D
1. Schedule and Cost Growth R. L. Coleman, J. R. Summerville, M. E. Dameron
35th ADoDCAS
2. Background At the MDA* Risk Working Group of 29/30 May 01, Schedule Risk was a major topic
Action Item:
Investigate Schedule Risk
Content variation
Cost risk*
PERT
Time and budget constraints
* The subject of this paper
3. The Hypothesis Many people believe1 a graph of cost growth vs. schedule growth as illustrated below:
4. The Data We analyzed data from the RAND Cost Growth Database with both the following characteristics:
Programs with E&MD only
Because growth is different for those with and without PDRR
Programs with schedule data in the requisite fields
There were 59 points. The analysis follows.
5. Descriptive Statistics for Schedule Growth We will look at these descriptive statistics in the following slides
Distribution shape
Scatter plots
Dollar weighting
6. Schedule Growth Distribution
7. Basic Statistics of Schedule ChangeAnalyzed data only Mean 1.29
Standard Deviation 0.54
CV 42%
75th %-ile 1.46
61st %-ile 1.29
50th %-ile 1.11
25th %-ile 1.00
Shrinkers 9/59 15.3%
Steady 12/59 20.3%
Stretchers 38/59 64.4%
8. Basic Scatterplots – SGF & Sked vs. Dollar Size
9. The “1/x Pattern”
10. CGF and SGF vs. Cost Size
11. Basic Scatterplots – Dollar Size vs. Length
12. Basic Scatterplots – Length vs. $ Size
13. Basic Scatterplots – Cost Growth
14. Basic Scatterplots - Length
15. Weighting by Length- and Dollar-Size
16. Sorted Graphs
17. Correlation and Other Joint Effects
18. Correlation and Other Joint Effects Between Schedule Growth and Cost Growth We will look for correlation
Parametric
Non-parametric
Trends in sorted data
We will investigate the hypothesis for schedule growth vs. cost growth
We will normalize by dollar size to eliminate any inadvertent distortion
19. Correlation - Parametric
20. Correlation – Non-Parametric Test
Cox Stewart Test for Trend test statistic of 18 is within the critical values of 8.41 and 18.59
The non-parametric test cannot reject no correlation
Used CGF Sort because CGF had less ties, thus less ambiguity
Previous parametric test cannot reject no correlation
Moving averages of CGF do not show a rise
Conclusion: Cannot reject “no correlation”
Visual presentations follow
21. Patterns in SGF and CGF
22. Investigating the Hypothesis
23. CGF by Regime
24. CGF by Regime
25. There is a PatternbutIs There a Curve?
26. Is there a curve? There is no pattern on either side of the data
27. Is there a Curve?
28. Normalizing for Dollar SizeTo Remove Inadvertent Dollar Size Distortion
29. Size Normalization We know there is a size effect in CGF
We think there is a size effect in SGF
We must investigate schedule effects free from size effects
First we will look at a scatter plot
Then we will normalize1 all programs for dollar size, and compare to actuals
If there is a pattern in any regime, we will worry
If there is no regime pattern, we can conclude there is no dollar size distortion
We chose to correct out dollar-size because it is stronger, and because we were worried about a length and SGF correlation causing mischief if we tried to correct it out
30. Is there a Dollar-Size Bias?
31. Normed vs Actual CGFs by Regime
32. Normed vs Actual CGFs by Regime
33. Correction Factors We must correct for schedule growth, if we can predict it. The form of the correction is unclear:
34. Hypothesis – The Answer The Hypothesis was about right
The below is all we can say for sure
Some liberties have been taken with the graph
35. Conclusions Schedule growth is less extreme than cost growth
But patterns are the same
There is a cost-size and length effect, just as for cost growth
Dollar-larger programs lengthen less
Longer programs lengthen less
Neither cost nor length predict the other
There is a difference in cost growth by schedule-growth regime
Relative to Relative to
Regime CGF No Change Average
Programs that shorten 1.42 1.25 1.14
Programs that stay the same 1.13 1.00 0.91
Programs that lengthen 1.24 1.09 1.00
36. Modeling Schedule Duration of Networks
37. Schedule Growth Distributions
38. Best Fits vs. Empirical Data
39. Why are Values of 1 more Common?And who cares? There is intense pressure to complete on time, and late finishes are easily discerned
The consequence of an early finish is to “ship” a flawed system
Flaws can be fixed after testing
There is a temptation to drag out work if you are done early
Perhaps the implication is that the customer should put less emphasis on finish time and more on test results?
In any event, it is altogether likely that there would be cosmetic 1.0 SGFs, and the data would seem to reflect that
We will find a way to deal with this in the analysis, and recommend a modeling approach
40. Extreme Value Distribution Fit The CDF of the data is oddly shaped due to a large number of 1.0’s and fails a Kolmogorov-Smirnov test for the Extreme Value Distribution
We believe the disproportionate amount of 1.0’s is politically motivated and not a natural occurrence
This causes a “gap” between the empirical and fitted distributions
We will next examine a hypothetical distribution with the 1.0’s redistributed along the “gap” area (using the Ext Val fit)
41. The Hypothetical “Natural” CDF
42. What the test showsAnd what it doesn’t show The redistributed data pass a K-S test
But, the test cannot take the redistribution of data into account
This is analogous to loss of degrees of freedom, but the literature provides no remedy
We fully realize that this is not a “valid statistical test”
But it strongly suggests that the underlying distribution is the Extreme Value distribution
43. Hybrid Distribution Alternative The hypothetical natural (re-distributed) distribution is reasonable for use
But, if you wish to capture the effects of too many programs appearing to finish “on schedule” then a hybrid distribution should be examined
To do this we must consider the probability of 1.0 vs. the rest of the outcomes as discrete cases
P(1.0) = 12/59 = 20.3%
P(Extreme Value) = 79.7%
The Extreme Value parameters would then be estimated from the data with the 1.0’s removed
44. Hybrid Distribution Alternative
45. Distribution Conclusions We have shown that the Extreme Value distribution is well supported as the natural distribution
We have shown that the pieces of the hybrid distribution fit the data
And, the hybrid reproduces the actuals well
We recommend using the hybrid
But if “political” or “cosmetic” effects are absent, we recommend using the hypothetical natural distribution
46. Backup
47. Size Adjustments
48. Prediction Equation - RAND RDT&E
49. Prediction Equation - RAND RDT&E
50. Dispersion – Bounds
51. Dispersion – Bounds
52. Basic Statistics of Schedule ChangeAll available schedule data compared to analyzed data Statistic Analyzed All Observations
Mean 1.29 1.25
Standard Deviation 0.54 0.51
CV 42% 41%
n 59 98
75th %-ile 1.46 1.365
%-ile of the mean 61% 63%
50th %-ile 1.11 1.03
25th %-ile 1.00 1.00
Shrinkers 15.3% 20.4%
Steady 20.3% 22.4%
Stretchers 64.4% 57.1%