240 likes | 357 Views
Improving Group Decision Making Under Uncertain Circumstances: Applications in Defense Acquisition. Dennis Goldenson & Bob Stoddard (SEI) Ricardo Valerdi (University of Arizona ) COCOMO 2013 23 October 2013. Information Flow for Early Lifecycle Estimation (QUELCE).
E N D
Improving Group Decision Making Under UncertainCircumstances: Applications in Defense Acquisition • Dennis Goldenson & Bob Stoddard (SEI) • Ricardo Valerdi (University of Arizona) • COCOMO 201323 October 2013
Information Flow for Early Lifecycle Estimation (QUELCE) Proposed Material Solution & Analysis of Alternatives Information from Analogous Programs/Systems • Operational CapabilityTrade-offs • Technology DevelopmentStrategy • System CharacteristicsTrade-offs • KPP selection • Systems Design • Sustainment issues... Program Execution Change Drivers • Mission / CONOPS • Capability Based Analysis • ... • Production Quantity • Acquisition Mgt • Scope definition/responsibility • Contract Award Expert Judgements Driver States & Probabilities Plans, Specifications, Assessments Probabilistic Modeling (BBN) & Monte Carlo Simulation Program Execution Scenarios with conditional probabilities of drivers/states • Cost Estimates • engineering • CERs • analogy • parametric
Issues with Expert Judgment • Most people are significantly overconfident and overoptimistic in their judgment! Calibrated = more realistic size and wider range to reflect true expert uncertainty An Estimate of SW Size
Cost Estimation Research Previous calibration research Current research in progress Future research & applications
Calibration Training • A series of training exercises • Typically 3 or 4 in sequence • Each exercise includes: • A battery of factual questions • Asking for upper and lower bounds within which people are 90 percent certain the correct answer lies • Sometimes true false questions where people provide their confidence in their answers • Brief reviews of the correct answers • Group discussions of why the participants answered as they did • Guidance with heuristics about ways to explicitly consider interdependencies among related factors ... that might affect the basis of one’s best judgments under uncertain circumstances
A Study of Accuracy versus Precision • Which would you rather have? • Someone whose recognized bounds of uncertainty include the correct answer... • Someone who’s a little overconfident but is closer to being accurate...
Relative Accuracy Improves Domain Specific Tests n=29 Generic Tests N=14 Experiments confirm that expert judgment can be calibrated.
Training Leads to Better Recognition of Uncertainty Domain Specific Tests Generic Tests
Experts Improved with Training Test 2: Accurate & imprecise Test 1: Inaccurate & imprecise Test 3: Accurate & Precise
Cost Estimation Research Previous calibration research Current research in progress Future research & applications
“Change Drivers” Explain Program Execution • Categories of unanticipated change events that often occur in MDAPs over the acquisition lifecycle: • Often a result of previous changes • Leading to subsequent changes • Or affecting program outcomes (which themselves can be drivers of further change. • Status of MDAP activities that are proceeding as planned are not change drivers. • Intended use • To enable DoD domain specific expert judgment training • Initially in QUELCE workshops • Other uses may be possible if we are successful in populating a larger DoD domain-specific reference point repository, e.g.: • “Deep dives” earlier in pre-Milestone A • Program planning & risk analysis throughout the Acquisition lifecycle
Domain Reference Points Aid Judgment “There is a 90% probability that MDAPs with certain characteristics will experience off nominal change drivers A, B and C.” “When change driver A goes off nominal, there is a 75% probability change driver B will go off nominal” “When change drivers A, B, and C go off nominal, there is a 90% probability that change driver D will go off nominal.” “When specific change drivers go off-nominal, specified impacts have occurred.” “When specific change drivers go off-nominal, other change drivers are influenced or impacts felt within a certain amount of calendar time.”
A Reference Point Repository for DoD • Categorizing & tagging textual information about change events • From program documents such as SAR & DAES • Identify DoD domain specific reference points mapped to QUELCE change drivers • Joining the tags & text excerpts with existing data • MDAP domain characteristics • Program performance outcomes, e.g., cost, schedule &scope of deliverables • Using the categories & text excerpts: • To assist judgments by QUELCE workshop teams based on experiences in analogous programs • For use in individual expert calibration experiments & group resolution of differences among team members • If we’re successful: Also used to support other activities • Both earlier in Milestone A & throughout the program lifecycle
Cost Estimation Research Previous calibration research Current research in progress Future research & applications
What’s Next for Expert Judgment Research? • A focus on DoD domain-specific questions & reference points • Seed a queryable reference point repository with DoDdata • Shift our focus to experiments on resolution of differences among members of expert groups • Quantify benefit of access to domain reference points • Comparing algorithmic & group consensus methods with respect to accuracy, recognition of uncertainty, & time required to resolve differences among team members • Upgrade our existing software support: • To capture individual judgments & eventually resolve differences without the need for face-to-face meetings
Leveraging the Delphi Planning Process • Given historical work • Wideband Delphi applied to cost estimation enabling discussion & a broader communications channel to produce more accurate results (Boehm 1981) • Recent research in software project estimation shows that estimates that benefit from group discussion tend to be more accurate (Cohn, 1997; Moløkken & Jørgensen, 2004). • We will research improved group decision-making judgment • Leverage expertise to forecast uncertainties related to costs and risks of program execution • Revisit conventional Delphi discouragement of discussion between rounds, introducing discussion of domain reference points
Additional Considerations in Judgment Experiments • Heuristics such as anchoring & adjustment, overconfidence, blind spot bias, and others commonly bias individual experts’ judgments • An individual estimator may first make a “best estimate” of duration for a program element ... then adjust it to form long-duration and short-duration estimates giving a range of likely outcomes • Such adjustments are commonly known to be too small (Fischhoff, 1994) • Resulting in too-tight range estimates & hugely over-frequent 1% and 5% tail occurrences • However, explicit prompting of the estimator’s imagination can substantially reduce this tightness (Connolly & Deane, 1997)
Summary • The eventual target is to apply these & related group reconciliation methods in our current research on QUELCE • QUELCE works by codifying expert judgment for cost estimates prior to Milestone A (Ferguson et al., 2011 ) • However improving group decision making is equally important for program planning and risk analysis throughout the lifecycle. • We will validate & enhance our previous research on calibrating individual judgment (Goldenson & Stoddard, 2013) by: • Developing DoD domain-specific questions for a series of test batteries & associated training exercises • Investigating the value of DoD domain-specific reference points that provide more detailed contextual background about programs analogous to the programs being considered in calibration test questions We welcome collaborators for the expert judgment experiments!