200 likes | 322 Views
A Multi-Expert Scenario Analysis for Systematic Comparison of Expert Weighting Approaches *. Umit Guvenc, Mitchell Small, Granger Morgan Carnegie Mellon University. CEDM Annual Meeting Pittsburgh, PA May 20, 2012.
E N D
A Multi-Expert Scenario Analysis for Systematic Comparison of Expert Weighting Approaches* Umit Guvenc, Mitchell Small, Granger Morgan Carnegie Mellon University CEDM Annual Meeting Pittsburgh, PA May 20, 2012 *Work supported under a cooperative agreement between NSF and Carnegie Mellon University through the Center for Climate and Energy Decision Making (SES-0949710)
Multi-Expert Weighting: A Common Challenge in Public Policy • Within climate change context, many critical quantities and probability distributions elicited from multiple experts (e.g., climate sensitivity) • No consensus on best methodology if one wanted to aggregate multiple, sometimes conflicting, expert opinions • Critical to demonstrate advantages and disadvantages of different approaches under different circumstances
General Issues Regarding Multi-Expert Weighting • Should we aggregate expert judgments at all? • If we do, should we use a differential weighting scheme? • If we do, should we use “seed questions” to assess expert skill? • If we do, how should we choose “appropriate” seed questions? • If we do, how do different weighting schemes perform under different circumstances? • Equal weights • Likelihood weights • “Classical” (Cooke) weights
Presentation Outline • Alternative Weighting Methods • Likelihood, “Classical”, Equal Weighting Schemes • Our Approach • Characterizing Experts • Bias, Precision, Confidence • Multi-Expert Scenario Analysis • Conclusions
Likelihood Weights • Traditional approach for multi-model aggregation in classical statistics • Equivalent to Bayesian model aggregation with uninformed priors • Uses relative likelihoods for Prob[true value| expert estimate] • We assume expert’s actual likelihood depends on their skill • Bias and Precision • Expert’s self-perceived likelihood depends on his/her Confidence • Parametric error distribution function required • Normal distribution assumed in analysis that follows (many risk-related quantities ~lognormal, so directly applicable to these) • “Micro” validation incorporated
“Classical” Weights • Cooke RM (1991), Experts in Uncertainty, Oxford University Press, Oxford • Cooke RM and Grossens LLHJ (2008) “TU Delft Expert Judgment Database”, Reliability Engineering and System Safety, v.93, p.657-674 • Per study: 7-55 seeds, 6-47 “effective” seeds, 4-77 experts • Parameters chosen to maximize expert weights • Within-sample validation • “Macro” validation only • Based on frequencies across percentiles across all questions • Non-parametric, based on Chi-square distribution
Our Approach • MC Simulation with 10 hypothetical questions • Experts characterized along three dimensions • Bias • Precision • Confidence • Multi-Expert Scenario Analysis
Characterizing Experts:Bias, Precision, Confidence fµ σmean(Precision) Expert thinks about the mean (i.e. best estimate) Bias TrueValue 0 µX µmean µ fX Expert thinks about distribution of variable X σX(Confidence) L=fX(0) * 0 X5% X50% X95% X
Multi-Expert Scenario Analysis • 9 experts, characterized by Bias, Precision, Confidence • 10 hypothetical questions (i = 1 to 10) • True Value XTrue(i) = 0 • Expert Estimate XEstimate(i): X5%, X50%,X95% • Predictive Error(i) = XTrue(i) - XGuess(i); MSE • Leave one question out at a time to predict (cross-validation) • Determine expert weights using 9 questions • Compare weights and predictive errorfor an assumed group of experts • Equal Weights • Likelihood Weights • “Classical” Weights
Multi-Expert Scenarios • Base Case • Impact of Bias • Impact of Precision • Impact of Confidence • Experts with bias, precision and confidence all varying
Scenario #1: Base Case • Model validation: Equal weights to equal skills
Scenario #2: Impact of Bias • When small and moderate bias introduced to multiple experts, weights change to penalize bias (more prominent in likelihood method)
Scenario #3: Impact of Precision • When Bias=0 for all and imprecision introduced to multiple experts, weights change to reward precision and penalize imprecision (more prominent in likelihood method)
Scenario #4: Impact of Confidence • When Bias=0 for all and over- and under-confidence introduced to multiple experts, weights change to penalize inappropriate confidence (more prominent in likelihood method for under-confidence)
Scenario #5a: Impact of Precision & Confidence (Bias = 0 for all) • When Bias=0 and imprecision and over-and under-confidence introduced to multiple experts • Weights change to reward “ideal” expert (more prominent in likelihood) • For “Classical”, proper confidence can somewhat compensate for imprecision, not so for Likelihood (imprecise experts are penalized highly, even if they know they are imprecise)
Scenario #5b: Impact of Precision & Confidence(Bias for all) • When bias for all, and varying amounts of precision and improper relative confidence introduced to multiple experts • Likelihood weights change to reward relatively precise, but underconfident experts • Classical weights shift to reward imprecise experts.
Scenario #5c: Precision & Confidence (Bias for 3 Experts) • When there is moderate bias in a subset of “good” experts, and both imprecision and over-and under-confidence introduced to all • Likelihood rewards “best” expert significantly • Classical spreads weights across much more
Conclusions (1) • Overall: Likelihood and “Classical” similar performance (much better than equal weights), but with very different weights assigned to experts with different degrees of bias, precision and relative confidence • Model Check: Both assign equal weights to experts with equal skill (equal bias, precision, and relative confidence) • Bias: Both penalize biased experts, stronger penalty in Likelihood • Precision: Both penalize imprecise experts, but again stronger penalty in Likelihood • Confidence: “Classical” penalizes overconfidence and underconfidence equally. Likelihood penalizes overconfidence a similar amount, but underconfidence much more so.
Conclusions (2) • Precision & Confidence: For “Classical”, proper (or under-) confidence can compensate somewhat for imprecision, not so for the Likelihood weights (and over-confidence remains better for Likelihood weighting). • Future Direction: Consider 3-parameter distributions to be fit from expert’s 5th, 50th, and 95th percentile values to enable a more flexible Likelihood approach • Conduct an elicitation in which 2- and 3-parameter likelihood functions are used and compared. • Consider how new information affects experts' performance on seed questions (explore VOI for correcting experts' biases, imprecision, and under- or overconfidence).