Dipak K. Dey University of Connecticut Some parts joint with: Junfeng Liu

Prior Elicitation from Expert Opinion Dipak K. Dey University of Connecticut Some parts joint with: Junfeng Liu Case Western Reserve University

Elicitation • Elicitation is the process of extracting • expert knowledge about some unknown quantity of interest, or the probability of some future event, which can then be used to supplement any numerical data that we may have. • If the expert in question does not have a statistical background, as is often the case, translating their beliefs into a statistical • form suitable for use in our analyses can be a challenging task.

Introduction • Prior elicitation is an important and yet under researched component of Bayesian statistics. • In any statistical analysis there will typically be some form of background knowledge available in addition to the data at hand. • For example, suppose we are investigating the average lifetime of a component. We can do tests on a sample of components to learn about their average lifetime, but the designer/ engineer of the component may have their own expectations about its performance.

Introduction • If we can represent the expert's uncertainty about the lifetime through a probability distribution, then this additional (prior) knowledge can be utilized within the Bayesian framework. • With a large quantity of data, prior knowledge tends to have less of an effect on final inferences. Given this fact, and the various techniques available for representing prior ignorance, practitioners of Bayesian statistics are frequently spared the effort of thinking about the available prior knowledge.

Introduction • It will not always be the case that we will have sufficient data to be able to ignore prior knowledge, and one example of this would be in the uncertainty in computer models application or modeling extreme events. • Uncertain model input parameters are often assigned probability distributions entirely on the basis of expert judgments. In addition, certain parameters in statistical models can be hard to make inferences about, even with a reasonable amount of data.

Introduction • The amount of research in eliciting prior knowledge is relatively low, and various proposed techniques are often targeted at specific applications. At the same time, recent advances in Bayesian computation have allowed far greater flexibility in modeling prior knowledge. In general, elicitation can be made difficult by the fact that we cannot expect the expert to provide probability distributions for quantities of interest directly.

Introduction • The challenge is then to find appropriate questions to ask the expert in order to extract their knowledge, and then to determine a suitable probabilistic description of the variables we are interested in based on the information we have learned from them.

Motivation • Three approaches: • [1] Direct Prior Elicitation: • Berger (1985) Relative frequency, and quantile based elicitation. • [2] Predictive prior probability space, which requires simple • priors and may be burdened with additional uncertainties • arising from the response model. • (Kadane, et al, 1980; Garthwaite and Dickey, 1988, Al-Awadhi and Garthwaite, 1998, etc.). • [3] Nonparametric Elicitation: • (Oakley and O’Hagan, 2002)

Symmetric Prior Elicitation • Double bisection method: Expert provides q(.25), q(.5) and q(.75), the three quantiles • IQR = q(.75)-q(.25) • Normal prior: • Z(q)= IQR of std. normal, then, prior mean and std. dev. are, • q(.5) and IQR/ Z(q) respectively.

Student’s t Prior • Three non redundant quantiles are required to estimate the df ν. Kadane et.al. (1980) suggested obtaining q(.5), q(.75) and q(.9375) • a(x) = (q(.9375)-q(.5))/(q(.75)-q(.5)) depends on df ν only • Df is determined from look up table of a(x) vs df ν.

Student’s t Prior • After elicitation of df obtain tν,0.75 • Calculate S(q) = (q(.75)-q(.5)) 2/ t2ν,0.75 • for elicitation of scale parameter σ. • This idea can be applied to any general location-scale family.

Lognormal Prior • Garthwaite (1989) used split-normal distribution, O’Hagan (1998) used 1/6, 3/6 and 5/6 quantiles. Proposition: If X has a log-normal distribution, i.e., , then the variance and the mean ,where is the is the IQR median of for standard normal distribution.

Direct Prior Elicitation Simple and limited prior family with only location and scale parameters (normal, exponential, etc.) (2) Location-scale-shape (µ--) parameter joint elicitation (gamma, skew-normal, Student’s t, etc.)

Symmetric and Asymmetric Priors Location-scale, symmetric • Normal • Student’s t • Log-normal • Skew-normal • Normal-exponential • Skew-Student’s t No location scale but shape, symmetric Location-scale, asymmetric Location-scale-shape, asymmetric Location-scale-shape, asymmetric Location-scale-shape, asymmetric

Shape Parameter Elicitation This is most challenging. Presumably, the Interquantile-Range-ratio (IQRR= [q(.75)-q(.5)]/[q(.5)-q(.25)] is a monotone function of shape parameter. We have two cases: (1) Shape-parameter is in the non-sensitive region, absolute value larger than 1. (2) Shape-parameter is in the sensitive region, absolute value smaller than 1.

Nonsensitive and sensitive regions (Skew-normal) Non-sensitive Sensitive IQRR (interquantile range ratio) vs. shape parameter

Shape Parameter Sensitive Region: Gamma Case

Parameter Elicitation Guideline: The elicitation input is IQRR and the hyperparameter is the shape parameter. We prefer a moderate sensitivity index (SI): Hyperparameter change / elicitation input change SI=∂ (IQRR)/∂ (l) We look for SI close to 1. Sensitive region: shape parameter is small in magnitude.

Parameter Elicitation on Shape Parameter Non-Sensitive Region (1) Elicit shape parameter from plot of IQRR() vs. (2) Scale parameter  = IQR/IQR() where, IQR is the interquantile range from expert, IQR() is the standardized IQR with elicited  from (1),  =1 and µ=0. (3) The location parameter is Q(0.75)-  Q(0.75,) where, Q(0.75) is .75 quantile from expert,  comes from (2), and Q(0.75,) is the standardized .75 quantile with elicited  from (1),  =1 and µ=0.

Note: The sensitivity index in “IQR() vs. ” and “Q(0.75,) vs. ” is usually moderate.

Approximate Scale Parameter Elicitation from Taylor’s Expansion (1: Basics) General approach for any location, scale and shape Family: [1] g(*) is the characteristic point of density f(x|µ,,), say mean, median, mode, etc. [2] g(*) = µ+g(), where g() is the standardized characteristic point. [3] f(g(*)|µ,,) = (1/)f(g()|0,1,).

Approximate Scale Parameter Elicitation from Taylor’s Expansion (2: Method) Letting (1)-(2) and only keeping first 2 terms on the right hand side, we get We get the approximate scale parameter without considering any consequences as

Relative Error in Student’s t Prior Elicitation (1: Values) From Taylor’s expansion, we have approximate The exact Where, [1] v is degrees of freedom [2] IQR is interquantile range from expert [3] p = 0.5 [4] is .75 quantile of Student’s t distribution with v degrees of freedom

Approximate Scale Parameter Elicitation from Taylor’s Expansion (3: Relative Error) Now (1)-(2) Denote (Only related to ) The relative error is

Relative Error in Student’s t Prior Elicitation (2: Plot) (1) ``approximate” represents Taylor expansion value: (2) ``exact” represents Taylor expansion value: (3) ``normal” represents , with as interquantile range for standardized normal distribution. (1) : (2) approaches 1.0763 as v goes to infinity.

An Important Observation When shape parameter is highly sensitive to IQRR, the approximate scale parameter elicitation by Taylor’s expansion will be very stable in terms of relative error.

Elicitation of Shape Parameter on Sensitive Region (Skew-normal, Iteration on characteristic points) Iteration based on Taylor’s expansion at median , mode or mean . (1) Start with current l, from high-proportional- fidelity by Taylor expansion, we have (2) The skew(shape) parameter can be obtained by plotting (3) Go to (1) until convergence (complete and ) (4) Location parameter

Elicitation on Shape Parameter Sensitive Region (Skew-normal, Iteration on IQRs) Iteration based on IQRs (1) Start with current , we look up , then (2) The skew (shape) parameter can be obtained by plot Since (3) Go to (1) until convergence (complete and ) (4) Location parameter

Graphical Comparison 1 (reference: IQR based iteration)

Graphical Comparison 2 (reference: median based iteration)

Graphical Comparison 3 (reference: mean based iteration)

Graphical Comparison 4 (reference: mode based iteration)

Another Important Observation The IQR based iteration is close to mean based iteration for skew-normal case, since mean is explicit, other than numerically solved.

Non-Parametric Prior Elicitation • To estimate prior density directly such that • , Suppose, = parametric family of distributions, where = vector of hyper parameters = underlying parameters in

Non-Parametric Prior Elicitation =(correlation function) = 1 if decreasing function of otherwise. ensures that prior variance covariance matrix of any set of observation or functional of is positive semi-definite.

Choice of Covariance function specifies the true density function. controls smoothness of the density. b large implies is large.

Hierarchical prior (Gaussian Process Prior) Special Case : then Then Prior:

Let D = elicited summaries relating to = {data} • H is a function of • A and is a function of

This implies, with

Posterior n = # of elements in D use MCMC to obtain samples from

Other Choices of Centering a) b) c) Gamma or Log-normal etc. d)

Side Conditions • Given Derivatives or quantiles D will be appropriately changed. In fact D can incorporate all the constraints specified in the prior, e.g., moments.

Psychological Perspective of Imprecise Subjective Probabilities • Numerical probabilty estimates (N) • Ranges of numerical values (R) • Verbal phrases (V) • Objective: • Translate the triplate (N,R,V) to a decision maker’s model

Imprecisely Assessed Distributions Contamination: Class of Bi-modal distribution

Future problems • Prior elicitation in Extreme Value Modeling • Quantile and graphical approaches for GEV model, Coles and Powel(1996) • Prior elicitation for short and long tailed distribution • Spatial modeling • High dimensional modeling

References 1. Daneshkhah, A. (2004). Psychological Aspects Influencing Elicitation of Subjective Probability. BEEP working paper. 2. Dey, D.K. and Liu, J. (2007). A quantitative study of quantile based direct prior elicitation from expert opinion. Bayesian Analysis, 2, 137-166. 3. Garthwaite, P. H., Kadane, J. B., and O'Hagan, A. (2005). Statistical methods for eliciting probability distributions. Journal of the American Statistical Association, 100, 680-701. 4. Jenkinson, D. (2005). The Elicitation of Probabilities-A Review of the Statistical Literature. BEEP working paper. 5. Kadane, J.B.,Dickey,J.M., Winkler, R.L., Smith, W.S. and Peters, S.C.(1980). Interactive elicitation of opinion for a normal linear model. JASA, 75, 845-854.

6. Oakley, J., and O'Hagan, A. (2005). Uncertainty in prior elicitations: a non-parametric approach. Revised version of research report No. 521/02 Department of Probability and Statistics, University of Sheffield. 7. O'Hagan, A. (2005). Research in elicitation. Research Report No.557/05, Department of Probability and Statistics, University of Sheffield. Invited article for a volume entitled Bayesian Statistics and its Applications. 8. O' Hagan, A., Buck, C. E., Daneshkhah, A., Eiser, J. E., Garthwaite, P. H., Jenkinson, D. J., Oakley, J. E. and Rakow, T. (2006). Uncertain Judgements: Eliciting Expert Probabilities. This book Will be published by John Wiley and Sons in July 2006.

THANK YOU

Dipak K. Dey University of Connecticut Some parts joint with: Junfeng Liu