1 / 67

Emulation, Elicitation and Calibration

Emulation, Elicitation and Calibration. UQ12 Minitutorial Presented by: Tony O’Hagan, Peter Challenor , Ian Vernon. Outline of the minitutorial. Three sessions of about 2 hours each Session 1: Monday, 2pm – 4pm, State C

svein
Download Presentation

Emulation, Elicitation and Calibration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Emulation, Elicitation and Calibration UQ12 Minitutorial Presented by: Tony O’Hagan, Peter Challenor, Ian Vernon UQ12 minitutorial - session 1

  2. Outline of the minitutorial Three sessions of about 2 hours each • Session 1: Monday, 2pm – 4pm, State C • Overview of UQ ; total UQ; introduction to emulation; elicitation • Session 2: Tuesday, 2pm – 4pm, State C • Building and using an emulator; sensitivity analysis • Session 3: Wednesday, 2pm – 4pm, State C • Calibration and history matching; galaxy formation case study Intended to introduce the applied maths/engineering UQ people to UQ methods developed in the statistics community UQ12 minitutorial - session 1

  3. Session 1 Introduction and elicitation

  4. Outline • Introduction • UQ and Total UQ • Managing uncertainty • A brief case study • Emulators • Elicitation • Elicitation principles • Elicitation practice UQ12 minitutorial - session 1

  5. UQ and Total UQ UQ12 minitutorial - session 1

  6. What is UQ? • Uncertainty quantification • A term that seems to have been devised by engineers • Faced with uncertainty in some particular kinds of analyses • Characterising how uncertainty about inputs to a complex computer model induces uncertainty about outputs • Large body of work in engineering and applied maths • Uncertainty quantification • What statisticians do! • And have always done • In every field of application, for all kinds of analyses • In particular, statisticians have developed methods for propagating and quantifying output uncertainty • And lots more relating to the use of complex simulation models UQ12 minitutorial - session 1

  7. Simulators • In almost all fields of science, technology, industry and policy making, people use mechanistic models • For understanding, prediction, control • Huge variety • A model simulates a real-world, usually complex, phenomenon as a set of mathematical equations • Models are usually implemented as computer programs • We will refer to a computer implementation of a model as a simulator UQ12 minitutorial - session 1

  8. Why worry about uncertainty? • Simulators are increasingly being used for decision-making • Taking very seriously the implied claim that the simulator represents and predicts reality • How accurate are model predictions? • There is growing concern about uncertainty in model outputs • Particularly where simulator predictions are used to inform scientific debate or environmental policy • Are their predictions robust enough for high stakes decision-making? UQ12 minitutorial - session 1

  9. For instance … • Models for climate change produce different predictions for the extent of global warming or other consequences • Which ones should we believe? • What error bounds should we put around these? • Are simulator differences consistent with the error bounds? • Until we can answer such questions convincingly, why should anyone have faith in the science? UQ12 minitutorial - session 1

  10. The simulator as a function • In order to talk about the uncertainty in model predictions we need some simple notation • Using computer language, a simulator takes a number of inputs and produces a number of outputs • We can represent any output y as a function y = f(x) of a vector x of inputs UQ12 minitutorial - session 1

  11. Where is the uncertainty? • How might the simulator output y = f(x) differ from the true real-world value z that the simulator is supposed to predict? • Error in inputs x • Initial values • Forcing inputs • Model parameters • Error in model structure or solution • Wrong, inaccurate or incomplete science • Bugs, solution errors UQ12 minitutorial - session 1

  12. The ideal is to provide a probability distribution p(z) for the true real-world value The centre of the distribution is a best estimate Its spread shows how much uncertainty about zis induced by uncertainties on the previous slide How do we get this? Input uncertainty: characterise p(x), propagate through to p(y) Structural uncertainty: characterise p(z–y) Quantifying uncertainty UQ12 minitutorial - session 1

  13. More uncertainties • It is important to recognise two more uncertainties that arise when working with simulators • The act of propagating input uncertainty is imprecise • Approximations are made • Introducing additional code uncertainty • A key task in managing uncertainty is to use observations of the real world to tune or calibrate the model • We need to acknowledge uncertainty due to measurement error UQ12 minitutorial - session 1

  14. Code uncertainty – Monte Carlo • The simplest way to propagate uncertainty is Monte Carlo • Take a large random sample of realisations from p(x) • Run the simulator at each sampled x to get a sample of outputs • This is a random sample from p(y) • E.g. sample mean estimates E(Y) • Even with a very large sample, MC computations are not exact • Sample is an approximation of the population • Standard error of sample mean is population s.d. over root n • This is code uncertainty • MC has a built-in statistical quantification of code uncertainty UQ12 minitutorial - session 1

  15. Code uncertainty – alternatives to MC • MC is impractical for simulators that require significant resources, so other methods have been developed • Polynomial chaos methods • PC expansions are always truncated • The truncation error is where the main code uncertainty lies • Also in solving Galerkin equations • Surrogate models (e.g. emulators) • Approximations to the true f(.) • Code uncertainty lies in the approximation error UQ12 minitutorial - session 1

  16. How to quantify uncertainty • To quantify uncertainty in the true real world value that the simulator is trying to predict we need the following steps • Quantify uncertainty in inputs, p(x) • Propagate to uncertainty in output, p(y) • Quantify and account for code uncertainty • Quantify and account for model discrepancy uncertainty • Engineering/applied maths UQ apparently only deals with the second step • Ironically, this is the one step that doesn’t actually involve quantifying uncertainty! UQ12 minitutorial - session 1

  17. Total UQ • Here are my key demands • UQ for any quantity of interest must quantify all components of uncertainty • All UQ must be in the form of explicit, quantified probability distributions • All quantifications of uncertainty should be credible representations of what is, and is not, known • None of this is easy but we should at least try • I call these aspirations the Total UQ Manifesto UQ12 minitutorial - session 1

  18. Managing uncertainty UQ12 minitutorial - session 1

  19. UQ is not enough • The presence of uncertainty creates several important tasks • Engineering/applied maths UQ addresses only one of these • Managing uncertainty • Uncertainty analysis – how much uncertainty do we have? • This is the basic UQ task • Sensitivity analysis – which sources of uncertainty drive overall uncertainty, and how? • Understanding the system, prioritising research • Calibration – how can we reduce uncertainty? • Use of observations • Tuning, data assimilation, history matching, inverse problems • Experimental design UQ12 minitutorial - session 1

  20. Decision-making under uncertainty – can we cope with uncertainty? • Robust engineering design • Optimisation under uncertainty UQ12 minitutorial - session 1

  21. MUCM • Managing Uncertainty in Complex Models • Large 4-year UK research grant • June 2006 to September 2010 • 7 postdoctoral research associates, 4 project PhD students • Objective to develop BACCO methods into a basic technology, usable and widely applicable • MUCM2: New directions for MUCM • Smaller 2-year grant to September 2012 • Scoping and developing research proposals UQ12 minitutorial - session 1

  22. Primary MUCM deliverables • Methodology and papers moving the technology forward • Papers both in statistics and application area journals • The MUCM toolkit • Documentation of the methods and how to use them • With emphasis on what is found to work reliably across a range of modelling areas • Web-based • Case studies • Three substantial case studies • Showcasing methods and best practice • Linked to toolkit • Events • Workshops – conceptual and hands-on • Short courses • Conferences – UCM 2010 and UCM 2012 (July 2-4) UQ12 minitutorial - session 1

  23. Focus on the • The toolkit is a ‘recipe book’ • The good sort that encourages you to experiment • There are recipes (procedures) but also lots of explanation of concepts and discussion of choices • It is not a software package • Software packages are great if they are in your favourite language • But it probably wouldn’t be! • Packages are dangerous without basic understanding • The purpose of the toolkit is to build that understanding • And it enables you to easily develop your own code UQ12 minitutorial - session 1

  24. Resources • Introduction to emulators • O'Hagan, A. (2006). Bayesian analysis of computer code outputs: a tutorial. Reliability Engineering and System Safety 91, 1290-1300. • The MUCM website • http://mucm.ac.uk • The MUCM toolkit • http://mucm.ac.uk/toolkit • The UCM 2012 conference • http://mucm.ac.uk/UCM2012.html UQ12 minitutorial - session 1

  25. This minitutorial • This minitutorial covers the key elements of Total UQ and uncertainty management • Emulators • Surrogate models that include quantification of code uncertainty • Brief outline in this session then details in session 2 • Elicitation • Tools for rigorous quantification of fundamental uncertainties • Introduction to this big field in this session • Management tools • Sensitivity analysis in session 2 • Calibration and history matching in session 3 UQ12 minitutorial - session 1

  26. A brief case study Complex emulation and expert elicitation were essential components of this exercise UQ12 minitutorial - session 1

  27. Example: UK carbon flux in 2000 • Vegetation model predicts carbon exchange from each of 700 pixels over England & Wales in 2000 • Principal output is Net Biosphere Production • Accounting for uncertainty in inputs • Soil properties • Properties of different types of vegetation • Land usage • Also code uncertainty • But not structural uncertainty • Aggregated to England & Wales total • Allowing for correlations • Estimate 7.46 Mt C (± 0.54 Mt C) UQ12 minitutorial - session 1

  28. Mean NBP Standard deviation Maps UQ12 minitutorial - session 1

  29. England & Wales aggregate UQ12 minitutorial - session 1

  30. Emulators UQ12 minitutorial - session 1

  31. So far, so good, but • In principle, Total UQ is straightforward • In practice, there are many technical difficulties • Formulating uncertainty on inputs • Elicitation of expert judgements • Propagating input uncertainty • Modelling structural error • Anything involving observational data! • The last two are intricately linked • And computation UQ12 minitutorial - session 1

  32. The problem of big models • Tasks like uncertainty propagation and calibration require us to run the simulator many times • Uncertainty propagation • Implicitly, we need to run f(x) at all possible x • Monte Carlo works by taking a sample of x from p(x) • Typically needs thousands of simulator runs • Calibration • Traditionally done by searching x space for good fits to the data • Both become impractical if the simulator takes more than a few seconds to run • 10,000 runs at 1 minute each takes a week of computer time • We need a more efficient technique UQ12 minitutorial - session 1

  33. More efficient methods • This is what UQ theory is mostly about • Engineering/Applied Maths UQ • Polynomial chaos expansions of random variables • Approximate by truncating • Thereby build an expansion of outputs • Compute by Monte Carlo etc. using this surrogate representation • Statistics UQ • Gaussian process emulation of the simulator • A different kind of surrogate • Propagate input uncertainty through surrogate • By Monte Carlo or analytically UQ12 minitutorial - session 1

  34. Gaussian process representation • More efficient approach • First work in early 1980s (DACE) • Represent the code as an unknown function • f(.) becomes a random process • We generally represent it as a Gaussian process (GP) • Or its second-order moment version • Training runs • Run simulator for sample of x values • Condition GP on observed data • Typically requires many fewer runs than Monte Carlo • And x values don’t need to be chosen randomly UQ12 minitutorial - session 1

  35. Emulation • Analysis is completed by prior distributions for, and posterior estimation of, hyperparameters • The posterior distribution is known as an emulator of the computer simulator • Posterior mean estimates what the simulator would produce for any untried x (prediction) • With uncertainty about that prediction given by posterior variance • Correctly reproduces training data • Gets its UQ right! • An essential requirement of credible quantification UQ12 minitutorial - session 1

  36. Consider one input and one output Emulator estimate interpolates data Emulator uncertainty grows between data points 2 code runs UQ12 minitutorial - session 1

  37. Adding another point changes estimate and reduces uncertainty 3 code runs UQ12 minitutorial - session 1

  38. And so on 5 code runs UQ12 minitutorial - session 1

  39. Then what? • Given enough training data points we can in principle emulate any simulator output accurately • So that posterior variance is small “everywhere” • Typically, this can be done with orders of magnitude fewer model runs than traditional methods • At least in relatively low-dimensional problems • Use the emulator to make inference about other things of interest • E.g. uncertainty analysis, sensitivity analysis, calibration • The key feature that distinguishes an emulator from other kinds of surrogate • Code uncertainty is quantified naturally • And credibly UQ12 minitutorial - session 1

  40. Elicitation principles UQ12 minitutorial - session 1

  41. Where do probabilities come from? • Consider the probability distribution for a model input • Like the hydraulic conductivity Kin a geophysical model • Suppose we ask an expert, Mary • Mary gives a probability distribution for K • We might be particularly interested inone probability in that distribution • Like the probability that Kexceeds 10-3 (cm/sec) • Mary’s distribution says Pr(K> 10-3) = 0.2 UQ12 minitutorial - session 1

  42. How can K have probabilities? • Almost everyone learning probability is taught the frequency interpretation • The probability of something is the long run relative frequency with which it occurs in a very long sequence of repetitions • How can we have repetitions of K? • It’s a one-off, and will only ever have one value • It’s that unique value we’re interested in • Mary’s distribution can’t be a probability distribution in that sense • So what do her probabilities actually mean? • And does she know? UQ12 minitutorial - session 1

  43. Mary’s probabilities • Mary’s probability 0.3 that K > 10-3 is a judgement • She thinks it’s more likely to be below 10-3 than above • So in principle she would bet even money on it • In fact she would bet $2 to win $1 (because 0.7 > 2/3) • Her expectation of around 10-3.5 is a kind of best estimate • Not a long run average over many repetitions • Her probabilities are an expression of her beliefs • They are personal judgements • You or I would have different probabilities • We want her judgements because she’s the expert! • We need a new definition of probability UQ12 minitutorial - session 1

  44. Subjective probability • The probability of a proposition E is a measure of a person’s degree of belief in the truth of E • If they are certain that E is true then Pr(E) = 1 • If they are certain it is false then Pr(E) = 0 • Otherwise Pr(E) lies between these two extremes • Exercise UQ12 minitutorial - session 1

  45. Subjective includes frequency • The frequency and subjective definitions of probability are compatible • If the results of a very long sequence of repetitions are available, they agree • Frequency probability equates to the long run frequency • All observers who accept the sequence as comprising repetitions will assign that frequency as their (personal or subjective) probability for the next result in the sequence • Subjective probability extends frequency probability • But also seamlessly covers propositions that are not repeatable • It’s also more controversial UQ12 minitutorial - session 1

  46. It doesn’t include prejudice etc! • The word “subjective” has derogatory overtones • Subjectivity should not admit prejudice, bias, superstition, wishful thinking, sloppy thinking, manipulation ... • Subjective probabilities are judgements but they should be careful, honest, informed judgements • As “objective” as possible without ducking the issue • Using best practice • Formal elicitation methods • Bayesian analysis • Probability judgements go along with all the other judgements that a scientist necessarily makes • And should be argued for in the same careful, honest and informed way UQ12 minitutorial - session 1

  47. But people are poor probability judges • Our brains evolved to make quick decisions • Heuristics are short-cut reasoning techniques • Allow us to make good judgements quickly in familiar situations • Judgement of probability is not something that we evolved to do well • The old heuristics now produce biases • Anchoring and adjustment • Availability • Representativeness • The range-frequency compromise • Overconfidence UQ12 minitutorial - session 1

  48. Anchoring and adjustment • When asked to make two related judgements, the second is affected by the first • The second is judged relative to the first • By adjustment away from the first judgement • The first is called the anchor • Adjustment is typically inadequate • Second response too close to the first (anchor) • Anchoring can be strong even when obviouslynot really relevant to the second question UQ12 minitutorial - session 1

  49. Availability • The probability of an event is judged more likely if we can quickly bring to mind instances of it • Things that are more memorable are deemed more probable • High profile train accidents in the UK lead people to imagine rail travel is more risky than it really is • My judgement of the risk of dying from a particular disease will be increased if I know (of) people who have the disease or have died from it UQ12 minitutorial - session 1

  50. Representativeness • An event is considered more probable if the components of its description fit together • Even when the juxtaposition of many components is actually improbable • “Linda is 31, single, outspoken and very bright. She studied philosophy at university and was deeply concerned with issues of discrimination and social justice. Is Linda … • “A bank teller? • “A bank teller and active in the feminist movement?” • The second is often judged more probable than the first • We are a story-telling species • This is also called the conjunction fallacy UQ12 minitutorial - session 1

More Related