A BRIEF OVERVIEW OF SOME STATISTICAL RESEARCH RELATED TO COMPUTER MODELS

A BRIEF OVERVIEW OF SOME STATISTICAL RESEARCH RELATED TO COMPUTER MODELS Max D. Morris Dept. of Statistics, Dept. of Industrial and Manufacturing Systems Engineering, Iowa State University

OVERVIEW • (Very incomplete) summary of some stat’l approaches to empirical study of computer models, organized by: • Assessment of relative importance of inputs • Quantitative understanding of computer models • Relationships between models and their context • Some “twists” and current points of interest concerning GaSP models • Potential links to more general problems and algorithms

INPUT ASSESSMENT • Input sampling plans • e.g. Latin hypercube sampling – Variance reduction techniques to improve sampling properties of simple statistics • Uncertainty Analysis – Variance propagation via other restricted random samples – Where should input uncertainty be reduced so as to most reduce output uncertainty? • “Quasi-Modeling” based on ranks – Standard data analysis techniques after replacing data with ranks • “Linearize” monotonic relationships, so that simple modeling techniques work – No attempt to predict/approximate output • Fairly ad hoc, but often effective • Dimension-Reduction methods – “Screening” to discover the (hopefully small) subset of important inputs

QUANTITATIVE UNDERSTANDING OF MODELS • “Meta-models” based on (e.g.) classification/regression trees – Focus on partitioning input space into homogeneous sectors • Flexible (input-)”spatial” modeling – Better justified for mimicking unknown functional forms … especially: • Spatial stochastic process models – Most often Gaussian Stochastic Process models (GaSP) • Multi-dimensional analogue of time-series modeling • Akin to spatial analysis techniques, e.g. Kriging • Most applicable when some qualitative assumptions can be made about the model (continuity, smoothness …) • Design for GaSP models • Entropy – Information content of statistical model • Asymptotic arguments – Distance between input vectors

MODELS AND CONTEXT • Issues concerning relationships between • Model: y = M(x,) • Reference: yR  xR • Calibration: What  will make y match yR when x matches xR? • Stat’l issues: “Honest” assessment of uncertainty • e.g. groundwater flow, incomplete porosity information • Validation: (How well / Where) does y match yR when x = xR? • Stat’l issues: Uncertainty about  and/or xR • Of particular concern with limited reference data, e.g. weapons stockpile • Inverse problems: What xR led to this yR? • Stat’l issues: Uncertainty about fidelity of M(-, ) • e.g. subterranean void detection via acoustic pinging

“TWISTS” CONCERNING GaSP MODELS • Use of derivative information • Motivated by “augmented” codes (e.g. adjoint-equation form) • Modeling at interface of statistics and numerical analysis • Statistically model an “intermediate” quantity that is: • more regular/well-behaved than output • highly reliable as a simple predictor of output • e.g. truncation errors • Knowledge/Model/Model/Physical data-merging (e.g. LANL) • System-of-systems modeling • “Patching” a meta-model when one component of a modular system is changed • e.g. supply chain models

POSSIBLE LINKS TO MORE GENERAL ALGORITHMS • Physical model validation • “Benchmark” comparisons? • Joint analysis of related models/algorithms • “Sampling and experimental design for asymptotic analysis”? • “Statistical methods for assessing convergence”? • “Species Discovery” , “x-partitioning” (rather than “y-prediction”) • Success rate / “basin of attraction” questions about random starts • Experimental design for software reliability assessment, Spatial “bump-hunting” • “Modeling the relationship between input parameters and performance”?

ABSTRACT I'll summarize statistical ideas and methods that have been developed for empirical studies involving computer models. The presentation will focus on three general areas: (1.) assessment of the relative importance of inputs, (2.) quantitative understanding of input-output relationships, and (3.) questions involving relationships between models and their context. Most of the specific work to which I'll refer is motivated by problems involving computer models constructed to represent a ``reality'' of some sort; I'll conclude with some thoughts concerning how these and other statistical ideas might be useful in evaluations of more general models and algorithms.

Dalal, S.R., and C.L. Mallows (1989) “Factor-Covering Designs for Testing Software,” Technometrics 40, 234-243. -- Brief discussion of how software testing can be viewed as an experimental design problem. • Dean, A. and S. Lewis (eds.) Screening: Methods for Experimentation in Industry, Drug Discovery, and Genetics, Springer – Recent volume of statistical screening ideas applied in several experimental applications areas. • Diaconis, P. (1988) “Bayesian Numerical Analysis,” Statistical Decision Theory and Related Topics IV, J. Berger, S. Gupta (eds.), Springer-Verlag. -- Introduction to the idea of what the title says. • Kennedy, M. and A. O’Hagan (2000) “Predicting the Output from a Complex Computer Code when Fast Approximations are Available,” Biometrika 87, 1-13. – Heavily referenced paper on joint analysis of similar models. • Sacks, J., W. Welch, T. Mitchell, and H. Wynn (1989) “Design and Analysis of Computer Experiments,” Statistical Science 4, 409-423. -- Early description of using stochastic processes to examine computer models. • Saltelli, A., K. Chan, and E.Scott (2000) Sensitivity Analysis, John Wiley and Sons – Summary of input sampling techniques to support relatively simple sensitivity analyses. • Santner, T.J., B.J. Williams, and W. I. Notz (2003) The Design and Analysis of Computer Experiments, Springer, ISBN 0-387-95420-1 – General overview of the use of GASP models in computer experiments. A FEW REFERENCES

A BRIEF OVERVIEW OF SOME STATISTICAL RESEARCH RELATED TO COMPUTER MODELS

A BRIEF OVERVIEW OF SOME STATISTICAL RESEARCH RELATED TO COMPUTER MODELS

Presentation Transcript

A Brief (very brief) Overview of Biostatistics

Statistical Models of Appearance for Computer Vision

A brief overview

Brief Overview of Some Futures Research Methods

Some Methods for Statistical Harmonization A fantastically brief introduction

HIV-related Research Overview

Overview of Research related to Literacy

A Brief OVERVIEW

A brief overview…

A Brief Introduction to Graphical Models

A Brief Overview…

BRIEF OVERVIEW OF SOME 2011 ACTIONS

statistical validation of numerical models: some methods

Overview of some international projects related to ECIMF

HIV-related Research Overview

An overview presenting some of our activities related to;

A Brief Overview of Current Research on Dividends

A Brief Overview of

A Brief Overview of Computer Vision

A Brief Overview of Some Important Acceptance Testing Frameworks

A BRIEF OVERVIEW OF SOME STATISTICAL RESEARCH RELATED TO COMPUTER MODELS

A Brief Introduction to Statistical Forecasting