710 likes | 930 Views
Recent advances in Global Sensitivity Analysis techniques. S. Kucherenko Imperial College London, UK s.kucherenko@imperial.ac.uk.
E N D
Recent advances in Global Sensitivity Analysis techniques S. Kucherenko Imperial College London, UK s.kucherenko@imperial.ac.uk France 2008
Introduction of Global Sensitivity Analysis and Sobol’ Sensitivity IndicesWhy Quasi Monte Carlo methods (Sobol’ sequence sampling) are much more efficient than Monte Carlo (random sampling) ?Effective dimensions and their link with Sobol’ Sensitivity IndicesClassification of functions based on global sensitivity indicesLink between Sobol’ Sensitivity Indices and Derivative based Global Sensitivity MeasuresQuasi Randon Sampling - High Dimensional Model Representation with polynomial approximationApplication of parametric GSA for optimal experimental design Outline France 2008
Propagation of uncertainty … … 1 2 n Model Input Output x1 x2 x3 y x4 … xk xi : input factors France 2008
Sensitivity Indices (SI) Consider a model x is a vector of input variables Y is the model output. ANOVA decomposition (HDMR): Variance decomposition: Sobol’ SI: France 2008
Sobol’ Sensitivity Indices (SI) • Definition: - partial variances - variance • Requires 2n integral evaluations for calculations • Sensitivity indices for subsets of variables: Introduction of the total variance: • Corresponding global sensitivity indices: France 2008
How to use Sobol’ Sensitivity Indices? • accounts for all interactions between y and z, x=(y,z). • The important indices in practice are and • does not depend on ; • does only depend on ; • corresponds to the absence of interactions between and other variables • If then function has additive structure: • Fixing unessential variables • If does not depend on so it can be fixed • complexity reduction, from to variables France 2008
Evaluation of Sobol’ Sensitivity Indices Straightforward use of Anova decomposition requires 2n integral evaluations – not practical ! There are efficient formulas for evaluation of Sobol’ Sensitivity Indices ( Sobol’ 1990): Evaluation is reduced to high-dimensional integration. Monte Carlo method is the only way to deal with such problems France 2008
Original vrs Improved formulae for evaluation ofSobol’ Sensitivity Indices France 2008
Improved formula for Sobol’ Sensitivity Indices France 2008
Comparison deterministic and Monte Carlo integration methods France 2008
Monte Carlo integration methods France 2008
How to improve MC ? France 2008
Sobol’ Sequences vrs Random numbersand regular grid Unlike random numbers, successive Sobol’ points “know" about the position of previously sampled points and fill the gaps between them France 2008
Quasi random sequences France 2008
Regular Grid Sobol’ Sequence What is the optimal way to arrange N points in two dimensions? Low dimensionalprojections of low discrepancy sequences are better distributed than higher dimensionalprojections France 2008
Comparison between Sobol sequencesand random numbers France 2008
Normally distributed Sobol’ Sequences Uniformly distributed Sobol’ sequences can be transformed to any other distribution with a known distribution function Normal probability plotsHistograms France 2008
Are QMC efficient for high dimensional problems ? “For high-dimensional problems (n > 12), QMC offers no practical advantage over Monte Carlo” ( Bratley, Fox, and Niederreiter (1992)) ?! France 2008
DiscrepancyI. Low Dimensions France 2008
DiscrepancyII. High Dimensions MC in high-dimensions has smaller discrepancy France 2008
Is MC more efficient for high-dimensional problems than QMC ? • Pros: • MC in high-dimensions has smaller discrepancy • Some studies show degradation of the convergence rate of QMC methods in high-dimensions to O(1/√N) • Cons: • Huge success of QMC methods in finance: QMC methodswere proven to be much more efficient than MC even for problems with thousands of variables • Many tests showed superior performance of QMC methodsfor high-dimensional integration France 2008
Effective dimension ___________________________________________________________ France 2008
Approximation errors For many problems only low order terms in the ANOVA decomposition are important Consider an approximation error Theorem 1: Link between an approximation error and effective dimension in superposition sense ___________________________________________________________________ Set of variables can be regarded as not important if If and Consider an approximation error Theorem 2: Link between an approximation error and effective dimension in truncation sense France 2008
Classification of functions Type A. Variables are not equally important Type C. Dominant higher order indices Type B. Dominant low order indices Type B,C. Variables are equally important France 2008
Sensitivity indices for type A functions France 2008
Integration error vs. N. Type A(a) f(x) = ∑nj=1(-1)i ij=1 xj, n = 360,(b) f(x) = si=1│4xi-2│/(1+a i), n = 100 (a) (b) France 2008
Sensitivity indices for type B functionsDominant low order indices France 2008
Integration error vs. N. Type BDominant low order indices (a) (b) France 2008
Sensitivity indices for type C functionsDominant higher order indices France 2008
The integration error vs. N. Type CDominant higher order indices: (a) (b) France 2008
Model The Morris method Elementary Effectfor the ith input factor in a point Xo France 2008
r elem. effects EE1i EE2i … EEri are computed at X1 , … , Xr and then averaged. Average of EEi’s (xi) Standard deviation of the EEi’s σ(xi) The EEi is still a local measure Solution: take the average of several EE France 2008
A graphical representation of results Factors can be screened on the (xi), σ(xi)plane France 2008
A trajectory of the EE design Implemention of the Morris method r trajectories of (k+1) sample points are generated, each providing one EE per input Total cost = r (k + 1) r is in the range 4 -10 France 2008
A comparison with variance-based methods: *(xi) is related to STi Test: the g-function of Sobol’ a=99a=9a=0.9 *(xi) and STi give similar ranking Problems: large Δ -> incorrect *(xi) France 2008
Sample X1 , … , Xr Sobol points, estimate finite differences E1i ,E2i … Eri and then averaged. Average of Ei’s M*(xi) Derivative based Global Sensitivity Measures Morris measure in the limit Δ → 0 France 2008
The integration error vs. N. Type Ag-function of Sobol’ . (a) (b) France 2008
Comparison of Sobol’ SI and Derivative based Global Sensitivity Measures (a) (b) (c) There is a link between and France 2008
Comparison of Sobol’ SI and Derivative based Global Sensitivity Measures 1. Small values of imply small values of . 2. For highly nonlinear functions ranking based on global SI can be very different from that based on derivative based sensitivity measures France 2008
Quasi Randon Sampling HDMR For many problems only low order terms in the ANOVA decomposition are important. is a metamodel (HDMR), Rabitz et al: It is assumed that effective dimension in superposition senseds=2. Sobol’ SI: France 2008
Polynomial Approximation Orthonormal polynomial base • Properties: • First few Legendre polynomials: France 2008
Global Sensitivity Analysis (HDMR) • The number of function evaluations is • N(n+2) for original Sobol’ method • Nfor sensitivity indices based on RS-HDMR France 2008
How to define maximum polynomial order ? • Homma-Saltelli function France 2008
RMSE for Homma-Saltelli function Root mean square error: QMC outperforms MC RS-HDMR has higher convergence than Sobol SI method France 2008
Sobol g-function • g-function: with • 2 important and 8 unimportant variables QRS-HDMRconverges faster Values ofSitot can be inaccurate. France 2008
Function Approximation Sobol g-function Error measure: France 2008
Computational costs QRS-HDMR method requires 10 to 103 times less model evaluations than Sobol SI method ! France 2008
subject to: System dynamics (ODEs, DAEs) Other algebraic constraints Upper and lower bounds: Optimal experimental design (OED) for parameter estimation Find values of experimentally manipulable variables (controls) and the time sampling strategy for a set of Nexp experiments which provides maximum information for the subsequent parameter estimation problem Non-linear programming problem (NLP) with partial differential-algebraic (PDAEs) constraints France 2008
Case study: fed-batch reactor • Parameters to be estimated: p1, p2 0.05 < p1 < 0.98, 0.05 < p2 < 0.98 • Control variables: u1, u2 • Dilution factor: 0.05 < u1 < 0.5 • Feed substrate concentration: 5 < u2 < 50 • Biomass: • Substrate: • Reaction rate: France 2008
OED traditional approach • Fisher Information Matrix ( FIM ) based criteria: • A criterion = • D criterion = • E criterion = • Modified-E criterion = Main drawback: based on local SI non-realistic linear and local assumptions France 2008