320 likes | 461 Views
Extracting Complex Nonlinear Response Surfaces from Deterministic Models. Davood Shahsavani and Anders Grimvall Linköping University. Motivation. Many applications of computer code models require repeated model runs for different sets of inputs Response surface methodologies can:
E N D
Extracting Complex Nonlinear Response Surfaces from Deterministic Models Davood Shahsavani and Anders Grimvall Linköping University SAMO 2007, 21 Jun 2007
Motivation Many applications of computer code models require repeated model runs for different sets of inputs Response surface methodologies can: • Help to learn about the model • Facilitate the development of computationally cheap decision support tools SAMO 2007, 21 Jun 2007
The two steps in extracting response surfacesfrom computationally expensive computer-code models Step 1: • Choose a suitable design of the computer experiment Step 2: • Choose an interpolation method that enables accurate prediction of the model output at previously untried inputs SAMO 2007, 21 Jun 2007
Features of currently used techniques for extracting reponse surfaces • The design criteria favour regular fractional factorial designs or an almost uniform coverage of the input domain • The interpolation is usually based on a global model for the entire input domain SAMO 2007, 21 Jun 2007
Study objective Extraction of response surfaces whose curvature varies strongly over the input domain • Our designs are space-filling and have a particularly good coverage of regions in which the response surface is rough or strongly nonlinear • We fitlocal models to the responses computed for subsets of design points SAMO 2007, 21 Jun 2007
Splitting procedure SAMO 2007, 21 Jun 2007
Design points after two splits SAMO 2007, 21 Jun 2007
A sequential design algorithm for box-shaped input domains 1. Initiate the design algorithm by selecting a (slightly extended) corner-centre design 2. Start a loop in which the input domain is split into sub-boxes, and new corners and centres are added to the design • A measure of roughnessor nonlinearity is used to determine which box that shall be split into two halves • A direction criterion is used to determine in which direction the selected sub-box shall be split SAMO 2007, 21 Jun 2007
Quantifying the roughness of a response surface • The integrated roughness Rf(D) of a function f (x1, … .xp) in a subsetDof the input domain is usually defined as • The integrated roughness of a second order polynomial is • We estimate the integrated roughness of any function by computing whereare estimated parameters in a fitted polynomial SAMO 2007, 21 Jun 2007
A roughness measure that focuseson integrated absolute errors in linear predictors • Consider a box in which the ith side has length hi • We estimate the roughness of the response surface by computing SAMO 2007, 21 Jun 2007
Possible splits of a box SAMO 2007, 21 Jun 2007
Splitting direction Suppose that the sub-box D* shall be split, and let be the polynomial fitted to data in or on the border of that box Then we split the selected sub-box along the kth coordinate where SAMO 2007, 21 Jun 2007
Design points for a simple response surfacef(x1, x2) = x15 + x25 SAMO 2007, 21 Jun 2007
120 100 80 60 40 20 1 11 21 120 100 80 60 40 20 1 11 21 The INCA-N model( Integrated Nitrogen in Catchments ) Average annual riverine load of inorganic nitrogen • Model parameters: • Initial conditions • Nitrogen transformation rates • Hydrogeological parameters INCA - N Daily estimates of water discharge and NO3 and NH4 concentrations in river water Daily weather data SAMO 2007, 21 Jun 2007
Examples of response surfaces produced by theINCA-N model Average annual nitrogen loss Average annual nitrogen loss Max. nitrate uptake rate Denitrification rate Denitrification rate Plant nitrate uptake rate SAMO 2007, 21 Jun 2007
Design points for the INCA-N model Average annual nitrogen loss Denitrification rate Plant nitrate uptake rate SAMO 2007, 21 Jun 2007
Design points for the INCA-N model SAMO 2007, 21 Jun 2007
Interpolation • Local interpolation is preferably based on simple models • Constant or linear predictors can be too simplistic for nonlinear deterministic functions • We started by fitting quadratic polynomials to minimal neighborhoods of the point at which the response shall be predicted SAMO 2007, 21 Jun 2007
Prediction errors for the functionf(x1, x2, x3, x4, x5) = x15 + x25 + 0.1(x3 + x4 + x5) Interpolation from 1024 (45) design points SAMO 2007, 21 Jun 2007
More carefully selected local neigbourhoods • Determine the minimal number of design points needed to produce a full rank design matrix for the polynomial regression • Add a fixed number of design points • Use the shape of the sub-box surrounding the new point to define a suitable local distance measure SAMO 2007, 21 Jun 2007
Prediction errors for the functionf(x1, x2, x3, x4, x5) = x15 + x25 + 0.1(x3 + x4 + x5) Minimal local neighbourhood Extended local neighbourhood Interpolation from 1024 (45) design points SAMO 2007, 21 Jun 2007
Predicted values for simple nonlinear functions f(x1, x2, x3, x4, x5) = x15 + x25 + 0.1(x3 + x4 + x5) f(x1, x2, x3, x4, x5,x6, x7) = x15 + x25 + 0.1(x3 + x4 + x5 + x6+ x7) 1024 (45) design points 2187 (37) design points SAMO 2007, 21 Jun 2007
Efficiency of local quadratic approximation of f(x1, x2, x3, x4, x5) = x15 + x25 + 0.1(x35 + x45 + x55) Break down SAMO 2007, 21 Jun 2007
Efficiency of local quadratic approximation of INCA –N Sequential design with extended local neighbourhoods SAMO 2007, 21 Jun 2007
Efficiency of local quadratic approximation of INCA –NSequential design and a regular grid design SAMO 2007, 21 Jun 2007
Accommodation of correlated inputs and arbitrarily shaped input domains Substitute for where g(x) is the joint probability density of the inputs SAMO 2007, 21 Jun 2007
Main conclusions • Our sequential design automatically adapts to the nonlinear features of the response surface under consideration • Local interpolation using quadratic polynomials performs satisfactorily, provided that local neighbourhoods are selected with care (appropriate size, local distance measure) SAMO 2007, 21 Jun 2007
Other conclusions • Both the design algorithm and the interpolation technique are conceptually simple and computationally cheap • The derived surrogate model forms a good basis for sensitivity analyses and user-friendly decision support tools • Our procedure is particularly suitable for studies of a single output from strongly nonlinear models with 2 to 7 inputs SAMO 2007, 21 Jun 2007
Sensitivity analysis • The input parameters were divided into three groups: initial conditions, hydrogeological parameters, and nitrogen transformation rates • Variance-based sensitivity analyses were first carried out for each group of parameters and then for the six most influential parameters • Although the hydrogeological parameters influenced the timing of the nitrogen losses, the average annual loss was almost exclusively determined by the nitrogen transformation rates SAMO 2007, 21 Jun 2007
Sensitivity indices for the six most influential inputs to the INCA-N model SAMO 2007, 21 Jun 2007
Nitrogen transformation parameters SAMO 2007, 21 Jun 2007
The INCA-N model Model Inputs Nitrogen transformation rates: 1- Denitrification 2- Plant nitrate uptake rate 3- Max nitrate uptake rate 4- Nitrification 5- Mineralisation 6-Immobilisation 7- Plant ammonium uptake rate Nitrification and all initial conditions and hydrogeological parameters are fixed at the mid value Model output Average annual riverine load of inorganic nitrogen INCA – N SAMO 2007, 21 Jun 2007