260 likes | 413 Views
CHEE801 Module 5:. Nonlinear Regression. Notation. random noise component. Model: Model specification – the model equation is with n experimental runs, we have defines the expectation surface the nonlinear regression model is. explanatory variables – ith
E N D
CHEE801 Module 5: Nonlinear Regression
Notation random noise component Model: Model specification – • the model equation is • with n experimental runs, we have • defines the expectation surface • the nonlinear regression model is explanatory variables – ith run conditions p-dimensional vector of parameters
Parameter Estimation – Gauss-Newton Iteration Least squares estimation – minimize Numerical optimization procedure is required. One possible method: • Linearization about the current estimate of the parameters • Solution of the linear(ized) regression problem to obtain the next parameter estimate • Iteration until a convergence criterion is satisfied
Linearization about a nominal parameter vector Linearize the expectation function η(θ) in terms of the parameter vectorθ about a nominal vector θ0: • Sensitivity Matrix • Jacobian of the expectationfunction • contains first-order sensitivity information
Parameter Estimation – Gauss-Newton Iteration Iterative procedure consisting of: • Linearization about the current estimate of the parameters • Solve the linearized regression problem to obtain the next parameter estimate update • Iterate until the parameter estimates converge
Computational Issues in Gauss-Newton Iteration The Gauss-Newton iteration can be subject to poor numerical conditioning, for some parameter values: • Conditioning problems arise in inversion of VTV • Solution – use a decomposition technique • QR decomposition • Singular Value Decomposition (SVD) • Use a different optimization technique • Don’t try to estimate so many parameters • Simplify the model • Fix some parameters at reasonable values
Other numerical estimation methods • Nonlinear least-squares is a minimization problem • Use any good optimization technique to find parameter estimates to minimize the sum of squares of the residuals
Inference – Joint Confidence Regions • Approximate confidence regions for parameters and predictions can be obtained by using a linearization approach • Approximate covariance matrix for parameter estimates:where is the Jacobian of () evaluated at the least squares parameter estimates • This covariance matrix is asymptotically the true covariance matrix for the parameter estimates as the number of data points becomes infinite • 100(1-α)% joint confidence region for the parameters: • Compare to the linear regression case
Inference – Marginal Confidence Intervals • Marginal confidence intervals • Confidence intervals on individual parameterswhere is the approximate standard error of the parameter estimate • – i-th diagonal element of the approximate parameter estimate covariance matrix, with noise variance estimated as in the linear case
Precision of the Predicted Responses – Linear Case From the linear regression module The predicted response from an estimated model has uncertainty, because it is a function of the parameter estimates which have uncertainty: e.g., Solder Wave Defect Model - first response at the point -1,-1,-1 If the parameter estimates were uncorrelated, the variance of the predicted response would be: Why?
Precision of the Predicted Responses - Linear In general, both the variances and covariances of the parameter estimates must be taken into account. For prediction at the k-th data point:
Precision of the Predicted Responses - Nonlinear Linearize the prediction equation about the least squares estimate: For prediction at the k-th data point: Note -
Estimating Precision of Predicted Responses Use an estimate of the inherent noise variance The degrees of freedom for the estimated variance of the predicted response are those of the estimate of the noise variance • replicates • external estimate • MSE linear nonlinear
Confidence Limits for Predicted Responses Linear and Nonlinear Cases: Follow an approach similar to that for parameters - 100(1-α)% confidence limits for the mean value of a predicted response are: • degrees of freedom are those of the inherent noise variance estimate If the prediction is for a new data value, confidence intervals are: Why?
Properties of LS Parameter Estimates Key Point - parameter estimates are random variables • because stochastic variation in data propagates through estimation calculations • parameter estimates have a variability pattern - probability distribution and density functions Unbiased • “average” of repeated data collection / estimation sequences will be true value of parameter vector
Properties of Parameter Estimates Linear Regression Case • Least squares estimates are – • Unbiased • Consistent • Efficient Nonlinear Regression Case • Least squares estimates are – • Asymptotically unbiased – as number of data points becomes infinite • Consistent • Efficient
Diagnostics for nonlinear regression • Similar to linear case • Qualitative – residual plots • Residuals vs. • Factors in model • Sequence (observation) number • Factors not in model (covariates) • Predicted responses • Things to look for: • Trend remaining • Non-constant variance • Qualitative – plot of observed and predicted responses • Predicted vs. observed – slope of 1 • Predicted and observed – as function of independent variable(s)
Diagnostics for nonlinear regression • Quantitative diagnostics • Ratio tests: • 3 tests are the same as for linear case • R-squared • coarse measure of significant trend • squared correlation of observed and predicted values • adjusted R-squared • squared correlation of observed and predicted values
Diagnostics for nonlinear regression • Quantitative diagnostics • Parameter confidence intervals: • Examine marginal intervals for parameters • Based on linear approximations • Can also use hypothesis tests • Consider dropping parameters that aren’t statistically significant • What should we do if parameters are • Not significantly different from zero • Not signficiantly different from the initial guesses • In nonlinear models– parameters are more likely to be involved in more complex expressions involving factors and other parameters • E.g., Arrhenius reaction rate expression • If possible, examine joint confidence regions
Diagnostics for nonlinear regression • Quantitative diagnostics • Parameter estimate correlation matrix: • Examine correlation matrix for parameter estimates • Based on linear approximation • Compute covariance matrix, then normalize using pairs of standard deviations • Note significant correlations and keep these in mind when retaining/deleting parameters using marginal significance tests • Significant correlation between some parameter estimates may indicate over-parameterization relative to the data collected • Consider dropping some of the parameters whose estimates are highly correlated • Further discussion – Chapter 3 - Bates and Watts (1988), Chapter 5 - Seber and Wild (1988)
Practical Considerations • What kind of stopping conditions should be used to determine convergence? • Problems with local minima? • Reparameterization to reduce correlation between parameter estimates • Ensuring physically realistic parameter estimates • Common problem – we know that some parameters should be positive or should be bounded between reasonable values • Solutions • Constrained optimization algorithm to enforce non-negativity of parameters • Reparameterization tricks • Estimate instead of positive positive Bounded between 0 and 1
Practical considerations • Correlation between parameter estimates • Reduce by reparameterization • Exponential example –
Practical considerations • Particular example – Arrhenius rate expression • Reduces correlation between parameter estimates and improves conditioning of estimation problem
Practical considerations • Scaling – of parameters and responses • Choices • Scale by nominal values • Nominal values – design centre point, typical value over range, average value • Scale by standard errors or initial uncertainty ranges for parameters • Parameters – estimate of standard devn of parameter estimate • Responses – by standard devn of observations – noise standard deviation • Scaling can improve conditioning of the estimation problem (e.g., scale sensitivity matrix V), and can facilitate comparison of terms on similar (dimensionless) bases
Practical considerations • Initial parameter guesses are required • From prior scientific knowledge • From prior estimation results • By simplifying model equations
Things to learn in CHEE 811 • Estimating parameters in differential equation models: • Estimating parameters in multi-response models • Deriving model equations based on chemical engineering knowledge and stories about what is happening • Solving model equations numerically • Deciding which parameters to estimate and which to leave at initial guesses when data are limited.