Nonlinear Regression

Ecole Nationale Vétérinaire de Toulouse Nonlinear Regression Didier Concordet d.concordet@envt.fr ECVPT Workshop April 2011 Can be downloaded at http://www.biostat.envt.fr/

An example

Questions • What does nonlinear mean ? • What is a nonlinear kinetics ? • What is a nonlinear statistical model ? • For a given model, how to fit the data ? • Is this model relevant ?

What does nonlinear mean ? • Definition : An operator (P) is linear if : • for all objects x, y on which it operates • P(x+y) = P (x) + P(y) • for all numbers a and all objects x • P (ax) = a P(x) When an operator is not linear, it is nonlinear

Examples Among the operators below which one are nonlinear ? • P (t) = a  t • P(t) = a • P(t) = a + b t • P(t) = a  t + b  t² • P(a,b) = a  t + b  t² • P(A,a) = A exp (- a t) • P(A) = A exp (- 0.1 t) • P(t) = A exp (- a t)

What is a nonlinear kinetics ? Concentration at time t, C(t,D) For a given dose D The kinetics is linear when the operator : is linear When P(D) is not linear, the kinetics is nonlinear

What is a nonlinear kinetics ? Examples :

What is a nonlinear statistical model ? A statistical model Observation : Dep. variable Parameters Covariates : indep. variables Error : residual function

What is a nonlinear statistical model ? A statistical model is linear when the operator : is linear. When is not linear the model is nonlinear

What is a nonlinear statistical model ? Example : Y = Concentration t = time The model : is linear

Examples Among the statistical models below which one are nonlinear ?

How to fit the data ? Proceed in three main steps • Write a (statistical) model • Choose a criterion • Minimize the criterion

Write a (statistical) model • Find a function of covariate(s) to describe the mean variation of the dependent variable (mean model). • Find a function of covariate(s) to describe the dispersion of the dependent variable about the mean (variance model).

Example is assumed gaussian with a constant variance homoscedastic model

How to choose the criterion to optimize ? Homoscedasticity : Ordinary Least Squares (OLS) When normality OLS are equivalent to maximum likelihood Heteroscedasticity: Weight Least Squares (WLS) Extended Least Squares (ELS)

Homoscedastic models The Ordinary Least-Squares criterion Define :

Heteroscedastic models : Weight Least-Squares criterion Define :

How to choose the weights ? When the model is heteroscedastic (ie is not constant with i) It is possible to rewrite it as where does not depend on i The weights are chosen as

Example with The model can be rewritten as with The weights are chosen as

Extended (Weight) Least Squares Define :

Balance sheet

The criterion properties It converges It leads to consistent (unbiased) estimates It leads to efficient estimates It has several minima

It converges When the sample size increases, it concentrates about a value of the parameter Example : Consider the homoscedastic model The criterion to use is the Least Squares criterion

It converges Small sample size Large sample size

It leads to consistent estimates The criterion concentrates about the true value

It leads to efficient estimates For a fixed n, the variance of an consistent estimator is always greater than a limit (Cramer-Rao lower bound). For a fixed n, the "precision" of a consistent estimator is bounded An estimator is efficient when its variance equals this lower bound

criterion Geometric interpretation This ellipsoid is a confidence region of the parameter

It leads to efficient estimates For a given large n, it does not exist a criterion giving consistent estimates more "convex" than - 2 ln(likelihood) - 2 ln(likelihood) criterion

It has several minima criterion

Minimize the criterion Suppose that the criterion to optimize has been chosen We are looking for the value of denoted which achieve the minimum of the criterion. We need an algorithm to minimize such a criterion

Example Consider the homoscedastic model We are looking for the value of denoted which achieve the minimumof the criterion

Isocontours

Different families of algorithms • Zero order algorithms : computation of the criterion • First order algorithms : computation of the first derivative of the criterion • Second order algorithms : computation of the second derivative of the criterion

Zero order algorithms • Simplex algorithm • Grid search and Monte-Carlo methods

Simplex algorithm

Monte-carlo algorithm

First order algorithms • Line search algorithm • Conjugate gradient

First order algorithms The derivatives of the criterion cancel at its optima Suppose that there is only one parameter to estimate The criterion (e.g. SS) depends only on How to find the value(s) of where the criterion cancels ?

Line search algorithm Derivative of the criterion 1 0 q 2

Second order algorithms Gauss-Newton (steepest descent method) Marquardt

Second order algorithms The derivatives of the criterion cancel at its optima. When the criterion is (locally) convex there is a path to reach the minimum : the steepest direction.

0 q Gauss Newton (one dimension) Derivative of the criterion 3 2 1 The criterion is convex

Gauss Newton (one dimension) Derivative of the criterion 0 q 1 2 The criterion is not convex

Gauss Newton

Marquardt Allows to deal with the case where the criterion is not convex When the second derivative <0 (first derivative decreases) it is set to a positive value Derivative of the criterion 0 q 3 2 1

Balance sheet

Is this model relevant ? • Graphical inspection of the residuals • mean model ( f ) • variance model ( g ) • Inspection of numerical results • variance-correlation matrix of the estimator • Akaike indice

Graphical inspection of the residuals For the model Calculate the weight residuals : and draw vs

Nonlinear Regression