420 likes | 601 Views
Mallard (Dawn Balmer, BTO). Sandwich Tern (Jill Pakenham, BTO). Herring Gulls. Parameter Redundancy in Ecological Models. Diana Cole, University of Kent Byron Morgan, University of Kent Rachel McCrea, University of Kent Ben Hubbard, University of Kent
E N D
Mallard (Dawn Balmer, BTO) Sandwich Tern (Jill Pakenham, BTO) Herring Gulls Parameter Redundancy in Ecological Models Diana Cole, University of Kent Byron Morgan, University of Kent Rachel McCrea, University of Kent Ben Hubbard, University of Kent Stephen Freeman, Centre for Ecology and Hydrology RemiChoquet, Centre d'Ecologie Fonctionnelle et Evolutive Mike Titterington, University of Glasgow Ted Catchpole, University of New South Wales Striped Sea Bass Wandering Albatross
Introduction • If a model is parameter redundant you cannot estimate all the parameters in the model. • Parameter redundancy is equivalent to non-identifiability of the parameters. • A model that is not parameter redundant will be identifiable somewhere (could be globally or locally identifiable). • Parameter redundancy can be detected by symbolic algebra. • Ecological models and models in other areas are getting more complex – then computers cannot do the symbolic algebra and numerical methods are used instead. • In this talk we show some of the tools that can be used to overcome this problem using ecological examples.
Introductory Example Cormack Jolly Seber (CJS) ModelCapture-Recature Herring Gulls (Larus argentatus) capture-recapture data for 1983 to 1986 (Lebreton, et al 1995) Numbers Ringed: Numbers Recaptured: James McCrea James McCrea • 83 • 84 • 85 Recapture yr 84 85 86 Ringing yr 83 84 85
Introductory Example CJS Model i – probability a bird survives from occasion i to i+1 pi – probability a bird is recaptured on occasion i = [1, 2, 3,p2, p3, p4 ] recapture probabilities Can only ever estimate 3p4 - model is parameter redundant or non-identifiable.
DerivativeMethod(Catchpole and Morgan, 1997) Calculate the derivative matrix D rank(D) = 5 rank(D) = 5 Number estimable parameters = rank(D). Deficiency = p – rank(D) no. est. pars = 5, deficiency = 6 – 5 = 1
Derivative or Jacobian Rank Test • Jacobian is the transpose of the derivative matrix, so the two are interchangeable. • Uses of rank test: • Catchpole and Morgan (1997) exponential family models, mostly used in ecological statistics. • Rothenberg (1971) original general use, examples econometrics. • Goodman (1974) latent class models. • Sharpio (1986) non-linear regression models. • Pohjanpalo (1982) first use for compartment models.
Derivative or Jacobian Rank Test • The key to the symbolic method for detecting parameter redundancy is to find a derivative matrix and its rank. • Models are getting more complex. • The derivative matrix is therefore structurally more complex. • Maple runs out of memory calculating the rank. • Examples: Hunter and Caswell (2009), Jiang et al (2007) • How do you proceed? • Numerically – but only valid for specific value of parameters. But can’t find combinations of parameters you can estimate. Not possible to generalise results. • Symbolically – involves extending the theory, again it involves a derivative matrix and its rank, but the derivative matrix is structurally simpler. Wandering Albatross Multi-state models for sea birds Striped Sea Bass Age-dependent tag-return models for fish
Exhaustive SummariesCole, Morgan and Titterington (2010, Mathematical Biosciences) • An exhaustive summary, , is a vector that uniquely defines the model (Walter and Lecoutier, 1982). • Derivative matrix • r = Rank(D) is the number of estimable parameters in a model. • p parameters; d = p – r is the deficiency of the model (how many parameters you cannot estimate). If d = 0 model is full rank (not parameter redundant, identifiable somewhere) . If d > 0 model is parameter redundant (non-identifiable). • More than one exhaustive summary exists for a model • CJS Example:
Choosing a simpler exhaustive summary will simplify the derivative matrix. CJS Example: Computer packages, such as Maple can find the symbolic rank of the derivative matrix if it is structurally simple. Exhaustive summaries can be simplified by any one-one transformation such as multiplying by a constant, taking logs, and removing repeated terms. A simpler exhaustive summary can also be found using reparameterisation. Exhaustive SummariesCole, Morgan and Titterington (2010, Mathematical Biosciences)
Methods For Use With Exhaustive SummariesWhat can you estimate in a parameter redundant model? Exponential family models: Catchpole and Morgan (1998)Compartment models: Chappell and Gunn (1998) and Evans and Chappell (2000) Exhaustive Summaries: Cole, Morgan and Titterington (2010, Mathematical Biosciences) • A model: p parameters, rank r, deficiency d = p – r • There will be d non-zero solutions to TD = 0. • Zeros in s indicate estimable parameters. • Example: CJS, regardless of which exhaustive summary is used • Solve PDEs to find full set of estimable pars. • Example: CJS, PDE: Can estimate: 1, 2, p2, p3 and 3p4
Methods For Use With Exhaustive SummariesExtension TheoremCatchpole and Morgan, 1997 Extended to exhaustive summaries in Cole, Morgan and Titterington (2010, Mathematical Biosciences) • Suppose a model has exhaustive summary 1 and parameters 1. • Now extend that model by adding extra exhaustive summary terms 2, and extra parameters 2 (eg. add more years of ringing/recovery). New model’s exhaustive summary is = [1 2]T and parameters are = [1 2]T. • If D1 is full rank and D2 is full rank, the extended model will be full rank. The result can be further generalised by induction. • Method can also be used for parameter redundant models by first rewriting the model in terms of its estimable set of parameters.
Methods For Use With Exhaustive SummariesExtension TheoremCatchpole and Morgan, 1997 Extended to exhaustive summaries in Cole, Morgan and Titterington (2010, Mathematical Biosciences) • Example: Ring-Recovery Mallards • Birds are ringed and then recovered dead. • Parameters: • D1 = [1/1] is of full rank 6. • Adding an extra year of recovery adds: • D2 = [2/2] has rank 1 – not full rank • Adding an extra year of ringing and recovery simultaneously adds: • D2 = [2/2] is of full rank 2. In general r = 2n1d = 0 if n1 = n2
Methods For Use With Exhaustive SummariesThe PLUR decompositionCole, Morgan and Titterington (2010, Mathematical Biosciences) • Write derivative matrix which is full rank r as D = PLUR (P is a square permutation matrix , L is a lower diagonal square matrix, with 1s on the diagonal, U is an upper triangular square matrix, R is a matrix in reduced echelon form). • If Det(U) = 0 at any point, model is parameter redundant at that point(as long as R is defined). The deficiency of U evaluated at that point is the deficiency of that nested model. • Example: Ring-recovery model: Rank(D) = 5 Therefore nested model is parameter redundant with deficiency 1.
Finding simpler exhaustive summaries Cole, Morgan and Titterington (2010, Mathematical Biosciences) • Choose a reparameterisation, s, that simplifies the model structure. CJS Model (revisited): • Reparameterise the exhaustive summary. Rewrite the exhaustive summary, (), in terms of the reparameterisation, (s).
Finding simpler exhaustive summaries Cole, Morgan and Titterington (2010, Mathematical Biosciences) • Calculate the derivative matrix Ds. • The no. of estimable parameters = rank(Ds) rank(Ds) = 5, no. est. pars = 5 • If Ds is full rank ( Rank(Ds) = Dim(s) ) s = sre is a reduced-form exhaustive summary. If Ds is not full rank solve set of PDE to find a reduced-form exhaustive summary, sre. There are 5 si and the Rank(Ds) = 5, so Ds is full rank. s is a reduced-form exhaustive summary.
Finding simpler exhaustive summaries Cole, Morgan and Titterington (2010, Mathematical Biosciences) • Use sre as an exhaustive summary. A reduced-form exhaustive summary is Rank(D2) = 5; 5 estimable parameters. Solve PDEs: estimable parameters are 1, 2, p2, p3 and 3p4
ReparameterisationMulti-state ExampleCole, Morgan and Titterington (2010) Wandering Albatross • Hunter and Caswell (2009) examine parameter redundancy of multi-state mark-recapture models, but cannot evaluate the symbolic rank of the derivative matrix (developed numerical method). • 4 state breeding success model: 3 post-success 1 success 3 1 4 2 4 = post-failure 2 = failure successful breeding recapture breeding given survival survival
ReparameterisationMulti-state ExampleCole, Morgan and Titterington (2010) • Choose a reparameterisation, s, that simplifies the model structure. • Rewrite the exhaustive summary, (), in terms of the reparameterisation - (s).
ReparameterisationMulti-state ExampleCole, Morgan and Titterington (2010) • Calculate the derivative matrix Ds. • The no. of estimable parameters =rank(Ds) rank(Ds) = 12, no. est. pars = 12, deficiency = 14 – 12 = 2 • If Ds is full rank s = sre is a reduced-form exhaustive summary. If Ds is not full rank solve set of PDE to find a reduced-form exhaustive summary, sre.
ReparameterisationMulti-state ExampleCole, Morgan and Titterington (2010) • Use sre as an exhaustive summary.
Age-dependent tag return models for estimating fishing mortality (Cole and Morgan, 2010, JABES) Striped Sea Bass Jiang et al (2007, JABES) developed an age-dependent fisheries model: F – fishing mortality, M natural mortality, Sela selectivity, - reporting. Numerical method applied to N = 3 years of data, with K = 3 age classes found the general model parameter redundant with rank 15 and deficiency 1. Assumed this meant full model was parameter redundant. Actually for N > 3, K > 1 model is full rank, with rank = 3N + 3K – 2 Problem of near redundancy. All parameters constant (Sela = 1) rank = 2, deficiency = 1.
Parameter redundancy with covariatesCole and Morgan (2010, Biometrika) • One approach to removing parameter redundancy is to include covariates in a model. • Suppose that the model with covariates has pc parameters, and in the equivalent model without covariates the rank = q. • The rank of the model with covariates = min(pc,q) • Example: Conditional ring-recovery, 1, a, t • For 4 years of ringing and 4 years of recovery: • Model without covariates: p = 6, r = 4 • Model with covariates: t = 1/{1 + exp( + t)} pc = 4 r = min(4,4) = 4, d = 0.
Parameter redundancy in mark-recovery modelsCole, Morgan, Catchpole, Hubbard (submitted) • The probability of an animal being ringed in year i andrecovered in year j is with survival probability and recovery probability . • Likelihood: • Model notation y/z: y represents survival probability and z represents reporting probability, which can be constant (C) or dependent on age (A), time (T) or (A,T). • The rank of any ring-recovery model is limited by the number of terms in the exhaustive summary. • E = n1n2 – ½n12 + ½n1
Parameter redundancy in mark-recovery modelsCole, Morgan, Catchpole, Hubbard (submitted) • How does the data effect parameter redundancy? • a main diagonals of data; Ni,j = 0 if j – i + 1 > a • An exhaustive summary consists of Pi,j with Ni,j 0 and the probabilities of never being seen again: • Reparameterisation method is used to find general results. • Now a maximum rank of Ea = E – ½ (n2 – a)(n2 – a +1)
Parameter redundancy in mark-recovery modelsCole, Morgan, Catchpole, Hubbard (submitted) • Similar tables of results are also available for x/y/z models, where x represents 1st year survival, y represents adult survival and z represents reporting probability. • There are 24 models • 3 of which remain unchanged for a 1 • 10 of which remain unchanged for a 2 • 3 of which remain unchanged for a 3 • 8 are limited byE/Ea • A lot of data values can be zero and the number of estimable parameters remains unchanged.
Estimating age-specific survival rates from historical ring-recovery data(Joint work with Stephen Freeman) • Prior to 2000 BTO ringing data were submitted on paper forms which have not yet been computerised. • Free-flying birds can be categorised as: • Juveniles (birds in their first year of life) • “Adults” (birds over a year) • There are more than 700 000 paper records listed by ringing number rather than species. • Each record will indicate whether a bird was a juvenile or an adult at ringing. • Recovered birds can be looked up and assigned to their age-class at ringing. • However the totals in each category cannot easily be tabulated. • There is also separate pulli data (birds ringed in nest), where totals are known.
Estimating age-specific survival rates from historical ring-recovery data(Joint work with Stephen Freeman) • Example ring-recovery data (simulated data)
Estimating age-specific survival rates from historical ring-recovery data(Joint work with Stephen Freeman) • Robinson (2010, Ibis) use Sandwich Terns (Sterna sandvicensis) historical data as a case study. • In Robinson (2010) a fixed proportion in each age class is assumed. For the Sandwich Terns this is 38% juvenile birds. This is based on the average proportion for 2000-2007 computerised data where the totals in each age class are known (range 25-47%). • Using parameter redundancy theory we show that this proportion can actually be estimated as an additional parameter.
Estimating age-specific survival rates from historical ring-recovery data(Joint work with Stephen Freeman) • The probability that a juvenile bird ringed in year i is recovered in year t • The probability that an adult bird ringed in year i is recovered in year t • Likelihood: (number of birds never seen again)
Estimating age-specific survival rates from historical ring-recovery data(Joint work with Stephen Freeman) Data simulated from 1, a, , p model with n1 = 5 and n2 = 5 Results from 1000 simulations Historic model is almost as good as the standard model.
Age-dependent mixture models for animals marked at unknown ageMcCrea, Morgan and Cole (invited revision, Applied Statistics) • Mallard data: ringed as first years, ringed as adults of known age. • If model data sets together only allows one age-class adult survival. • t(a) - survival probability of an individual aged a. • t(a) - reporting probability of an individual aged a. • t(a) - proportion of individuals marked that are age a. • i,t(a) - probability an individual is marked at time i at age a and recovered dead at time t, i,t(u) - probability an individual of unknown age is marked at time i and recovered dead at time t.
Age-dependent mixture models for animals marked at unknown age McCrea, Morgan and Cole (invited revision, Applied Statistics) • If only adult-marked data are available, the model is always parameter redundant. • However it is possible to estimate adult annual survival. • If we combine 1st year data with adult data we can estimate all parameters for many models. • Notation: X/Y/Z/W 1st year survival/adult survival/reporting/ q parameters, J+1 age classes starting at age J0, I years of ringing years of recovery
Which is the best method to use? • Symbolic method is now possible in structurally complex models using reparameterisation, but method is not automatic. • Numerical methods can be automatic, but can be inaccurate. • Develop general simpler exhaustive summaries, eg multi-state models. • Hybrid Symbolic-Numerical.
Multi-state mark–recapture models for sea birdsCole (to appear in Journal of Ornithology) Wandering Albatross State 1: Breeding site 1 State 2: Breeding site 2 State 3: Non-breeding, Unobservable in state 3 - survival - breeding - breeding site 1 1 – - breeding site 2
Multi-state mark–recapture models for sea birds Cole (to appear in Journal of Ornithology) Wandering Albatross • General Multistate-model has S states, with the last U states unobservable with N years of data. • Survival probabilities released in year r captured in year c: • t is an SS matrix of transition probabilities at time t with transition probabilities i,j(t) = ai,j(t). • Pt is an SS diagonal matrix of probabilities of capture pt • pt = 0 for an unobservable state
Multi-state mark–recapture models for sea birds Cole (to appear in Journal of Ornithology) Wandering Albatross r = 10N – 17 d = N + 3
A Hybrid Symbolic-Numerical Method for Determining Model StructureChoquet and Cole (invited review Mathematical Biosciences) • Derivative matrix evaluated symbolically, rank is determined at 5 random points. The model rank is equal to the maximum rank of the 5 points. • (Can also determine which parameters can be estimated in parameter redundant models). • Example: Additive trap-dependence t = logit-1(t), p*t = logit-1(t+3 + m) pt = logit-1(t+3) t =1,..,4 Gimenez et al (2003) found rank = 9, problem with Maple not simplifying logit functions.
Other and future work • Random effect models (joint with Remi Choquet). • Joint mark-recapture-recovery models (Ben Hubbard). • Pledger et al (2009)'s stopover models (Eleni Matechou). • (Discrete) state-space models for census data. . Sandpiper (Stop-over models) eg. Jose Lahoz-Monfort’s Integrated Population Model Atlantic puffin Jose Lahoz-Monfort Parameters Fjconfounded. a estimable.
Conclusion • Exhaustive summaries offer a more general framework for symbolic detection of parameter redundancy. • Parameter redundancy can be investigated symbolically by examining a derivative matrix and its rank. • In the symbolic method we can find the estimable parameter combinations (via PDEs). • The symbolic method can easily be generalised using the extension theorem. • Parameter redundant nested models can be found using a PLUR decomposition of any full rank derivative matrix. • The use of reparameterisation allows us to produce structurally much simpler exhaustive summaries, allowing us to examine parameter redundancy of much more complex models symbolically. • Methods are general and can in theory be applied to any parametric model.
References • Catchpole, E. A. and Morgan, B. J. T. (1997) Biometrika, 84, 187-196 • Catchpole, E. A., Morgan, B. J. T. and Freeman, S. N. (1998) Biometrika, 85, 462-468 • Chappell, M. J. and Gunn, R. N. (1998) Mathematical Biosciences, 148, 21-41. • Choquet, R. and Cole D. J. (2010) A Hybrid Symbolic-Numerical Method for Determining Model Structure. University of Kent Technical Report UKC/SMSAS/10/016. • Cole, D. J., Morgan, B. J. T and Titterington, D. M. (2010) Determining the Parametric Structure of Non-Linear Models. Mathematical Biosciences. 228, 16–30. • Cole, D. J. (2010) Determining parameter redundancy of multi-state mark-recapture models for sea birds. To appear in Journal of Ornithology. • Cole, D. J. and Morgan, B. J. T. (2010a) A note on determining parameter redundancy in age-dependent tag return models for estimating fishing mortality, natural mortality and selectivity, JABES, 15, 431-434. • Cole D. J. and Morgan B. J. T. (2010b) Parameter redundancy with covariates. Biometrika, 97, 1002-1005. • O. Gimenez, R. Choquet, J. D. Lebreton (2003), Parameter redundancy in multi-state capture-recapture models, Biometrical Journal, 45, 704-722. • McCrea, R. S., Morgan, B. J. T and Cole, D. J. (2010) Age-dependent models for recovery data on animals marked at unknown age. Technical report UKC/SMSAS/10/020 • Robinson, R. A. (2010) Estimating age-specific survival rates from historical data. Ibis, 152, 651–653. • Evans, N. D. and Chappell, M. J. (2000) Mathematical Biosciences, 168, 137-159. • Goodman, L. A. (1974) Biometrika, 61, 215-231. • Hunter, C.M. and Caswell, H. (2009). Ecological and Environmental Statistics, 3, 797-825 • Jiang, H. Pollock, K. H., Brownie, C., et al (2007) JABES, 12, 177-194 • Lebreton, J. Morgan, B. J. T., Pradel R. and Freeman, S. N. (1995). Biometrics, 51, 1418-1428. • Pledger, S., Efford, M. Pollock, K., Collazo, J. and Lyons, J. (2009) Ecological and Environmental Statistics Series: Volume 3. • Pohjanpalo, H. (1982) Technical Research Centre of Finland Research Report No. 56. • Rothenberg, T. J. (1971) Econometrica, 39, 577-591. • Shapiro, A. (1986) Journal of the American Statistical Association, 81, 142-149. • Walter, E. and Lecoutier, Y (1982) Mathematics and Computers in Simulations, 24, 472-482