260 likes | 401 Views
Lecture 6 review. Bioenergetics models either (1) predict growth from predictions of food consumption and metabolism/reproduction, or (2) back-calculate food consumption from growth and predictions ofmetabolism/reproduction
E N D
Lecture 6 review • Bioenergetics models either • (1) predict growth from predictions of food consumption and metabolism/reproduction, or • (2) back-calculate food consumption from growth and predictions ofmetabolism/reproduction • Backcalculation of food intake (needed for analysis of trophic interactions) can be done from growth curves • Bioenergetics models that account for seasonal variation in food, temperature effects are needed for interpretation of tagging data • Allocation of energy to reproduction is critical for understanding growth curves
Lecture 7: Environmental tolerances and niches, statistical concepts (likelihood, etc.) • In fisheries, discussions about environmental tolerances and niches are about what is now widely called “essential fish habitat” (EFH) • Such discussions should focus on the effects of habitat change on recruitment, rather than the traditional concern with explaining why fish have been distributed as they have in the past • Most EFH discussions have failed to clearly define “habitat”, to distinguish between preferences and requirements, and to fully account for life history trajectories • This lecture will examine niche/habitat issues from the perspective of applied prediction: what models work?
Why you really need to worry about habitat prediction • For every dollar spent on population studies, management agencies now spend about ten dollars on “habitat management” • Habitat models have mostly not been developed by people with population dynamics understanding • Habitat change is a convenient excuse for harvest mismanagement, technological fixes comparable to hatcheries • If you are too stupid to do anything else, become a habitat biologist (eg California MLPA design).
X2 X1 X3 Tolerance limits What are the main physical habitat variables Xi that limit fish use (define physical niche dimensions)? • Temperature (tarpon) • Oxygen (where is the biggest dead zone?) • Salinity (estuaries: tolerance or refuge?) • Velocity (not just streams)
Can you correctly map fish distributions using physical habitat variables and niche tolerances? • This is the “PHABSIM” approach. • Distributions reflect preference as well as requirement, especially for species that have shown range contraction under fishing (other fish are part of “habitat”). • What about biotic niche dimensions, particularly predation risk and food availability? • Bottom line: physical variables tell us little.
What about prediction of distributions based on observed associations with habitat “types”, e.g. rocky bottoms, seagrass? • Types are difficult to define, e.g. patchy vs continuous seagrass • Types where you catch them are usually their resting/ hiding places, not all of the habitat they need ! Bell et al (2001) Biol. Cons. 100: 115-123 Brad Robbins (Mote) seagrass studies: fish prefer patchy, not continuous grass
Examples of where simple habitat restoration thinking (eg natural flow paradigm) could be causing more harm than good • Grand Canyon: will warming the water restore native fish, or exotic fish? • Bridge River: salmon abundance in steep canyon much higher under regulated flows • Macrophyte development in southern lakes and reservoirs: breeding bluegill at the expense of bass • Louisiana: planting marsh grasses on subsiding landscapes • Salmon streams: replacing nursery habitats with unneeded spawning areas • Shrimp trawling:bycatch reduction helps restore benthic communities, but at the expense of shrimp production
Lessons from these examples • Biotic “habitat” factors are at least as important as physical ones, especially when exotic species are present • Improving conditions for one life stage does not mean improving them for all stages • “Natural” habitat conditions are not always the best: it is a blatant logical fallacy to assert that being best adapted to a particular habitat structure implies that particular structure to be the best one.
Using the Beverton-Holt recruitment model to integrate predicted impacts of changes in food, predation, and habitat size • The B-H model R=aE/(1+bE) can be written as • Here, Pt is predation risk, Ct is food supply, and Ht is habitat size • (P,C,H can be measured in any convenient relative units; fit to data determined by k1, k2.)
Ln(Rt/Et) Ln(Rt/Et) Et Et A good way to look at the Beverton-Holt prediction is by plotting ln(Rt/Et) vs Et • Ln(Rt/Et) is the log of survival rate from egg to recruitment; it is predicted to vary with predation, food, and habitat as • Which looks nasty, but has a simple shape when you plot it (nearly linear decrease) Changes in P or C Changes in H
Example: Several Bristol Bay sockeye stocks increased after 1970; was this due to change in P/C or H? (gray points and line are pre-1970, black points and line are more recent years)
Key statistical concepts • What is a likelihood function, and why do we use them? • In defining likelihood functions, what do we mean by an “observation”, and why is this very different from what we mean by a “measurement”? • What is the distinction between a likelihood function and a Bayes posterior probability? • Given various model choices, how can we decide a “best” choice, besides using something misleading like the Akaike criterion (AIC)?
Why use the dreaded likelihood function instead of simple SS, when fitting models to data? • To impress your colleagues • To get the most information from your data, if you know form of data distribution • To combine heterogeneous types of data so as to get the most information from all of them simultaneously • To combine your data with prior information about parameters (prior distributions for parameters), in which case the likelihood+prior function becomes a “Bayes posterior distribution” • To allow use of information-theoretic criteria for comparing alternative models (AIC, etc.)
So what is a likelihood function? • It is just a probability equation (model) for the data x, given the parameters, e.g. • If we treat the observation x as fixed, varying the mean u tells us what value of u would make x most probable; this value of u is the most parsimonious estimate of u, i.e. the maximum likelihood estimate of u
We always work with log likelihoods (scale better, can add them over independent observations • Normal: lnL(x|u,σ2)=-0.5(x-u)2/σ2-0.5log(σ2) • Note here that you can drop any additive log terms (e.g. -.5log() ) that involve only constants or depend only on the data; such terms have no effect on comparisons of parameter values (e.g. finding parameter values that maximize lnL, comparing particular parameter values using likelihood ratios)
The common log likelihood functions in fisheries (4 of them) • For any n observations that are assumed to have the same distribution, SS=Σ(xi-ui)2 (note each xi may have different predicted mean ui) • Normal, fixed variance: lnL= -0.5 SS/σ2 • Normal, m.l. est of variance: lnL= -n/2 ln(SS) (this is also the Student’s T distribution) • Poisson, counts xi: lnL= -Σui + Σxiln(ui) • Multinomial, counts xi: lnL= Σxiln(pi) where pi is probability of a type i event
Why assume the normal distribution? • Normal arises when any sum x1+x2+… of independent random variables is formed (sum is normal, Central Limit Theorem) • Most “observations” (e.g. cpue) are sums or products of more primitive measurements. • Normal is also the “minimum information” distribution (makes the weakest possible assumptions about the data besides finite mean and variance); just because the normal equation is complex does not mean that it makes complex assumptions about the data
Leading and nuisance parameters • Model parameter sets, e.g. (u,σ2 of normal) can be partitioned into two subsets • “Leading” parameters of interest (u, No, K) • “Nuisance” observation parameters (q, σ2) • By differentiating lnL with respect to the nuisance parameters, we can often obtain analytical expressions for their m.l. estimates • σ2=SS/n q=exp{(1/n)Σlog(cpuet/Nt)} • Putting these expressions directly into the calculation of lnL then allows us to use numerical search methods only to find best estimates of the leading parameters • Plotting lnL values vs parameter values for such nuisance-maximized lnL’s is called “likelihood profiling”
The lnL Profile The L Profile exp(lnL) Likelihood profiles are a great way to describe uncertainty about parameter estimates • The L profile is much more informative • Y axis scaling (units of measurement) are completely irrelevant in comparing hypotheses about the mean U
Comparing alternative models • Two fundamentally different approaches • Fitting/data prediction (information theory) criteria: AIC=-2lnL+2K, K=no of parameters • Policy parameter estimation error criteria • Burnham and Anderson have made AIC criteria very popular, without apparently realizing that our aim in applied analysis is typically not to predict or explain observations, but rather to make policy prescriptions.
Estimating policy parameter error • Suppose a policy parameter H can be calculated from model parameters, i.e. H=f(θ) where θ is a set of parameters θ=(Ro,K,…) • For a given model, can calculate an approximate covariance matrix Vθ for the parameters: Vθ≈σ2(J’J)-1 where σ2 measures goodness of fit and (J’J)-1 measures parameter confounding. • Var(H)≈g’Vθg where g={∂H/∂θ}, i.e. how sensitive the estimate of H is to each parameter in θ. • So uncertainty about H (var of the H estimate) can be calculated from σ2 and (J’J)-1 along with information about sensitivity of H to each θ.
Here is the basic problem with using more complex models to estimate policy parameters H Elements of (X’X)-1 (each parameter becomes more uncertain when more parameters are estimated) Components of Vθ Var(H) Model fit error (gets smaller when more parameters are used to explain the data) Number of parameters in θ
How it all comes home to roost Highly detailed model, admits major uncertainties and is unbiased, but can be highly in error Precise Gulland model with too few parameters, always overestimates MSY
Another example: mark-recapture experiments to determine ESA jeopardy • Population estimates below some threshhold trigger large investments, eg research programs, fishery closures, habitat mitigation actions • Suppose a closed mark recapture experiment is done, and N consists of categories of animals Ni that have different capture probabilities Pi (e.g. small fish often have lower Pi) • In analysis of results from such experiments, minimum AIC model is typically (always?) to assume same Pi for all classes, i.e. pool the recapture data. This typically (always?) causes downward bias in estimate of N=ΣNi
Another example: mark-recapture experiments to determine ESA jeopardy Unbiased but imprecise estimator (“too many” Ni estimated) P(N) Biased, precise estimator for just total N (best AIC) N Cost Assumed N
So if likelihood functions are so great, what goes wrong in practice? • CONTAMINATION • Actual distributions often much wider than assumed (estimates appear too precise) • Unmodeled process errors (components of variation in the ui) • Bad sampling assumptions (e.g. multinomial effective sample size typically much smaller than number of items classified) • CONTRADICTION • Structural assumption(s) violated for at least one data type, e.g. c.p.u.e. series where one series goes up, another goes down • DO NOT COMBINE SUCH DATA SETS, UNLESS YOU WANT TO BE A NMFS STOCK ASSESSMENT SCIENTIST !