130 likes | 170 Views
Introduction to statistical estimation methods. Finse Alpine Research Center, 10-11 September 2010. OUTLINE. Focus of the course: Introduce essential methods for statistical modelling in ecology Construction of biologically sound models
E N D
Introduction to statistical estimation methods Finse Alpine Research Center, 10-11 September 2010
OUTLINE • Focus of the course: • Introduce essential methods for statistical modelling in ecology • Construction of biologically sound models • Estimation of parameter values and associated uncertainties • Interpretation of results • Introduce concepts that are important for the course next week TODAY: Mostly maximum likelihood TOMORROW: Mostly Bayesian statistics MATERIAL ON: http://www.finse.uio.no/ – Many Exercises – Some lectures – Tutoring – – Be active! – Ask questions – Help each other SUNDAY: Day off / Glacier hike MONDAY TO FRIDAY: Occupancy modelling workshop (3 new lecturers – joining on the glacier hike)
Most studies in ecology require quantification in some way: Quantification of …: … relationship between variables … differences between groups of individuals … the effect of experimental treatments … predictions for the future (effects of climate change) … of effect of management strategies and not the least: Quantification of uncertainty! Quantification of anything requires: … some sort of model … ways to estimate parameters / distributions of random variables
Claim: • In ecology, the main question is seldom IF something has an effect • The questions are more about HOW and HOW MUCH
Example: How does habitat quality affect energy expenditure? Body mass Sex Reproductive state Activity / Behaviour Energy expenditure (Field metabolic rate) ? HABITAT Weather • The question should not be IF these variables have an effect – from biological theory we can be almost certain that all these variables have an effect. • Relationships in ecology are almost infinitely complex (there is no true model) • “All models are wrong, but some are useful” (Box) Temperature Season OTHER THINGS (biological things + measurement error)
“Typical approach”: • Put everything into a linear model (multiple regression) Body mass Sex Reproductive state Without thinking about HOW the various predictor variables can affect the response variables Activity / Behaviour Energy expenditure (Field metabolic rate) • Remove non-significant effects ? HABITAT Without thinking about what you are really interested in Weather • Reporting p-values Temperature Season Without quantifying HOW MUCH the predictor variables affect the response variable, and without thinking about BIOLOGICAL SIGNIFICANCE OTHER THINGS (biological things + measurement error)
Statistical significance vs. biological relevance Not biologically significant Could be important – more data needed 5 different confidence intervals: Null-hypothesis tests are often used erroneously to make a classification of “no effect” (not significant) and “significant effect” with no consideration of the potential biological significance (a somewhat thoughtless process). Large E.g. statements like “Predator density did not affect prey survival” with no further detail on effect size. Effect size Small NS p<0.05 NS p<0.05 p<0.001 Biologically significant
Number of papers questioning the utility of null hypothesis testing in scientific research • Null-hypothesis testing in ecological science: • Yoccoz, N. G. 1991. Use, overuse, and misuse of significance tests in evolutionary biology and ecology.Bulletin of the Ecological Society of America 72:106-111. • Anderson, D. R., K. P. Burnham, and W. L. Thompson. 2000. Null hypothesis testing: Problems, prevalence, and an alternative.Journal of Wildlife Management 64:912-923. • Web-page: • http://www.cnr.colostate.edu/~anderson/ Anderson et al. 2000
Bias2 Variance (uncertainty) Too simple model Freq. Number of parameters (K) Too complex model Freq. = deviance + 2 × no. parameters + small sample correction Truth Prediction Using p-values for model selection is a different thing In a set of models, the model with the lowest AIC will, on average, be the model with the lowest K-L distance (i.e., give predictions closest to the truth).
Think about HOW things are related … 60 sites Testosterone treated male Control male Do testosterone treated males have larger home-ranges at high densities? What are the effects at low densities? Response variable: Predictor variables: Treatment Density Body mass Home range size measured by radio-telemetry ~ Example: Influence of testosterone on size of home-range in voles.
F Value Pr(F) Body mass 15.79 <0.001 Treatment 0.99 0.32 Density 104.90 <0.001 Body mass × Treatment 0.003 0.96 Density × Treatment 2.66 0.11 Full model: F Value Pr(F) Body mass 15.95 <0.001 Treatment 1.00 0.32 Density 105.95 <0.001 Density × Treatment 2.63 0.11 Step 1: F Value Pr(F) Body mass 15.68 <0.001 Treatment 0.99 0.32 Density 104.18 <0.001 Step 2: F Value Pr(F) Body mass 15.06 <0.001 Density 96.15 <0.001 Step 3: Response variable: home range size Conclusion: There is no significant effect of ‘Treatment’ (p = 0.32) or a ‘Density × Treatment’ interaction (p = 0.11).
D<c:y = constant D≥c:y = c*(D-c)b y = aMb Home range size Home range size Density Body mass (M) D<c:log(y) = constant D≥c:log(y) = log(c) + b*log(D-c) log(y) = a+ b*log(M) log(Home range size) log(Home range size) log(Density) log(Body mass)
Treatment + Treatment × Density Treatment + Density Treatment × Density log(Home range size) log(Population density) Candidate models