270 likes | 533 Views
Testing Predictive Performance of Ecological Niche Models. A. Townsend Peterson, STOLEN FROM Richard Pearson. Niche Model Validation. Diverse challenges … Not a single loss function or optimality criterion Different uses demand different criteria
E N D
Testing Predictive Performance of Ecological Niche Models A. Townsend Peterson, STOLEN FROM Richard Pearson
Niche Model Validation • Diverse challenges … • Not a single loss function or optimality criterion • Different uses demand different criteria • In particular, relative weights applied to omission and commission errors in evaluating models • Nakamura: “which way is relevant to adopt is not a mathematical question, but rather a question for the user” • Asymmetric loss functions
Model calibration and evaluation strategies: resubstitution Projection Calibration Same region Different region Different time Different resolution All available data 100% Evaluation (after Araújo et al. 2005 Gl. Ch. Biol.)
Model calibration and evaluation strategies: independent validation Projection Same region Different region Different time Different resolution All available data Calibration 100% Evaluation (after Araújo et al. 2005 Gl. Ch. Biol.)
Model calibration and evaluation strategies: data splitting Projection Calibration Calibration data Same region Different region Different time Different resolution 70% Test data 30% Evaluation (after Araújo et al. 2005 Gl. Ch. Biol.)
The four types of results that are possible when testing a distribution model (see Pearson NCEP module 2007)
Presence-absence confusion matrix Recorded (or assumed) absent Recorded present Predicted present a (true positive) b (false positive) Predicted absent c (false negative) d (true negative)
Selecting a decision threshold (p/a data) (Liu et al. 2005 Ecography 29:385-393)
Omission (proportion of presences predicted absent) (c/a+c) Commission (proportion of absences predicted present) (b/b+d) Selecting a decision threshold (p/a data)
LPT T10 Selecting a decision threshold (p-o data)
The four types of results that are possible when testing a distribution model (see Pearson NCEP module 2007)
Presence-absence test statistics Recorded (or assumed) absent Recorded present Predicted present a (true positive) b (false positive) Predicted absent c (false negative) d (true negative) Proportion (%) correctly predicted (or ‘accuracy’, or ‘correct classification rate’): (a + d)/(a + b + c + d)
Presence-absence test statistics Recorded (or assumed) absent Recorded present Predicted present a (true positive) b (false positive) Predicted absent c (false negative) d (true negative) Cohen’s Kappa:
Presence-only test statistics Recorded (or assumed) absent Recorded present Predicted present a (true positive) b (false positive) Predicted absent c (false negative) d (true negative) Proportion of observed presences correctly predicted (or ‘sensitivity’, or ‘true positive fraction’): a/(a + c)
Presence-only test statistics Recorded (or assumed) absent Recorded present Predicted present a (true positive) b (false positive) Predicted absent c (false negative) d (true negative) Proportion of observed presences correctly predicted (or ‘sensitivity’, or ‘true positive fraction’): a/(a + c) Proportion of observed presences incorrectly predicted (or ‘omission rate’, or ‘false negative fraction’): c/(a + c)
Leaf-tailed gecko (Uroplatus) Presence-only test statistics:testing for statistical significance U. sikorae U. sikorae Success rate: 4 from 7 Proportion predicted present: 0.231 Binomial p = 0.0546 Success rate: 6 from 7 Proportion predicted present: 0.339 Binomial p = 0.008
Absence-only test statistics Recorded (or assumed) absent Recorded present Predicted present a (true positive) b (false positive) Predicted absent c (false negative) d (true negative) Proportion of observed (or assumed) absences correctly predicted (or ‘specificity’, or ‘true negative fraction’): d/(b + d)
Absence-only test statistics Recorded (or assumed) absent Recorded present Predicted present a (true positive) b (false positive) Predicted absent c (false negative) d (true negative) Proportion of observed (or assumed) absences correctly predicted (or ‘specificity’, or ‘true negative fraction’): d/(b + d) Proportion of observed (or assumed) absences incorrectly predicted (or ‘commission rate’, or ‘false positive fraction’): b/(b + d)
Recorded present Recorded (or assumed) absent a (true positive) b (false positive) Predicted present Predicted absent c (false negative) d (true negative) AUC: a threshold-independent test statistic (1 – omission rate) (fraction of absences predicted present) sensitivity = a/(a+c) specificity = d/(b+d)
Threshold-independent assessment: The Receiver Operating Characteristic (ROC) Curve A B set of ‘absences’ set of ‘presences’ 1 Frequency 0 1 Predicted probability of occurrence sensitivity C set of ‘presences’ set of ‘absences’ Frequency 0 0 1 0 1 Predicted probability of occurrence 1 - specificity (check out: http://www.anaesthetist.com/mnm/stats/roc/Findex.htm)