550 likes | 727 Views
( Almost ) everything you always wanted to know (or maybe not…) about Geographically Weighted Regressions. Rodolphe Devillers. JCU Stats Group , March 2012. Outline. Background Spatial autocorrelation Spatial non-stationarity Geographically Weighted Regressions (GWR). Outline.
E N D
(Almost) everythingyoualwayswanted to know (or maybe not…) about GeographicallyWeightedRegressions Rodolphe Devillers JCU Stats Group, March 2012
Outline • Background • Spatial autocorrelation • Spatial non-stationarity • GeographicallyWeightedRegressions (GWR)
Outline • Background • Spatial autocorrelation • Spatial non-stationarity • GeographicallyWeightedRegressions (GWR)
Biological Data GeoCod Project (2006-…) Goal: Get a better understanding of the spatial and temporal dynamics of some fish/shellfish species in the NW Atlantic region, and their relationship with the physical environmental Environmental Data Scientific surveys Fisheries observers 4 species > 800 000 records Temperature Salinity Remote Sensing > 300 GB
GeoCodproject 1 2 3 4 Collection Integration Analysis Visualization Environmental data Normalized database Fisheries data Other data(Bathy, etc.)
Context • A number of statistical methods can be used • Testing spatial statistics Species ? Environnement
Outline • Background • Spatial autocorrelation • Spatial non-stationarity • GeographicallyWeightedRegressions (GWR)
Spatial autocorrelation • “…the property of random variables taking values, at pairs of locations a certain distance apart, that are more similar (positive autocorrelation) or less similar (negative autocorrelation) than expected for randomly associated pairs of observations.” (Legendre, 1993)
Spatial autocorrelation - Basics Positive (Neighbours more similar) Neutral (Random) Negative • (Neighbours less similar) http://www.spatialanalysisonline.com/
Spatial autocorrelation – is it common? • Elevation • Air/water temperature • Air humidity • Disease distribution • Species abundance • Housing value • Etc.
Spatial autocorrelation – why bother? • Spatial autocorrelation in the data leads to spatial autocorrelation in the residuals
Spatial autocorrelation – why bother? • Most statistics are based on the assumption that the values of observations in each sample are independent of one another • Consequence: it will violate the assumption about the independence of residuals and call into question the validity of hypothesis testing • Main effect: • Standard errors are underestimated, • t-scores are overestimated (= increases the chance of a Type I error = Incorrect rejection of a Null Hypothesis) • Sometime inverts the slope of relationships.
Spatial autocorrelation – how to measure it? • Measures of spatial autocorrelation: • Moran’s I • Geary’s C • Others (e.g. Getis’ G)
Spatial autocorrelation – How can I deal with it? • Many ways to handle this: • Subsampling, adjusting type I error, adjusting the effective sample size, etc. (Dale and Fortin (2002) Ecoscience 9(2)) • Autocovariate regressions, spatial eigenvector mapping (SEVM), generalised least squares (GLS), conditional autoregressive models (CAR), simultaneous autoregressive models (SAR), generalised linear mixed models (GLMM), generalised estimation equations (GEE), etc.(More details: Dormann et al. (2007) Ecography30) • If spatial autocorrelation is not stationary: GWR
Outline • Background • Spatial autocorrelation • Spatial non-stationarity • GeographicallyWeightedRegressions (GWR)
Stationarity • Classical regression models are valid under the assumptions that phenomena are stationary temporally and spatially (=statistical parameters such as the mean, the variance or the spatial autocorrelation do not vary depending on the geographic position) • E.g. Coral bleaching = 0.55 Temperature + 0.37 Nutrients + … - … • Studies (in various fields, including terrestrial ecology) have shown that they are rarely stationary
Global vs Local Statistics Simpson Paradox
Local spatial statistics • Local Indicators of Spatial Association (LISA) • Local Moran’s I (used to detect clustering) • Getis-OrdGi* (hotspot analysis) • Look at GeoDa(free software from Luc Anselin- http://geodacenter.asu.edu/) • Local regressions: GWR
Outline • Background • Spatial autocorrelation • Spatial non-stationarity • GeographicallyWeightedRegressions (GWR)
GWR • Brunsdon, Fortheringham and Charlton
GWR • Increasingly used in various fields (mostly since 2006, and even more since integrated into ArcGIS) • Sally: yes, it is also available in R… (spgwr)
GWR • Criticized by some authors (e.g. Wheeler 2005, Cho et al. 2009) when using collinear data, potentially leading to: • Occasional inflation of the variance • Rare inversion of the sign of the regression
Windle, M., Rose, G., Devillers, R. and Fortin, M.-J. Exploring spatial non-stationarity of fisheries survey data using geographically weighted regression (GWR): an example from the Northwest Atlantic. ICES Journal of Marine Science, 67: 145-154.
GWR • Geographically Weighted Regression(GRW) • (μ,ν): geographic coordinates of the samples • Multiple regression model (global) • y: dependentvariable, x1 to xp: independentvariables, β0: origin, β1 to βp: coefficients, ε: error.
Method Government fisheries scientific survey data (Fisheries and Oceans Canada) Cod presence/absence (threshold at 5 kg) for the Fall 2001
Method Year 2001 Temperature Cod Combining data in a single point data file Exporting data points in a file (.dbf) Crab Shrimp
GWR software (version 3.0) About 25 minutes per file of 5500 points 200km used for tests
Fixed Variable
Results Test of spatial stationarity of independent variables used in the regression Spatial non-stationarity Spatial stationarity
Results spatial stationarity Windleet al. (accepted) - MEPS Stationarity of bottom temperature used to model shrimp biomass
Results Comparison of regression models
Results Test of the spatial auto-correlation of the residuals
Results K-means clusteringof the t values of the GWR coefficients Positive relationship between crab and shrimp, weak relationship with the coast Negative relationship with crab and distance, positive with shrimp Stronger negative relationship with crab
Results Weak AIC: Akaike Information Criterion GAM systematically has lower AIC values, suggesting a non-linear relationship between cod and the variables used in the analysis Strong