170 likes | 249 Views
A Summary of An Introduction to Statistical Problem Solving in Geography Chapter 12: Inferential Spatial Statistics. Prepared by W. Bullitt Fitzhugh Geography 3000 - Advanced Geographic Statistics Dr. Paul Sutton Winter Quarter, 2010.
E N D
A Summary of An Introduction to Statistical Problem Solving in GeographyChapter 12: Inferential Spatial Statistics Prepared by W. Bullitt Fitzhugh Geography 3000 - Advanced Geographic Statistics Dr. Paul Sutton Winter Quarter, 2010
Inferential Spatial Statistics: Point Pattern and Area Pattern Analysis • Common in Geography to find points and areas used to represent spatial phenomenon • Methods exist to determine whether sample point pattern or sample area pattern follow a random spatial distribution • Spatial patterns that are distributed randomly are not usually of interest because underlying phenomenon has “no spatial logic”
Types of Spatial Patterns • Spatial patterns may be: clustered, random, or dispersed • Clustered points or areas have high densities in some locations and low or zero densities in other locations • Dispersed points or areas are (nearly) uniformly distributed across a study area • Random points or areas do not tend towards a clustered or dispersed spatial distribution • In real life, the type of patterns encountered in a research problem may be ambiguous with elements of multiple pattern types
Spatial Patterns Example: Lakeside Trees and Condo Development
Spatial Autocorrelation • Positive spatial autocorrelation occurs when nearby points or areas have similar values (clustered patterns) • Negative spatial autocorrelation occurs when nearby points or areas have dissimilar values (dispersed patterns) • No spatial autocorrelation exists when point or area patterns are randomly distributed
Point Pattern Analysis • Point Pattern Analyses are useful analytic tools for geographic research problems where the variable(s) at hand are represented by points on a map • “Nearest Neighbor Analysis” is a statistical procedure for understanding the spacing of points on a map • “Quadrant Analysis” focuses on the nature of the spatial distribution of the point pattern within the overall study area • Both methods aim to shed light on the underlying process(es) behind the geographic results
Nearest Neighbor Analysis • Nearest Neighbor Analysis (NNA) can be used as a descriptive statistic or as a method to test hypotheses about the population from which the sample points were taken • NNA uses the average nearest neighbor distance as an index of point spacing (descriptive) • If a random spatial pattern, average nearest neighbor distance NNC(r) = 1/[2*SQRT(Density)] • If a perfectly distributed or uniform spatial pattern, average nearest neighbor distance NND(d) 1.07453/SQRT(Density) • If a perfectly clustered spatial pattern, average nearest neighbor distance NND(c) = 0 (all the points are at the same coordinates) • Density = Number of Points/Study Area • A standardized nearest neighbor index is used to compare results from different data sets • R = Observed Mean Nearest Neighbor Distance/ Random Mean Nearest Neighbor Distance
Nearest Neighbor Analysis, Contd. • The observed average nearest neighbor of a data set can be compared to a theoretical average nearest neighbor (assuming random spatial distribution) to test the hypothesis that points are randomly distributed • Ho: NND = NND(r) • Ha: NND not= NND(r) OR NND > NND(r) OR NNR(r) > NND) • Choice of Ha depends on having a rational for thinking pattern may be clustered or distributed
Nearest Neighbor Analysis, Example • Area of Study Area = (Max(X) – Min(X)) *(Max(Y) – Min(Y)) • Ho: NND = NNDe (point pattern is random) • Ha: NND > NNDe (point pattern is more distributed than random) • z(NND) = 2.99 => p-value = .001, Ho rejected
Quadrant Analysis • Quadrant Analysis focuses on the frequency of points occurring in a defined part of the study area • Quadrants are superimposed over the study area, and the number of points in each quadrant is examined • Based on quadrant (cell) frequency variability • Point pattern in the whole study area is described through the analysis of point frequencies in each quadrant
Quadrant Analysis, Contd. • Variance Mean Ratio (VMR) = (Variance of Cell Frequencies)/(Mean Cell Frequency) • Disbursed distribution of points, cell frequencies should be similar • VMR ~= Zero • Clustered distribution of points, cell frequencies should be low or zero for most cells with a few cells having many points • VMR is large • Random distribution of points, variance of cell frequency should be near the mean cell frequency
Quadrant Analysis, Contd. • VMR can be used as inferential test statistic • Chi-square • Function of VMR and number of cells m • X^2 = VMR*(m-1) • Ho: VMR = 1 (point R pattern is random) • Ha: VMR not=1 OR VMR > 1 OR VMR < 1 • Need a large number of points spread accross a large number of cells for Quadrant Analysis to be worthwhile approach
Area Pattern Analysis • Aspects of Area Pattern Analysis are analogous to Point Pattern Analysis • Basic statistic for analysis of area patterns is the “joint count” • Join is operationally defined as two areas with common boundary • Measure of spatial autocorrelation for nominal, areal data • Familiar GIS function • Binary Categories used in pattern analysis • Each individual area assigned black or white value • Clustering occurs if areas with same binary value are contiguous • Dispersion occurs if number of areas with black/white joins is greater than number of same-category joins • Randomness occurs if number of similar and dissimilar joins are roughly equal
Area Pattern Analysis, Contd. • How do you determine the number of expected black-white joins? • “Free sampling” approach relies on theoretical background to inform probability of a given area having black or white value • Probability of black or white value for a cell corresponds to binomial distribution, probability p for taking one value and q = 1-p for the other value • “Non-free sampling” approach does not rely on any information outside of the • Probability of black or white cell based on number of black and white cells in study area • When unsure which approach to take, take non-free sampling
Area Pattern Analysis: Non-free Sampling Test • Ho: Observed number of black-white joins (OBW) = expected number of random black-white joins (EBW) • Ha: OBW not= EBW OR OBW > EBW OR EBW > OWB • Choice of Ha depends on having a rationale • If there is no rationale in to choose Ha in either direction, choose two-tailed test
Area Pattern Analysis: Non-free Sampling Test • EBW = 2*J*B*W/N*(N – 1) • J = Total Number of Joins • B = Number of Black Areas • W = Number of White Areas • N = Total Number of Areas • Test statistic Z = (OBW – EBW)/s(BW) • s(BW) is standard error of expected number of black-white joins • Given by formula 12.12 in McGrew (messy) • Worked example found on pages 185 – 189 of text