1 / 34

Chapter 5

Chapter 5. Part A: Spatial data exploration. Spatial data exploration. Spatial analysis and data models (Anselin, 2002). Spatial data exploration. Sampling frameworks Pure random sampling Stratified random – by class/strata (proportionate, disproportionate) Randomised within defined grids

kevina
Download Presentation

Chapter 5

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 5 Part A: Spatial data exploration www.spatialanalysisonline.com

  2. Spatial data exploration • Spatial analysis and data models (Anselin, 2002) www.spatialanalysisonline.com

  3. Spatial data exploration • Sampling frameworks • Pure random sampling • Stratified random – by class/strata (proportionate, disproportionate) • Randomised within defined grids • Uniform • Uniform with randomised offsets • Sampling and declustering www.spatialanalysisonline.com

  4. Spatial data exploration • Sampling frameworks – point sampling www.spatialanalysisonline.com

  5. Spatial data exploration • Sampling frameworks – within zones Grid generation (hexagonal) - selection of 1 point per cell, random offset from centre Grid generation - square grid within field boundaries Selection of 5 random points per zone www.spatialanalysisonline.com

  6. Spatial data exploration www.spatialanalysisonline.com

  7. Spatial data exploration • Random points on a network www.spatialanalysisonline.com

  8. Spatial data exploration • EDA, ESDA and ESTDA • EDA – basic aims (after NIST) • maximize insight into a data set • uncover underlying structure • extract important variables • detect outliers and anomalies • test underlying assumptions • develop parsimonious models • determine optimal factor settings www.spatialanalysisonline.com

  9. Spatial data exploration • ESDA (see GeoDa and STARS) • Extending EDA ideas to the spatial domain (lattice/zone models) • Brushing • Linking • Mapped histograms • Outlier mapping • Box plots • Conditional choropleth plots • Rate mapping www.spatialanalysisonline.com

  10. Spatial data exploration • ESDA: Brushing & linking www.spatialanalysisonline.com

  11. Spatial data exploration • ESDA: Histogram linkage www.spatialanalysisonline.com

  12. Spatial data exploration • ESDA: Parallel coordinate plot & star plot www.spatialanalysisonline.com

  13. Spatial data exploration • ESDA: Mapped box plots www.spatialanalysisonline.com

  14. Spatial data exploration • ESDA: Conditional choropleth mapping www.spatialanalysisonline.com

  15. Spatial data exploration • ESDA: Mapped point data www.spatialanalysisonline.com

  16. Spatial data exploration • ESDA: Trend analysis (continuous spatial data) www.spatialanalysisonline.com

  17. Spatial data exploration • ESDA: Cluster hunting – GAM/K (steps) • Read data for the population at risk • Identify the MBR containing the data, identify starting circle radius, and degree of overlap • Generate a grid covering the MBR • For each grid-intersection generate a circle of radius r • Retrieve two counts for the population at risk and the variable of interest • Apply some “significance” test procedure • Keep the result if significant • Repeat Steps 5 to 7 until all circles have been processed • Increase circle radius by dr and return to Step 3 else go to Step 10 • Create a smoothed density surface of excess incidence for the significant circles • Map this surface and inspect the results www.spatialanalysisonline.com

  18. Spatial data exploration • Grid-based statistics • Univariate analysis of attribute data (non-spatial metrics) • Cross-classification and cross-tab analyses • Spatial pattern analysis for grid data (including Landscape metrics) • Patch metrics; Class-level metrics; Landscape-level metrics • Quadrat analysis • Multi-grid regression analysis www.spatialanalysisonline.com

  19. Spatial data exploration • Grid-based statistics • Landscape metrics • Non-spatial • Proportional abundance; Richness; Evenness; Diversity • Spatial • Patch size distribution and density; Patch shape complexity; Core Area; Isolation/Proximity; Contrast; Dispersion; Contagion and Interspersion; Subdivision; Connectivity www.spatialanalysisonline.com

  20. Spatial data exploration • Point (event) based statistics • Typically analysis of point-pair distances • Points vs events • Distance metrics: Euclidean, spherical, Lp or network • Weighted or unweighted events • Events, NOT computed points (e.g. centroids) • Classical statistical models vs Monte Carlo and other computational methods www.spatialanalysisonline.com

  21. Spatial data exploration • Point (event) based statistics • Basic Nearest neighbour (NN) model • Input coordinates of all points • Compute (symmetric) distances matrix D • Sort the distances to identify the 1st, 2nd,...kth nearest values • Compute the mean of the observed 1st, 2nd, ...kth nearest values • Compare this mean with the expected mean under Complete Spatial Randomness (CSR or Poisson) model www.spatialanalysisonline.com

  22. Spatial data exploration • Point (event) based statistics – NN model www.spatialanalysisonline.com

  23. Spatial data exploration • Point (event) based statistics – NN model • Mean NN distance: • Variance: • NN Index (Ratio): • Z-transform: www.spatialanalysisonline.com

  24. Spatial data exploration • Point (event) based statistics • Issues • Are observations n discrete points? • Sample size (esp. for kth order NN, k>1) • Model requires density estimation, m • Boundary definition problems (density and edge effects) – affects all methods • NN reflexivity of point sets • Limited use of frequency distribution • Validity of Poisson model vs alternative models www.spatialanalysisonline.com

  25. Spatial data exploration • Frequency distribution of nearest neighbour distances, i.e. • The frequency of NN distances in distance bands, say 0-1km, 1-2kms, etc • The cumulative frequency distribution is usually denoted • G(d) = #(di < r)/n where di are the NN distances and n is the number of measurements, or • F(d) = #(di < r)/m where m is the number of random points used in sampling www.spatialanalysisonline.com

  26. Spatial data exploration • Computing G(d) [computing F(d) is similar] • Find all the NN distances • Rank them and form the cumulative frequency distribution • Compare to expected cumulative frequency distribution: • Similar in concept to K-S test with quadrat model, but compute the critical values by simulation rather than table lookup www.spatialanalysisonline.com

  27. Spatial data exploration • Point (event) based statistics – clustering (ESDA) • Is the observed clustering due to natural background variation in the population from which the events arise? • Over what spatial scales does clustering occur? • Are clusters a reflection of regional variations in underlying variables? • Are clusters associated with some feature of interest, such as a refinery, waste disposal site or nuclear plant? • Are clusters simply spatial or are they spatio-temporal? www.spatialanalysisonline.com

  28. Spatial data exploration • Point (event) based statistics – clustering • kth order NN analysis • Cumulative distance frequency distribution, G(r) • Ripley K (or L) function – single or dual pattern • PCP • Hot spot and cluster analysis methods www.spatialanalysisonline.com

  29. Spatial data exploration • Point (event) based statistics – Ripley K or L • Construct a circle, radius d, around each point (event), i • Count the number of other events, labelled j, that fall inside this circle • Repeat these first two stages for all points i, and then sum the results • Increment d by a small fixed amount • Repeat the computation, giving values of K(d) for a set of distances, d • Adjust to provide ‘normalised measure’ L: www.spatialanalysisonline.com

  30. Spatial data exploration • Point (event) based statistics – Ripley K www.spatialanalysisonline.com

  31. Spatial data exploration • Point (event) based statistics – comments • CSR vs PCP vs other models • Data: location, time, attributes, error, duplicates • Duplicates: deliberate rounding, data resolution, genuine duplicate locations, agreed surrogate locations, deliberate data modification • Multi-approach analysis is beneficial • Methods: choice of methods and parameters • Other factors: borders, areas, metrics, background variation, temporal variation, non-spatial factors • Rare events and small samples • Process-pattern vs cause-effect • ESDA in most instances www.spatialanalysisonline.com

  32. Spatial data exploration • Hot spot and cluster analysis – questions • where are the main (most intensive) clusters located? • are clusters distinct or do they merge into one another? • are clusters associated with some known background variable? • is there a common size to clusters or are they variable in size? • do clusters themselves cluster into higher order groupings? • if comparable data are mapped over time, do the clusters remain stable or do they move and/or disappear? www.spatialanalysisonline.com

  33. Spatial data exploration • Hot spot (and cool-spot) analysis • Visual inspection of mapped patterns • Scale issues • Proximal and duplicate points • Point representation (size) • Background variation/controls (risk adjustment) • Weighted or unweighted • Hierarchical or non-hierarchical • Kernel & K-means methods www.spatialanalysisonline.com

  34. Spatial data exploration • Hot spot analysis – Hierarchical NN Cancer incidence data 1st and 2nd order clusters www.spatialanalysisonline.com

More Related