1 / 32

Spatial Statistics and GIS for Environmental Health

Spatial Statistics and GIS for Environmental Health. Frank C. Curriero Environmental Health Sciences and Biostatistics Bloomberg School of Public Health MPT Winter Colloquium January 2006. Objectives. Provide exposure to the field of spatial statistics and the tools of GIS.

knut
Download Presentation

Spatial Statistics and GIS for Environmental Health

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Spatial Statistics and GIS for Environmental Health Frank C. Curriero Environmental Health Sciences and Biostatistics Bloomberg School of Public Health MPT Winter Colloquium January 2006

  2. Objectives • Provide exposure to the field of spatial statistics • and the tools of GIS. • Keeping it simple, application oriented. • Geography is a source of variation worth • considering in environmental health investigations.

  3. What is Spatial Statistics? Statistics for the analysis of spatial data “spatial” “geographic” What is Spatial Data? The “where” in addition to the “what” was observed or measured is important and recorded with the data. Location information (the “where”) can vary. What is GIS? Stands for Geographic Information System Anything more depends on who you ask!

  4. Western Maryland Superfund Site DDE Soil Sample Data Sample # Easting Northing DDE (ppm) 1 1108420 725173 160 2 1108300 725378 4 110 1108490 725038 92 . . . . . . . . .

  5. Substantive Questions Does the site exceed regulated levels of DDE contamination and in need of remediation? What is the level of DDE in my backyard?

  6. Kriged DDE Predictions Kriging: Spatial prediction at unsampled locations based on data from sampled locations. Environmental health applications of kriging exposure maps

  7. Baltimore County Lyme Disease: 1989-1990 Lyme Case Lyme Control Lyme Disease Cases and Controls Cases Controls Longitude Latitude Longitude Latitude -76.4047 39.3421 -76.4054 39.3419 -76.3433 39.3736 -76.3522 39.3718 -76.7592 39.3265 -76.7665 39.3119 . . . . .

  8. Baltimore County Lyme Disease: 1989-1990 Lyme Case Lyme Control Substantive Questions Do cases of Lyme Disease tend to cluster, generally or as localized “hot spots?” Does risk of Lyme Disease vary spatially over Balt. County? Identify and quantify environmental risk factors associated with Lyme Disease.

  9. Baltimore County Lyme Disease Risk: 1989-1990 Spatial Case/Control Analysis • Spatial density estimate of cases divided by spatial density • estimate of controls (nonparametric kernel approach). • Logistic regression approach to include covariates.

  10. Baltimore City 1995 Sexually Transmitted Infections Population at Risk Baltimore City STI Data Census Tract ID STI Cases 270703 5 270501 1 250402 18 . . . . . .

  11. Statistical Methods Exist to Address • Do cases (events) show a tendency to cluster? • Identifying “clusters” or “hot spots.” • Does risk of disease (or outcome of interest) vary • spatially? • Is disease risk elevated near a particular point • source? • Spatial prediction of outcomes at unobserved • locations. • Risk factor estimation in the presence of residual • spatial variation.

  12. Types of Spatial Data 1. Geostatistical Data Basic structure is data tagged with locations. Outcomes usually continuous but not necessary. Locations can essentially exist anywhere. Referred to as continuous spatial variation. Example:MD Superfund Site DDE

  13. 2. Point Pattern Data Locations are the data denoting occurrence of events. Common to aggregate to area-level data. Example: Baltimore County Lyme Disease Cases Baltimore County Lyme Disease Controls 3. Area-level Data Data summarized to an area unit. Rarely arises naturally. Often an aggregate form of point pattern data. Referred to as discrete spatial variation. Example: Baltimore City Census Tract STI

  14. Why Collect Locations as Part of Data? • Sometimes locations are the only data (as in point patterns). • Risk (or outcome of interest) may vary spatially. • Location can serve as an information gatewayto other • linked data sources: environmental • demographic • social • etc. • Data are spatially dependent and locations are used in • statistical methods that account for this dependence. • In general things can vary spatially and geography (location) • maybe a source of variation worth considering.

  15. Temporal Dependence • Time series or longitudinal data. • Past/present direction inherent in temporal data. Spatial Dependence • Dimensions > 1 and loss of directional component. • Observations closer together in space are more • similar than observations further away (clustering). “in space” “on the earth”

  16. Spatial Dependence (clustering) in Environmental Health Data Could be due to: • A contagious agent of the outcome under • investigation. • The spatial variation in the population at risk. • An underlying shared environmental characteristic, • measured or unmeasured, that also varies spatially • (Shared Environment Effect).

  17. What’s so Special about Spatial Data • Spatial data can be complex • - Size large number of data records (remote sensed data) • and/or large number of variables. • - Scale spatial data often exists in 2 or higher dimensions, • so scale and resolution can become an issue. • - Structure features such as point landmark locations, • roads, and boundaries (e.g. census tracts), coordinate • systems (earth is not flat) further complicate spatial data. • - Spatial Dependence spatial data tend to cluster

  18. What’s so Special about Spatial Data (cont) • Spatial investigations often involve generating • and combining data from numerous sources • Spatial data can always be represented as a map. • Spatial data analytic results are often better and/or • need to be communicated via a map. • Maps make you think.

  19. What is a GIS? One word def: Database Two word def: Visual Database Visual database for geographic data • Stores • Manipulates • Manages • Queries • Creates • Displays . . . . MAPS “Layer cake of information”

  20. What else: - A computer system (piece of software) with a tremendous amount of capability for storing, querying, combining, presenting, . . . , spatial data. - GIS is designed specifically for spatial data and hence built to handle all of its complicated features. - GIS is a generic name like word processor. ArcGIS, MapInfo, Idrisi are examples of different GIS. - The earth does not have to be the backdrop for every GIS application, but certainly most common.

  21. What else (cont.) - Public health was not the first and probably not be the last application of GIS and spatial statistics. - GIS as a mechanism for generating hypotheses (exploratory spatial data analysis). - GIS is a tool, a very powerful and valuable tool when working with spatial data. - GIS: a technological tool or a science

  22. What GIS is Not • A complete system for statistical or scientific inference. • Maps, most basic and fundamental concepts in GIS, • are not statistical inference. • A GIS map of • one variable is analogous to a histogram display • two variables overlayed is analogous to an x-y • scatterplot or 2x2 table. • In statistics we go beyond histograms and • scatterplots.

  23. An Important Distinction In the GIS literature analysis or spatial analysis often means spatial data manipulation which is something different than statistical analysis.

  24. US Waterborne Disease Outbreaks, 1948-1994 Outbreak Data Location Longitude Latitude Month Year AL, Anniston -85.83 33.65 Oct 1953 AL, Center Pt. -86.68 33.63 Nov 1958 WY, Cody -109.06 44.53 July 1986 . . . . . . . . .

  25. US Waterborne Disease Outbreaks, 1948-1994 Substantive Questions Do outbreaks occur at random across the US? Are outbreaks preceded by extreme precipitation events? Does the risk of an outbreak vary spatially and related to watershed vulnerability?

  26. GIS Demonstration With Kathryn Kulbicki GIS Database Specialist Environmental Health Sciences Bloomberg School of Public Health

  27. Objective: Association between extreme prcip. and outbreaks Methods: Overlay map of outbreaks and extreme precip events 2,105 watersheds (USGS) 16,000+ weather stations (NCDC) define extreme precipitation aggregate precip and outbreak to watershed Results: 51% of outbreaks were coincident with extreme levels of precip (in the highest 90th %tile) within a 2 month lag preceding the outbreak month. Conclusion: Is this evidence of an association?

  28. US Waterborne Disease Outbreaks, 1948-1994 • Map generation included many involved GIS tasks • on numerous data sources, GIS Spatial Analysis. • Statistically speaking though it represents risk • factor data. • Spatial statistics often considers the map as a • starting point, which in GIS is often an endpoint.

  29. Closing Remarks • Spatial Science for Environmental Health • Spatial Statistics and GIS • Location as an information gateway • Biography and Geography of Public Health

More Related