870 likes | 1.04k Views
Statistical approaches for detecting unexplained clusters of disease . Spatial Aggregation Thomas Talbot New York State Department of Health Environmental Health Surveillance Section Albany School of Public Health GIS & Public Health Class March 3, 2009. Cluster.
E N D
Statistical approaches for detecting unexplained clusters of disease. • Spatial Aggregation Thomas Talbot New York State Department of Health Environmental Health Surveillance Section Albany School of Public Health GIS & Public Health Class March 3, 2009
Cluster • A number of similar things grouped closely togetherWebster’s Dictionary • Unexplained concentrations of health eventsin space and/or time Public Health Definition
Occupation • Sex, Age • Socioeconomic class • Behavior (smoking) • Race • Time • Space
Spatial Autocorrelation “Everything is related to everything else, but near things are more related than distant things.” Negative autocorrelation - Tobler’s first law of geography Positive autocorrelation
Moran’s I • A test for spatial autocorrelation in disease rates. • Nearby areas tend to have similar rates of disease. Moran I is greater than 1, positive spatial autocorrelation. • When nearby areas are dissimilar Moran I is less than 1, negative spatial autocorrelation.
Detecting Clusters • Consider scale • Consider zone • Control for multiple testing
Cluster Questions • Does a disease cluster in space? • Does a disease cluster in both time and space? • Where is the most likely cluster? • Where is the most likely cluster in both time and space?
More Cluster Questions • At what geographic or population scale do clusters appear? • Are cases of disease clustered in areas of high exposure?
Nearest Neighbor AnalysisCuzick & Edwards Method • Count the the number of cases whose nearest neighbors are cases and not controls. • When cases are clustered the nearest neighbor to a case will tend to be another case, and the test statistic will be large.
Advantages • Accounts for the geographic variation in population density • Accounts for confounders through judicious selection of controls • Can detect clustering with many small clusters
Disadvantages • Must have spatial locations of cases & controls • Doesn’t show location of the clusters
Spatial Scan StatisticMartin Kulldorff • Determines the location with elevated rate that is statistically significant. • Adjust for multiple testing of the many possible locations and area sizes of clusters. • Uses Monte Carlo testing techniques
The Space-Time Scan Statistic • Cylindrical window with a circular geographic base and a height corresponding to time. • Cylindrical window is moved in space and time. • P value for each cylinder calculated.
Knox Method test for space-time interaction • When space-time interaction is present cases near in space will be near in time, the test statistic will be large. • Test statistic: The number of pairs of cases that are near in both time and space.
Focal tests for clustering • Cross sectional or cohort approach: Is there a higher rate of disease in populations living in contaminated areas compared to populations in uncontaminated areas? (Relative risk) • Case/control approach: Are there more cases than controls living in a contaminated area? (Odds ratio)
Focal Case-Control Design 500 m. 250 m. Case Control
Regression Analysis • Control for know risk factors before analyzing for spatial clustering • Analyze for unexplained clusters. • Follow-up in areas with large regression residuals with traditional case-control or cohort studies • Obtain additional risk factor data to account for the large residuals.
At what geographic or population scale do clusters appear?Multiresolution mapping.
A cluster of cases in a neighborhood provides a different epidemiological meaning then a cluster of cases across several adjacent counties. Results can change dramatically with the scale of analysis.
References • Talbot TO, Kulldorff M, Forand SP, and Haley VB. Evaluation of Spatial Filters to Create Smoothed Maps of Health Data. Statistics in Medicine. 2000, 19:2451-2467 • Forand SP, Talbot TO, Druschel C, Cross PK. Data Quality and the Spatial Analysis of Disease Rates: Congenital Malformations in New York. 2002. Health and Place. 2002, 8:191-199 • Haley VB, Talbot TO. Geographic Analysis of Blood Lead Levels in New York State Children Born 1994-1997. Environmental Health Perspectives 2004, 112(15):1577-1582 • Kuldorff M, National Cancer Institute. SatScan User Guide www.satscan.org
Geographic Aggregation of Health DatabyThomas TalbotNYS Department of HealthEnvironmental Health Surveillance Section
Health data can be shown at different geographic scales • Residential address • Census blocks, and tracts • Towns • Counties • State
Concerns about release of small area data • Risk of disclosure of confidential information. • Rates of disease are unreliable due to small numbers.
Rate maps with small numbers provide very little information. http://www.nyhealth.gov/statistics/ny_asthma/hosp/zipcode/hamil_t2.htm http://www.nyhealth.gov/statistics/ny_asthma/hosp/zipcode/pdf/hamil_m2.pdf
Disclosure of confidential information Census Blocks
Smoothed or Aggregated Count & Rate Maps • Protect Confidentiality so data can be shared. • Reduce random fluctuations in rates due to small numbers.