280 likes | 396 Views
Geographic linkages between National Center for Health Statistics’ population health surveys and air quality measures. NSF-Census Research Network Meeting September 11, 2014 Jennifer D. Parker National Center for Health Statistics. Background.
E N D
Geographic linkages between National Center for Health Statistics’ population health surveys and air quality measures NSF-Census Research Network Meeting September 11, 2014 Jennifer D. Parker National Center for Health Statistics
Background • Air pollution has been found to be related to numerous acute and chronic health outcomes, including mortality, asthma exacerbation, emergency department visits, other cardiovascular and respiratory outcomes, etc • The Environmental Protection Agency regulates air quality, largely based on health impacts • Few national studies are available
National Center for Health Statistics • 1960 - National Office of Vital Statistics and the National Health Survey merged to form National Center for Health Statistics • 1987 - NCHS became part of the Centers for Disease Control and Prevention • Official Federal Health Statistics Agency and part of the Federal Statistical System • Data products • A principle source of health statistics for Department of Health and Human Services • Public use micro-data: individual records to enable data users to calculate estimates • Integrated Data Tools: reports and online tools with pre-tabulated indicators • Restricted use micro-data: individual records with information that could potentially be used to identify an individual • Geographic identifiers and dates • Genetic information, detailed race and ethnicity • Can be accessed in the NCHS Research Data Center (RDC)
Population Health SurveysNational Health Interview Survey • Household interview survey – detailed demographic, health status, health insurance, access to care, economic etc. • Generally chronic health outcomes rather than acute • Approximately 35,000 – 40,000 households, 80,000 – 100,000 individuals per year • Data years can be combined to increase sample or study trends • Complex sample design (stratified, clustered, oversampling) • Collected since 1957 • Interview dates and geocoded residential locations available on restricted use files • Locations can be used to examine characteristics of locations, not actual places
Population Health SurveysNational Health and Nutrition Examination Survey • National examinations since 1960, since 1999 an ongoing survey • Physical measures, blood, urine, DNA • Household interview and 3-4 hr exam • Complex survey design • About 5,000 total survey respondents each year • About 15 locations each year, 30 location in a two-year data release; data can be combined across releases • Exam dates, interview dates and geocoded residential locations available on restricted use files • Locations can be used to examine characteristics of locations, not actual places • Data collected at each location over a few weeks
National Vital Statistics System (NVSS)Death and Birth Data • Vital statistics data can be used to assess events during specific time windows, at specific locations, aggregated locations, and/or for specific endpoints • Death data have been combined with daily pollution data at county-level to study acute effects of air pollution • Birth data have been combined with county level pollution data to study birth weight, infant mortality, preterm birth
National Health Care Surveys • Provider surveys designed to provide a comprehensive look at medical care provided in the U.S • Information generally obtained from administrative records (e.g. claims), not interview • Selected covariates: age, race, expected source of payment, procedures and diagnoses • Data can be combined across years to increase power • Restricted use files can contain dates, locations of hospitals, and zip codes of patients • Locations can be used to examine characteristics of locations, not actual places
EPA Air Quality System • EPA is a regulatory agency • Selected pollution data • Criteria pollutants – ozone, sulfur dioxide, carbon monoxide, particulate matter, nitrogen dioxide, lead • Collected at monitors throughout the U.S. under local authority • Both raw numbers and annual summaries available • Air Quality System (AQS) is EPA’s repository for ambient air quality data. • Monitors, however, are not located everywhere. • Placement is not random, rather for regulatory and convenience purposes • Not all locations monitor for all pollutants, not all pollutants are monitored on the same periodicity, and the ability to extrapolate from the monitor to a distance away from the monitor varies by pollutant. • As the AQS has become more user-friendly, the number of studies using these data have greatly expanded. • More recently, EPA Community Multiscale Air Quality (CMAQ) model provides estimates for locations with no monitors
Hypothetical sampled segments and locations of monitors within a hypothetical NCHS sampled county Not all monitors are near NCHS survey participants Exposure estimates are highly clustered
Combining NCHS survey and EPA data • Complex sample survey designs • Not all counties in sample • While survey designed to be nationally representative, the merged survey may not be representative for pollution • Only ~ 30 counties sampled in a two-year NHANES release • Temporal correspondence between health measures and pollution measures may not align • Over sampling of subpopulation increases sample size for understudied groups
Combining NCHS survey and EPA data, continued • Surveys currently geocoded to latitude and longitude of residence and/or administrative units • Air quality data • Latitude and longitude of monitor locations • Administrative units (county) • Modeled air pollution data available at lat/long and/or smaller administrative units • Basic krigging • EPA’s CMAQ – Community Multiscale Air Quality • Others, including Land Use Regression (LUR) • Geographically linked data cannot be used for maps or to identify sensitive locations, rather as a contextual variable for analysis (e.g., areas with higher compared to lower levels of air pollution)
Evolution of geographic linkages between NCHS survey data and EPA data
Childhood asthma and county-level PM2.5 12-mo average air pollution. 2001-2004 NHIS Current asthma Asthma attack past 12 months ER visit past 12 months
Percent NHIS respondents linked to pollutants by linkage method: 2000-2003 NHIS
Percent distribution race/ethnicity. NHIS linked to PM2.5, by linkage method: 2000-2003 NHIS
Percent with PM2.5data by linkage method and time from examination: 1999-2004 NHANES 20 miles County 5 miles SOURCE: NHANES 99-04, unpublished data
Adjusted* odds of fair/poor health status by PM2.5 (per 10 mg/m3) by geographic linkage method. 2000-2003 NHIS *Adjusted for poverty, age, region, urban/rural, and race/ethnicity
Application of krigging used to obtain air measures for the NHIS
Relationship between asthma and PM2.5 using krigged PM2.5 estimates. NHIS 2002-2005
What was learned comparing distance and county measures • The more tightly controlled the exposure window in terms of space and time, the fewer survey respondents have monitor data • Large literature on ‘exposure measurement error’ • Growing interest in time-use patterns • The more pollutants examined simultaneously, the fewer survey respondents would be included • More urban and disadvantaged populations would be more likely to be included • Impact on policy inferences unknown for all outcomes, but may be small • Adjustment of sample weights may partially address this issue • Random effects and Bayesian statistical models may address these issues
Modeled Data from CDC/National Center for Environmental Health • Downscaler (DS) model based on Bayesian space-time methods • Combines monitored data from AQS with EPA CMAQ data • Daily, county, tract level • Available for ozone and PM2.5 for 2001-2008
Hazard ratios for association between heart disease mortality and PM2.5 (2001), black and white adults. 1986-2004 NHIS with mortality follow-up through 2006
What we’ve learned (so far) from DS model • Inferences for mortality and PM2.5 are similar for monitor data using AQS and the DS data • Confidence intervals narrower for DS models • Inferences for black adults similar to those for white adults • Availability of DS data in more locations may allow for inference by geographic indicators (e.g. region of the country, levels of urbanization) • Propagating model error into epidemiological models with complex survey data should be done but best method for doing so is unknown
Conclusion • Linkages between national health data and monitored and modeled air pollution data are resources for studying environmental health • Nationally representative data include population subgroups less often included in other studies • Wide range of health outcomes • Numerous methodological challenges • Temporal correspondence between exposure measures and health outcome • Geographic correspondence between exposure measures and health outcomes • Monitor locations • Residential and daily mobility • Incorporation of estimation and model errors into inference analyses
Thanks to my collaborators! • Nataliya Kravets • Ambarish Vaidyanathan • Lara Akinbami • Keeve Nachman • Tracey Woodruff
Contact: Jennifer D. Parker Chief, Special Projects Branch Office of Analysis and Epidemiology National Center for Health Statistics jdparker@cdc.gov (301) 458-4419