450 likes | 599 Views
Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology. Francesca Dominici. NMMAPS. National Morbidity and Mortality Air Pollution Study (NMMAPS) Daily data on cardiovascular/respiratory mortality in 10 largest cities in U.S. Daily particulate matter (PM10) data
E N D
Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici BIO656 Multilevel Models
NMMAPS • National Morbidity and Mortality Air Pollution Study (NMMAPS) • Daily data on cardiovascular/respiratory mortality in 10 largest cities in U.S. • Daily particulate matter (PM10) data • Log-linear regression estimate relative risk of mortality per 10 unit increase in PM10 for each city • Estimate and statistical standard error for each city BIO656 Multilevel Models
Relative Risks* for Six Largest Cities Approximate values read from graph in Daniels, et al. 2000. AJE BIO656 Multilevel Models
Notation BIO656 Multilevel Models
Sources of Variation BIO656 Multilevel Models
Notation BIO656 Multilevel Models
Estimating Overall Mean • Idea: give more weight to more precise values • Specifically, weight estimates inversely proportional to their variances BIO656 Multilevel Models
Estimating the Overall Mean BIO656 Multilevel Models
Calculations for Empirical Bayes Estimates a= .27* 0.25 + .18*1.4 + .27*0.60 + .07*0.25 + .11*0.45 + 0.9*1.0 = 0.65 Var(a) = 1/Sum(1/TVc) = 0.164^2 BIO656 Multilevel Models
Software in R beta.hat <-c(0.25,1.4,0.50,0.25,0.45,1.0) se <- c(0.13,0.25,0.13,0.55,0.40,0.45) NV <- var(beta.hat) - mean(se^2) TV <- se^2 + NV tmp<- 1/TV ww <- tmp/sum(tmp) v.alphahat <- sum(ww)^{-1} alpha.hat <- v.alphahat*sum(beta.hat*ww) BIO656 Multilevel Models
Two Extremes • Natural variance >> Statistical variances • Weights wc approximately constant = 1/n • Use ordinary mean of estimates regardless of their relative precision • Statistical variances >> Natural variance • Weight each estimator inversely proportional to its statistical variance BIO656 Multilevel Models
Estimating Relative Risk for Each City • Disease screening analogy • Test result from imperfect test • Positive predictive value combines prevalence with test result using Bayes theorem • Empirical Bayes estimator of the true value for a city is the conditional expectation of the true value given the data BIO656 Multilevel Models
Empirical Bayes Estimate BIO656 Multilevel Models
Calculations for Empirical Bayes Estimates BIO656 Multilevel Models
Maximum likelihood estimates Empirical Bayes estimates BIO656 Multilevel Models
Key Ideas • Better to use data for all cities to estimate the relative risk for a particular city • Reduce variance by adding some bias • Smooth compromise between city specific estimates and overall mean • Empirical-Bayes estimates depend on measure of natural variation • Assess sensivity to estimate of NV BIO656 Multilevel Models
Daily time series of air pollution, mortality and weather in Baltimore 1987-1994 BIO656 Multilevel Models
90 Largest Locations in the USA BIO656 Multilevel Models
Statistical Methods • Semi-parametric regressions for estimating associations between day-to-day variations in air pollution and mortality controlling for confounding factors • Hierarchical Models for estimating: • national-average relative rate • national-average exposure-response relationship • exploring heterogeneity of air pollution effects across the country BIO656 Multilevel Models
Hierarchical Models for Estimating a National AverageRelative Rate of Mortality BIO656 Multilevel Models
Pooling City-specific relative rates are pooled across cities to: • estimate a national-average air pollution effect on mortality; • explore geographical patterns of variation of air pollution effects across the country BIO656 Multilevel Models
Pooling • Implement the old idea of borrowing strength across studies • Estimate heterogeneity and its uncertainty • Estimate a national-average effect which takes into account heterogeneity BIO656 Multilevel Models
City-specific and regional estimates City-specific and regional estimates BIO656 Multilevel Models
Spatial Model for Relative Rates BIO656 Multilevel Models
Three Models • “Three stage”- as in previous slide • “Two stage”- ignore region effects; assume cities have exchangeable random effects • Two stage with “spatial” correlation-city random effects have isotropic exponentially decaying autocorrelation function BIO656 Multilevel Models
Estimating a national-average relative rate Dominici, Zeger, Samet RSSA 2000 Samet, Dominici, Zeger et al. NEJM 2000 BIO656 Multilevel Models
Epidemiological Evidence from NMMAPS BIO656 Multilevel Models
Maximum likelihood and Bayesian estimates of air pollution effects Borrow strength across cities Use only city-specific information Dominici, McDermott, Zeger, Samet EHP 2003 BIO656 Multilevel Models
Shrinkage BIO656 Multilevel Models
Posterior Distribution of National Average BIO656 Multilevel Models
Results Stratified by Cause of Death BIO656 Multilevel Models
Regional map of air pollution effects Partition of the United States used in the 1996 Review of the NAAQS BIO656 Multilevel Models
Findings • NMMAPS has provided at least four important findings about air pollution and mortality • There is evidence of an association between acute exposure to particulate air pollution and mortality • This association is strongest for cardiovascular and respiratory mortality • The association is strongest in the Northeast region of the USA • The exposure-response relationship is linear BIO656 Multilevel Models
Caveats • Used simplistic methods to illustrate the key ideas: • Treated natural variance and overall estimate as known when calculating uncertainty in EB estimates • Assumed normal distribution or true relative risks • Can do better using Markov Chain Monte Carlo methods – more to come BIO656 Multilevel Models