1 / 30

Inference from ecological models: air pollution and stroke using data from Sheffield, England.

Inference from ecological models: air pollution and stroke using data from Sheffield, England. Ravi Maheswaran, Guangquan Li, Jane Law, Robert Haining, Marta Blangiardo, Sylvia Richardson, Nicky Best. Outline: Background to the Sheffield study and results presented at Geomed 2005.

Download Presentation

Inference from ecological models: air pollution and stroke using data from Sheffield, England.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Inference from ecological models: air pollution and stroke using data from Sheffield, England. Ravi Maheswaran, Guangquan Li, Jane Law, Robert Haining, Marta Blangiardo, Sylvia Richardson, Nicky Best

  2. Outline: • Background to the Sheffield study and results presented at Geomed 2005. • From the Poisson to the Binomial model • Results • Conclusions

  3. 1. Nitrogen oxides (NOx) and stroke mortality in Sheffield, England (Geomed 2005). • Strokes account for 8%-12% of UK deaths • Some evidence of a link between air pollution and stroke: • studies of severe air pollution episodes (e.g 1952 London smog); • analysis of daily time series (e.g. Kan et al (2003): Shanghai); • cohort studies (e.g. Nafstad et al (2004): Norwegian males).

  4. Since absolute number of deaths is small, power of tests even in large cohort studies is not large particularly for a factor that may not have a large effect. • Small area ecological studies may help: • - by providing another way of looking at the relationship; • - by allowing the analysis of very large populations and at a much lower cost than a cohort study; • - small areas are likely to be more homogeneous (than large areas) in terms of population characteristics thus reducing the risk of ecological bias.

  5. Data Stroke mortality data: • ICD9 codes 430-438; • 1994-8. c3k stroke deaths in population of c200k over 45; • Aggregated by Enumeration District (c 150 households); age (5 year cohorts from 45 to 85+) and sex. • 2.89 deaths per ED (min expected: 0.1; max:10.9)

  6. Population data: • (i) 1991 Census data on demography and deprivation (Townsend index); • Recorded at the Enumeration District level (n=1030) • (ii) Sheffield Health and Illness Prevalence survey (2000): • Random sample stratified by ward; • >10k respondents of whom >9.5k gave complete age, sex and smoking information. • Average of 2.43 smokers per ED (Min expected: 0.19; max expected: 19.24)

  7. Environmental data: Quantifying NOx exposure. The Indic-Airviro model:

  8. Average annual mean pollution levels 1994-9 (exc 1998): NOx (ug/m3)

  9. Areal Interpolation (from grid to ED): point in polygon – weighted PostPoint

  10. NOx data transfered to the enumeration district framework after application of the weighted PostPoint method of areal interpolation

  11. Poisson Model yi = number of stroke deaths in area i. yi ~ Poisson(i) i = riEi ri = underlying true area i specific relative risk. Ei = expected number of deaths in area i standardized for age, sex and socio-economic deprivation: m = age-sex-deprivation specific mortality rate for population subgroup m. ni,m = size of population subgroup m in area i.

  12. Generalized linear model: xi = NOx level in area i. ziave = Smoking prevalence ratio in area i (spatial moving average using the observed and expected counts).

  13. Poisson regression controlling for age, sex, deprivation and smoking prevalence.

  14. Bayesian hierarchical spatial model: Fitted to allow for overdispersion due to : - small area population heterogeneity; - missing covariates (that may be spatially autocorrelated). To allow for the uncertainty associated with the smoking data (small counts; missing values), an errors-in-variable model used for zi.

  15. ei = unexplained area-specific log relative risk in area i after adjusting for x and zest. = vi + si vi= unstructured random effects (zero-mean normal prior) si= spatially structured random effects (zero-mean intrinsic conditional autoregressive prior). ziest = log[smoke.ri] = smoke. + smoke.vi + smoke.si

  16. Priors: - flat priors used for ,  and . - gamma(0.5, 0.0005) used for the precision parameters of the random effect terms. Spatial fraction (SF): - Var(si)/[Var(si) + Var(vi)]. Ratio of the estimate of the marginal variance of the spatial random effect to the sum of the estimated marginal variances of the spatial and the unstructured random effects. SF => 1 implies spatial heterogeneity dominates; SF => 0 implies unstructured heterogeneity dominates.

  17. Poisson regression with spatial random effects, controlling for age, sex, deprivation and smoking prevalence

  18. Conclusions: • Evidence of an association between NOx and stroke mortality: • threshold level for an effect; • effect size diminishes after including random effects to allow for overdispersion and missing variables; • spatially smoothing NOx to allow for local journeys did not make a difference to the size of the effect; • Unable to allow for long and short term population movements. • No association with smoking prevalence (effect of definition?; small sample sizes in some EDs?)

  19. 2. Fitting a Binomial Model • stroke is not contagious so outcomes for individuals are independent Bernoulli rvs and therefore at the area level they aggregate to Binomial rvs. • because stroke is relatively rare, the Poisson assumption should give similar results, but it is only an approximation. • we also have data on the proportion exposed to different levels of NOx at the ED level which was not previously used.

  20. Ecological analysis Unknown (but of interest) Observed (not previously used) Observed (and used in the previous analysis)

  21. Within-ED population distribution by PostPoint.

  22. Dichotomised individual level model xi,j is 0 (if individual j in area i is not exposed) or 1 (if individual j in area i is exposed). :stroke risk in not-exposed group in i :stroke risk in exposed group in i zi denotes other area level covariates (e.g. deprivation) vi~ N(0,2). An unstructured random effect to account for unmeasured covariates.

  23. Depending on the exposure status of the individual: The person is in the not-exposed group The person is in the exposed group This can be extend to a categorical exposure variable with more than 2 levels. Various extensions of the model such as incorporating continuous exposure can be found in Jackson et al. (2006) Jackson, C. H., Best, N. G. and Richardson, S. Improving ecological inference using individual-level data. Statistics in Medicine (2006) 25(12): 2136--2159

  24. An area-level model incorporating the distribution of within-area exposure where i = proportion of the population in area i in the exposed category. pi = probability of stroke death in area i, regardless of exposure.

  25. Remark Note that applying a Binomial model with the proportion of exposed individuals as a covariate: But in general Ecological bias Derived from an individual level model

  26. 3. Results Binomial regression controlling for age, sex (18 strata), deprivation and incorporating the within area distribution of exposure.

  27. A dichotomised-exposure Binomial regression model controlling for age, sex (4 strata; 18 strata) and deprivation and incorporating data on the within area distribution of exposure. • The exposed category comprises NOx categories 4 and 5 in • the previous slide; • The non-exposed category comprises categories 1, 2 and 3.

  28. 4. Conclusions Incorporation of information on within area exposure resulted in a reduction of the estimated relative risk compared to the earlier set of results. Lower risks in categories 2 and 3 in the binomial model with 5 exposure categories may indicate some confounding effects have not been accounted for in the current model; in the absence of additional information, these effects could be “averaged out” by combining some exposure categories. Fitting a reduced model with two exposure categories does indicate a significant effect in the exposed group after adjusting for age, sex and deprivation; Increasing the number of age-sex cohorts from 4 to 18 in the dichotomous-exposure model reduced the estimated relative risk to 1.14 (95% CI: 1.00, 1.30), but there is still evidence of a significant effect.

  29. Differences between the current approach and the earlier modelling. • The Poisson model is prone to ecological bias since for exposure, only aggregated information was used. • Here we attempt to reduce  the bias by utilizing data on the within-area distribution of exposure, i.e., the  proportion of people in the exposed and non-exposed groups. • Deprivation was absorbed into the expected number of cases in the earlier work, here it has been included as a covariate. We could adjust for deprivation in the baseline risks. • There was no adjustment for smoking prevalence since it was not significant in the earlier modeling. The possibility exists of using lung cancer mortality as a proxy for smoking instead.

More Related