1 / 26

TWO-STAGE CASE-CONTROL STUDIES USING EXPOSURE ESTIMATES FROM A GEOGRAPHICAL INFORMATION SYSTEM

TWO-STAGE CASE-CONTROL STUDIES USING EXPOSURE ESTIMATES FROM A GEOGRAPHICAL INFORMATION SYSTEM. Jonas Björk 1 & Ulf Strömberg 2 1 Competence Center for Clinical Research 2 Occupational and Environmental Medicine Lund University Hospital. OUTLINE OF TALK.

rebekah
Download Presentation

TWO-STAGE CASE-CONTROL STUDIES USING EXPOSURE ESTIMATES FROM A GEOGRAPHICAL INFORMATION SYSTEM

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TWO-STAGE CASE-CONTROL STUDIES USING EXPOSURE ESTIMATES FROM A GEOGRAPHICAL INFORMATION SYSTEM Jonas Björk1 & Ulf Strömberg2 1Competence Center for Clinical Research 2Occupational and Environmental Medicine Lund University Hospital

  2. OUTLINE OF TALK • Previous project: What have we done? (Jonas Björk) • Ongoing project: What shall we do? (Ulf Strömberg)

  3. Two-stage procedure for case-control studies 1st stage Complete data obtained from registries Disease status General characteristics Group affiliation (e.g. occupation or residential area)  Group-level exposure XG 2nd stage Individual exposure data for a subset of the 1st stage sample

  4. Exposure database group-level exposure • JEM = Job Exposure Matrix Occupational group  proportion exposed • GIS Residential group (area)  average concentration of an air pollutant

  5. JEM - proportion exposed Most data typically in groups with low XG

  6. Linear Relation between Proportion Exposed and Relative Risk • No confounding between/within groups Example: RR (exposed vs. unexposed) = 2.0

  7. Linear OR model: OR(XG) = 1 + β XG XG = Exposure proportion OR for exposed vs. unexposed = OR(1) = 1 + β OR(1) Most data typically in groups with low XG 1 XG 0 1

  8. Confounding between groups • General confounders (eg, gender and age) can normally be adjusted for • Assuming no confounding within groups and no effect modification in any stratum sk: OR(XG;s1, s2, ...sk) = (1 + β XG) exp(Σγksk)

  9. Combining 1st and 2nd stage data • Assumption: 2nd stage data missing at random condition on disease status and 1st stage group affiliation • For subjects with missing 2nd stage data: Use 1st stage data to calculate expected number of exposed/unexposed • Expectation-maximization (EM) algorithm

  10. EM-algorithm(Wacholder & Weinberg 1994) 1. Select a starting value, e.g. OR=1 2. E-step Among the non-participants, calculate expected number of exposed/unexposed case and controls in each group 3. M-step Maximize the likelihood for observed+expected cell frequencies using the chosen risk model for individual-level data (not necessarily linear)  New OR-estimate 4. Repeat 2. and 3. until convergence

  11. E-step in our situation (Strömberg & Björk, submitted) ÔR = Current OR-estimate Complete the data in each group G: • m0 controls with missing 2nd stage data m0 * XG = expected number of exposed • m1 cases with missing 2nd stage data  m1 * XG * ÔR / [1+(ÔR-1)* XG]

  12. Simulated case-control studies • 400 cases, 1200 controls in the 1st stage • 2nd stage participation 75% of the cases 25% of the controls • Selective participation of 2nd stage controls Corr(Participation, XG) =0, > 0, <0 • 1000 replications in each scenario • True OR = 3

  13. Simulations - Results SD = Empirical standard deviation of the ln(OR) estimates Coverage = Coverage of 95% confidence intervals

  14. Simulations - Conclusions • Combining 1st and 2nd stage data, • using the EM method can: • 1. Improve precision • 2. Remove bias from selective participation • Method is sensitive to errors in the • (1st stage) external exposure data!

  15. Simulations – Conclusions II • EM-method is sensitive to • Violations of the MAR-assumption • (condition on on disease status and 1st stage group affiliation) • 2. Errors in the (1st stage) external exposure data

  16. Ongoing methodological research project • Focus on exposure estimates from a GIS

  17. GIS data: NO2 (Scania)

  18. Two-stage exposure assessment procedure 1st stage:XG represents mean exposure levels rather than proportion exposed XG = 4.8 XG = 10.1 XG = 20.1 ... xi xi xi 2nd stage:xi is a continuous, rather than a dichotomous, exposure variable

  19. Assume a linear relation between and xi and disease odds (cf. radon exposure and lung cancer [Weinberg et al., 1996]). Odds xi For the ”only 1st stage” subjects: no bias expected by using their XG:s (Berkson errors) provided MAR in each group – independent of disease status. EM method? Exposure variation in each group?

  20. Two-stage exposure assessment procedure – related work • Multilevel studies with applications to a study of air pollution [Navidi et al., 1994]: pooling exposure effect estimates based on individual-level and group-level models, respectively

  21. Collecting data on confounders or effect modifiers at 2nd stage 1st stage:XG = mean exposure levels XG = 4.8 XG = 10.1 XG = 20.1 ... ci ci ci 2nd stage:ci is a covariate, e.g. smoking history

  22. Data on confounders or effect modifiers at 2nd stage – estimation of exposure effect • Confounder adjustment based on logistic regression: pseudo-likelihood approach [Cain & Breslow, 1988] • More general approach: EM method [Wacholder & Weinberg, 1994]

  23. Design stage (“stage 0”) 1st stage: How many geographical areas (groups)? Group1 Group 2 Group 3 ... Subjects? ? ? 2nd stage: Fractions of the 1st stage cases and controls?

  24. Design stage – related work • Two-stage exposure assessment: power depends more strongly on the number of groups than on the number of subjects per group [Navidi et al., 1994]

  25. References I • Björk & Strömberg. Int J Epidemiol 2002;31:154-60. • Strömberg & Björk. “Incorporating group-level exposure information in case-control studies with missing data on dichotomous exposures”. Submitted.

  26. References II • Cain & Breslow. Am J Epidemiol 1988;128:1198-1206. • Navidi et al. Environ Health Perspect 1994;102(Suppl 8):25-32. • Wacholder & Weinberg. Biometrics 1994;50:350-7. • Weinberg et al. Epidemiology 1996;7:190-7.

More Related