1 / 87

EPI 5240: Introduction to Epidemiology Case-control Studies November 30, 2009

EPI 5240: Introduction to Epidemiology Case-control Studies November 30, 2009. Dr. N. Birkett, Department of Epidemiology & Community Medicine, University of Ottawa. Case-control studies. More correctly called: case-referent studies.

elata
Download Presentation

EPI 5240: Introduction to Epidemiology Case-control Studies November 30, 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EPI 5240:Introduction to EpidemiologyCase-control StudiesNovember 30, 2009 Dr. N. Birkett, Department of Epidemiology & Community Medicine, University of Ottawa

  2. Case-control studies • More correctly called: case-referent studies. • Compare a group of cases to a referent group which reflects the exposure experience of the underlying population which gave rise to the cases. • Cases • Prevalent • Incident • Controls • Prevalent • Incident • Density sampled

  3. Case-control (1) • Key feature is that subjects in the ‘case’ group are selected after they have developed the outcome of interest. • Interviews are done after the fact • Limit the potential for some measures (e.g. biomarkers, psychological state) • Subject to biases • Need a comparison group (control group or reference group) • Choosing a suitable group is a major challenge.

  4. Case-control (1A) • Can be done prospectively or retrospectively • Prospective: Start recruiting cases on day study starts as new cases of the outcome are diagnosed • Slows done the study but get better data • Retrospective: choose a date prior to the start of the study and recruit newly diagnosed cases • Faster but limits interview options • Deaths • Do not select prevalent cases (those alive on a target date) • Strong potential for bias.

  5. Case-control studies (2) • Some ‘names’ or labels • Case-control • Case-cohort • Case-base • Case-only • Case-crossover

  6. Case-control studies (3) • Situations where case-control designs are used • Exposure data are difficult or expensive to obtain • Nested case-control • Case-cohort • Disease is rare • Disease has long induction and latent period • Little is known about the disease • Underlying population is dynamic

  7. Advantages of Case-Control Studies • Relatively quick and cheap (not always; depends on the design used) • Appropriate for studying rare outcomes. • Require a smaller number of subjects than cohort study (assuming you can find enough cases) • Allow study of multiple potential exposure factors in the same study

  8. Disadvantages of Case-Control Studies • Cannot determine incidence directly (except in special circumstances). • Not appropriate for studying rare exposures. • Higher risk of biases in exposure estimation, etc. • Selection of appropriate comparison group can be hard. • They have a bad reputation • Complex design and methodological features

  9. Case-control (4) • So far, we’ve discussed the traditional view of case-control studies • Select a group of people with the outcome (cases) • Select a group of people to whom to compare the cases (controls) • Compare exposure profiles in the two groups. • No clear rule to logically link the two groups • Originally, people thought this didn’t matter • Sometimes thought of as a backwards cohort study • TROHOC • Logic is ‘backwards’ • From effect  cause

  10. Case-control (5) • The ‘Modern view’ • An alternate view of doing a cohort study which • Studies only a sample of the members of the cohort who do not get the outcome • Provides a logical link between the two study groups • Cases and referents • Is more efficient

  11. Case-control (6) • Suppose we wanted to do a cohort study to find out if DDE exposure increases the risk of breast cancer in women. • Recruit 100,000 women without breast cancer and follow them for 20 years. • Collect a blood sample at baseline to determine DDE exposure level. • Analyze the blood samples of all 100,000 women to generate this table of results (count data)

  12. Case-control (7) BC+ BC- High 500 14,500 15,000 Low 1,500 83,500 85,000 2,000 98,000 100,000 DDE 500/15000 CIR = ------------------- = 1.88 1500/85000 OR = 1.92 (1.73 - 2.13) COST: $500/DDE sample. TOTAL Cost = $500 * 100,000 = $50,000,000 BC+ Cost = $500 * 2,000 = 1,000,000 BC- Cost = $500 * 98,000 = 49,000,000 CAN WE DO THIS CHEAPER?

  13. Case-control (8) • 98% of cost is going to study women who didn’t get breast cancer. Do we really need 98,000 of them? • Suppose we could reduce the number of BC negative women to 4,000. • Then cost would be only $3,000,000 (rather than $50,000,000). • Can’t do this at the start of the cohort because we don’t know which women will develop breast cancer. • BUT, if we wait until the end to do our lab studies, we will know which women developed breast cancer. • Select a random sample of 4,000 BC negative women • Keep all 2000 BC positive women. • Now, only need to do lab tests of 6,000 women.

  14. Case-control (9) Select 4000 of 98000 of the BC negative women and generate this table BC+ BC- High 500 592 (1092) Low 1,500 3408 (4908) 2,000 4,000 6000 DDE Sampled Study OR = 1.92 (1.68 – 2.19) Cost = $3,000,000 Full study OR = 1.92 (1.73 - 2.13) Cost = $50,000,000 Can’t compute the CIR from this study (1.50 ≠ 1.88)

  15. Case-control (10) • This is a NESTED CASE-CONTROL study • The basis for the modern framing of the case control. • Comparison group is selected from the people who belong to the cohort • They were candidates to be ‘cases’. If they had got the outcome. • Controls were elected from people who remained disease-free through-out follow-up • Prevalent controls • Not the best but close to the traditional case-control approach. • How else could we get a comparison group? • Select a random sample of the entire cohort! • Yes, some people might be included twice in study.

  16. Case-control (11) Select 4000/100000 of the cohort to generate this table BC+ BC- High 500 600 (1100) Low 1,500 3400 (4900) 2,000 4,000 6000 DDE Sampled Study OR = 1.88 Cost = $3,000,000 Full study CIR = 1.88 Cost = $50,000,000 CASE-COHORT Study Design

  17. Case-control (12) • CASE-COHORT study (or case-base) • Select the reference group from all people in the cohort • If someone is selected for the reference group and then gets the outcome, they remain in the study twice • Once as case • Once as referent • The OR from a case-cohort study is algebraically identical to the CIR from the underlying cohort study

  18. Case-control (13) • Third method of selecting referent group • During the 20 years of follow-up, every time a case occurs, select a referent group member • Candidates for this selection are all people who are in the cohort and are outcome-free and still under follow-up • The ‘RISK SET’ • This is density sampling. • The OR from this design is identical to the Rate Ratio from the underlying cohort study.

  19. BAD CHOICE

  20. Case-control (14) • Modern case-referent study • Linked to an underlying cohort study • May exist (primary base) • May be conceptual (secondary base) • Comparison group is selected to represent the exposure experience of the underlying cohort. • Provides a basis to decide if the referent group is any good • Can select controls from people free of the outcome at the • Start • End • Through-out study

  21. Key Design Points • Selecting the cases • Selecting the controls • Determining exposure status • Sample size and power.

  22. Study Base (1) • The set of persons or person-time in which disease subjects become cases. • The members of the source population • Primary base • Investigator defines the population experience of interest (e.g. the 1st example) • Closed vs. dynamic • Secondary base • Defined implicitly as the population which gave rise to the cases • All people who would have been diagnosed at the Ottawa hospital if they had got the disease under study

  23. Study Base (2) • Cases should be exclusively people in the base • All or a random sample • Controls (referents) estimate the exposure experience of the base • Primary base • Main challenge is complete case ascertainment • Secondary base • Main challenge is definition of study base and control selection

  24. Selecting the Cases • Incident vs. prevalent cases • Incident cases are preferred • Can be hard to establish ‘point of onset’. • Chronic disease • Sub-clinical phase • Latency periods

  25. Defining a Case (1) • Existing entity • Severity (mild vs. severe) • Disease heterogeneity • Criteria to establish diagnosis (e.g. Rheumatoid Arthritis

  26. Defining a Case (2) • Existing entity • Severity (mild vs. severe) • Disease heterogeneity • Criteria to establish diagnosis (e.g. Rheumatoid Arthritis • Incubation period • Subjective vs. objective criteria

  27. Defining a Case (3) • New disease • No clear guidelines • Depends on clinical insights and formation of homogeneous groups • AIDS/HIV initial case definition limited to homosexual men • efficient design to find cases • limited etiological focus to lifestyle issues vs. infection

  28. Identifying a Case (1) • Goal is to identify all cases meeting criteria. Ideally, population based (Primary base). Could be hospital/clinic/etc based (Secondary base) • All true cases should have equal probability of being chosen. • Text states that complete ascertainment from base is not needed • True, but only if you can define the base population so you can: • Select a random sample of cases from the base • Selecting a convenience sample is not OK in most cases, especially when the proportion of selected cases is low.

  29. Identifying a Case (1A) • Selection Biases • Berkson's bias • Neyman fallacy (prevalence-incidence bias) • Detection bias

  30. Identifying a Case (2) • Sources for Cases • Death Certificates • Registries • Hospital/clinic lists • Pathology records • Advertising

  31. Selection of Controls (1) ‘Without controls, there can be no case-control studies but with the wrong controls, there can only be regrettable case-control studies.’ Oleckno

  32. Selection of Controls (2) UNDERLYING REQUIREMENT • The control group should represent the exposure experience of the subjects (cohort) which gave rise to the case group. • Very hard to achieve this goal when using a secondary base approach.

  33. Selection of Controls (3) General Control Selection Methods • Survivor Sampling • Only subjects who are disease free at the end of the cohort are eligible. • Base sampling • All subjects at the start of the cohort are eligible • Risk set sampling • Controls are selected through-out follow-up/recruitment from those who are disease-free and under follow-up • A subject can be both a case and a control

  34. Selection of Controls (4) • Wacholder et al lists four key principles of control selection: • The study base principle • Deconfounding principle • Comparable accuracy principle • Efficiency

  35. Selection of Controls (5) Study Base Principle • Primary base: pre-defined group (population experience) which is to be studied. Cases are derived only from people in the 'experience' • major challenge is complete case ascertainment. Can be infeasible for mild outcomes (e.g. male infertility) • Can ascertain cases through clinics, etc. if they capture all cases in ‘cohort’ • Easier to select a valid control group • Commonly a population-based study

  36. Selection of Controls (6) Study Base Principle • Secondary base. Defined implicitly as the 'group of people who would have become study cases if they had acquired the outcome during the course of the study'. • Hard to define to avoid selection bias problems. • referral filters • Usually a hospital or clinic based study • Cases can come from a wide geographic area without complete coverage

  37. Selection of Controls (7) Selection of Controls from Study Base • Usually use simple random sample but can be more complex (e.g. 2 stage sample) • Controls need to be representative of the base population not of the general population • Exclusions applied to both cases and controls are fine. Those applied only to controls (or only to cases) produce bias. • BAD: • exclude controls with dementia (can’t get exposure info) • Keep cases with dementia (since you can get exposure info from the hospital chart). • External controls can be OK.

  38. Selection of Controls (8) Deconfounding Principle • Measured confounders can be controlled in the analysis. • Can select controls to control unmeasured confounders (e.g. neighbourhood controls or sibling controls). • Can impact on study efficiency.

  39. Selection of Controls (9) Comparable Accuracy Principle • Aim is to produce non-differential misclassification • Try and collect information from cases and controls in the same manner • Using clinic charts for cases and personal interview for controls would be a problem. • Dead cases. • Don't select dead controls • Use proxies but using proxies for controls doesn’t work • Unavailable cases. • Use proxies.

  40. Sources of Controls (1) • Population Controls • Main method used in study with primary base. • Roster based selection (very limited options in 2009) • Census • Property taxation roles • Medical insurance files • Driver’s licence files • Random Digit Dialling • Main method used at present due to privacy restrictions • Neighbourhood controls

  41. Sources of Controls (2) • Population Controls (cont) • Advantages • Same study base as cases. • Easier to include exclusion criteria • Permits extrapolation to base to produce estimates of risk. • Disadvantages • Problematic if case ascertainment is incomplete. • Inconvenient • Recall bias • Motivation

  42. Sources of Controls (3) • Hospital/registry Controls • Commonly used with secondary base. • Apply all eligibility rules to both the cases and controls • Condition used to define control group MUST not be related to exposure • Don’t use COPD controls in a study of smoking and lung cancer • Often choose more than one condition

  43. Sources of Controls (4) • Hospital/registry Controls (cont) • Advantages • Useful if a large number of potential cases don’t get recruited (e.g. due to distance from study). • Comparable quality of information. • Convenience/cost • Disadvantages • Controls often have different catchment area from cases. • Berkson’s bias

  44. Sources of Controls (5) • Medical Practice Controls • May be a good match for secondary base referral patterns; BUT • Exposure profile may differ from true base due to selection effects of interventions by HCP’s.

More Related