Spring 2008

Spring 2008 Bias, Confounding, and Effect Modification STAT 6395 Filardo and Ng

Bias Any systematic error in the design or conduct of a study that results in a mistaken estimate of the association between an exposure and a disease Bias is often a major problem in observational epidemiologic studies

Systematic error (bias) is different than random error • Example: an association between an exposure an a disease in which the true relative risk is 2.0

Systematic error (bias) is different than random error • If the design and conduct of a study are unbiased, and there is no confounding, and we repeat the study an infinite number of times, the mean relative risk will be 2.0, with the individual relative risks from the different studies fluctuating around 2.0

Systematic error (bias) is different than random error • If the design or conduct of the study is biased, and we repeat the study an infinite number of times, the mean relative risk will differ from 2.0 (for example, it may be 1.2), with the individual relative risks from the different studies fluctuating around 1.2

Systematic error (bias) is different than random error • Due to random variation, an association that is far from the truth can be observed in an unbiased study, but it usually won’t be.

Systematic error (bias) is different than random error • Due to random variation, the true association can be observed in a biased study, but it usually won’t be

Systematic error (bias) is different than random error Statistical significance does not protect against bias

Two major categories of bias • Selection bias • Information bias

Selection bias Error that results from criteria or procedures used to select study subjects or from factors that influence study participation. With selection bias, the relation between exposure and disease is different for those who are selected for and participate in the study and those who should be theoretically eligible to participate.

Selection bias Selection bias can occur as a result of: • Incorrect selection criteria for study subjects • Differences in characteristics between eligible subjects who agree to participate and eligible subjects who do not participate

Information bias Error due to collection of incorrect information about study subjects. Due to this incorrect information, subjects are classified into incorrect exposure or disease categories.

Selection bias is a major issue in case-control studies • Source population: the population that gives rise to the cases

Selection bias is a major issue in case-control studies • Cases should be selected such that the distribution of the exposures of interest among the cases selected for the study is the same as it is among all cases that arise in the source population. The cases should be representative of all cases that arise in the source population with respect to the exposures of interest.

Selection bias in case-control studies (cont.) • Controls should be selected such that the distribution of the exposures of interest among the controls is the same as it is in the source population. The controls should be representative of the source population with respect to the exposures of interest.

Selection bias in case-control studies (cont.) • Selection bias occurs when either: • The cases are not representative of all cases that arise in the source population with respect to the exposures of interest and/or • The controls are not representative of the source population with respect to the exposures of interest.

Selection bias in case-control studies: how it works • In the hypothetical data depicted in the following tables, we will assume there is: • no information bias, • confounding, or • random variability so that all differences are due to differences in selection of cases or controls

Hypothetical case-control study including all cases and all non-cases from Source Population A Gold standard OR = 4.5

Hypothetical case-control study including a 70%unbiased sample of the cases and 0.5% unbiased sample of the controls from Source Population A Unbiased OR = (350x4,500)/(500x700) = 4.5 This is an unbiased odds ratio because the selection of cases and controls was unrelated to exposure.

Selection bias in choosing controls in a hypothetical case-control study including a 70% sample of the cases and 0.5% sample of the controls from Source Population A Biased OR = (350x4,050)/(950x700) = 2.13 Selection of controls was related to exposure -over selecting exposed controls biases OR downward

Selection bias in choosing controls in a case-control study due to incorrect criteria for control selection Example: A hospital-based case-control study of the relation of smoking to a given disease.

Selection bias in choosing controls in a case-control study due to incorrect criteria for control selection If the control group includes persons hospitalized for smoking-related diseases (e.g, cardiovascular disease)… …the control group would likely have a higher proportion of smokers than the source population, and the resultant odds ratio would be biased downward

Selection bias in choosing controls in a case-control study due to a difference in participation rates between exposed controls and nonexposed controls • Example: Case-control study of the relation between housing characteristics and lead poisoning among children 6 years of age or younger who are screened for blood lead levels at the Hill Health Center in New Haven

Selection bias in choosing controls in a case-control study due to a difference in participation rates between exposed controls and nonexposed controls • Cases: all children with a blood lead level of >10 micrograms/dL • Controls: a systematic sample of children with a blood lead level of <10 micrograms/dL

Housing characteristics and lead poisoning (cont.) • Incentive for participation: the parents of the children were offered a free lead inspection of their homes • Participation rate among cases: 91% (parents were motivated by their child’s elevated blood lead level to have the inspection)

Housing characteristics and lead poisoning (cont.) • Participation rate among controls: 69% (parents did not have the same motivation to participate) The condition of the housing of the control parents who refused to participate was better than the condition of the housing of the control parents who did participate

Housing characteristics and lead poisoning (cont.) • The housing of the controls selected for the study was in poorer condition than the housing of the source population The odds ratio for the association between measures of dilapidated housing and childhood lead poisoning would be biased downward

Housing characteristics and lead poisoning (cont.) • Although the criteria for selecting controls were sound, the difference in participation rate between exposed controls and nonexposed controls resulted in a biased odds ratio

Selection bias in choosing cases in a hypothetical case-control study including a 70% sample of the cases and 0.5% sample of the non-cases from Source Population A Biased OR = (450x4,500)/(500x600) = 6.75 Selection of cases was related to exposure -over-selecting exposed cases biases OR upward

Selection bias in choosing cases in a case-control study • Example: Population-based case-control study of pancreatic cancer cancer • Hypothesis: vitamin C protects against development of pancreatic cancer Vitamin C intake assessed by food frequency questionnaire

Selection bias in choosing cases in a case-control study • Median interval between diagnosis and interview: 9 months • One-year case fatality rate of pancreatic cancer: 80% Many cases would die before being interviewed

Selection bias in choosing cases in a case-control study Suppose vitamin C intake improves survival from pancreatic cancer • Then vitamin C intake among cases selected for the study would be higher than vitamin C intake among all cases • Over-selection of exposed cases would bias OR upward

Compensating Selection Bias To avoid biased odds ratios, investigators often attempt to equalize selection bias between cases and controls by selecting cases and controls undergoing the same selection processes

Compensating bias in choosing cases and controls in a hypothetical case-control study including a 70% sample of the cases and 0.5% sample of the non-cases from Source Population A Unbiased OR = (450x4,286)/(714x600) = 4.5 Equal over-selection (1.5x) of exposed cases and controls

Hypothetical case-control study including a 70%unbiased sample of the cases and 0.5% unbiased sample of the controls from Source Population A Unbiased OR = (350x4,500)/(500x700) = 4.5 This is the original table

Cases and controls undergoing the same selection processes in a case-control study of breast cancer • Example: Cases and controls selected from among women attending a breast cancer screening program These women are likely to have high prevalence of known breast cancer risk factors, (family history of breast cancer, history of benign breast disease, late age at first birth)

Cases and controls undergoing the same selection processes in a case-control study of breast cancer • Example: Cases and controls selected from among women attending a breast cancer screening program If cases from this population were compared to controls from the general population, an overestimate of the magnitude of some risk factors would probably occur

Cases and controls undergoing the same selection processes in a case-control study of breast cancer • Selecting both cases and controls from the screening program should make the bias the same in both groups, leading to unbiased odds ratios This is another way of saying that controls should be selected from the source population that gave rise to the cases

Minimizing selection bias in case-control studies • In the study design stage, carefully consider the criteria for selection of cases and controls, particularly with respect to ensuring internal validity

Minimizing selection bias in case-control studies • Choose study procedures aimed at maximizing the participation rate of the subjects selected for the study

Selection bias in cohort studies using internal comparison groups is unlikely • Selection bias would occur if participation were related to both exposure and the subsequent development of disease • Because study participants are selected before the development of disease, this is unlikely The exposed group and nonexposed comparison group were drawn from the same source population and went through the same selection process

Selection bias in cohort studies using internal comparison groups is unlikely • The nurses who participated in the Nurses’ Health Study most likely differed from the nurses who did not, but since the same selection process was used to select the exposed group and the nonexposed internal comparison group, the relative risk estimates should be unbiased.

Cohort studies using external comparison groups are prone to selection bias • Exposed cohort and nonexposed external comparison group are not selected from the same source population The exposed cohort may be selected such that it is at higher or lower risk for disease than the external comparison group for a reason other than the exposure of interest

Healthy worker effect • A selection bias in occupational cohort studies using a general population external comparison group Persons selected for employment are usually healthier than and have lower mortality rates than the general population, which includes the sick and disabled.

Healthy worker effect • A selection bias in occupational cohort studies using a general population external comparison group The healthy worker effect makes any excess disease or mortality associated with an occupational exposure more difficult to detect than it would have been if a valid comparison group had been used, biasing the estimates of relative risk downward

Losses to follow-up in cohort studies are analogous to selection bias in case-control studies • When a subject in a cohort study is lost to follow-up, we do not know whether that subject developed the disease of interest during the remainder of the study’s follow-up period

Losses to follow-up in cohort studies are analogous to selection bias in case-control studies • If the subjects lost to follow-up have a different incidence of the disease of interest than the subjects not lost to follow-up, the estimates of the incidence rate of the disease of interest in the cohort will be biased

Losses to follow-up in cohort studies are analogous to selection bias in case-control studies • However, relative risk estimates will be unbiased if the bias on the incidence rate estimates is the same in the exposed and nonexposed groups. A biased relative risk estimate will occur only if losses to follow-up are related to both disease and exposure • The best defense against bias due to losses to follow-up is to make intense efforts to locate each cohort member, and thus minimize losses

Losses to follow-up in cohort studies are analogous to selection bias in case-control studies • The best defense against bias due to losses to follow-up is to make intense efforts to locate each cohort member, and thus minimize losses

Hypothetical cohort study with 100% follow-up (to keep the examples simple, we will not use the person-years method, but will use 10-year cumulative incidence) Gold standard RR = 49.75/11.10 = 4.48

Spring 2008

Spring 2008

Presentation Transcript

Navigation Spring 2008

Spring, 2008

CSE Spring 2008

2008 Spring Valley

Spring 2008 CSE 1105

Patent Law Spring 2008

Spring 2008

Spring 2008

DLF Spring Forum 2008

Background: Spring 2008 :

Spring Training 2008

2008 Spring Training

CIS6930 Spring 2008

Spring Fest 2008

Spring 2008

Spring 2008

MAT252 Spring 2008

Spring 2008

Spring 2008

2008 Spring Semester Workshop