420 likes | 473 Views
Bias can get by us. November 2 2004 Epidemiology 511 W. A. Kukull. Bias. Systematic error that leads to incorrect estimate of an association anticipate and eliminate or minimize in the study design phase may be impossible to account for in analysis
E N D
Bias can get by us November 2 2004 Epidemiology 511 W. A. Kukull
Bias • Systematic error that leads to incorrect estimate of an association • anticipate and eliminate or minimize in the study design phase • may be impossible to account for in analysis • usually introduced by the investigator (or subjects) • Main categories: Selection bias and Information bias
Bias is a systematic error(diagram after Rothman, 2002) • Random error decreases with study size; systematic error remains Random error Error Systematic error Study size
Control of Bias • Careful study design is primary • Selection bias: permanent flaw • Choice of study groups • Data Collection; Data sources • objective, closed ended questions • trained interviewers: reliability assessment • wide variety of factors to “blind” interviewer and subject to hypothesis
Selection Bias • Selection of “cases” or “controls” leads to apparent disease- exposure association • Selection or f/u and dx of “exposed” or “unexposed” leads to apparent d - e association • “Apparent” association is due to a systematic error in design or conduct of the study
Selection bias • Common element: • The association between exposure and disease is different for those who are studied than it is for those who would be eligible but are not studied • Case - control: subject selection is influenced by probability of exposure history • Cohort: non-random loss to follow-up influences association measure (RR)
“Population” base Framinghammer City Non- diseased Loss, death, refusals before disease develops Study enrollees Disease cases Time
Selection Bias Reference Population Dis No Dis Exp Study Sample Not Exp Non-Reference probabilities of being included in the study within exposure (or disease)
Example: selection bias(after Szklo & Neito, 2000) True reference population disease No disease Exp 1800 500 Not Exp 500 7200 OR = 4.0
Unbiased Sample re: exposure status 50% of Diseased; 10% of Not Diseased-- but true Reference proportions of “exposed” in each Not D D 180 Exp 250 250 720 Not Exp OR= 4.0
Biased exposure probability sampling among “diseased” ONLY ( 60% exposed, not true 50% ) due to a flawed design or strategy Dis Not Dis Exp 300 180 Not Exp 200 720 OR = 6.0
Basic example: Case-control study(after Hernan et al, 2004) • Is prior HRT use associated with MI? • Select women with incident MI—cases • Select controls from women with high frequency of hip fracture (unintentionally) • HRT is known to decrease osteoporosis • Is the HRT – MI association likely to be biased ? Why/how?
Hospital-base case-control study:Berkson’s bias (after Schwartzbaum et al,2003) • Premise: diseases have different probabilities hospital admission • Pr(brain injury) > Pr(allergic rhinitis) • Pr( >2 diseases) > Pr( 1 disease) • Diseases unassociated in the population could be associated in hospitalized patients • Then, a risk factor for one disease could appear to be a risk factor for the other
Berkson’s bias/Admission bias(after Sackett, 1979) Bone disease Yes No Resp. Disease Yes No 5 15 17 207 Yes 18 219 184 2376 No Hospitalized in Last 6 months OR=4.06 Gen. Pop. OR=1.06
Loss to follow-up: Selection bias in a Cohort study • Effects of anti-retroviral therapy hx on AIDS risk in HIV+ patients. • Pts. with more symptoms may drop early • Pts. with more therapy side effects may drop • Restricting analysis to non-drop outs can produce biased result • Subject drop out is rarely “at random” • Statistical missing data strategies
Selection Biases • Non-response/Missing data bias: characteristics may differ between early, late and nonresponders • Missing data proportions differ • Analyses restricted to complete data will be biased • Non-responders in case-control studies may have different exposure histories
Healthy Worker selection bias • Do rubber industry workers have excess mortality compared with U.S. population of the same age and sex? • SMR = 82 for rubber workers • General population includes people who are unable to work because of illness • All cause death rates are usually higher in the general pop. than among workers • Use unexposed workers as a comparison group
Contributors to selection bias • Choice of comparison group or sampling frame • Self-selection, volunteers • Loss to follow-up (cohort) • Initial non-response • primarily case-control studies • Selective survival • Differences in disease detection (surveillance or detection bias)
Examples • Unmasking bias: • physicians followed OC users more closely because of use-related cautions and thus detected more thrombophlebitis • Frequent visits =>more comorbidity • Prevalent case and Survival bias • Smoking and Alzheimer’s disease • Among AD cases smokers may have shorter survival than non-smokers
Prevalent case biasLonger disease duration increases chance of selection Cross-sectional Sample Time
Example: volunteer/self-selection • Leukemia in troops present at atomic test site • 76% of all troops were traced • of the 76%, 82% were tracked down by investigators • of the 76%, 18% contacted investigators on their own initiative • 4 leukemia cases were among the 18% and 4 among the 82%--Self referral bias?
Information Bias • Inadequacies and inaccuracies in data collection or measurement • Common to all subjects? • Will reduce observed association • Different in each comparison group? • may exaggerate association
Information Bias • Systematic errors in obtaining needed exposure (or diagnosis) information • non-differential misclassification, “random” error • usually biases toward the “null” • differential misclassification: different between the study groups • may cause estimated effect error in either direction
Example:True classification of family history for a hypothetical disease ‘X’ Disease X No Disease Positive Family Hx 240 80 No Family Hx 320 160 400 400 OR= 6.0
Example:Non-Differential misclassificationFam Hx accuracy cases 65%; controls 65% Disease X No X Family Hx 52 156 No Fam Hx 244 348 OR = 4.3 400 400
Example: Differential misclassificationaccuracy cases 85%; controls 25% Disease X No X Family Hx 204 20 No Family Hx 196 380 OR = 19.8 400 400
Cohort study: true classification of persons who hypothetically develop ER(after Koepsell & Weiss, Chapt 10)
What if only 90% of the true cases were identified due to diagnostic inaccuracy?
What if 1.0% of the well persons were misdiagnosed as having ER, but didn’t
Information Bias • Example: MI and smoking • smokers with new MI may be less likely to respond to a mailed questionnaire than non-smokers with new MI • if the non response is related to exposure and disease the potential for bias exists • Proxy reports of exposure • Relationship, proximity influence agreement
Information Biases(after Sackett) • Diagnostic suspicion bias: knowledge of subjects prior history influences intensity of diagnostic effort • Exposure suspicion bias: disease with “known” cause may increase search for that cause
Information Biases(after Sackett) • Recall bias: cases more (or less) likely to report than controls • Family information bias: Information from a family is stimulated by a new case in in the family--and their need to explain why
Background random factors (chance) Correlated causes, confounding Diagnostic inaccuracy Exposure accuracy Missing data, database errors Group/hypothesis formation Case-control selection Cohort loss to f/u Analysis, modeling, interpretation Publication bias Editors and experts Exposure Diseaseviewed through(after Maclure & Schneeweiss, 2001)
Evaluation of Bias:What would the RR look like if ??? • What is the direction and likely effect if bias is active? • IS A TRUE ASSOCIATION MASKED? • IS A SPURIOUS ASSOCIATION REPORTED? • Can the potential for recall bias be estimated • second control group with another illness?
Is Selection Bias Present(after Grimes and Shultz, Lancet;2002;359:248-52) • In a cohort study, are participants in the exposed and unexposed groups similar in all respects except for exposure? • In a case control study, are cases and controls similar in important respects except for the disease in question?
Is Information Bias Present(after Grimes and Shultz, Lancet;2002;359:248-52) • In a cohort study, is information about outcome obtained in the same way for those exposed and unexposed? • In a case control study, information about exposure gathered in the same way for cases and controls?
Is Confounding Present(after Grimes and Shultz, Lancet;2002;359:248-52) • Could the results be accounted for by the presence of another factor– e.g., age, smoking, sexual behavior, diet—associated with the exposure and outcome but not directly in the causal pathway? • Confounding is the subject of another lecture…
If Not bias or confounding are results due to “chance”(after Grimes and Shultz, Lancet;2002,359:248-52) • What is the RR or OR and the 95% confidence intervals…Does the CI include 1.0? • Is the difference (association) statistically significant and if not did the study have adequate power to find a clinically important difference (association)? • What is the p-value? • Is the p-value inflated by multiple comparisons ?
Bias and study designs:Important sources • Case-control • Knowledge of disease status may influence determination of exposure status • Knowledge of exposure status influenced the subjects selected • Recall bias • Cohort • loss to follow-up; differential misdiagnosis • Information bias
Epidemiologic Reasoning • Use the tools, statistics and calculations • Use knowledge of biology, behavior and disease pathogenesis • Make educated guesses about effect of bias and confounding to guide study design and analysis and eliminate untoward effects • Try to make causal inferences
Conclusion • What sources of Bias are common to which study designs? • How can we evaluate bias? • “Sensitivity analysis”: “What if….” • Confounding may still impact results even if bias is eliminated—but it can be dealt with in analysis.