1 / 0

The Myth of Small Numbers (and Other Sources of Bias)

The Myth of Small Numbers (and Other Sources of Bias). Bob L. Larson, DVM, PhD Kansas State University. I. Normal Variation. ½ below the mean. ½ above the mean. Mean ± standard deviation. What does Normal variation mean to animal populations?.

faolan
Download Presentation

The Myth of Small Numbers (and Other Sources of Bias)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Myth of Small Numbers (and Other Sources of Bias)

    Bob L. Larson, DVM, PhD Kansas State University
  2. I. Normal Variation

  3. ½ below the mean ½ above the mean Mean ± standard deviation What does Normal variation mean to animal populations?
  4. What does Normal variation mean to animal populations? 68% within 1 SD of the mean Mean ± 1 standard deviation
  5. What does Normal variation mean to animal populations? 95% within 2 SD of the mean Mean ± 2 standard deviations
  6. What does Normal variation mean to animal populations? 99.7% within 3 SD of the mean Mean ± 3 standard deviations
  7. Something is wrong ! I must do something different Everything is OK I must be doing everything right Goal Tempting Way To Think ! (time to raise fees)
  8. Product / Output Naïve View of Biology (Animal Population) People Methods Material Environment Equipment
  9. Real View of Biology (Animal Population ) People Methods Material Environment Equipment Product / Output
  10. What is the mean? 2. What is the standard deviation? 3. Where does this data fall? X Alternate Way To Think
  11. Law of Large Numbers With increasing data points, the sample mean and distribution approaches the true population mean and distribution. 1,500 data points randomly drawn from a larger population will give a mean and distribution that is basically equivalent to that entire population (used by pollsters).
  12. Results of Demonstration n=50 Mean = 48.88% SD = 6.82 Mean +/-1 SD = 35.2-62.5%
  13. Myth of Small Numbers Myth = a small data set tells me something about the population Reality = small data sets can be very misleading
  14. Results of Demonstration n=5 Mean = 44.00% A small number of samples is not Normally distributed A small number of samples gives no indication of Mean or Standard Deviation
  15. Results of Demonstration n=15 Mean = 46.53% SD = 6.91 Mean +/-1 SD = 32.7-60.4%
  16. Myth of Small Numbers So how do I make a wise conclusion without data or with a small amount of data? You are guessing You can’t… Luckily, we are seldom proven wrong. Unfortunately, guessing doesn’t differentiate you from other people providing the same service
  17. Errors in Research Findings (and Reasoning)

  18. Errors in Research Findings “Mistakes” Wrong study design, wrong analysis strategy, conclusions that do not follow from results, etc. Extremely common in medical research literature Statistical (chance) errors Type I and Type II errors Bias A systematic error that causes a conclusion to be incorrect Myth of Small Numbers & Other Issues Myth of Small Numbers & Other Issues
  19. Bias & Confounding (a form of bias) Issues of internal validity This is important ! So you don’t get fooled by wrong information from research studies due to bias or confounding
  20. Bias & Confounding (a form of bias) Issues of internal validity Bias Systematic error (vs. random error) that results in mistaken conclusions regarding the relationship between the exposure (or explanatory factors) and the outcome Random (non-systemic) errors not bias – these errors are randomly distributed amongst groups/observations Lack of bias →internal validity
  21. Bias & Confounding (a form of bias) Issues of internal validity Bias Confounding The mixing of the effects of one risk factor with another Identifying a spurious relationship between a risk factor and a disease that is due to the effects of a separate factor
  22. Bias & Confounding (a form of bias) Why is this important to understand? To understand what research studies mean (and don’t mean) – to be an educated consumer of research information Clinical practice – To better understand the causes of disease and appropriate treatments Research – To use appropriate study design, analysis, and interpretation
  23. Bias & Confounding (a form of bias) Why is this important to understand? Bias and confounding can (and do!) completely distort study results and can lead to interpretations that are completely wrong!! Why does this happen? Multi-factorial nature of disease Lack of understanding of the roles of bias and confounding by researchers and clinicians
  24. Bias & Confounding (a form of bias) Challenge for researchers and health practitioners? Obtain valid study results i.e. results that represent the true nature of the relationship between exposure and disease This requires consideration of all possible errors due to bias and/or confounding
  25. Bias & Confounding (a form of bias) How to control for bias and confounding Appropriate study design Statistical analytic techniques Understanding and accommodating for limitations (don’t over-interpret!)
  26. Bias & Confounding (a form of bias) Take Home Beware! When you read results from a health study…an apparent link between a risk factor and a disease may be real, or just an anomaly of how the study was done.
  27. Bias Three main types of bias Selection bias Information bias Confounding (not always considered ‘bias’)
  28. Bias Threemain types of bias Selection bias Information bias Confounding (not always considered ‘bias’) Note: lack of generalizability not usually considered type of bias “Bias” usually related to internal validity Generalizability related to external validity
  29. Bias Three main types of bias Selection bias Information bias Confounding (not always considered ‘bias’) Note: lack of generalizability not usually considered type of bias “Bias” usually related to internal validity Generalizability related to external validity Distinctions between types of biases not always clear-cut
  30. SelectionBias
  31. Selection Bias Distortion in the estimate of a relationship between exposure and disease that is the result of how subjects are selected for the study Distortions that arise from... The procedures used to select subjects Factors that influence study participation
  32. Selection Bias Distortions that arise from… The procedures used to select subjects Factors that influence study participation Systematic error in selecting subjects… If relationship between exposure/explanatory factor and disease is different… Between cases and controls Between participants and those who should be eligible for the study but don’t participate (are not selected) Examples are…
  33. Selection Bias Self-selection bias Self-selection may be associated with the outcome under study Volunteers may be more likely to have disease you are interested in e.g. If one did a survey on dogs with leg pain – would owners who have recognized lameness in their dogs be more likely to respond to the survey?
  34. Selection Bias Problematic in selecting control group Want them to differ only on the exposure (for cohort and some cross-sectional studies) Want them to differ only on outcome (for case-control and some cross-sectional studies) By excluding animals that don’t have the exposure or outcome of interest – we are at risk of creating selection bias
  35. Selection Bias Diagnostic bias Also occurs before subjects are identified for study Diagnoses may be influenced by veterinarians’ knowledge of exposure Legitimate for process of diagnosis and treatment, but inconvenient for research This makes medical records and data-bases less valuable for research than one might think (case-control and retrospective cohort)
  36. Selection Bias Response bias Differential loss to follow-up Differential consent rates Especially problematic in prospective cohort studies Retrospective cohorts also require ascertainment of outcome in cohort
  37. Selection Bias vs. Selective Sample Selection bias Selective differences between groups that impacts the relationship between explanatory factors/exposure and outcome Violates internal validity Selective sample Strict inclusion / exclusion criteria Not a threat to internal validity (may enhance internal validity) Not necessarily representative of population as a whole Potentialthreat to external validity
  38. Controlling Selection Bias Appropriate study design Random selection of subjects from subject “pool” Once study is completed (or even started) – it is impossible to correct for selection bias Study should be thrown out/destroyed/ignored But it probably won’t be
  39. Information Bias
  40. Information Bias Method of gathering information which yields systematic errors regarding exposures and outcomes Using an “invalid” measure e.g., database that has not been validated Is this information bias? Yes, if information is more likely to be wrong for one group than for another Some would consider “not biased” if inaccuracies randomly distributed
  41. Information Bias Method of gathering information which yields systematic errors regarding exposures and outcomes Using an “invalid” measure e.g., database that has not been validated Is this information bias? Yes, if information is more likely to be wrong for one group than for another Some would consider “not biased” if inaccuracies randomly distributed Either way, study is seriously flawed
  42. Information Bias Examples: Misclassification bias Observer or Interviewer bias Recall bias Reporting bias (wish bias) Surveillance bias Observer bias
  43. Information Bias Examples: Misclassification bias Observer or Interviewer bias Recall bias Reporting bias (wish bias) Surveillance bias Observer bias Measurement error Loss to follow-up
  44. Information Bias Misclassification of exposures A problem that occurs when study subjects are erroneously categorized according to the disease and/or the exposure being studied
  45. Information Bias Misclassification of exposures Differential Proportion of misclassified depends on exposure e.g., exposure = exposure to “kennel cough” Owners of dogs who develop kennel cough are more likely to identify any potential exposure than those who do not (recall bias)
  46. Information Bias Misclassification of exposures Differential Proportion of misclassified depends on exposure Non-differential Misclassification independent of exposure (same between treatment and control) Acts to dilute true effects Can also act to inflate effects (non-effects)
  47. Information Bias Misclassification of outcome e.g., outcome = failure to thrive exposure = use of animal health products Owners asked about “failure to thrive” Dogs conscientiously treated with animal health products may be misclassified more often as failure to thrive (intensity of owner interaction) Differential misclassification of outcome
  48. Information Bias Observer or Interviewer Bias Recall bias Problem for retrospective studies (case-control and cohort). Study subjects are required to report specific experiences or exposures that happened in the past Cases are more likely to recall potential exposures than are controls Higher OR than the true association - can result in a study that shows an exposure ‘causing’ an outcome even if it does not
  49. Information Bias Observer or Interviewer Bias Recall bias Problem for retrospective studies (case-control and cohort). Study subjects are required to report specific experiences or exposures that happened in the past Cases are more likely to recall potential exposures than are controls 1999 Febreze question (or just internet hoax)
  50. Information Bias Observer or Interviewer Bias Recall bias Reporting bias (aka Wish bias) Owners and veterinarians my see what they want to see (and not report what they are embarrassed to share) Important concern in case-control studies and poorly blinded experimental trials
  51. Information Bias Observer or Interviewer Bias Recall bias Reporting bias (aka Wish bias) Surveillance bias Animals with a particular exposure may be more closely monitored than animals without the exposure e.g. Women with a familiar history for breast cancer may be more closely monitored for breast cancer than the general population – therefore, even if no familial risk exists – the closely monitored population is more likely to be diagnosed
  52. Information Bias Observer or Interviewer Bias Recall bias Reporting bias (aka Wish bias) Surveillance bias Animals with a particular exposure may be more closely monitored than subjects without the exposure Important concern in cohort studies
  53. Information Bias Observer or Interviewer Bias Recall bias Reporting bias (aka Wish bias) Surveillance bias Observer bias Potential problem where judgment is required in assessing exposure or outcome Important concern anytime medical records are used (veterinary personnel are not ‘blinded’ to history, other exposures, etc.)
  54. Information Bias Observer or Interviewer Bias Recall bias Reporting bias (aka Wish bias) Surveillance bias Observer bias Measurement bias (error) Use of invalid or poorly validated measurements (i.e. diagnostic test for either the exposure or outcome)
  55. Controlling Information Bias Appropriate study design Careful collection of information/data Blinding
  56. Blinding of Treatment Groups Single-blinded trial The person giving the treatment (owner, technician, veterinarian) is blinded (does not know which treatment the animal received) Often by use of placebo
  57. Blinding of Treatment Groups Single-blinded trial Double-blinded trial Both the person giving the treatment and the person evaluating the animal are unaware of which treatment each animal received
  58. Blinding of Treatment Groups Single-blinded trial Double-blinded trial Triple-blinded trial The person giving the treatment, the animal evaluator, and the diagnostician or statisticianare all unaware of which treatment is received
  59. Blinding of Treatment Groups Assures that patients in different treatment groups are not assessed differently – a potential source of bias (“wish” bias) Errors due to patient pre-conceptions or investigator bias will be avoided or equally distributed between treatment groups
  60. Blinding of Treatment Groups It is not always possible to blind treatments e.g. if the ‘treatment’ is surgery Fatal error non-blinded + subjective outcome
  61. Controlling Information Bias Appropriate study design Careful collection of information/data Blinding Limit interpretation of flawed study VERY WEAK evidence of anything
  62. Bias & Confounding (a form of bias) Take Home Beware! When you read results from a health study…an apparent link between a risk factor and a disease may be real, or just an anomaly of how the study was done.
  63. Confounding

  64. Confounding A third factor which is related to both exposure and outcome, and which accounts for some/all of the observed relationship between the two Confounder is not a result of the exposure e.g. association between birth rank and Down syndrome confounder = mother’s age
  65. Confounding Exposure ? Disease
  66. Confounding ANOTHER PATHWAY TO GET TO THE DISEASE (a mixing of effects) Exposure ? Confounding Variable Disease
  67. Confounding Confounder is distributed differently between exposed and un-exposed populations Exposure ? Confounding Variable Disease
  68. Confounding Exposure ? Confounding Variable Confounder must be a risk factor or surrogate for a cause of the disease, independent of the exposure of interest Disease
  69. Confounding Confounder can not be in the causal pathway between the exposure and disease (that is Interaction) Exposure ? Confounding Variable Disease
  70. Birth order & Down Syndrome Risk
  71. Maternal age & Down Syndrome Risk
  72. Birth order & Down Syndrome Risk
  73. Birth order & Down Syndrome Risk
  74. Birth order & Down Syndrome Risk
  75. Birth order & Down Syndrome Risk
  76. Birth order & Down Syndrome Risk
  77. Birth order & Down Syndrome Risk
  78. Birth order & Down Syndrome Risk
  79. Birth order & Down Syndrome Risk
  80. Birth order & Down Syndrome Risk
  81. Birth order & Down Syndrome Risk
  82. Birth order & Down Syndrome Risk
  83. Birth order & Down Syndrome Risk
  84. Birth order & Down Syndrome Risk
  85. Birth order & Down Syndrome Risk
  86. Birth order & Down Syndrome Risk
  87. Confounding Exposure= lighter in front pocket Outcome= lung cancer Smoking a confounder? Is smoking associated with lung cancer (outcome)? Y/N Is smoking associated with carrying a lighter (i.e. is smoking unequally distributed between people who do and don’t carry lighters)? Y/N If yes to both – smoking is a potential confounder
  88. Confounding Exposure= fed colostrum/milk replacer Outcome = diarrhea Dystocia a confounder? Is dystocia associated with diarrhea (outcome)? Y/N Is dystocia associated with colostrum feeding (i.e. is dystocia unequally distributed between colostrum-fed and non-colostrum-fed animals) Y/N If yes to both – dystocia is a potential confounder
  89. Methods to Prevent Confounding Exposure X ? Confounding Variable X Disease
  90. Methods to Prevent Confounding Exposure X ? Confounding Variable X Disease
  91. Methods to Prevent Confounding In RCT, random allocation controls for confounding If it is possible to randomize – do it….it is the best method to reduce confounding Distribution of any variable is theoretically the same in the exposed and unexposed groups This is almost always true with large samples (but may violated with small sample size)
  92. Methods to Prevent Confounding Exposure X ? Confounding Variable Disease
  93. Randomization to Reduce Confounding Exposed Applicable only for intervention (experimental) studies Randomization controls for both known and unknown confounding factors! Because distribution of any variable theoretically the same across randomization groups Other methods to control confounding can only deal with known (suspected) confounders Randomize All subjects Unexposed
  94. Randomization to Reduce Confounding Exposed Applicable only for intervention (experimental) studies Randomization controls for both known and unknown confounding factors! Does not, however, always eliminate confounding! By chance alone, there can be imbalance Less of a problem in large studies Techniques exist to ensure balance of certain variables Randomize All subjects Unexposed
  95. Methods to Prevent Confounding In RCT, random allocation controls for confounding In other study types - control confounding by: Exclusion criteria (aka Restriction or Specification) Restrict enrollment to only those subjects who have a specific value/range of the confounding variable e.g. when age is confounder – include only subjects of same narrow age range
  96. Methods to Prevent Confounding In RCT, random allocation controls for confounding In other study types - control confounding by: Exclusion criteria (aka Restriction or Specification) Restrict enrollment to only those subjects who have a specific value/range of the confounding variable Not always effective, reasonable, practical, or useful to exclude all potential confounders
  97. Restriction to Reduce Confounding Exposure X ? Confounding Variable X Disease
  98. Methods to Prevent Confounding In RCT, random allocation controls for confounding In other study types - control confounding by: Exclusion criteria (aka Restriction or Specification) Advantage – very straight forward Disadvantages Reduces the number of animals who are eligible Takes more work to sift through animals to find those with the level of confounder you want (inefficient) Moderate restriction may not work Reduce the generalizability of the study
  99. Methods to Prevent Confounding In RCT, random allocation controls for confounding In other study types - control confounding by: Exclusion criteria(aka Restriction or Specification) Matching Advantages Good for complex nominal variable (e.g. herd/kennel) Statistical precision because number of cases and controls is balanced
  100. Methods to Prevent Confounding In RCT, random allocation controls for confounding In other study types - control confounding by: Exclusion criteria(aka Restriction or Specification) Matching Disadvantages Finding matches may be difficult or time-consuming May have to throw out a case if an appropriate match cannot be found In a case-control study – the factor used to match subjects cannot be evaluated as a risk factor Decisions are irrevocable – if you match on an intermediary factor, you lose ability to evaluate it
  101. Methods to Prevent Confounding In RCT, random allocation controls for confounding In other study types - control confounding by: Exclusion criteria Matching Statistical analysis
  102. Statistical Control of Confounding Multivariable analyses Any analysis technique that simultaneously adjusts for several variables Known potential confounders should be included in the model ANCOVA, MANCOVA Generalized Linear Models Multiple linear regression Multivariate logistic regression Multivariate Cox Proportional Hazards Regression etc.
  103. Multivariable results Provides relationship between outcome and exposure, adjusted for the potential confounders such as: Gender Age Herd Diet Health related factors (body condition score, etc.) Other diseases, health conditions etc. Must have gathered the information during the experiment!
  104. Statistical Control of Confounding Multivariable analyses ANCOVA, MANCOVA Generalized Linear Models Multiple linear regression Multivariate logistic regression Multivariate Cox Proportional Hazards Regression Etc. Stratification Stratify results by confounder
  105. Statistical Control of Confounding Stratification Stratify results by confounder Create strata that are homogenous with respect to the different levels of the confounder Results in a mini-restriction within each strata Effective for small number of confounders
  106. Stratification of Results Example: Exposure = gender Outcome = acceptance to professional school At a particular university, 38.5% of women applying to the professional schools (medicine, veterinary medicine, and law) and 47.9% of men applying to professional schools are admitted. Is there evidence for a gender-bias lawsuit (prior cases used a 15% difference between genders to establish sex-bias)?
  107. Stratification of Results Example: Exposure = gender Outcome = acceptance to professional school Non-stratified Results Gender % Acceptance Male 47.9% Female 38.5% 19.6% reduction in acceptance risk
  108. Stratification to Reduce Confounding Gender Types of Professional Schools ? Professional School Acceptance
  109. Stratified Results
  110. Statistical Control of Confounding Stratification Stratify results by confounder
  111. Stratification: Gender, Application School, & Likelihood of Acceptance
  112. Stratification: Gender, Application School, & Likelihood of Acceptance Interpretation: Men are 48% more likely to be admitted to professional school compared to women (statistically significant since 95% CI does not include 1)
  113. Stratification: Gender, Application School, & Likelihood of Acceptance Interpretation: No statistical difference between men and women for admission to medical school (women with numerical advantage)
  114. Stratification: Gender, Application School, & Likelihood of Acceptance Interpretation: No statistical difference between men and women for admission to vet school (women with numerical advantage)
  115. Stratification: Gender, Application School, & Likelihood of Acceptance Interpretation: No statistical difference between men and women for admission to law school (women with numerical advantage)
  116. Statistical Control of Confounding Stratification Stratify results by confounder Create a single un-confounded (adjusted) estimate for the relationship in question Summarize the un-confounded estimates from the two (or more) strata to form a single overall un-confounded “summary estimate”
  117. Stratification: Gender, Application School, & Likelihood of Acceptance
  118. Stratification: Gender, Application School, & Likelihood of Acceptance Interpretation: No statistical difference between men and women for admission to professional school
  119. Controlling Confounding Study design Randomization (in clinical trials) Restriction (known confounders) Matching (known confounders) Data analysis Multivariate analysis (appropriate models) Stratification (analyze data by subgroups) Studies often use a combination of these
  120. Bias, Confounding (a form of bias), and Interaction Take Home Beware! When you read results from a health study…an apparent link between a risk factor and a disease may be real, or just an anomaly of how the study was done.
More Related