1 / 29

Traps and pitfalls in medical statistics

Traps and pitfalls in medical statistics. Arvid Sjölander. Motivating example. You are involved in a project to find out if snus causes ulcer. A questionnaire is sent out to 300 randomly chosen subjects. 200 subjects respond:

sheila
Download Presentation

Traps and pitfalls in medical statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Traps and pitfalls in medical statistics Arvid Sjölander

  2. Motivating example • You are involved in a project to find out if snus causes ulcer. • A questionnaire is sent out to 300 randomly chosen subjects. • 200 subjects respond: • We can use the relative risk (RR) to measure the association between snus and ulcer: • Can we safely conclude that snus prevents ulcer? Arvid Sjölander

  3. Outline • Systematic errors • Selection bias • Confounding • Randomization • Reverse causation • Random errors • Confidence interval • P-value • Hypothesis test • Significance level • Power Arvid Sjölander

  4. One possible explanation • It is a wide spread hypothesis that snus causes ulcer. • Snus users who develop ulcer may therefore feel somewhat guilty, and may therefore be reluctant to participate in the study • Hence, RR<1 may be (partly) explained by an underrepresentation of snus users with ulcer among the responders. • This is a case of selection bias. Arvid Sjölander

  5. Selection bias Population • We only observe the RR among the potential responders. • The RR among the responders (observed) may not be equal to the population RR (unobserved). Sample Potential non- responders Potential responders Arvid Sjölander

  6. How do we avoid selection bias? Population • Make sure that the sample is drawn randomly from the whole population of interest - must trace the non-responders. • Send out the questionnaire again, follow up phone calls etc. Sample Potential non- responders Potential responders Arvid Sjölander

  7. Another possible explanation • Because of age-trends, young people use snus more often than old people. • For biological reasons, young people have a smaller risks for ulcer than old people. • Hence, RR<1 may be (partly) explained by snus-users being in “better shape” than non-users. • This is a case of confounding, and age is called a confounder. Arvid Sjölander

  8. ? Confounding • The RR measures the association between snus and ulcer. • The association depends on both the causal effect, and the influence of age. • In particular, even in the absence of a causal effect, there will be an (inverse) association between snus and ulcer (RR  1). Arvid Sjölander

  9. ? How do we avoid confounding? • At the design stage: randomization, i.e. assigning “snus” and “no snus” by “the flip of a coin”. • + reliable; it eliminates the influence of all confounders. • - expensive and possibly unethical. • At the analysis stage: adjust (the observed association) for (the influence of) age, e.g. stratification, matching, regression modeling. • + cheap and ethical. • - not fully reliable; cannot adjust for unknown or unmeasured confounders. Arvid Sjölander

  10. Yet another explanation • It is a wide spread hypothesis among physicians that snus causes and aggravates ulcer. • Snus users who suffers from ulcer may therefore be advised by their physicians to quit. • Hence, RR<1 may be (partly) explained by a tendency among people with ulcer to quit using snus. • This is a case of reverse causation. Arvid Sjölander

  11. Reverse causation • Reverse causation can be avoided by randomization. ? Snus Ulcer Arvid Sjölander

  12. Systematic errors • Selection bias, confounding, and reverse causation, are referred to as systematic errors, or bias. • “You don’t measure what you are interested in”. • How can you tell if your study is biased? • You can’t! (At least not from the observed data). • It is important to design the study carefully and “think ahead” to avoid bias. • What may the reason be for potential response/non-response? • How can we trace the non-responders? • Which are the possible confounders? • Do we need to randomize the study? Would randomization be ethical and practically possible? Arvid Sjölander

  13. Example cont’d • Assume that we believe that the study is unbiased (no selection bias, no confounding and no reverse causation). • Can we safely conclude that snus prevents ulcer? Arvid Sjölander

  14. Random errors Population Sample • True RR = observed RR? • True RR  observed RR! True RR Observed RR=0.7 Arvid Sjölander

  15. Confidence interval • Where can we expect the true RR to be? • The 95% Confidence Interval (CI) answers this question. • It is a range of plausible values for the true RR. • Example: RR=0.7, 95% CI: (0.5,0.9). • The narrower CI, the less uncertainty in the true RR. • The width of the CI depends on the sample size, the larger sample, the narrower CI. • How do we compute a CI?Ask a statistician! Arvid Sjölander

  16. CI for our data • RR=0.7, 95% CI: (0.16,2.74). • Conclusion? Arvid Sjölander

  17. P-value • Often, we specifically want to know whether the true RR is equal to 1 (no association between snus and ulcer). • The hypothesis that the true RR = 1 is called the “null hypothesis”; H0. • The p-value (p) is an objective measure of the strength of evidence in the observed data against H0. • 0 < p < 1. • The smaller p-value, the stronger evidence against H0. • How do we compute p?Ask a statistician? Arvid Sjölander

  18. Factors that determine the p-value • What do you think p depends on? • The sample size: the larger sample, the smaller p. • The magnitude of the observed association: the stronger association, the smaller p. • A common mistake: “The p-value is low, but the sample size is small so we cannot trust the results”. • Yes you can! • The p-value takes the sample size into account. Once the p-value is computed, the sample size carries no further information. Arvid Sjölander

  19. P-value for our data • P = 0.81 • Conclusion? Arvid Sjölander

  20. Making a decision • The p-value is an objective measure of the strenght of evidence against H0. • The smaller p-value, the stronger evidence against H0. • Sometimes, we have to make a formal decision of whether or not to reject H0. • This decision process is formally called hypothesis testing. • We reject H0 when the evidence against H0 are “strong enough”. • i.e. when the p-value is “small enough”. Arvid Sjölander

  21. Significance level • The rejection threshold is called the significance level. • E.g. “5% significance level” means that we have decided to reject H0 if p<0.05. • That we use a low significance level level means that we require strong evidence against H0 for rejection. • That we use a high significance level means that we are satisfied with weak evidence against H0 for rejection. • What is the advantage of using a low significance level? What about a high significance level? Arvid Sjölander

  22. A parallell to the court room • H0 = the prosecuted is innocent. • p value = the strength of evidence against H0. • Low significance level = need strong evidence to condemn to jail. • Few innocent in jail, but many guilty in freedom. • High significance level = weak evidence sufficient to condemn to jail. • Many guilty in jail, but many innocent in jail as well. Arvid Sjölander

  23. Type I and type II errors • There is always a trade-off between the risk for type I and the risk for type II errors. • Low significance level (difficult to reject H0)  small risk for type I errors, but large risk for type II errors. • High significance level (easy to reject H0)  small risk for type II errors, but large risk for type I errors. • By convention, we use 5% significance level (reject H0 if p<0.05). Arvid Sjölander

  24. Relation between significance level and type I errors • In fact, the significance level = the risk for type I errors. • If we follow the convention and use 5% significance level (reject H0 if p<0.05) then we have 5% risk of type I errors. • What does this mean, more concretely? Sig level Arvid Sjölander

  25. Power Power • Power = the chance of being able to reject H0, when H0 is false. • Relation between significance level and power: • High significance level (easy to reject H0)  high power. • Low significance level (difficult to reject H0)  low power. Sig level Arvid Sjölander

  26. Power calculations • It is important to determine the power of the study before data is collected. • That the power is low means that we will probably not find what we are looking for. • Direct calculation of the power is beyond the scope of this course • Ask a statistician! Arvid Sjölander

  27. Power calculations, cont’d • Heuristically, the power of the study is determined by three factors: • The significance level; higher significance level gives higher power. • The true RR; stronger association gives higher power. • The sample size; larger sample gives higher power. • Typically, we want to have a power of at least 80%. • In practice, the significance level is fixed at 5%. • We also typically have an idea of what deviations from H0 that are scientifically relevant to detect (e.g. RR > 1.5). • We determine the sample size that we need, to have the desired power. Arvid Sjölander

  28. Systematic vs random errors • There are two qualitative differences between systematic and random errors. • #1 • Data can tell us if an observed association is possibly due to random errors - check the p-value. • Data can never tell us if an observed association is due to systematic errors. • #2 • Uncertainty due to random errors can be reduced by increasing the sample size  narrower confidence intervals. • Systematic errors results from a poor study design, and can not be reduced by increasing the sample size. Arvid Sjölander

  29. Summary • In medical research, we are often interested in the causal effect of one variable on another. • An observed association between two variables does not necessarily imply that one causes the other. • Always be aware of the following pitfalls: • Selection bias • Confounding • Reverse causation • Random errors Arvid Sjölander

More Related