1 / 26

Statistics : the ten main mistakes

Ecole Nationale Vétérinaire de Toulouse. Statistics : the ten main mistakes. Didier Concordet d.concordet@envt.fr. July 2005. Statistical mistakes are frequent. • Many surveys of statistical errors in the medical literature

nash
Download Presentation

Statistics : the ten main mistakes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ecole Nationale Vétérinaire de Toulouse Statistics : the ten main mistakes Didier Concordet d.concordet@envt.fr July 2005

  2. Statistical mistakes are frequent • • Many surveys of statistical errors in the medical literature • with error rates ranging from 30%-90% (Altman, 1991; Gore et. al.,1976; Pocock et. al., 1987 and MacArthur, 1984) • Reviews of the biomedical literature have consistently found that about half the articles use incorrect statistical methods (Glantz, 1980)

  3. When do they occur ? • When designing the experiment • When collecting data • When analysing data • When interpreting results

  4. Design • Lack of a proper randomisation • the inference space is not defined • poor balance of the groups to be compared • lack of control group (maybe les frequent now) • there exist confounding factors • Lack of power • the sample size is not large enough to answer the question • the statistical unit is not well defined

  5. Inference space definition (M1) An experiment in 2 years old beagles showed that the temperature of dogs treated with the antipyretic drug A decreased by 2 °C. Does this result still hold for all 2 years old beagles 3 years olds beagles beagles dogs man

  6. Poor balance (M2) Clinical trial comparison of 2 antipyretics rectal temperature after treatment REFERENCE New TRT X = 39 N = 100 SD = 1 X = 37 N = 100 SD = 1 Reference < New TRT (P<0.001)

  7. Poor balance Clinical trial comparison of 2 antipyretics rectal temperature after treatment Clinical trial 1 Clinical trial 2 REFERENCE New TRT REFERENCE New TRT X = 30 N = 10 SD = 1 X = 32 N = 50 SD = 1 X = 40 N = 90 SD = 1 X = 42 N = 50 SD = 1 New TRT < Ref P<0.001 New TRT< Ref P<0.001 Conclusion : Reference > New TRT

  8. Power (M3) A clinical study to compare efficacy of two treatments (Ref. and Test) For the efficacy variable Expected difference between the treatments = 4 SD  2. A parallel two groups design is planned with 5 dogs in each groups What to think about this study ? 35 % of power for a type I risk of 5% Even if the expected difference exists, only 35% of the samples (of size 5)of dogs actually exhibits it !

  9. N 5 5 Power Efficacy variable on two groups of dogs Ref Test 20.0 Mean 15.4 2.6 2.4 SD Student t-test :P = 0.18 Actually no conclusion

  10. A real story A study was performed in order to study the effect of diet on several biochemical compounds (about 20). To this end, a dog was fed with a "normal" diet during 3 months and then with the new diet during 3 months. Every two days, a blood sample was taken and the biochemical compounds were dosed. At the end of the experiment 90 data were available for each biochemical compound. There was a significant difference between the effects of the two diets for 10 biochemical compounds (P<0.001). This result was obtained with a sample size of 90

  11. Statistical unit (M4) The statistical unit (an individual) is a statistical object that cannot be divided. We want to generalise results obtained on a finite collection of units (a sample) to a population of units. Despite the appearance of "wealth", the sample size was equal to 1 not 90. At the end of the experiment, the only dog of the experiment was well known but what about the other dogs of the population ?

  12. Experiment • Missing data not adequately reported • Extreme values excluded • Data ignored because they did not support the hypothesis ?

  13. Analysis • Failure to check assumptions of the statistical methods (M5) • homoscedasticity (for a t-test, a linear regression,…) • using a linear regression without first establishing linearity… • correlation • Ignoring informative "missing" data • death and its consequences • data below LOQ • Choosing the question to get an answer • Multiple comparisons

  14. What the t-test can see Homoscedasticity (M5) t-test P-value = 0.56 After log-transf P-value = 0.026 Clearance Treatment 2 1

  15. Linear regression Linear regression Linearity/Correlation (M5) Correlation R = -0.93 Correlation R = -0.002

  16. Linear regression A linear model with 3 groups Linearity/Correlation Correlation R = 0.84 Within group Correlation R = -0.92

  17. Ignoring data (M6)

  18. Ignoring data

  19. Choosing the question to get an answer (M7) Occurs frequently in the presentation of clinical trials results The question becomes random : it changes with the sample of animals. The question is chosen with its answer in hands… Think about a flip coin game where you win 1€ when tail or head occurs. You choose the decision rule once you know the result of the flip ! Such an approach increases the number of false discoveries.

  20. A risk of 5% for each comparison : the global risk can be very large Multiple comparisons (M8) One wants to compare the ADG obtained with 5 different diets in pig Ten T-tests

  21. Interpretation/presentation • Standard error and standard deviation • P values : non significant effects • False causality

  22. Standard error / standard deviation (M9) The clairance of the drug was equal to 68 ± 5 mL/mn Two possible meanings depending on the meaning of 5 If 5 is the standard error of the mean (se) there is 95 % chance that the population mean clearance belongs to [68 - 2  5 ; 68 + 2  5 ] If 5 is the standard deviation (SD) 95 % of animals have their clearance within [68 - 2  5 ; 68 + 2  5 ]

  23. NO P values (M10) The difference between the effect of the drugs A and B is not significant (P = 0.56) therefore drug A can be substituted by drug B. The only conclusion that can be drawn from such a P value is that you didn't see any difference between the effect of the drugs A and B. That does not mean that such a difference does not exist. Absence of evidence is not evidence of absence

  24. NO P values (M10) The drug A has a higher efficacy than the drug B (P = 0.001) The drug C has a higher efficacy than the drug B (P = 0.04) Since 0.001<0.04 the drug A has a higher than the drug B. The only conclusion that can be drawn from such a P value is that you are sure than A>B and less sure than C>B. This does not presume anything about the amplitude of the differences. Significant does not mean important

  25. False causality : lying with statistics There is a strong positive correlation between the number of firefighters present at a fire and the amount of fire damage. Thus, the firefighters present at fire create higher fire damage ! The correlation coefficient is nothing else than a measure of the strength of a linear relationship between 2 variables. Correlation cannot establish causality. A strong correlation between X and Y can occurs when "X" causes "Y" "Y" causes "X" "Z" causes "X" and "Y" (Z = fire size in the previous example) Incidentally with small samples size when X and Y are independent

  26. How to avoid these mistakes ? • Consult your prefered statistician for help in the design of complicated experiments • Use basic descriptive statistics first (graphics, summary statistics,…) • Use common sense • Consider to learn more statistics

More Related