1 / 34

The Presentation of Statistics in Clinical and Health Psychology Research

The Presentation of Statistics in Clinical and Health Psychology Research. Jeremy Miles Department of Health Sciences. Susanne Hempel Centre for Reviews and Dissemination. Introduction. Statistics in clinical and health psychology Appropriate statistics used

moke
Download Presentation

The Presentation of Statistics in Clinical and Health Psychology Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Presentation of Statistics in Clinical and Health Psychology Research Jeremy Miles Department of Health Sciences Susanne Hempel Centre for Reviews and Dissemination

  2. Introduction • Statistics in clinical and health psychology • Appropriate statistics used • Statistics appropriately presented • Graphical display • Verbal presentation

  3. Methodology • Reviewed 2003 volumes (4 issues) of • British Journal of Clinical Psychology • British Journal of Health Psychology • Looking for • Errors of statistical presentation / interpretation • Potential areas of improvement

  4. Results • BJCP: 29 papers reviewed • BJHP: 31 papers reviewed • 5 excluded (qualitative, narrative review) • Wide range of problems identified • Emerging themes • P-values • Inferential statistics • Effect Sizes • Reliability • Other Issues • 2 papers with no issues

  5. Statistical Significance

  6. Statistical Significance • Confusing and controversial issue • Misunderstood by students, researchers, teachers, textbook authors • (Broadly) two rival approaches to probability: • Fisher: report exact significance value • Neyman-Pearson: <0.05, or not • These are incompatible(!) • (Ignoring Bayes; ignoring meanings of probability)

  7. A Bastardised Approach • (From Gigerenzer, 1992) • The two approaches are misunderstood, and combined • “We must report the exact p” • “We must present results as <0.xx” • Recommended: • Exact probability values (e.g. Wilkinson, et al, 1999)

  8. Results of p-value reporting • BJCP: 8 out of 29 reported exact p-values • 1 used strict N-P approach • BJHP: 4 out of 26 reported exact p-values

  9. More on P-Values • 2 papers reported p < 0 (.00) • True values were 0.000040, 0.000007 • Several reported arbitrary cutoffs • <0.07, <0.02 • Incorrect, but not deceptive

  10. Misleading? • Not using exact p-values sometimes appears fishy: • Exact p-values for all except where p = 0.049, reported as p < 0.05 • Gave p > 0.05 (p = 0.057), p < 0.05 (p = 0.048) • P < 0.01 when p = 1 * 10-19 (others in same paper reported as p < 0.001) • p = 0.0104, described as “< 0.01”, p = 0.0123 described as “<0.05”

  11. Finally: Mistakes • Good old errors • Very hard for readers and reviewers to spot, but still … • “F (1, 69) = 4.58, p < 0.001” • No, p = 0.035 • “F (1.76, 142.51) = 3.026, p = .058.” • No, p = 0.084 • F = 4.02, (df not given, but are 2, 62), p = 0.05. (information in table) • No, p = 0.022

  12. Inferential Statistics

  13. Reporting Test Statistics • Most people can’t interpret a test statistic • Even fewer are interested • Why report a test statistic exactly, and not the exact p? • “[no] significant interaction of both variables, F (1,67) = .289.” No p-value given (it’s 0.59) • F without df • No use at all (unless df can be worked out, but can be tricky or ambiguous)

  14. Standard Errors • Standard error is the standard deviation of the sampling distribution • Used to calculate t (and hence p-value) and CIs • 95% CIs given by: • Value depends on df • df = 5, ta/2= 2.57 • df = 100, ta/2 = 1.98 • Standard error has little use.

  15. Graph shows mean +/- 1 SE. SE Mean is not showing anything useful

  16. Graph shows mean +/- standard error. Data are repeated measures.

  17. Confidence Intervals • Generally recommended that confidence intervals are reported • Better idea of the likely value in the population • Not significant ≠ no effect • Appropriate confidence intervals: • BJCP: 3 (of 29) • BJHP: 4 (of 26)

  18. Inappropriate Confidence Intervals / Standard Errors • Compare two groups • Appropriate standard error / confidence interval is of the difference , not of each group

  19. Independent groups study: Significant difference? Yes. t = 2.7, df = 18, p = 0.016, difference 2.7, 95% CIs = 0.60, 4.80

  20. Repeated measures study: Significant difference? t = 2.25, df = 9, p = 0.051 Difference = 2.7, 95% CIs -0.02, 2.25 Trick question. It’s the same graph, and I haven’t given you enough information

  21. Effect Sizes

  22. Effect Sizes • More statistically significant = larger, more important effect? • No • Effect sizes describe the size of the effect • r, d, h2, R2

  23. Reliability Reporting

  24. Reliability Reporting • Small, but important • Reliability is not a property of a test • It is a property of a test, in a population, at a particular time • Reliability should always be evaluated, and presented

  25. Stepwise Regression • Almost never appropriate • Small differences in samples can lead to large differences in results • 1 paper discusses differences between two stepwise regressions • Df are wrong (hence F, and p are also wrong) • Use of stepwise regression: • BJCP: 1 • BJHP: 2 (one not described as stepwise)

  26. A Collection of Smaller Issues

  27. Distributional Assumptions • Very few tests assume normal distribution of the variables • When sample sizes are at least moderate, normal distribution unimportant • Kolmogorov-Smirnov test examines significant difference from normality • Not important difference from normality (Field?) • 2 papers (BJCP) used the KS test • Non-parametric tests

  28. Other Miscellany • Mann-Whitney test described as comparing medians (it doesn’t necessarily) • Principal components analysis described as exploratory factor analysis (it’s not) • Expected values of chi-square test violated • Arithmetical errors in chi-square test • Correlation used as measure of agreement • We all know that it isn’t • Inappropriate dichotomisation of continuous variables • Never necessary

  29. Hall of Shame

  30. Conclusions

  31. Summary • Picture isn’t rosy • Errors are not limited to psychology • Garcia-Berthou and Alcaraz (2004) found errors in Nature and the British Medical Journal • There are a lot of areas for improvement

  32. Solutions? Short Term • More statistical refereeing? • More guidelines for reviewers • More reviewers with expertise in statistics • BJCP and BJEP have statistical reviewers • Rapid response? • Could be set up with the electronic journals • Work in other fields

  33. Solutions? Long Term • Statistical / methodological training? • Undergraduate? Postgraduate? CPD? • Work more closely with statisticians? • Common in other fields – MSc in Medical Statistics is possible, MSc in Psychological Statistics is not

  34. Final Thought Aaagggghhhhh! We just did a piece of qualitative research?

More Related