240 likes | 462 Views
Emily’s Experiment. Drawing by Pat Linse, Skeptics Society. A SKEPTICAL THINKER LOOKS AT SKEPTIC’S WORK PRODUCTS: A RE-ANALYSIS OF EMILY ROSA’S THERAPEUTIC TOUCH STUDY DATA. Australian College of Holistic Nurses 5th INTERNATIONAL CONFERENCE 2002 Touching the Spirit:
E N D
Emily’s Experiment Drawing by Pat Linse, Skeptics Society
A SKEPTICAL THINKER LOOKS AT SKEPTIC’S WORK PRODUCTS: A RE-ANALYSIS OF EMILY ROSA’S THERAPEUTIC TOUCH STUDY DATA Australian College of Holistic Nurses 5th INTERNATIONAL CONFERENCE 2002 Touching the Spirit: Ancient Wisdom in the Art and Science of Future Nursing Hahndorf, SA Australia Thomas Cox RN, MS, MSW, MS (Nursing) Doctoral Candidate Virginia Commonwealth University School of Nursing November, 2002
Review Emily's Experiment Review the conclusions of the authors Review the data Re-analyze the data Compare the conclusions to the data Do a power analysis - was the ‘experiment’ ever intended to be fair? Questions… maybe answers OBJECTIVES
Practitioners recruited for 4th graders science project Self-described TT practitioners - no verification of TT credentials Willing to engage in "childish" research project Broad range of professions - phlebotomist? Not clearly tied to the way TT is practiced – testing method violated most TT assumptions Not clearly tied to nursing – too few nurses among participants to draw inferences about nurses Methodology of Emily's "Experiment"
Emily’s Experiment Drawing by Pat Linse, Skeptics Society
Guidelines for human subjects research Informed consent Institutional review board Intent to embarrass participants Participants deceived about project: Who was doing it Why it was being done What would happen with results Why would failures from first phase, confronted by researchers, agree to taping for a TV show? Failure to report results that would lead an objective, “ethical”, researcher to reject random guessing hypothesis ETHICAL ISSUES
Positioning of subjects inherently uncomfortable position not free to move or explore HEF subjects, not researcher blinded Imagine testing soccer players skill this way Player has their eyes covered Wear leg straps so they can’t move freely Spin them around - don’t let them see goal Miss the goal Failure – no skill Combine test scores across players METHODOLOGICAL ISSUES
Left hand vs Right hand performance is statistically insignificant 123 or 122 (???) successes in 280 trials were consistent with random guessing TT is ineffective and should not be used by nurses Their testing procedures were fair Skills do not improve over time ASSERTIONS MADE BY THE AUTHORS
Is the second phase data, the phase videotaped for the Scientific American series, hosted by Alan Alda, consistent with random guessing: One participant scores 1 – unlikely – though in an unexpected direction Far too many of the participants score 5 or less – 12/13 The average (0.408 correct) for all the 130 trials in phase 2 is unlikely under random guessing as the authors’ own test, reported in JAMA, showed – but the authors ignored this evidence in their conclusion RE-ANALYSIS OF THE ROSA DATA 1
EMILY ROSA’S PHASE I RESULTS RESULTS ARE CLEARLY ‘SYMMETRIC’ AROUND 5
EMILY ROSA’S PHASE II RESULTS RESULTS ARE CLEARLY NOT SYMMETRIC AROUND ‘5’ Authors should have been highly suspicious of their results Never should have published such questionable results
EMILY ROSA’S COMBINED RESULTS RESULTS ARE NOT CLEARLY SYMMETRIC AROUND ‘5’ Authors should have been suspicious of their results Never should not have published such strong conclusions
PHASE II RESULTS Successes = 53 Trials = 130 Success Proportion = 0.408 P(Success <= 53) = 0.0216 Two chances in 100 that random guessers would score so low
Author’s confidence interval excludes n = 5.0 the expected number of successes in 10 trials for random guessing Better success rate estimate is lower than theirs - .426 and even less plausible How to calculate a better point estimate 21 independent estimates of success rate some based on 10, some 20 and one 30 repetitions combine these into a lower variance estimate t value = 2.47 -- an unlikely result for binomial trials why did authors conclude consistency with guessing? Data screaming - “Something is Wrong” Authors should probably have been trying to hide this ‘data’ not publishing it WHAT THE DATA TELL US
The hand data: Hand preference is not a TT principle – it may have been a total fabrication by the authors No need to test hand data, Authors: ‘created’ a need to test, produced the data for the test, failed to test it correctly, misrepresented the implications of the data The data are inconsistent with their conclusions The Chi-square & Fisher’s Exact tests are significant, the authors misrepresented a major conclusion that they never should have even explored RE-ANALYSIS OF THE ROSA DATA 2
CHI-SQUARE TEST DATA Chi Square = 3.99
Author’s: No difference between hands Contingency table analysis Chi-square = 3.99 Chi-square test is significant Authors misrepresent another significant result. Why? WHAT THE DATA TELL US 4
Rebecca Long, a skeptic, did a replication of Emily’s experiment Long’s results suggest that the ratio of correct to trials should have been more like 70 – 75% Using Long’s data as the probability of success in Emily’s study – P[Emily’s data|Long data] = 0.0 Probability Emily’s data contrived/cooked/fake = 1.0 if Long’s data is an accurate success rate Long’s results are also unlikely if Emily’s data is accurate – i.e. two skeptics do essentially the same study and not only do they not agree – their studies are inconsistent with each other WHAT THE DATA TELL US 5
Subjects do not need to score more than 50% to demonstrate a skill Random guessing will average 50% over the long run - results should have been about 50% in these trials - not so low Instances where less than 50% is OK: stock broker - 20% winners might do well cure rates - broad-spectrum antibiotic Tx for earache 50% could be very, very good: Larry Bird – career lifetime field shooting 49.6% The ‘value’ of a skill is determined not just by how frequently it is demonstrated but by the difference in well-being that results – the “Theory of Utility” in economics THE AUTHORS SHOULD HAVE CONSIDERED
May be just a calibrationproblem – could test subjects ‘learn’ how to report results correctly? Results are not consistent with random guessing – they actually support TTPs Authors believe TTPs should consider that they are randomly guessing but they do not feel the need to consider that TTPs are not randomly guessing – despite their own data Authors Do Not Consider Alternative Explanations
Skill with hands is significant Overall results inconsistent with random guessing Should not have reported/published their results Are authors credible? Do authors have integrity? Why have they not withdrawn this article? This study will probably never be replicated Concerns about Authors Conclusions & Data
To be taken seriously – this study would have to be replicable Would have to conform to TT principles and procedures – centering is an essential, aspect Researcher bias must be controlled – ‘skeptics’ are too biased and, since Rosa, too ethically suspect to do such research Participants may need to be trained to report findings correctly Has to be a search to identify practitioners who perform well – again, skeptics cannot do this because they want to look for people who will fail, not succeed (Long study ignored success) Have to improve methods over time - Can’t jump the gun on publishing findings either in support or contrary If we can find subjects who do well in this sort of procedure Study what they do and how they do it Use information in training people to use TT Use information in evaluating competency? NEXT STEPS
Authors have shown a significant non-detection rate JAMA should withdraw support for article’s conclusions Unlikely JAMA did any statistical review at all What happened at JAMA since article was published Editor of JAMA fired Guidelines for authorship radically changed Authors cautioned to be prepared to defend their work Authors should explain why it happened They seem to think that acceptance of the article by JAMA is the only criteria for evaluating merit – refuse to discuss it Authors refuse to discuss discrepancies - unscientific CONCLUDING REMARKS 1
Serious study should proceed without skeptical assistance or interference The author’s work cannot be replicated and nobody should even try This type of research needs to be a search for excellence not an effort to disprove by finding that no average effect exists – one actual person who could do this, as occurred in the Long study, would provide compelling evidence Efforts to control against successful guessing create as many problems as they solve – must be more creative than the authors CONCLUDING REMARKS 2