120 likes | 250 Views
reporting ‘almost significant’ results: follow-up to PRIMENT stats seminar (27 August 2013). stats methodologists meeting 10 September 2013. plan. brief summary of Priment seminar deconstructing the phrase “…trend towards statistical significance” investigating how p-values move
E N D
reporting ‘almost significant’ results:follow-up to PRIMENT stats seminar (27 August 2013) stats methodologists meeting 10 September 2013
plan • brief summary of Priment seminar • deconstructing the phrase “…trend towards statistical significance” • investigating how p-values move • what p-values tell us • how best to report near-significant results?
The trap of trends to statistical significance: how likely it really is that a near significant P value becomes more significant with extra data John Wood Nick Freemantle Michael King Irwin Nazareth
“…a trend towards statistical significance…" • …is a very popular way of reporting ‘non-significant’ results where the p-values weren’t ‘too far’ above some threshold (usually p=0.05) • (e.g.) “…there was a trend toward a lower risk of any treatment failure … (hazard ratio ... 0.86; 95% CI, 0.73 to 1.01; P = 0.06)” • is this a reasonable use of words? • does it make sense to call it a ‘trend’
‘trends’ imply movement • we’ve collected data comparing 2 treatments and found the 2-sided p-value (2p) to be just above 0.05 (say) • if this is a ‘trend towards significance’ then the following should be true: • running the experiment longer (k% more data)… • then p-value ‘should’ drop (get more significant) • what are the chances?
(aside) how we might calculate that • current data {xi} – all ~N(μ,1) - is 100 (pairs of) observations, each contributes an estimate of the treatment effect • overall current estimate x̄~N(μ,0.01) is greater than 0 with 2-sided significance 2p • can express x̄ in terms of p: x̄=0.1*Φ-1(1-p) • our current knowledge about μ is reasonably represented by the likelihood, so (loosely) μ~N(x̄,0.01) • now add in an extra k pairs of observations (k% more data), which will have a mean of ȳ: ȳ|μ ~N(μ,1/k) ȳ~N(x̄, 0.01+1/k) • significance is unchanged if: • (updated mean)/(updated SEM) = (old mean)/(old SEM) • [(100.x̄+k.ȳ)/(100+k)].√[100+k] = 10.x̄ • have the distribution of ȳ, so can calculate chance of significance moving ‘backwards’
summary • a p-value ‘on the brink’ would be quite likely to move the ‘wrong’ way if we were able to add more data • therefore, talking of ‘trends to significance’ is misleading impression • p-values have much more variability associated with them than we’d like to think (and not just when H0 is true)
investigating how p-values move simple-comparative trial; up to n=250/group; effect-size = 0.3
what question do p-values answer? • not: “are the effects of A and B different?” (with “no” as a possible answer) • but “can we be confident of the direction from A to B: is it ‘up’, ‘down’ or ‘uncertain’?...” • …the follow-up question is about ‘how much’ • J. W. Tukey (1991). The Philosophy of Multiple Comparisons. Statistical Science 6 100-116
how should you report near-significant results? • not as ‘trends towards significance’ • but this is certainly not an argument for ignoring ‘interesting hints’ (Tukey again) • so, a word like ‘hint’ perhaps, and always with the CI • views?