380 likes | 535 Views
Life After P -hacking (APS May 2013, Washington DC) With minor edits for posting. Photo not necessary. Leif Nelson UC Berkeley. Uri Simonsohn Penn (gave the talk). Joe Simmons Penn also. Definition. p -hacking : exploiting researchers’ degrees-of-freedom seeking p<.05.
E N D
Life After P-hacking(APS May 2013, Washington DC) With minor edits for posting Photo not necessary Leif Nelson UC Berkeley Uri Simonsohn Penn (gave the talk) Joe Simmons Penn also
Definition p-hacking: exploiting researchers’ degrees-of-freedom seeking p<.05
Life after p-hacking • n>50 • Direct replications • 21 words • Compromise writing • Who to hire • What about Bayesian?
~ Median study: n=20 • False-Positive Psych: n>20 • What can you reliably detect with n=20? • Mturk study. • N=674 • Why not published ds?
n=20 is enough for: • Men taller than women n=6 • People above median age closer to retirement n=10 • Women, more shoes than men n=15
n=20 is not enough for: • People who like spicy food are more likely to like Indian food n = 27 • Liberals rate social equality as more important than do conservatives n = 34 • People who like eggs report eating egg salad more often n = 47 • Men weigh more than women n = 47 • Smokers think smoking is less likely to kill someone than do non-smokers n = 146
People who like spicy food are more likely to like Indian food n = 27 • Liberals rate social equality as more important than do conservatives n = 34 • People who like eggs report eating egg salad more often n = 47 • Men weigh more than women n = 47 • Smokers think smoking is less likely to kill someone than do non-smokers n = 146
Are you studying a bigger effect than: • Men weigh more than women? • If not, use n>50
Life after p-hacking • n>50 • Direct replications • 21 words • Compromise writing • Who to hire • What about Bayesian?
Estimates are way off Subjects confused? Big outliers
p < .03 Estimates are way off Subjects confused? Big outliers
Study 1? p < .03
Run calories study again. • Same exclusion rule.
Why not just conceptual replication? • Restart p-hacking clock • Failures do not count
Replications • Conceptual • Rule out confounds • Rule in generalizability • Direct • Rule out false-positive
Life after p-hacking • n>50 • Direct replications • 21 words (Google it) • Compromise writing • Who to hire • What about Bayesian?
How can an organic researchercompete? • If you determined sample size in advance Say it. • If you did not drop variables Say it. • If you did not drop conditions Say it.
21 Word Solution get .pdf here http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2160588 Organic Farmer Organic Researcher Footnote 1 We report how we determined our sample size, all data exclusions (if any), all manipulations, and all measures in the study.
Life after p-hacking • n>50 • Direct replications • 21 words • Compromise writing • Who to hire • What about Bayesian?
Compromise writing • While reviewers still in dark ages. • Have it both ways. • “Clean” version in main text • All studies “worked” & < 2500 words • Supplement/footnote • n=100n=150 • p=.08 w/o exclusion • Data and materials online • Only reformers read small print • Organic 21 words applies. • Everybody likes the paper
Life after p-hacking • n>50 • Direct replications • 21 words • Compromise writing • Who to hire • What about Bayesian?
What’s the alternative to counting papers? • Rookies: Best 1 • Tenure: Best 3 • Full: Best 5 Try it. It is a powerful question. What’s her best paper?
Life after p-hacking • n>50 • Direct replications • 21 words • Compromise writing • Who to hire • What about Bayesian? Only speak for myself here. My prior: Bayesians will be unhappy in 3 2 1
P-hacking also invalidatesBayesian results Let me say that again
Bayesian proposals for Psych 1) Bayesian t-test • Replications use it sometimes • Turns out • α = 5% 2) Bayesian estimation • Latest JEP:G . • Turns out • Changes nothing 1%
t-test “vs” Bayesian Estimationchanges nothing How similar? Results change by less than if we dropped 1 observationat random.
But! • Isn’t data-peeking OK for Bayes? • Not when used for hypothesis testing • Also: • Dropped subjects, measures, conditions invalidate all inference.
P-hacking Bayesian stats • Drunk driving leather seats Good reasons to go Bayesian do not include p-hacking.
Life after p-hacking • n>50 • Direct replications • 21 words • Compromise writing • Who to hire • What about Bayesian? Leif Nelson UC Berkeley Joe Simmons Penn Only speak for myself here.