Statistical power in experiments in which samples of participants respond to samples of stimuli

Statistical power in experiments in which samples of participants respond to samples of stimuli Jake Westfall University of Colorado Boulder David A. Kenny Charles M. Judd University of Connecticut University of Colorado Boulder

Studies involving participants responding to stimuli (hypothetical data matrix): Subject # 1 2 3 . . .

Just in domain of implicit prejudice and stereotyping: • IAT (Greenwald et al.) • Affective Priming (Fazio et al.) • Shooter task (Correll et al.) • Affect Misattribution Procedure (Payne et al.) • Go/No-Go task (Nosek et al.) • Primed Lexical Decision task (Wittenbrink et al.) • Many non-paradigmatic studies

Hard questions • “How many stimuli should I use?” • “How similar or variable should the stimuli be?” • “When should I counterbalance the assignment of stimuli to conditions?” • “Is it better to have all participants respond to the same set of stimuli, or should each participant receive different stimuli?” • “Should participants make multiple responses to each stimulus, or should every response by a participant be to a unique stimulus?”

Power analysis in crossed designs • Power determined by several parameters: • 1 effect size (Cohen’s d) • 2 sample sizes • p = # of participants • q = # of stimuli • Set of Variance Partitioning Coefficients (VPCs) • VPCs describe what proportion of the random variation in the data comes from which sources • Different designs depend on different VPCs

Four common experimental designs

For power = 0.80, • need q ≈ 50

For power = 0.80, • need p≈ 20

Maximum attainable power • In crossed designs, power asymptotes at a maximum theoretically attainable value that depends on: • Effect size • Number of stimuli • Stimulus variability • Under realistic assumptions, maximum attainable power can be quite low!

To obtain max. power = 0.9… Pessimist: q=86 Realist: q= 20 to 50 Optimist: q=11

Implications of maximum attainable power • Think hard about your experimental stimuli before you begin collecting data! • Once data collection begins, maximum attainable power is pretty much determined. • Even the most optimistic assumptions imply that we should use at least 11 stimuli per between-stimulus condition • Based on achieving max. power = 0.9 to detect a medium effect size (d = 0.5)

What about time-consuming stimulus presentation? • Assume that responses to each stimulus take about 10 minutes (e.g., film clips). • Power analysis says we need q=60 to reach power=0.8 (based on having p=60) • But then it would take over 10 hours for a participant to respond to every stimulus! • The highest feasible number of responses per participant is, say, 6 (about one hour) • Are we doomed to have low power? No!

Stimuli-within-Block designs

Standard error reduced by factor of 2.3!

The end URL for power app: JakeWestfall.org/power/ Article reference: Westfall, J., Kenny, D. A., & Judd, C. M. (in press). Statistical Power and Optimal Design in Experiments in Which Samples of Participants Respond to Samples of Stimuli. Journal of Experimental Psychology: General.

Statistical power in experiments in which samples of participants respond to samples of stimuli

Statistical power in experiments in which samples of participants respond to samples of stimuli

Presentation Transcript

Samples

Analysis of ABG Samples

Statistical Inference from Small Samples

Samples of our work

Samples

Samples of CFO Resume

Statistical Inference for Two Samples

Linkage in Selected Samples

Numbers of Participants in Case-Cohort Samples by the Type of Event

Samples

Use of Samples in Research - Rhabdomyosarcomas

Samples

Optimal Experimental Design in Experiments With Samples of Stimuli

SAMPLES OF WORK

Samples of Inflation

Linkage in selected samples

Samples of Inflation

Producing Data: Samples and Experiments

samples of essay introductions

Samples Of Evaluation Essays

Samples Of Descriptive Essay