220 likes | 241 Views
Understand the importance of specific contrasts, effect sizes, and multiplicity in evaluating research outcomes. Learn about linear contrasts, pairwise comparisons, and the distinction between planned and unplanned comparisons. Explore the development of motor skills in children through a practical study example.
E N D
Next Friday (Week 9) • Evaluating research, class test • First ten minutes of lecture (2.05-2.15) • Please come a little early • Please sit one seat space apart if possible • Please do not talk once seated, until the test finishes • There will be a lecture after the test
Learning objectives • specific contrasts are sometimes more useful than ANOVA main effects • linear contrasts and pairwise comparison are important examples of contrasts • effect size can be more relevant than significance • multiplicity affects the interpretation of results • distinction between planned and unplanned comparisons affects interpretation of p-value
Study: Development of motor skill • 50 children at five ages (11, 12, 13, 14, 15) • record how well they play a new video game • Is Age a good predictor of Game Score? • Age 11 12 13 14 15 • Score 25 30 40 50 55 • Example from Rosenthal, Rosnow, and Rubin (2000)
ANOVA • Source SS df MS F p • ------------------------------------------------------------------- • Age levels 6,500 4 1,625 1.03 .40 • Within error 70,875 45 1,575 • not significant! • Should we conclude that age is not a useful predictor?
Should we conclude that age is not a useful predictor? • ANOVA main effect did not use information about the order of the ages • ANOVA tests an unfocused question"any differences among the five age levels“ • A more focused question a more powerful test
A specific contrast • Choose • a weight for each level • weights reflect the contrast you want to test • weights add up to zero • Age 11 12 13 14 15 • Contrast -2 -1 0 1 2 • The contrast weight represents a specific model the form you expect the relationship to take
ANOVA - scores are different at different ages • Linear contrast • - scores go up in a straight line as age increases • In this example, the linear contrast is statistically significant: • t(45) = 2.02, p = .025
Pairwise comparisons • Overall main effect often not an especially interesting hypothesis • Week 5 ANOVA tested whether the average comfort score was different for different drugs (main effect of 'Drug') • Effect significant, but what can you conclude? • "The drugs did not all have the same effect"
Pairwise comparisons • A more interesting question would be: • 'Is aspirin more effective than tylenol?‘ • When two groups are compared, it's called a pairwise comparison • You can express a pairwise comparison as a contrast too: • Drug • Asprin Tylenol Nuprin Bufferin • +1 -1 0 0
Effect size • If asprin is significantly better than tylenol, • should we stop ordering tylenol for the pharmacy? • Significance level (p-value) & sample size • a very large sample can detect tiny effects • too small a sample can miss even a large effect • A very small p (eg. p = .001) does not in itself mean a strong effect • Significance and effect size are different things
To measure effect size • d = M1 – M2 • s • Where: • M1 and M2 are the respective group means • s is an estimate of population s.d. • 0.2 is "small"; 0.8 is a "large" effect • (Cohen, 1977)
Multiplicity • Take 15 measures of individual differences • Correlate each with all the others • There will be 105 different correlations • So we expect 5 to reach the 5% p-value (.05) even if there are no real relationships
Not appropriate to claim statistical significance for results in such circumstances • Choice • • use a stricter, more conservative, criterion • • attempt to replicate your result
More conservative criterion • Bonferroni adjustment • For 105 comparisons • set required p-value to 0.05 / 105 • Simple approach, wide applicability
Replication • Does the result continue to appear? • If it is real, it should appear again in another study • Meta-analysis takes this method further by aggregating results from several studies
Planned and unplanned comparisons • Planned (“a priori”) • contrast envisaged at the outset • follows from the logic of the study design • Treat significance values straightforwardly
Unplanned comparisons • Unplanned (“post hoc” tests) • chosen on the basis of looking at the data • often – is an unexpected difference or pattern statistically reliable? • Multiplicity issue • -- even if you actually do just one, effectively you looked at them all
Unplanned comparisons • Choice • • use a stricter, more conservative, criterion • Bonferroni adjusted tests • Special purpose tests • eg. Tukey HSD • • attempt to replicate your result
Learning objectives • specific contrasts are sometimes more useful than ANOVA main effects • linear contrasts and pairwise comparison are important examples of contrasts • effect size can be more relevant than significance • multiplicity affects the interpretation of results • distinction between planned and unplanned comparisons affects interpretation of p-value
Getting a contrast in SPSS • Syntax window (start setting up ANOVA, then choose paste) • For a two way ANOVA • IVs a(2) x b(4) • DV y • To contrast the four means within b • Note F-ratio for doing this is bigger than if collapse b as a one –way, cos including the extra predictor a will reduce error variance • glm y by a b • /contrast(b)= special (0 0 1 -1). • Actually, the following is what you get from GLM if you set up a two-way ANOVA, and the same contrast can be added. • UNIANOVA • y BY a b • /METHOD = SSTYPE(3) • /INTERCEPT = INCLUDE • /CRITERIA = ALPHA(.05) • /DESIGN = a b a*b • /contrast(b) = special (0 0 1 -1) .