530 likes | 886 Views
Analysis of Variance (ANOVA). Quantitative Methods in HPELS 440:210. Agenda. Introduction The Analysis of Variance (ANOVA) Hypothesis Tests with ANOVA Post Hoc Analysis Instat Assumptions. Introduction.
E N D
Analysis of Variance (ANOVA) Quantitative Methods in HPELS 440:210
Agenda • Introduction • The Analysis of Variance (ANOVA) • Hypothesis Tests with ANOVA • Post Hoc Analysis • Instat • Assumptions
Introduction • Recall There are two possible scenarios when obtaining two sets of data for comparison: • Independent samples: The data in the first sample is completely INDEPENDENT from the data in the second sample. • Dependent/Related samples: The two sets of data are DEPENDENT on one another. There is a relationship between the two sets of data.
Introduction • Three or more data sets? • If the three or more sets of data are independent of one another Analysis of Variance (ANOVA) • If the three or more sets of data are dependent on one another Repeated Measures ANOVA
Introduction: Terminology • Factor: Synonym of independent variable • Level: The treatment conditions that make up the factor or independent variable • Example: What is the effect of grade (1st, 2nd, 3rd) on IQ? • Dependent variable: IQ • Factor: Grade • Levels (3): 1st, 2nd and 3rd grades
Introduction: Terminology • Between-Treatment Variance: Variance between the treatments/levels • As the between-treatment variance increases: • The statistic increases • The p-value decreases • Greater chance of rejecting the H0
Introduction: Terminology • Within-Treatment Variance: Variance within the treatments/levels • As the within-treatment variance increases: • The statistic decreases • The p-value increases • Lesser chance of rejecting the H0
Recall the Independent-Measures t-Test • If there was a large difference between the means (between variance) t got bigger • Why? • t = M1-M2 / s(M1-M2) • The t formula can be thought of as a ratio of: • Between variance (M1-M2) • Within variance (s(M1-M2)) • Several Scenarios can occur
-Small between variance -Large within variance -t = BV / WV = near zero value Accept or reject the H0
-Large between variance -Large within variance -t = BV / WV = near value of 1.0 Accept or reject the H0
-Small between variance -Small within variance -t = BV / WV = near value of 1.0 Accept or reject the H0
-Large between variance -Small within variance -t = BV / WV = greater than 1.0 Accept or reject the H0
Introduction The F-Ratio • ANOVA is a ratio of between variance and within variance • Distinction: Three or more groups
The F Distribution • Plot all possible F-ratios F distribution • There is a family of F distributions • As df increases, the distribution becomes more narrow • F-ratios are always positive in value • Computed with two variances • Variances are always positive! • F distribution is skewed • Most values cluster around 1.0 • Figure 13.8 (p 413)
Agenda • Introduction • The Analysis of Variance (ANOVA) • Hypothesis Tests with ANOVA • Post Hoc Analysis • Instat • Assumptions
ANOVA • Statistical Notation: • k = number of treatment conditions (levels) • nx = number of samples per treatment level • N = total number of samples • N = kn if sample sizes are equal • Tx = SX for any given treatment level • G = ST • MS = mean square = variance
ANOVA • Formula Considerations: • SSbetween = ST2/n – G2/N • SSwithin = SSSinside each treatment • SStotal = SSwithin + SSbetween • SStotal = SX2 – G2/N
ANOVA • Formula Considerations: • dftotal = N – 1 • dfbetween = k – 1 • dfwithin = S(n – 1) • dfwithin = Sdfin each treatment
ANOVA • Formula Considerations: • MSbetween = s2between = SSbetween / dfbetween • MSwithin = s2within = SSwithin / dfwithin • F = MSbetween / MSwithin
Independent-Measures Designs • Static-Group Comparison Design: • Administer treatment to two or more groups and perform posttest • Perform posttest to control group • Compare groups X1 O X2 O O
Independent-Measures Designs • Quasi-Experimental Pretest Posttest Control Group Design: • Perform pretest on three or more groups • Administer treatments to treatment groups • Perform posttests on all groups • Compare delta (Δ) scores O X1 O Δ O X2 O Δ O O Δ
Independent-Measures Designs • Randomized Pretest Posttest Control Group Design: • Randomly select subjects from three or more populations • Perform pretest on all groups • Administer treatments to treatment groups • Perform posttests on all groups • Compare delta (Δ) scores R O X1 O Δ R O X2 O Δ R O O Δ
Agenda • Introduction • The Analysis of Variance (ANOVA) • Hypothesis Tests with ANOVA • Post Hoc Analysis • Instat • Assumptions
Hypothesis Test: ANOVA • Example 13.1 (p 415) • Overview: • Researchers are interested in the effectiveness different pain relievers (A, B and C) compared placebo (D) • N = 20 randomly assigned to the four treatments (n = 5) • Amount of time (s) each subject could withstand a painfully hot stimulus was measured
Hypothesis Test: ANOVA • Questions: • What is the experimental design? • What is the independent variable/factor? • How many levels are there? • What is the dependent variable?
Step 1: State Hypotheses Non-Directional H0: µA = µB =µC = µD H1: At least one mean is different than the others Directional? Too many too list Step 2: Set Criteria Alpha (a) = 0.05 Critical Value: Use F Distribution Table Appendix B.4 (p 693) Information Needed: dfbetween = k – 1 dfwithin = S(n – 1)
Step 3: Collect Data and Calculate Statistic Total Sum of Squares SStotal = SX2 – G2/N SStotal = 262 – 602/20 SStotal = 262 - 180 SStotal = 82 Sum of Squares Between SSbetween = ST2/n – SG2/N SSbetween = 52/5+102/5+202/5+252/5 – 602/20 SSbetween = (5+20+80+125) - 180 SSbetween = 50 Sum of Squares Within SSwithin = SSSinside each treatment SSwithin = 8+8+6+10 SSwithin = 32
Step 3: Collect Data and Calculate Statistic F-Ratio F = MSbetween / MSwithin F = 16.67 / 2 F = 8.33 Mean Square Between MSbetween = SSbetween / dfbetween MSbetween = 50 / 3 MSbetween = 16.67 Step 4: Make Decision Mean Square Within MSwithin = SSwithin / dfwithin MSwithin = 32/16 MSwithin = 2
Agenda • Introduction • The Analysis of Variance (ANOVA) • Hypothesis Tests with ANOVA • Post Hoc Analysis • Instat • Assumptions
Post Hoc Analysis • What ANOVA tells us: • Rejection of the H0 tells you that there is a high PROBABILITY that AT LEAST ONE difference exists SOMEWHERE • What ANOVA doesn’t tell us: • Where the differences lie • Post hoc analysis is needed to determine which mean(s) is(are) different
Post Hoc Analysis • Post Hoc Tests: Additional hypothesis tests performed after a significant ANOVA test to determine where the differences lie. • Post hoc analysis IS NOT PERFORMED unless the initial ANOVA H0 was rejected!
Post Hoc Analysis Type I Error • Type I error: Rejection of a true H0 • Pairwise comparisons: Multiple post hoc tests comparing the means of all “pairwise combinations” • Problem: Each post hoc hypothesis test has chance of type I error • As multiple tests are performed, the chance of type I error accumulates • Experimentwise alpha level: Overall probability of type I error that accumulates over a series of pairwise post hoc hypothesis tests • How is this accumulation of type I error controlled?
Two Methods • Bonferonni or Dunn’s Method: • Perform multiple t-tests of desired comparisons or contrasts • Make decision relative to a / # of tests • This reduction of alpha will control for the inflation of type I error • Specific post hoc tests: • Note: There are many different post hoc tests that can be used • Our book only covers two (Tukey and Scheffe)
Tukey’s Honestly Significant Difference (HSD) Test • Overview: • Computes a single value that determines the minimum difference (HSD) between any two means necessary for rejection of the H0 • Compares the HSD value to all of the contrast results • If the contrast result exceeds the HSD, the H0 of that particular contrast is rejected
Tukey’s HSD Calculation • Formulas: • Equal sample sizes • HSD = q√MSwithin / n • Unequal sample sizes • HSD = q√(MSwithin/2)(1/n1+1/n2)
Tukey’s HSD Calculations • Formula Considerations: • q = value found in Table B.5 (p 696) • Left column: dfwithin • Top row: k treatments • Body: • Regular font: a = 0.05 • Bold font: a = 0.01 • MSwithin = value from ANOVA calculation • n = number of subjects in each treatment • Example 13.5 (p 427)
Step 1: State Hypotheses Null H0: µA = µB H0: µA = µC H0: µB = µC Alternative H1: µAµB H1: µAµC H1: µBµC Step 2: Set Criteria Alpha (a) = 0.05 • Step 3: Calculate Statistic • Get q from Table B.5 • Information needed: • dfwithin = 24 • k = 3 • = 0.05 q = 3.53
Table 13.6 Calculate Tukey’s HSD Value HSD = qMSwithin / n HSD = 3.53 4 / 9 HSD = 2.36 Step 4: Make Decision: A significantly greater than B MA – MB = 2.44 > 2.36 A significantly greater than C MA – MC = 4.00 > 2.36 B not significantly different than C MB – MC = 1.56 < 2.36
Scheffe • Overview: • Most conservative/cautious of all post hoc tests • Uses an F-ratio (like ANOVA) on only two treatments • Controls for type I error: • Uses k value from the original ANOVA thus the numerator of the F-ratio for the Scheffe test is k – 1 • Uses same critical value used for the ANOVA • Calculation of Scheffe is identical to the ANOVA however: • SSbetween uses the two means of interest • Example 13.6 (p 428)
Step 1: State Hypotheses Null H0: µA = µB H0: µA = µC H0: µB = µC Alternative H1: µAµB H1: µAµC H1: µBµC Step 3: Calculate Statistic Sum of squares between: SSbetween = ST2/n – G2/N SSbetween = (272/9 + 492/9) – 762/18 SSbetween = (81+266.78) – 320.89 SSbetween = 26.89 SSwithin from original ANOVA = 96 Mean square between and within MSbetween = SSbetween/dfbetween MSbetween = 26.89 / 2 = 13.45 MSwithin from original ANOVA = 4 Step 2: Set Criteria Alpha (a) = 0.05 Critical Value 3.40 dfbetween = 2 dfwithin = 24 a = 0.05 F = MSbetween / MSwithin F = 13.45 / 4 F = 3.36
F = MSbetween / MSwithin F = 13.45 / 4 F = 3.36 df = 2, 24 Step 4: Make Decision F = 3.36 < 3.40 (critical value) Accept or reject? Repeat for the other two contrasts: H0: µA = µC H0: µB = µC 3.40
Agenda • Introduction • The Analysis of Variance (ANOVA) • Hypothesis Tests with ANOVA • Post Hoc Analysis • Instat • Assumptions
Instat • Type dependent variable data from the three or more samples into one column: • Label column appropriately • In a second column, type in the grouping variable (independent variable) next to each data point: • Label column appropriately • Convert the grouping column into a “factor” column: • Highlight the grouping column. • Choose “Manage” • Choose “Column Properties” • Choose “Factor” • Select the appropriate column to be converted • Indicate the number of levels in the factor • Click OK
Instat • Choose “Statistics” • Choose “Analysis of Variance” • Choose “One-Way” • Y-Variate: • Choose the dependent variable • Factor: • Choose the factor column or grouping/independent variable • Plots: • Not necessary to choose any • Click OK. • Interpret the p-value!!! • Post Hoc Analysis: • Perform multiple Independent-Measures t-Tests with the Bonferonni/Dunn correction method
Reporting ANOVA Results • Information to include: • Value of the F statistic • Degrees of freedom: • Between: k – 1 • Within: S(n – 1) • p-value • Examples: • A significant treatment effect was observed (F(2, 24) = 8.33, p = 0.02)
Reporting ANOVA Results • An ANOVA summary table is often included
Agenda • Introduction • The Analysis of Variance (ANOVA) • Hypothesis Tests with ANOVA • Post Hoc Analysis • Instat • Assumptions
Assumptions of ANOVA • Independent Observations • Normal Distribution • Scale of Measurement • Interval or ratio • Equal variances (homogeneity)
Violation of Assumptions • Nonparametric Version Kruskall-Wallis Test (Chapter 17) • When to use the Kruskall-Wallis Test: • Independent-Measures design with three or more groups • Scale of measurement assumption violation: • Ordinal data • Normality assumption violation: • Regardless of scale of measurement