Analysis of Variance (ANOVA)

Analysis of Variance(ANOVA) • Comparison of multiple sample means • T-test: 1 = 2; or1-2 = 0 • ANOVA: 1 = 2= 3 = 4 • Multiple T-tests result in probability of a Type I error being too great

Assumptions • 1. Normality-The dependent variable is assumed to be normally distributed in the population from which the samples are drawn.

Assumptions • 2. Homogeneity of Variance-The variances for the dependent variable scores for each population is assumed to be equal

Assumptions • 3. Random assignment - distributes the effects of extraneous variables equally over all levels of the independent variable. Using random assignment allows you to conclude that a difference between means is caused by a difference in levels of the independent variable

When one-way (simple) ANOVA is used to compare the means of several samples, the variability between samples is compared to the variability within samples. All differences in sample means are judged statistically significant (or not) by comparing them to the variation within samples.

Factorial ANOVA • Factorial ANOVA- allows the utilization more than one independent variable and allows for the analysis of the interaction between the independent variables. • An “A x B x C” factorial allows for determination of 7 different F ratios. • An “A x B” factorial produces 3 F ratios.

Factorial ANOVA Limitations • 1. Number of scores per cell must be equal • 2. The cells must be independent- no form of correlated data may be used. Each subject is randomly assigned to only one cell • 3. The levels of both factors are fixed- chosen by the experimenter

One Way Correlated measures ANOVA • Allows you to calculate and then discard the variability that comes from differences between subjects.

Repeated measures advantages • Efficiency- requires fewer subjects than other designs • Power- any procedure that reduces MSE increases the F ratio. Therefore the correlated measures design is both powerful and efficient.

Correlated measures Limits • Carryover effect • Model and F ratio formulas change dependent upon rather fixed effects or random effects are utilized.

Why not use a t-test? • Let’s assume we want to compare finish times in a forty yard sprint between 5th grade students from 6 schools (A, B, and C, D, E, & F)? • Could we use a t-test to make the comparisons?

Solution • Use ANOVA to determine differences by comparing the sources of variability. • Use one preset level for  (rejection region). • Use an F ratio instead of a t ratio.

MSbetween F = MSwithin

MS error is an unbiased estimate of variance even if the null hypotheses is false • MS between groups is a good estimate of the variance only if the null hypothesis is true- it is comparing the variation between the group means

Is the null is false MS Between will tend to be large and the value of F will increase • If the null is true both MS error and MS between will result in similar estimates and the F value will be near unity.

Example Suppose we have three groups of kids that are evaluated by judges on a particular skills test after being taught by three different methods (tape alone, teacher, teacher + tape). We want to know if the scores from the three groups (methods) differ.

Scores

Factorial

A1 A2 A3 • Subj a 4 7 7 • Subj b 3 6 6 • Subj c 3 3 7 • Subj d 3 4 5 • Subj e 2 5 5

Summary Table for ANOVA K = number of groups N = total number of subjects across all groups

Explanation of Summary Table • Between-group variance is the variability among the sample means. • Within-group variance is the variability among individual scores within the samples. • Mean squares (MS) are calculated by dividing the sum of squares by the appropriate degrees of freedom (df). • The F ratio is the ratio of Msbetween (numerator) to MSwithin (denominator). (11.65/1.33)

Interpretation of ANOVA Results If the calculated F ratio is greater than the critical F value from the F table, then we can state that there is a difference between at least two of the means.

Tukey • We know now that at least 2 means are different but we don’t know which two. An additional procedure is required to locate the difference. There are many possible test the one we will utilize is called Tukey’s HSD

Analysis of Variance (ANOVA)