Understanding ANOVA and F-Test in Statistics

General Linear Model 2 Intro to ANOVA

Questions • ANOVA makes assumptions about error for significance tests. What are the assumptions? • What might happen (why would it be a problem) if the assumption of {normality, equality of error, independence of error} turned out to be false? • What is an expected mean square? Why is it important? • Why do we use the F test to decide whether means are equal in ANOVA?

Questions (2) • Correctly interpret ANOVA summary tables. • Find correct values of critical F from tabled values for a given test. • Suppose someone has worked out that a one-way ANOVA with 6 levels has a power of .80 for the overall F test. What does this mean? • Describe (make up) a concrete example of a one-way ANOVA where it makes sense to use an overall F test. Explain why ANOVA (not t, chi-square or something else) is the best method for the analysis.

New Distributions • So far, the normal (z) and its short, fat relative, the t distribution. • The normal has two children, chi-square ( ) and F. • Chi-square is made of the sum of v squared deviations from the unit normal. It essentially show the sampling distribution of the variance. • F is the ratio of two chi-squares.

ANOVA Assumptions • Recall we can partition total SS into between (treatment) and within (error) SS. No assumptions needed. • To conduct tests about population effects, have to make assumptions: • Within cells (treatments) error is normal. • Homogeneity of error variance. • Independent errors.

Assumptions • Normality – sampling distribution of means,variances; not bad if N is large; e.g. reaction time • Homogeneity – pooled estimate of population value. Where are means different? Assumed equal error for each. E.g., ceiling effects in training. • Independence – sampling distribution again; e.g., cheating on exam, nesting (schools, labs)

Mean Square Between Groups • Mean square = SS/df = = variance estimate. • MS between = • E(MS between) = • If there is no treatment effect, MS between = error variance. • If there is a treatment effect, MS between is bigger than error variance. (J treatments)

Mean Square Within Groups (N is total sample size and J is number of groups.) • MS within = • E(MS within) = • Expected mean square for error is . Expected mean square for treatment is same plus treatment effect: . • When there is no treatment effect, between and within estimate same thing.

Review • ANOVA makes assumptions about error for significance tests. What are the assumptions? • What might happen (why would it be a problem) if the assumption of {normality, equality of error, independence of error} turned out to be false? • What is an expected mean square? Why is it important?

The F Test (1) • Suppose • The null is equivalent to: • If the null is true, then for all j for some j The ratio of the two variance estimates will be distributed as F with J-1 and N-J degrees of freedom.

The F Test (2) This is a big deal because we can use variance estimates to test the hypothesis that any number of population means are equal. Equality of means is same as testing population treatment effect(s). For a treatment effect to be detected, F must be larger than 1. F is one-tailed in the tables which show upper tail values of F given the two df.

F Table – Critical Values

Review • Why do we use the F test to decide whether means are equal in ANOVA? • Suppose we have an ANOVA design with 3 cells and 5 people per cell. What is the critical value of F at alpha = .05?

Calculating F – 1 Way ANOVA Sums of squares (squared deviations from the mean) tell the story of variance. The simple ANOVA designs have 3 sums of squares. The total sum of squares comes from the distance of all the scores from the grand mean. This is the total; it’s all you have. The within-group or within-cell sum of squares comes from the distance of the observations to the cell means. This indicates error. The between-cells or between-groups sum of squares tells of the distance of the cell means from the grand mean. This indicates IV effects.

Computational Example: Caffeine on Test Scores

Total Sum of Squares

Within Sum of Squares

Between Sum of Squares

ANOVA Source (Summary) Table

ANOVA Summary • Calculate SS (total, between, within) • Each SS has associated df to calculate MS • F is ratio of MSb to MSw • Compare obtained F (12.5) to critical value (3.89). Significant if obtained F is larger than critical. • One-tailed test makes sense for F.

Review • Suppose we have 4 groups and 10 people per group. We find that SSB = 60 and SSW = 40. Construct an ANOVA summary table and test for significance of the overall effect.

ANOVA Descriptive Stats • Because SStot = SSb+SSw we can figure proportion of total variance due to treatment. • Proportion of total variance due to treatment is: • R2= SSb/SStot. • Varies from 0 (no effect) to 1 (no error). • Sample value is biased (too large).

Estimating Power • Power for what? For one-way ANOVA, power usually means for the overall F, i.e., at least 1 group mean is different from the others. • Howell uses noncentral F for sample size calculation. Where k is the number of treatment goups; n is sample size per group. Variance of error is MSE in the population (variance of DV within cells). Mu(j) are treatment means; mu is grand mean.

SAS Power calculation • SAS will compute sample size requirements for a given scenario. • You input the expected means and a common (within cell) standard deviation, (along with alpha and desired power) and it will tell you the sample size you need.

run; SAS Input ********************************************************** * Power computation example from Howell, 2010, p. 350. * Note the standard deviation is the square root of the * provided MSE: sqrt(240.35) = ~ 15.5. **********************************************************; procpower ; onewayanova groupmeans = 34 | 50.8 | 60.33 | 48.5 | 38.1 stddev = 15.5 alpha = 0.05 npergroup = . power = .8;

SAS Output The POWER Procedure Overall F Test for One-Way ANOVA Fixed Scenario Elements Method Exact Alpha 0.05 Group Means 34 50.8 60.33 48.5 38.1 Standard Deviation 15.5 Nominal Power 0.8 Computed N Per Group Actual N Per Power Group 0.831 8

Review • Suppose someone has worked out that a one-way ANOVA with 6 levels has a power of .80 for the overall F test. What does this mean? • Describe (make up) a concrete example of a one-way ANOVA where it makes sense to use an overall F test. Explain why ANOVA (not t, chi-square or something else) is the best method for the analysis.

Understanding ANOVA and F-Test in Statistics