260 likes | 296 Views
Learn about ANOVA and its uses in hypothesis testing, regression analysis, and mean comparison. Understand sources of variation, degrees of freedom, variance, F-test statistic, and how to create and interpret an ANOVA table.
E N D
Analysis of Variance Introduction
Analysis of Variance • The Analysis of Variance is abbreviated as ANOVA • Used for hypothesis testing in • Simple Regression • Multiple Regression • Comparison of Means
Sources • There is variation anytime that all of the data values are not identical • This variation can come from different sources such as the model or the factor • There is always the left-over variation that can’t be explained by any of the other sources. This source is called the error
Variation • Variation is the sum of squares of the deviations of the values from the mean of those values • As long as the values are not identical, there will be variation • Abbreviated as SS for Sum of Squares
Degrees of Freedom • The degrees of freedom are the number of values that are free to vary once certain parameters have been established • Usually, this is one less than the sample size, but in general, it’s the number of values minus the number of parameters being estimated • Abbreviated as df
Variance • The sample variance is the average squared deviation from the mean • Found by dividing the variation by the degrees of freedom • Variance = Variation / df • Abbreviated as MS for Mean of the Squares • MS = SS / df
F • F is the F test statistic • There will be an F test statistic for each source except for the error and total • F is the ratio of two sample variances • The MS column contains variances • The F test statistic for each source is the MS for that row divided by the MS of the error row
F • F requires a pair of degrees of freedom, one for the numerator and one for the denominator • The numerator df is the df for the source • The denominator df is the df for the error row • F is always a right tail test
The ANOVA Table • The ANOVA table is composed of rows, each row represents one source of variation • For each source of variation … • The variation is in the SS column • The degrees of freedom is in the df column • The variance is in the MS column • The MS value is found by dividing the SS by the df
ANOVA Table • The complete ANOVA table can be generated by most statistical packages and spreadsheets • We’ll concentrate on understanding how the table works rather than the formulas for the variations
The ANOVA Table The explained* variation has different names depending on the particular type of ANOVA problem
Example 1 The Sum of Squares and Degrees of Freedom are given. Complete the table.
Example 1 – Find Totals Add the SS and df columns to get the totals.
Example 1 – Find MS Divide SS by df to get MS.
Example 1 – Find F F = 6.30 / 4.50 = 1.4
Notes about the ANOVA • The MS(Total) isn’t actually part of the ANOVA table, but it represents the sample variance of the response variable, so it’s useful to find • The total df is one less than the sample size • You would either need to find a Critical F value or the p-value to finish the hypothesis test
Example 2 Complete the table
Example 2 – Step 1 SS / df = MS, so 106.6 / df = 21.32. Solving for df gives df = 5. F = MS(Source) / MS(Error), so 2.60 = 21.32 / MS. Solving gives MS = 8.20.
Example 2 – Step 2 SS / df = MS, so SS / 26 = 8.20. Solving for SS gives SS = 213.2. The total df is the sum of the other df, so 5 + 26 = 31.
Example 2 – Step 3 Find the total SS by adding the 106.6 + 213.2 = 319.8
Example 2 – Step 4 Find the MS(Total) by dividing SS by df. 319.8 / 31 = 10.32
Example 2 – Notes • Since there are 31 df, the sample size was 32 • Since the sample variance was 10.32 and the standard deviation is the square root of the variance, the sample standard deviation is 3.21
Example 3 The sample size is n = 20. Work this one out on your own!
Example 3 - Solution How did you do?