280 likes | 418 Views
Analysis of Variance. Introduction. Analysis of Variance. The An alysis o f Va riance is abbreviated as ANOVA Used for hypothesis testing in Simple Regression Multiple Regression Comparison of Means. Sources. There is variation anytime that all of the data values are not identical
E N D
Analysis of Variance Introduction
Analysis of Variance • The Analysis of Variance is abbreviated as ANOVA • Used for hypothesis testing in • Simple Regression • Multiple Regression • Comparison of Means
Sources • There is variation anytime that all of the data values are not identical • This variation can come from different sources such as the model or the factor • There is always the left-over variation that can’t be explained by any of the other sources. This source is called the error
Variation • Variation is the sum of squares of the deviations of the values from the mean of those values • As long as the values are not identical, there will be variation • Abbreviated as SS for Sum of Squares
Degrees of Freedom • The degrees of freedom are the number of values that are free to vary once certain parameters have been established • Usually, this is one less than the sample size, but in general, it’s the number of values minus the number of parameters being estimated • Abbreviated as df
Variance • The sample variance is the average squared deviation from the mean • Found by dividing the variation by the degrees of freedom • Variance = Variation / df • Abbreviated as MS for Mean of the Squares • MS = SS / df
F • F is the F test statistic • There will be an F test statistic for each source except for the error and total • F is the ratio of two sample variances • The MS column contains variances • The F test statistic for each source is the MS for that row divided by the MS of the error row
F • F requires a pair of degrees of freedom, one for the numerator and one for the denominator • The numerator df is the df for the source • The denominator df is the df for the error row • F is always a right tail test
The ANOVA Table • The ANOVA table is composed of rows, each row represents one source of variation • For each source of variation … • The variation is in the SS column • The degrees of freedom is in the df column • The variance is in the MS column • The MS value is found by dividing the SS by the df
ANOVA Table • The complete ANOVA table can be generated by most statistical packages and spreadsheets • We’ll concentrate on understanding how the table works rather than the formulas for the variations
The ANOVA Table The explained* variation has different names depending on the particular type of ANOVA problem
Example 1 The Sum of Squares and Degrees of Freedom are given. Complete the table.
Example 1 – Find Totals Add the SS and df columns to get the totals.
Example 1 – Find MS Divide SS by df to get MS.
Example 1 – Find F F = 6.30 / 4.50 = 1.4
Notes about the ANOVA • The MS(Total) isn’t actually part of the ANOVA table, but it represents the sample variance of the response variable, so it’s useful to find • The total df is one less than the sample size • You would either need to find a Critical F value or the p-value to finish the hypothesis test
Example 2 Complete the table
Example 2 – Step 1 SS / df = MS, so 106.6 / df = 21.32. Solving for df gives df = 5. F = MS(Source) / MS(Error), so 2.60 = 21.32 / MS. Solving gives MS = 8.20.
Example 2 – Step 2 SS / df = MS, so SS / 26 = 8.20. Solving for SS gives SS = 213.2. The total df is the sum of the other df, so 5 + 26 = 31.
Example 2 – Step 3 Find the total SS by adding the 106.6 + 213.2 = 319.8
Example 2 – Step 4 Find the MS(Total) by dividing SS by df. 319.8 / 31 = 10.32
Example 2 – Notes • Since there are 31 df, the sample size was 32 • Since the sample variance was 10.32 and the standard deviation is the square root of the variance, the sample standard deviation is 3.21
Example 3 The sample size is n = 20. Work this one out on your own!
Example 3 - Solution How did you do?