820 likes | 1.58k Views
Introduction to ANOVA. Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24. Where are we?. Concluded material on t-tests & introduction of hypothesis testing Good conceptual & computational foundation for more advanced inferential statistics
E N D
Introduction to ANOVA Introduction to Statistics Chapter 13 Apr 13-15, 2010 Classes #23-24
Where are we? • Concluded material on t-tests & introduction of hypothesis testing • Good conceptual & computational foundation for more advanced inferential statistics • Turn now to ANOVA – more complex statistic • Good preparation for more complex situations
A slightly different type of research design… • You have three different groups of people, and want to compare the outcomes of these three groups
What are you expecting? • 1. you might be predicting specific differences (e.g., group A will have a higher score than both group B and group C) • 2. you might be predicting a difference somewhere
What to do? • If you have a concrete idea of where the differences lie, based on theory and previous research, you can conduct planned comparisons • Directly test just for where you think the differences are
What to do? • If you think there’s a difference somewhere, but you want to be able to make all the possible comparisons to see where it might be, you can’t use this strategy
Why not? • With each statistically significant test, there is a p probability that that result was just due to chance, if the null hypothesis was correct • more tests, more likelihood of Type 1 error
What to do? • Need a new test, to make comparisons across all levels of predictor variable • ANOVA • Stands for analysis of variance • Just like t-tests, different types • Will discuss 3 types
First type of ANOVA • Comparable to independent samples t-test • Used with one predictor variable • Used with continuous criterion variable • Used with between-subjects design
The idea behind ANOVA • Key question = where does variability lie? • Two sources: • Within people in each group or condition • Between groups or conditions • If research hypothesis is true, where will there be the most variability? • What if the null hypothesis is true?
Illustrate Logic of ANOVA We want to evaluate the effects of 4 different drugs on participants level of depression as measured by the Beck Depression Inventory. An ANOVA allows us to quantify how far apart the sample means must be before we are no longer willing to say they are all “approximately” equal.
Introduction to ANOVA • ANOVA – the ANalysis Of Variance • (1) Inferential hypothesis-testing procedure • (2) Tremendous advantage over t-tests: • used to compare MULTIPLE (two or more) treatments • (3) Provides researchers with much greater flexibility in design and analysis of experiments
Introduction to ANOVA • ANOVA – the ANalysis Of Variance • (4) Multiple Forms – In Chapter 13 we’ll look at the simplest: Single-factor, independent measures ANOVA • (a) factor: new name for the independent variable • (b) independent measures: separate sample for each treatment • (c) level: the individual treatment conditions that make up a factor
Factors and Levels • Can be multiple factors (IV’s) and levels (variations) • Expressed as factors x levels • How many factors? • How many levels?
Example of ANOVA • Four different test times (8am, 12pm, 4pm, and 8pm) • Does time of test affect scores? • ANOVA uses variance to assess differences among the sample means
The Logic of ANOVA • (1) First, determine total variability for data set
The Logic of ANOVA • (2) Next, break this variability into two components: • (a) Between-Treatments variance – two sources: • Treatment Effect: Differences are caused by treatments. • Chance: Differences simply due to chance. • (b) Within-Treatments variance – one source: • Chance: Differences simply due to chance.
Partitioning variance • Math behind ANOVA = • Variance between groups (MS between) • Divided by • Variance within groups (MS within, or MS error; like pooled variance from independent samples t-test) • This ratio referred to as F value
Forming an F-Ratio • (3) Finally, determine the variance due to the treatments alone by forming an F-Ratio. F = Variance Between-Treatments Variance Within-Treatments Or in terms of sources… F = Treatment Effect + Differences due to Chance Differences due to Chance • If no treatment effect exists, F = 1.00 • If there IS some treatment effect, F > 1.00( but not automatically statistically significant)
Thinking about F • Variance can’t be negative F is always positive • If F = 1, same amount of variance between groups as within groups keep null hypothesis • If F > 1, more variance between groups than within groups if F large enough, reject null
How big is big enough? • Just like all statistics, have critical F value • We will be using Table B.4 (page 590-592) • Size of F is dependent on: • Significance value (table uses either .01 or .05) • Whether one-tailed or two-tailed test • Number of groups comparing ( numerator df = number of groups – 1) • Number of participants ( denominator df = sum of df across all groups, or sample size – number of groups)
Error Term • Error due to chance • Does the treatment effect (difference among means) produce greater variability between groups than that expected by chance? • The denominator in the F ratio
New Terms and Symbols • k = number of treatment conditions (levels and factors). For independent-measures study, k = # of separate samples. • n = number of scores in a treatment condition • N = total number of scores in whole study (N = nk) • T = sum of scores for each treatment condition • G = sum of all scores in the study (Grand Total)
Hypothesis Testing with ANOVA (4 steps) • STEP 1: State the Hypothesis • H0: m1 = m2 = m3 = mk (k = number of factor levels) • H1 : µ1 ≠ µ2 ≠ µ3 ≠ µ4 (At least one m is different from the others)
k = number of factor levels n = number of scores in a treatment condition N = total number of scores in whole study (N = nk) T = sum of scores for each treatment condition G = sum of all scores in the study (Grand Total) Hypothesis Testing with ANOVA • STEP 2: Locate the Critical region • a = .05 • Calculate dfbetween = k – 1 • Calculate dfwithin = N-k • Calculate dftotal = N-1 • Critical F will be provided for you • dfbetween + dfwithin = dftotal (always!) • Begin to fill in the Source Table (ANOVA Table)
Hypothesis Testing with ANOVASTEP 2 continued… Basic ANOVA Table Source SS df MS F Between SSbetween k-1 MSbetween F = Fcalculated Within SSwithin N-k MSwithin Total SStotal N-1
Hypothesis Testing with ANOVA • STEP 3:Collect Data and Compute Sample Statistics • SSbetween = (T2/n) – (G2/N) • SSwithin = SS inside each treatment = (SS1+SS2+SS3+...+SSk) • SStotal = X2 – (G2/N) or SSbetween + SSwithin • MSbetween = SSbetween/dfbetween • MSwithin = SSwithin/dfwithin • F = MSbetween/MSwithin • Fill in source table (ANOVA Table) *note: SSbetween + SSwithin = SStotal (always!) n = # of scores in a tx condition N = total # of scores in whole study T = sum of scores for each tx condition G = sum of all scores in the study (Grand Total)
Hypothesis Testing with ANOVA • STEP 4:Make a Decision • Given the Critical F-value (Fcritical ) - which will be provided - decide whether or not to reject the null. • Fobtained < Fcritical --> Fail to reject Ho. • Fobtained > Fcritical --> Reject Ho. Use Appendix B.4 (page 592-594) to find Fcritical Bold-Faced = Fcriticalfor a = 0.01 Light-Faced Fcritical for a = 0.05 df-numerator = df-between df-denominator = df-within
Hypothesis Testing with ANOVA:Need to Organize Lots of Calculations Basic ANOVA Table Source SS df MS F Between SSbetween k-1 MSbetween F = Fcalculated Within SSwithin N-k MSwithin Total SStotal N-1
Example 1 A researcher is interested in whether class time affects exam scores. There are four different class times being examined: 8am, 12pm, 4pm, and 8pm. Run an ANOVA, = .05, to see if a significant difference exists between these treatments. Null Hypothesis:H0: µ1 = µ2 = µ3 = µ4 Alternative Hypothesis: HA: µ1 ≠ µ2 ≠ µ3 ≠ µ4 (At least one µ is different from the others)
Example 1 DATA Trt. 1 Trt. 2 Trt. 3 Trt. 4 25 30 27 22 28 29 20 27 22 30 21 24 m1=25 m2=29.67 m3=22.67 m4=24.33 T1=75 T2=89 T3=68 T4=73 SS1=18 SS2=0.67 SS3=28.67 SS4=12.67 n1=3 n2=3 n3=3 n4=3 X2 = 7893 G = 305 N = 12 k = 4
Example 1 Calculations SSbetween = S(T2/n) – (G2/N) SSbetween = ((752/3)+(892/3)+(682/3)+(732/3))- (93,025/12) SSbetween=((5625/3)+(7921/3)+(4624/3)+(5329/3))-7752.083 SSbetween = (1875+2640.33+1541.33+1776.33)-7752.083 SSbetween = 7832.99 – 7752.083 SSbetween = 80.91 SSwithin = SS1+SS2+SS3+SS4 SSwithin = 18+.67+28.67+12.67 SSwithin = 60.01 SStotal = X2 – (G2/N) OR SSbetween + SSwithin SStotal = 7893-7752.083 OR 80.91+ 60.01 SStotal = 140.92
Example 1 ANOVA and Decision Source SS df MS Fcalculated Between 80.91 3 26.97 3.596 Within 60.01 8 7.50 TOTAL 140.92 11 Fcritical = 4.07 Fcalculated < Fcritical fail to reject H0 3.596 < 4.07 Fail to reject H0 Use Appendix B.4 (pp. 592-594) for Fcritical df numerator = 3 (df for between) df denominator = 8 (df for within)
Example 2 A researcher is interested in whether a new drug affects activity level of lab animals. There are three different doses being examined: low, medium, large. Run an ANOVA, = .05, to see if a significant difference exists between these doses. Null Hypothesis: Alternative:
Example 2 DATA Dose 1 (lo) Dose 2 (med) Dose 3 (hi) 0 1 5 1 3 8 3 4 6 0 1 4 1 1 7 mean1= mean2= mean3= T1= T2= T3= SS1= SS2= SS3= n1=5 n2=5 n3=5 X2 = G = N = k =
Example 2 Calculations SSbetween = S(T2/n) – (G2/N) SSbetween = SSbetween= SSbetween = SSbetween = SSbetween = SSwithin = SS1+SS2+SS3+SS4 SSwithin = SSwithin = SStotal = X2 – (G2/N) OR SSbetween + SSwithin SStotal = SStotal =
Example 2 ANOVA and Decision Source SS df MS Fcalculated Between Within TOTAL Fcritical = If Fobtained < Fcritical fail to reject H0 df numerator = (df for between) df denominator = (df for within)
Effect size • Don’t use Cohen’s d anymore… • Instead, r2 – like always, refers to amount of variance explained by knowing which group someone belongs to = SS between treatments, divided by SS total (SPSS will compute – need to check off “effect size estimation” under “options”) r2 = SS between treatments = SS between treatments SS between + SS within SS total • When computed for ANOVA, r2 frequently referred to as eta squared (h2)
There’s a difference – now, where is it? • If your F value is great than your critical value, you can reject the null hypothesis • But, because you have more than two groups, you just know there’s a difference somewhere
Post-hoc tests • These look for where the differences are
One option: Tukey Honestly Significant Difference Test (HSD) • Strategy: test computes how large the difference between two groups needs to be, based on variance and sample size; then, any two groups whose difference exceeds this are considered to be significantly different
A second option: Scheffé • More conservative than Tukey • The strategy: uses overall variance estimate (MS error) from overall ANOVA, also uses numerator df from overall ANOVA • keeps critical F higher • Uses MS between for the specific comparison between two groups at a time
Example 3 • A high school girls basketball coach is unhappy with the free throw shooting % of her team. In fact, last year her team finished last in the league in this category. This season she wants significant improvement so she has hired a sport psychologist to implement new techniques during preseason practices to determine the method to be employed to help her girls to improve. • She teaches them two focusing strategies – 1st an internal and 2nd an external strategy • She then allows half to continue using their preferred strategy while forcing the other half to change their focus
Example 3 • What is the DV? • What is the IV? • Make a diagram of this design. • How many groups are being tested?
Example 3 • For a fair comparison, during preseason competition she records only the first 15 free throws taken by each of her 16 players. The number of shots made are listed
Example 3 • Are the groups different?