720 likes | 735 Views
Experimental Statistics - week 5. Chapter 9: Multiple Comparisons Chapter 15: Randomized Complete Block Design (15.3). PC SAS on Campus. Library BIC Student Center. SAS Learning Edition $125. http://support.sas.com/rnd/le/index.html. 1 -Factor ANOVA Model.
E N D
Experimental Statistics - week 5 Chapter 9: Multiple Comparisons Chapter 15: Randomized Complete Block Design (15.3)
PC SAS on Campus Library BIC Student Center SAS Learning Edition $125 http://support.sas.com/rnd/le/index.html
1-Factor ANOVA Model yij = mi+ eij mean for ith treatment unexplained part
1-Factor ANOVA Model yij = mi+ eij or observed data mean for ith treatment unexplained part
1-Factor ANOVA Model yij = mi+ eij or yij = m + ai+ eij observed data mean for ith treatment unexplained part
1-Factor ANOVA Model yij = mi+ eij or yij = m + ai+ eij observed data
1-Factor ANOVA Model yij = mi+ eij or yij = m + ai+ eij mean for ith treatment
1-Factor ANOVA Model yij = mi+ eij or yij = m + ai+ eij unexplained part
In words: TSS(total SS) = total sample variability among yijvalues SSB(SS “between”) = variability explained by differences in group means SSW(SS “within”) = unexplained variability (within groups)
Analysis of Variance Table Note:unequal sample sizes allowed
CAR DATA Example For this analysis, 5 gasoline types (A - E) were to be tested. Twenty carswere selected for testing and were assigned randomly to the groups (i.e. the gasoline types). Thus, in the analysis, each gasoline type was tested on 4 cars. A performance-based octane reading was obtained for each car,and the question is whether the gasolines differ with respect to this octanereading. A 91.7 91.2 90.9 90.6 B 91.7 91.9 90.9 90.9 C 92.4 91.2 91.6 91.0 D 91.8 92.2 92.0 91.4 E 93.1 92.9 92.4 92.4
Problem 1. Descriptive Statistics for CAR Data The MEANS Procedure Analysis Variable : octane Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 91.7100000 0.7062876 90.6000000 93.1000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Problem 3. Descriptive Statistics by Gasoline ------------------------------------ gas=A ------------------------------------- The MEANS Procedure Analysis Variable : octane Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 91.1000000 0.4690416 90.6000000 91.7000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ ------------------------------------ gas=B ------------------------------------- Analysis Variable : octane Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 91.3500000 0.5259911 90.9000000 91.9000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ ------------------------------------ gas=C ------------------------------------- Analysis Variable : octane Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 91.5500000 0.6191392 91.0000000 92.4000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ ------------------------------------ gas=D ------------------------------------- Analysis Variable : octane Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 91.8500000 0.3415650 91.4000000 92.2000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ ------------------------------------ gas=E ------------------------------------- The MEANS Procedure Analysis Variable : octane Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 92.7000000 0.3559026 92.4000000 93.1000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Gasoline Example - Completely Randomized Design -- All 5 Gasolines The GLM Procedure Dependent Variable: octane Sum of Source DF Squares Mean Square F Value Pr > F Model 4 6.10800000 1.52700000 6.80 0.0025 Error 15 3.37000000 0.22466667 Corrected Total 19 9.47800000 R-Square Coeff Var Root MSE octane Mean 0.644440 0.516836 0.473990 91.71000 Source DF Type I SS Mean Square F Value Pr > F gas 4 6.10800000 1.52700000 6.80 0.0025
Problem 6. 1-factor ANOVA forfirst 3 GAS Types The GLM Procedure Dependent Variable: octane Sum of Source DF Squares Mean Square F Value Pr > F Model 2 0.40666667 0.20333333 0.69 0.5248 Error 9 2.64000000 0.29333333 Corrected Total 11 3.04666667 R-Square Coeff Var Root MSE octane Mean 0.133479 0.592996 0.541603 91.33333 Source DF Type I SS Mean Square F Value Pr > F gas 2 0.40666667 0.20333333 0.69 0.5248
Problem 3. Descriptive Statistics by Gasoline ------------------------------------ gas=A ------------------------------------- The MEANS Procedure Analysis Variable : octane Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 91.1000000 0.4690416 90.6000000 91.7000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ ------------------------------------ gas=B ------------------------------------- Analysis Variable : octane Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 91.3500000 0.5259911 90.9000000 91.9000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ ------------------------------------ gas=C ------------------------------------- Analysis Variable : octane Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 91.5500000 0.6191392 91.0000000 92.4000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ ------------------------------------ gas=D ------------------------------------- Analysis Variable : octane Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 91.8500000 0.3415650 91.4000000 92.2000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ ------------------------------------ gas=E ------------------------------------- The MEANS Procedure Analysis Variable : octane Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 92.7000000 0.3559026 92.4000000 93.1000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Question 1: Which gasolines are different? Question 2: Why didn’t we just do t-tests to compare all combinations of gasolines? i.e. compare A vs B A vs C . . . D vs E
Simulation: i.e. using computer to generate data under certain known conditions and observing the outcomes
Setting: Normal population with: m = 20 and s = 5 Simulation Experiment: Generate 2 samples of size n = 10 from this population and run t-test to compare sample means. i.e test: Question: What do we expect to happen?
Simulation Results: t-test procedure: (a = .05) Reject H0 if | t | > 2.101 1 21.6 4.0 t = .235 so we do not reject H0 2 21.1 5.4
Now - suppose we obtain 10 samples and test Simulationresults: 1 21.6 4.0 2 21.1 5.4 3 20.9 6.2 4 18.3 3.2 5 23.1 6.7 6 18.6 4.8 7 22.2 5.8 8 19.1 5.9 9 20.3 2.5 10 19.3 3.2 Note: Comparing means 4 vs 5 we get t = 2.33 What does this mean?
Suppose we run all possible t-tests at significance level a = .05to compare 10 sample means of size n=10 from this population - it can be shown that there is a 63% chance that at least one pair of means will be declared significantly different from each other F-test in ANOVA controls overall significance level.
Probability of finding at least 2 of k means significantly different using multiple t-=tests at the a = .05 level when all means are actually equal. k Prob. 2 .05 3 .13 4 .21 5 .29 10 .63 20 .92
Fisher’s Least Significant Difference (LSD) Protected LSD:Preceded by an F-test for overall significance. Only use the LSD if F is significant. X Unprotected:Not preceded by an F-test (like individual t-tests).
Gasoline Example - Completely Randomized Design -- All 5 Gasolines The GLM Procedure Dependent Variable: octane Sum of Source DF Squares Mean Square F Value Pr > F Model 4 6.10800000 1.52700000 6.80 0.0025 Error 15 3.37000000 0.22466667 Corrected Total 19 9.47800000 R-Square Coeff Var Root MSE octane Mean 0.644440 0.516836 0.473990 91.71000 Source DF Type I SS Mean Square F Value Pr > F gas 4 6.10800000 1.52700000 6.80 0.0025
Problem 3. Descriptive Statistics by Gasoline ------------------------------------ gas=A ------------------------------------- The MEANS Procedure Analysis Variable : octane Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 91.1000000 0.4690416 90.6000000 91.7000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ ------------------------------------ gas=B ------------------------------------- Analysis Variable : octane Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 91.3500000 0.5259911 90.9000000 91.9000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ ------------------------------------ gas=C ------------------------------------- Analysis Variable : octane Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 91.5500000 0.6191392 91.0000000 92.4000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ ------------------------------------ gas=D ------------------------------------- Analysis Variable : octane Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 91.8500000 0.3415650 91.4000000 92.2000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ ------------------------------------ gas=E ------------------------------------- The MEANS Procedure Analysis Variable : octane Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 92.7000000 0.3559026 92.4000000 93.1000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
PROC GLM; (or ANOVA) CLASS gas; MODEL octane=gas; TITLE 'Gasoline Example - Completely Randomized Design'; MEANS gas/lsd; RUN;
Gasoline Example - Completely Randomized Design The GLM Procedure t Tests (LSD) for octane NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 15 Error Mean Square 0.224667 Critical Value of t 2.13145 Least Significant Difference 0.7144 Means with the same letter are not significantly different. t Grouping Mean N gas A 92.7000 4 E B 91.8500 4 D B C B 91.5500 4 C C B C B 91.3500 4 B C C 91.1000 4 A
Gasoline Example - Completely Randomized Design The GLM Procedure t Tests (LSD) for octane NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 15 Error Mean Square 0.224667 Critical Value of t 2.13145 Least Significant Difference 0.7144 Means with the same letter are not significantly different. t Grouping Mean N gas A 92.7000 4 E B 91.8500 4 D B C B 91.5500 4 C C B C B 91.3500 4 B C C 91.1000 4 A
Bonferroni Multiple Comparisons (BSD) Number of Pairwise Comparisons
PROC GLM; (or ANOVA) CLASS gas; MODEL octane=gas; TITLE 'Gasoline Example - Completely Randomized Design'; MEANS gas/bon; RUN;
Gasoline Example - Completely Randomized Design The GLM Procedure Bonferroni (Dunn) t Tests for octane NOTE: This test controls the Type I experimentwise error rate, but it generally has a higher Type II error rate than REGWQ. Alpha 0.05 Error Degrees of Freedom 15 Error Mean Square 0.224667 Critical Value of t 3.28604 Minimum Significant Difference 1.1014 Means with the same letter are not significantly different. Bon Grouping Mean N gas A 92.7000 4 E A B A 91.8500 4 D B B 91.5500 4 C B B 91.3500 4 B B B 91.1000 4 A
Gasoline Example - Completely Randomized Design The GLM Procedure Bonferroni (Dunn) t Tests for octane NOTE: This test controls the Type I experimentwise error rate, but it generally has a higher Type II error rate than REGWQ. Alpha 0.05 Error Degrees of Freedom 15 Error Mean Square 0.224667 Critical Value of t 3.28604 Minimum Significant Difference 1.1014 Means with the same letter are not significantly different. Bon Grouping Mean N gas A 92.7000 4 E A B A 91.8500 4 D B B 91.5500 4 C B B 91.3500 4 B B B 91.1000 4 A
Extracted from From Ex. 8.2, page 390-391 3 Methods for Reducing Hostility 12 students displaying similar hostility were randomly assigned to 3 treatment methods. Scores (HLT) at end of study recorded. Method 1 96 79 91 85 Method 2 77 76 74 73 Method 3 66 73 69 66 Test:
ANOVA Table Output - hostility data - calculations done in class Source SS df MS F p-value Between 767.17 2 383.58 16.7 <.001 samples Within 205.74 9 22.86 samples Totals 972.91
Extracted from From Ex. 8.2, page 390-391 3 Methods for Reducing Hostility 12 students displaying similar hostility were randomly assigned to 3 treatment methods. Scores (HLT) at end of study recorded. Method 1 96 79 91 85 Method 2 77 76 74 73 Method 3 66 73 69 66 Test:
Hostility Data - Completely Randomized Design The GLM Procedure t Tests (LSD) for score NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 9 Error Mean Square 22.86111 Critical Value of t 2.26216 Least Significant Difference 7.6482 Means with the same letter are not significantly different. t Grouping Mean N method A 87.750 4 M1 B 75.000 4 M2 B B 68.500 4 M3
Hostility Data - Completely Randomized Design The GLM Procedure Bonferroni (Dunn) t Tests for score NOTE: This test controls the Type I experimentwise error rate, but it generally has a higher Type II error rate than REGWQ. Alpha 0.05 Error Degrees of Freedom 9 Error Mean Square 22.86111 Critical Value of t 2.93332 Minimum Significant Difference 9.9173 Means with the same letter are not significantly different. Bon Grouping Mean N method A 87.750 4 M1 B 75.000 4 M2 B B 68.500 4 M3
Some Multiple Comparison Techniquesin SAS FISHER’S LSD (LSD) BONFERRONI (BON) STUDENT-NEWMAN-KEULS (SNK) DUNCAN DUNNETT RYAN-EINOT-GABRIEL-WELCH (REGWQ) SCHEFFE TUKEY
Balloon Data Col. 1-2 - observation number Col. 3 - color (1=pink, 2=yellow, 3=orange, 4=blue) Col. 4-7 - inflation time in seconds 1122.4 2324.6 3120.3 4419.8 5324.3 6222.2 7228.5 8225.7 9320.2 10119.6 11228.8 12424.0 13417.1 14419.3 15324.2 16115.8 17218.3 18117.5 19418.7 20322.9 21116.3 22414.0 23416.6 24218.1 25218.9 26416.0 27220.1 28322.5 29316.0 30119.3 31115.9 32320.3
ANOVA --- Balloon Data General Linear Models Procedure Dependent Variable: TIME Sum of Mean Source DF Squares Square F Value Pr > F Model 3 126.15125000 42.05041667 3.85 0.0200 Error 28 305.64750000 10.91598214 Corrected Total 31 431.79875000 R-Square C.V. Root MSE TIME Mean 0.292153 16.31069 3.3039343 20.256250 Mean Source DF Type I SS Square F Value Pr > F Color 3 126.15125000 42.05041667 3.85 0.0200