580 likes | 768 Views
Experimental Statistics - week 6. Chapter 15: Randomized Complete Block Design (15.3) Factorial Models (15.5). Caution: Chapter 15 introduces some new notation - i.e. changes notation already defined. Recall: Sum-of-Squares Identity 1-Factor ANOVA.
E N D
Experimental Statistics - week 6 Chapter 15: Randomized Complete Block Design (15.3) Factorial Models (15.5)
Caution: Chapter 15 introduces some new notation - i.e. changes notation already defined
Recall: Sum-of-Squares Identity 1-Factor ANOVA In words: Total SS = SS between samples + within sample SS
Recall: Sum-of-Squares Identity 1-Factor ANOVA - new notation for Chapter 15
Recall: Sum-of-Squares Identity 1-Factor ANOVA - new notation for Chapter 15
Recall: Sum-of-Squares Identity 1-Factor ANOVA - new notation for Chapter 15 In words: Total SS = SS for “treatments” + SS for “error”
Revised ANOVA Table for 1-Factor ANOVA(Ch. 15 terminology - p.857) Source SS df MS F Treatments SST t -1 Error SSE N -t Total TSS N -1
Recall CRD Model for Gasoline Data yij = mi+ eij or yij = m + ai+ eij observed octane mean for ith gasoline unexplained part -- car-to-car differences -- temperature -- etc.
Gasoline Data Question: What if car differences are obscuring gasoline differences? Similar to diet t-test example: Recall: person-to-person differences obscured effect of diet
Possible Alternative Design: Test all 5 gasolines on the same car - in essence we test the gasoline effect directly and remove effect of car-to-car variation Question: How would you randomize an experiment with 4 cars?
Blocking an Experiment - dividing the observations into groups (called blocks) where the observations in each block are collected under relatively similar conditions - comparisons can many times be made more precisely this way
Terminology is based on Agricultural Experiments Consider the problem of testingfertilizers on a crop - t fertilizers - n observations on each
Completely Randomized Design B A C A B B A C A C C B C A t = 3 fertilizers n = 5 replications B - randomly select 15 plots - randomly assign fertilizers to the 15 plots
Randomized Complete Block Strategy A | C | B B | A | C C | B | A A | B | C t = 3 fertilizers C | A | B - select 5 “blocks” - randomly assign the 3 treatments to each block Note:The 3 “plots” within each block are similar - similar soil type, sun, water, etc
Randomized Complete Block Design Randomly assign each treatment once to every block Car Example Car 1: randomly assign each gas to this car Car 2: .... etc. Agricultural Example Randomly assign each fertilizer to one of the 3 plots within each block
Model For Randomized Complete Block (RCB) Design yij = m + ai+ bj+ eij effect of ith treatment effect of jth block unexplained error (gasoline) (car) -- temperature -- etc.
Back to CAR data: Suppose that instead of 20 cars, there were only 4 cars, and we tested each gasoline on each car. “Restructured” CAR Data Car Old Data Format 1 2 3 4 A 91.7 91.2 90.9 90.6 B 91.7 91.9 90.9 90.9 C 92.4 91.2 91.6 91.0 D 91.8 92.2 92.0 91.4 E 93.1 92.9 92.4 92.4 A 91.7 91.2 90.9 90.6 B 91.7 91.9 90.9 90.9 C 92.4 91.2 91.6 91.0 D 91.8 92.2 92.0 91.4 E 93.1 92.9 92.4 92.4 Gas Gas
Back to CAR data: Suppose that instead of 20 cars, there were only 4 cars, and we tested each gasoline on each car. “Restructured” CAR Data Car Old Data Format 1 2 3 4 A 91.7 91.2 90.9 90.6 B 91.7 91.9 90.9 90.9 C 92.4 91.2 91.6 91.0 D 91.8 92.2 92.0 91.4 E 93.1 92.9 92.4 92.4 A 91.7 91.2 90.9 90.6 B 91.7 91.9 90.9 90.9 C 92.4 91.2 91.6 91.0 D 91.8 92.2 92.0 91.4 E 93.1 92.9 92.4 92.4 Gas Gas
Recall: Sum-of-Squares Identity 1-Factor ANOVA - new notation for Chapter 15 In words: Total SS = SS for “treatments” + SS for “error”
A New Sum-of-Squares Identity In words: Total SS = SS for treatments + SS for blocks + SS for error
Hypotheses: To test for treatmenteffects - i.e. gas differences we test To test for block effects - i.e. car differences (not usually the research hypothesis) we test
Randomized Complete Block Design ANOVA Table Source SS df MS F Treatments SST t -1 Blocks SSB Error SSE Total TSS bt -1 See page 866
“Restructured” CAR Data - SAS Format A B1 91.7 A B2 91.2 A B3 90.9 A B4 90.6 B B1 91.7 B B2 91.9 B B3 90.9 B B4 90.9 C B1 92.4 C B2 91.2 C B3 91.6 C B4 91.0 D B1 91.8 D B2 92.2 D B3 92.0 D B4 91.4 E B1 93.1 E B2 92.9 E B3 92.4 E B4 92.4 The first variable (A - E) indicates gas as it did with the Completely Randomized Design. The second variable (B1 - B4) indicates car.
SAS file - Randomized Complete Block Design for CAR Data INPUT gas$ block$ octane; PROC GLM; CLASS gas block; MODEL octane=gas block; TITLE 'Gasoline Example -Randomized Complete Block Design'; MEANS gas/LSD; RUN;
CRD ANOVA Table Output - car data Source SS df MS F p-value Gas 6.108 4 1.527 6.80 0.0025 (treatments) Error 3.370 15 0.225 Totals 9.478 19
RCB ANOVA Table Output - car data Source SS df MS F p-value Gas 6.108 4 1.527 15.58 0.0001 (treatments) Cars 2.194 3 0.731 7.46 0.0044 (blocks) Error 1.176 12 0.098 Totals 9.478 19
SAS Output -- RCB CAR Data Dependent Variable: OCTANE Sum of Mean Source DF Squares Square F Value Pr > F Model 7 8.30200000 1.18600000 12.10 0.0001 Error 12 1.17600000 0.09800000 Corrected Total 19 9.47800000 R-Square C.V. Root MSE OCTANE Mean 0.875923 0.341347 0.3130495 91.710000 Source DF Anova SS Mean Square F Value Pr > F GAS 4 6.10800000 1.52700000 15.58 0.0001 BLOCK 3 2.19400000 0.73133333 7.46 0.0044
CAR Data -- LSD Results CRD Analysis t Grouping Mean N gas A 92.7000 4 E B 91.8500 4 D B C B 91.5500 4 C C B C B 91.3500 4 B C C 91.1000 4 A RCB Analysis t Grouping Mean N gas A 92.7000 4 E B 91.8500 4 D B C B 91.5500 4 C C C 91.3500 4 B C C 91.1000 4 A
CAR Data -- Bonferroni Results CRD Analysis Bon Grouping Mean N gas A 92.7000 4 E A B A 91.8500 4 D B B 91.5500 4 C B B 91.3500 4 B B B 91.1000 4 A RCB Analysis Bon Grouping Mean N gas A 92.7000 4 E B 91.8500 4 D B B 91.5500 4 C B B 91.3500 4 B B B 91.1000 4 A
STIMULUS EXAMPLE: Personal computer presents stimulus, and person responds. Study of how RESPONSE TIME is effected by a WARNING given prior to the stimulus: 2-factors of interest: Warning Type --- auditory or visual Time between warning and stimulus -- 5 sec, 10 sec, or 15 sec.
Auditory Visual .204 .257 .170 .279 .181 .269 .167 .283 .182 .235 .187 .260 .202 .256 .198 .281 .236 .258 5 sec WarningTime 10 sec 15 sec Note:“Sort of like RCB” -- what is the difference? Question: How would you randomize? - 18 subjects - 1 subject
Stimulus Data Observed data (response time) Replication Level of Factor A Level of Factor B (warning type) (time)
2-Factor ANOVA Data Factor B FactorA
A Possible Model for STIMULUS Data i = type, j = time Note: so according to this model Note: The model assumes that the difference between types is the same for all times
Hypothetical Cell Means Auditory Visual 5 10 15
Similarly i.e. the model says We may not want to make these assumptions!!
Hypothetical Cell Means Auditory Visual 5 10 15 Auditory Visual 5 10 15
Sum-of-Squares Breakdown (2-factor ANOVA) SSA SSB SSAB SSE
2-Factor ANOVA Table(2-Factor Completely Randomized Design) Source SS df MS F Main Effects A SSA a -1 B SSB b- 1 Interaction AB SSAB (a -1)(b- 1) Error SSE ab(n -1) Total TSS abn -1 See page 900
***************************************************** * Two-Way ANOVA using PROC GLM * * showing Interaction Plots * ***************************************************** ; data stimulus; input type$ time response; datalines; A 5 .204 A 5 .170 A 5 .181 A 10 .167 A 10 .182 A 10 .187 A 15 .202 A 15 .198 A 15 .236 V 5 .257 V 5 .279 V 5 .269 V 10 .283 V 10 .235 V 10 .260 V 15 .256 V 15 .281 V 15 .258 ; PROCGLM; CLASSES type time; MODEL response=type time type*time; TITLE‘Stimulus Data'; run; PROCSORT;BY type time; PROCMEANS; BY type time; OUTPUTOUT=cells MEAN=response; RUN; * OUTPUT MEAN INTERACTION PLOTS; PROCGPLOT; PLOT response*type=time; SYMBOL1V=CIRCLE I=JOIN C=BLACK; SYMBOL2V=DOT I=JOIN C=BLACK; symbol3V=BOX I=JOIN C=BLACK; RUN; PROCGPLOT; PLOT response*time=type; SYMBOL1V=CIRCLE I=JOIN C=BLACK; SYMBOL2V=DOT I=JOIN C=BLACK; RUN; PROCPRINT; RUN;
***************************************************** * Two-Way ANOVA using PROC GLM * showing Interaction Plots ***************************************************** ; data stimulus; input type$ time response; datalines; A 5 .204 A 5 .170 A 5 .181 A 10 .167 A 10 .182 A 10 .187 A 15 .202 A 15 .198 A 15 .236 V 5 .257 V 5 .279 V 5 .269 V 10 .283 V 10 .235 V 10 .260 V 15 .256 V 15 .281 V 15 .258 ; PROCGLM; CLASSES type time; MODEL response=type time type*time; TITLE‘Stimulus Data'; run;
GLM Output Stimulus Data The GLM Procedure Dependent Variable: response Sum of Source DF Squares Mean Square F Value Pr > F Model 5 0.02554894 0.00510979 17.66 <.0001 Error 12 0.00347200 0.00028933 Corrected Total 17 0.02902094 R-Square Coeff Var Root MSE response Mean 0.880362 7.458622 0.017010 0.228056 Source DF Type I SS Mean Square F Value Pr > F type 1 0.02354450 0.02354450 81.38 <.0001 time 2 0.00115811 0.00057906 2.00 0.1778 type*time 2 0.00084633 0.00042317 1.46 0.2701
PROCSORT;BY type time; PROCMEANS; BY type time; OUTPUTOUT=cells MEAN=response; RUN;