450 likes | 611 Views
Experimental Statistics - week 7. Chapter 15: Factorial Models (15.5). Chapter 17: Random Effects Models. Testing Procedure Revisted 2 factor CRD Design. Step 1. Test for interaction. Step 2. (a) IF there IS NOT a significant interaction - test the main effects.
E N D
Experimental Statistics - week 7 Chapter 15: Factorial Models (15.5) Chapter 17: Random Effects Models
Testing Procedure Revisted 2 factor CRD Design Step 1. Test for interaction. Step 2. (a) IFthere IS NOT a significant interaction - test the main effects (b) IF there IS a significant interaction - compare a x b cell means (by hand) Main Idea:We are trying to determine whether the factors effect the response either individually or collectively.
Statistics 5372: Experimental Statistics Assignment Report Form Name: Data Set or Problem Description Key Results of the Analysis Conclusions in the Language of the Problem Appendices: A. Tables and Figures Cited in the Report B. SAS Log from the Final SAS Run Notes: 1. All assignments should be typed using a word processor according to the format above. 2. SAS output should consist only of tables and figures cited in the report. The report should refer to these tables and figures using numbers you assign, i.e. Table 1, etc. 3. The data should be listed somewhere in the report. (within SAS code is ok)
Auditory Visual .204 .257 .170 .279 .181 .269 .167 .283 .182 .235 .187 .260 .202 .256 .198 .281 .236 .258 5 sec WarningTime 10 sec 15 sec
Note:For balanceddesigns, i.e. for STIMULUS data .228 = (.227+.219+.239)/3 = (.192+.264)/2
Now Consider: Auditory Visual .204 .257 .170 .279 .181 .269 .167 .283 .182 .235 .187 .260 .202 .256 .198 .281 .236 .258 5 sec WarningTime 10 sec 15 sec
Balanced Experimental Designs • Every Combination of the Factor Levels has an Equal Number of Repeats • Sums of Squares • Uniquely Calculated • Usual Textbook Formulas Unbalanced Experimental Designs • Not Every Combination of the Factor Levels has an Equal Number of Repeats • Sums of Squares • Not Uniquely Calculated • Usual Textbook Formulas Are Not Valid
Unbalanced Experimental Designs Many Software Programs Cannot Properly Calculate Sums of Squares for Unbalanced Designs - they typically use “Textbook Formulas” SAS: - must Use Proc GLM, not Proc ANOVA - Type I and Type III sums-of-squares results will not generally agree - use Type III sums of squares -- analysis is closest to that for “Balanced Experiments”
Unbalanced Data -- GLM Output The GLM Procedure Dependent Variable: response Sum of Source DF Squares Mean Square F Value Pr > F Model 5 0.02547774 0.00509555 19.13 <.0001 Error 11 0.00293050 0.00026641 Corrected Total 16 0.02840824 R-Square Coeff Var Root MSE response Mean 0.896843 7.112913 0.016322 0.229471 Source DF Type I SS Mean Square F Value Pr > F type 1 0.02309680 0.02309680 86.70 <.0001 time 2 0.00122742 0.00061371 2.30 0.1460 type*time 2 0.00115351 0.00057676 2.16 0.1611 Source DF Type III SS Mean Square F Value Pr > F type 1 0.02367796 0.02367796 88.88 <.0001 time 2 0.00130085 0.00065042 2.44 0.1326 type*time 2 0.00115351 0.00057676 2.16 0.1611
Model for 3-factor Factorial Design where and also, the sum over any subscript of a 2 or 3 factor interaction is zero
Sum-of-Squares Breakdown (3-factor ANOVA)
3-Factor ANOVA Table(3-Factor Completely Randomized Design) Source SS df MS F Main Effects A SSA a -1 B SSB b - 1 C SSC c - 1 Interactions AB SSAB (a -1)(b- 1) AC SSAC (a -1)(c- 1) BC SSBC (b -1)(c- 1) ABC SSABC (a -1)(b- 1)(c- 1) Error SSE abc(n -1) Total TSS abcn -1 See page 908
Popcorn Data Response variable --% of kernels that popped • Factors • (A) Brand (3 brands) • (B) Power of Microwave (500, 600 watts) • (C) 4, 4.5 minutes • n =2replications per cell
Popcorn Data 1 500 4.5 70.3 1 500 4.5 91.0 1 500 4 72.7 1 500 4 81.9 1 600 4.5 78.7 1 600 4.5 88.7 1 600 4 74.1 1 600 4 72.1 2 500 4.5 93.4 2 500 4.5 76.3 2 500 4 45.3 2 500 4 47.6 2 600 4.5 92.2 2 600 4.5 84.7 2 600 4 66.3 2 600 4 45.7 3 500 4.5 50.1 3 500 4.5 81.5 3 500 4 51.4 3 500 4 67.7 3 600 4.5 71.5 3 600 4.5 80.0 3 600 4 64.0 3 600 4 77.0
SAS GLM Code – 3 Factor Model PROCGLM; class brand power time; MODEL percent=brand power time brand*power brand*time power*time brand*power*time; Title 'Popcorn Example -- 3-Factor ANOVA'; MEANS brand power time/LSD; RUN; The Statement MODEL percent=brand power time brand*power brand*time power*time brand*power*time can be written as MODEL percent=brand | power | time;
The GLM Procedure Dependent Variable: percent Sum of Source DF Squares Mean Square F Value Pr > F Model 11 3589.988333 326.362576 2.71 0.0503 Error 12 1444.170000 120.347500 Corrected Total 23 5034.158333 R-Square Coeff Var Root MSE percent Mean 0.713126 15.27011 10.97030 71.84167 Source DF Type I SS Mean Square F Value Pr > F brand 2 566.690833 283.345417 2.35 0.1372 power 1 180.401667 180.401667 1.50 0.2443 time 1 1545.615000 1545.615000 12.84 0.0038 brand*power 2 125.125833 62.562917 0.52 0.6074 brand*time 2 1127.672500 563.836250 4.69 0.0314 power*time 1 0.015000 0.015000 0.00 0.9913 brand*power*time 2 44.467500 22.233750 0.18 0.8336
Testing Procedure 3 factor CRD Design Step 1. Test for 3rd order interaction. IF there IS a significant 3rd order interaction - compare cell means IF there IS NOT a significant 3rd order interaction - test 2nd order interactions IF there IS a significant 2rd order interaction - compare associated cell means IF there IS NOT a sig. 2nd order interaction - test the main effects In general -- test main effects only for variables not involved in a significant 2nd or 3rd order interaction
The GLM Procedure Dependent Variable: percent Sum of Source DF Squares Mean Square F Value Pr > F Model 11 3589.988333 326.362576 2.71 0.0503 Error 12 1444.170000 120.347500 Corrected Total 23 5034.158333 R-Square Coeff Var Root MSE percent Mean 0.713126 15.27011 10.97030 71.84167 Source DF Type I SS Mean Square F Value Pr > F brand 2 566.690833 283.345417 2.35 0.1372 power 1 180.401667 180.401667 1.50 0.2443 time 1 1545.615000 1545.615000 12.84 0.0038 brand*power 2 125.125833 62.562917 0.52 0.6074 brand*time 2 1127.672500 563.836250 4.69 0.0314 power*time 1 0.015000 0.015000 0.00 0.9913 brand*power*time 2 44.467500 22.233750 0.18 0.8336 Examine brand x time cell means Examine Power main effect
To complete the analysis: 1. The F-test for Power was not significant (.2443) 2. Compare the 6 cell means plotted in interaction plot using procedure analogous to the one used for pilot plant data. PROCSORT data=one;BY brand time; PROCMEANS mean std data=one;BY brand time; OUTPUT OUT=cells MEAN=percent; Title 'Brand x Time Cell Means for Popcorn Data'; RUN; Obs brand time _TYPE_ _FREQ_ percent 1 1 4 0 4 75.200 2 1 4.5 0 4 82.175 3 2 4 0 4 51.225 4 2 4.5 0 4 86.650 5 3 4 0 4 65.025 6 3 4.5 0 4 70.775 LSD =
Popcorn Data 1 500 4.5 70.3 1 500 4.5 91.0 1 500 4 72.7 1 500 4 81.9 1 600 4.5 78.7 1 600 4.5 88.7 1 600 4 74.1 1 600 4 72.1 2 500 4.5 93.4 2 500 4.5 76.3 2 500 4 45.3 2 500 4 47.6 2 600 4.5 92.2 2 600 4.5 84.7 2 600 4 66.3 2 600 4 45.7 3 500 4.5 50.1 3 500 4.5 81.5 3 500 4 51.4 3 500 4 67.7 3 600 4.5 71.5 3 600 4.5 80.0 3 600 4 64.0 3 600 4 77.0 70.3+91.0+78.7+88.74 = 82.175 = cell mean for Brand 1 and Time 4.5
Models with Random Effects Fixed-Effects Models -- the models we’ve studied to this point -- factor levels have been specifically selected - investigator is interested in testing effects of these specific levels on the response variable Examples: -- CAR data - interested in performance of these 5 gasolines -- Pilot Plant data - interested in the specific temperatures (160o and 180o) and catalysts (C1 and C2)
Random-Effect Factor -- the factor has a large number of possible levels -- the levels used in the analysis are a random sample from the population of all possible levels - investigator wants to draw conclusions about thepopulation from which these levels were chosen (not the specific levels themselves)
Fixed Effects vs Random Effects This determination affects - the model - the hypothesis tested - the conclusions drawn - the F-tests involved (sometimes)
1-Factor Random Effects Model Assumptions:
Hypotheses: Ho: sa2 = 0 Ha: sa2 0 Ho says (considering the variability of the yij’s) : - the component of the variance due to “Factor” has zero variance -- i.e. no factor “level-to-level” variation - all of the variability observed is just unexplained subject-to-subject variation -- at least none is explained by variation due to the factor
DATA one; INPUT operator output; DATALINES; 1 175.4 1 171.7 1 173.0 1 170.5 2 168.5 2 162.7 2 165.0 2 164.1 3 170.1 3 173.4 3 175.7 3 170.7 4 175.2 4 175.7 4 180.1 4 183.7 ; PROC GLM; CLASS operator; MODEL output=operator; RANDOM operator; TITLE ‘Operator Data: One Factor Random Effects Model'; RUN; These are data from an experiment studying the effect of four operators (chosen randomly) on the output of a particular machine. t = n =
One Factor Random effects Model The GLM Procedure Dependent Variable: output Sum of Source DF Squares Mean Square F Value Pr > F Model 3 371.8718750 123.9572917 14.91 0.0002 Error 12 99.7925000 8.3160417 Corrected Total 15 471.6643750 R-Square Coeff Var Root MSE output Mean 0.788425 1.674472 2.883755 172.2188 Source DF Type I SS Mean Square F Value Pr > F operator 3 371.8718750 123.9572917 14.91 0.0002 The GLM Procedure Source Type III Expected Mean Square operator Var(Error) + 4 Var(operator)
Conclusion: We rejectHo : sa2 = 0 (p = .0002) and we conclude that there is variability due to operator Note: Multiple comparisons are not used in random effects analyses -- we are interested in whether there is variability due to operator - not interested in which operators performed better, etc. (they were randomly chosen)
RECALL: 1-Factor (Fixed-Effects) ANOVA Table (page 389) Rationale for F-test and critical region: estimates estimates + constant × - if no factor effects, we expect F≈ 1; - if factor effects, we expect F > 1
Expected Mean Squares for 1-Factor ANOVA’s (p.979) EMS Source SS df MS Fixed Effects Random Effects Treatments SST t -1 MST Error SSE t(n - 1) MSE Total TSS tn -1 Rationale for Test Statistic and Critical Region is the Same: Fixed or Random
DATA one; INPUT operator output; DATALINES; 1 175.4 1 171.7 1 173.0 1 170.5 2 168.5 2 162.7 2 165.0 2 164.1 3 170.1 3 173.4 3 175.7 3 170.7 4 175.2 4 175.7 4 180.1 4 183.7 ; PROC GLM; CLASS operator; MODEL output=operator; RANDOM operator; TITLE ‘Operator Data: One Factor Random Effects Model'; RUN; These are data from an experiment studying the effect of four operators (chosen randomly) on the output of a particular machine.
One Factor Random effects Model The GLM Procedure Dependent Variable: output Sum of Source DF Squares Mean Square F Value Pr > F Model 3 371.8718750 123.9572917 14.91 0.0002 Error 12 99.7925000 8.3160417 Corrected Total 15 471.6643750 R-Square Coeff Var Root MSE output Mean 0.788425 1.674472 2.883755 172.2188 Source DF Type I SS Mean Square F Value Pr > F operator 3 371.8718750 123.9572917 14.91 0.0002 The GLM Procedure Source Type III Expected Mean Square operator Var(Error) + 4 Var(operator)
Estimating Variance Components Solving for sa2we get: so, we estimate sa2 by Also,
Expected Mean Squares for 2-Factor ANOVA with Fixed Effects: Expected MS F-test A MSA/MSE B MSB/MSE AB MSAB/MSE Error
2-Factor Random Effects Model Assumptions: Sum-of-Squares obtained as in Fixed-Effects case
Expected Mean Squares for 2-Factor ANOVA with Random Effects: Expected MS A B AB Error
To Test: we use F = we use F = we use F = Note: Test each of these 3 hypotheses (no matter whether Ho:sab2= 0 is rejected)
2-Factor Random Effects ANOVA Table Source SS df MS F Main Effects A SSA a -1 B SSB b- 1 Interaction AB SSAB (a -1)(b- 1) Error SSE ab(n -1) Total TSS abn -1
Estimating Variance Components 2-Factor Random Effects Model (note error on page 986)
DATA one; INPUT operator filter loss; DATALINES; 1 1 16.2 1 1 16.8 1 1 17.1 1 2 16.6 1 2 16.9 1 2 16.8 . . . 4 1 14.9 4 2 15.4 4 2 14.6 4 2 15.9 4 3 16.1 4 3 15.4 4 3 15.6 ; PROC GLM; CLASS operator filter; MODEL loss=operator filter operator*filter; TITLE ‘2-Factor Random Effects Model'; RANDOM operator filter operator*filter/test; RUN; Filtration Process: Response - % material lost through filtration A – Operator (randomly selected) (a = ) B – Filter (randomly selected) (b = ) n = Operator 1 2 3 4 16.2 15.9 15.6 14.9 1 16.8 15.1 15.9 15.2 17.1 14.5 16.1 14.9 16.6 16.0 16.1 15.4 2 16.9 16.3 16.0 14.6 16.8 16.5 17.2 15.9 16.7 16.5 16.4 16.1 3 16.9 16.9 17.4 15.4 17.1 16.8 16.9 15.6 Filter
SAS Random-Effects Output (Filtration Data) 2-Factor Random Effects Model General Linear Models Procedure Dependent Variable: LOSS Sum of Mean Source DF Squares Square F Value Pr > F Model 11 16.60888889 1.50989899 8.16 0.0001 Error 24 4.44000000 0.18500000 Corrected Total 35 21.04888889 R-Square C.V. Root MSE LOSS Mean 0.789062 2.664175 0.4301163 16.144444 Source DF Type III SS Mean Square F Value Pr > F OPERATOR 3 10.31777778 3.43925926 18.59 0.0001 FILTER 2 4.63388889 2.31694444 12.52 0.0002 OPERATOR*FILTER 6 1.65722222 0.27620370 1.49 0.2229 Source Type III Expected Mean Square OPERATOR Var(Error) + 3 Var(OPERATOR*FILTER) + 9 Var(OPERATOR) FILTER Var(Error) + 3 Var(OPERATOR*FILTER) + 12 Var(FILTER) OPERATOR*FILTER Var(Error) + 3 Var(OPERATOR*FILTER)
SAS Random-Effects Output – continued “../test” option Tests of Hypotheses for Random Model Analysis of Variance Dependent Variable: LOSS Source: OPERATOR Error: MS(OPERATOR*FILTER) Denominator Denominator DF Type III MS DF MS F Value Pr > F 3 3.4392592593 6 0.2762037037 12.4519 0.0055 Source: FILTER Error: MS(OPERATOR*FILTER) Denominator Denominator DF Type III MS DF MS F Value Pr > F 2 2.3169444444 6 0.2762037037 8.3885 0.0183 Source: OPERATOR*FILTER Error: MS(Error) Denominator Denominator DF Type III MS DF MS F Value Pr > F 6 0.2762037037 24 0.185 1.4930 0.2229