330 likes | 430 Views
Always be contented, be grateful, be understanding and be compassionate. Blocking. We will add a factor even if it is not of interest so that the study of the prime factors is under more homogeneous conditions. This factor is called “block”. Most of time, the
E N D
Always be contented, be grateful, be understanding and be compassionate.
Blocking • We will add a factor even if it is not of interest so that • the study of the prime factors is under more homogeneous • conditions. This factor is called “block”. Most of time, the • block does not interact with prime factors. • Popular block factors are “location”, “gender” and so on. • A two-factor design with one block factor is called a • “randomized block design”.
RBD Model (Section 15.2) • A randomized (complete) block design is an experimental design for comparing t treatments (or say levels) in b blocks. Treatments are randomly assigned to units within a block and without replications. • The probability model of RBD is the same as two-way Anova model with no interaction term (so can conduct multiple comparisons for each factor separately)
For example, suppose that we are studying worker absenteeism as a function of the age of the worker, and have different levels of ages: 25-30, 40-55, and 55-60. However, a worker’s gender may also affect his/her amount of absenteeism. Even though we are not particularly concerned with the impact of gender, we want to ensure that the gender factor does not pollute our conclusions about the effect of age. Moreover, it seems unlikely that “gender” interacts with “ages”. We include “gender” as a block factor.
O/L: Example 15.1 • Goal: To compare the effects of 3 different insecticides on a variety of string beans. • Condition: It was necessary to use 4 different plots of land. • Response of interest: the number of seedlings that emerged per row.
Data: insecticide plot seedlings 1 1 56 1 2 48 1 3 66 1 4 62 2 1 83 2 2 78 2 3 94 2 4 93 3 1 80 3 2 72 3 3 83 3 4 85
Minitab>>General Linear Model, response seedlings, model insecticide & plot General Linear Model: seedings versus insectcide, plot Analysis of Variance for seedlings, using Adjusted SS for Tests Source DF Seq SS Adj SS Adj MS F P insecticide 2 1832.00 1832.00 916.00 211.38 0.000 plot 3 438.00 438.00 146.00 33.69 0.000 Error 6 26.00 26.00 4.33 Total 11 2296.00 S = 2.08167 R-Sq = 98.87% R-Sq(adj) = 97.92% Unusual Observations for seedings Obs seedings Fit SE Fit Residual St Resid 11 83.0000 86.0000 1.4720 -3.0000 -2.04 R R denotes an observation with a large standardized residual.
RBD with random blocks • We would like to apply our conclusions on a large pool of blocks • We are able to sample blocks randomly • Example: Minitab unit 5 • Goal: to study the difference of 3 appraisers on their appraised values • Blocks: randomly selected 5 properties
Latin Square Design (Section 15.3) Example: Three factors, A (block factor), B (block factor), and C (treatment factor), each at three levels. A possible arrangement: B B B 1 2 3 C C C A 1 1 1 1 C C C A 2 2 2 2 A C C C 3 3 3 3
Notice, first, that these designs are squares; all factors are at the same number of levels, though there is no restriction on the nature of the levels themselves. Notice, that these squares are balanced: each letter (level) appears the same number of times; this insures unbiased estimates of main effects. How to do it in a square? Each treatment appears once in every column and row. Notice, that these designs are incomplete; of the 27 possible combinations of three factors each at three levels, we use only 9.
Example: Three factors, A (block factor), B (block factor), and C (treatment factor), each at three levels, in a Latin Square design; nine combinations. B B B 1 2 3 C C C A 1 2 3 1 C C C A 2 3 1 2 A C C C 3 1 2 3
B B B B 1 2 3 4 C C C C A 4 3 2 1 1 8 5 5 8 7 7 8 9 0 9 9 7 C C C C A 1 2 3 4 2 9 6 2 8 1 7 8 4 5 7 7 6 C C C C A 3 4 1 2 3 8 4 8 8 4 1 7 8 4 7 7 6 C C C C A 2 1 4 3 4 8 3 1 9 5 2 8 0 6 8 7 1 Example with 4 Levels per Factor FACTORS VARIABLE Lifetime of a tire (days) Automobiles A four levels Tire positions B four levels Tire treatments C four levels
r t g and each at m levels Three factors , , , The Model for (Unreplicated) Latin Squares Example: m i = 1,... y 1, m ... = + + + + j = , ijk j k ijk i ... , k m Y = A + B + C + e = 1, AB, AC, BC, ABC Note that interaction is not present in the model. Same three assumptions: normality, constant variances, and randomness.
Putting in Estimates: = y + ( y – y ) + ( y – y ) + ( y – y ) + R y ijk ... i .. ... . j . ... .. k ... or bringing y••• to the left – hand side , ( y – y ) = ( y – y ) + ( y – y ) + ( y – y ) + R , ijk ... i .. ... . j . ... .. k ... Variability among yields associated with Rows Variability among yields associated with Columns Variability among yields associated with Inside Factor Total variability among yields = + + y – y – y – y + 2 y where R = ijk i .. . j . .. k ...
Actually, R = y - y - y - y + 2 y ... ijk i .. . j . .. k = ( y - y ) ( y - y ) - ijk ... i .. ... y y ) - - ( . j . ... - ( y - y ), .. k ... An “interaction-like” term. (After all, there’s no replication!)
The analysis of variance (omitting the mean squares, which are the ratios of second to third entries), and expectations of mean squares: S o u r c e o f S u m o f D e g r e e s o f E x p e c t e d v a r i a t i o n s q u a r e s f r e e d o m v a l u e o f m e a n s q u a r e m R o w s m – 1 + V 2 m ( y – y ) 2 Rows i .. ... i = 1 m C o l u m n s m – 1 + V 2 m ( y – y ) 2 Col . j . ... j = 1 m I n s i d e m – 1 + V 2 m ( y – y ) 2 Inside factor .. k ... f a c t o r k = 1 b y s u b t r a c t i o n ( m – 1)( m – 2) 2 Error T o t a l m – 1 2 2 ( y – y ) ijk ... i j k
The expected values of the mean squares immediately suggest the F ratios appropriate for testing null hypotheses on rows, columns and inside factor.
Our Example: (Inside factor = Tire Treatment) Tire Position Auto.
General Linear Model: Lifetime versus Auto, Postn, TrtmntFactor Type Levels Values Auto fixed 4 1 2 3 4Postn fixed 4 1 2 3 4Trtmnt fixed 4 1 2 3 4Analysis of Variance for Lifetime, using Adjusted SS for TestsSource DF Seq SS Adj SS Adj MS F PAuto 3 17567 17567 5856 2.17 0.192Postn 3 4679 4679 1560 0.58 0.650Trtmnt 3 26722 26722 8907 3.31 0.099Error 6 16165 16165 2694Total 15 65132 Unusual Observations for LifetimeObs Lifetime Fit SE Fit Residual St Resid 11 784.000 851.250 41.034 -67.250 -2.12R
Minitab DATA ENTRYVAR1 VAR2 VAR3 VAR4855 1 1 4962 2 1 1848 3 1 3831 4 1 2877 1 2 3817 2 2 2. . . .. . . .. . . .871 4 4 3
Latin Square with REPLICATION • Case One: using the same rows and columns for all Latin squares. • Case Two: using different rows and columns for different Latin squares. • Case Three: using the same rows but different columns for different Latin squares.
Treatment Assignments for n Replications • Case One: repeat the same Latin square n times. • Case Two: randomly select one Latin square for each replication. • Case Three: randomly select one Latin square for each replication.
Example: n = 2, m = 4, trtmnt = A,B,C,D Case One: • Row = 4 tire positions; column = 4 cars
Case Two • Row = clinics; column = patients; letter = drugs for flu
Case Three • Row = 4 tire positions; column = 8 cars
ANOVA for Case 1SSBR, SSBC, SSBIF are computed the same way as before, except that the multiplier of (say for rows) m (Yi..-Y…)2 becomes mn (Yi..-Y…)2and degrees of freedom for error becomes(nm2 - 1) - 3(m - 1) = nm2 - 3m + 2
ANOVA for other cases: • SS: please refer to the book, Statistical Principles of research Design and Analysis by R. Kuehl. • DF: # of levels – 1 for all terms except error. DF of error = total DF – the sum of the rest DF’s. Using Minitab in the same way can give Anova tables for all cases.
Three or More Factors Notation: • Y = response; A, B, C, … = input factors • AB = interaction between A and B • ABC = interaction between A, B, and C • The term involving k factors has order of k: eg. AB order 2 term ABC order 3 term
Full model = the model includes all factors and their interactions, denoted as (1) Two factors A|B (= A+B+AB) (2) Three factors A|B|C (= A+B+C+AB+AC+BC+ABC) (3) And so on.
Backward Model Selection • Fit the full model and delete the most insignificant highest order term. • Fit the reduced model from 1. and delete the most insignificant highest order term. • Repeat 2. until all remaining highest order terms are significant. • Repeat the same procedure (deleting the most insignificant term each time until no insignificant terms) for the 2nd highest order, then the 3rd highest order, …, and finally the order 1 terms. • Determine the final model and do assumption checking for it.
Note. If a term is in the current model, then all lower order terms involving factors in that term must not be deleted even if they are insignificant. eg. If ABC is significant (so it is in the model), then A, B, C, AB, AC, BC cannot be deleted.
Note. The procedure of backward model selection can be very time-consuming if the number of factors, k, is large. In such cases, we delete all insignificant terms together when we are processing the order 4 or higher terms. • Examples are in Minitab unit 11.