1.2k likes | 1.66k Views
Design and Analysis of Multi-Factored Experiments. Two-level Factorial Designs. The 2 k Factorial Design. Special case of the general factorial design; k factors, all at two levels The two levels are usually called low and high (they could be either quantitative or qualitative)
E N D
Design and Analysis ofMulti-Factored Experiments Two-level Factorial Designs DOE Course
The 2k Factorial Design • Special case of the general factorial design; k factors, all at two levels • The two levels are usually called low and high (they could be either quantitative or qualitative) • Very widely used in industrial experimentation • Form a basic “building block” for other very useful experimental designs (DNA) • Special (short-cut) methods for analysis • We will make use of Design-Expert for analysis DOE Course
Chemical Process Example A = reactant concentration, B = catalyst amount, y = recovery DOE Course
The Simplest Case: The 22 “-” and “+” denote the low and high levels of a factor, respectively Low and high are arbitrary terms Geometrically, the four runs form the corners of a square Factors can be quantitative or qualitative, although their treatment in the final model will be different DOE Course
Estimating effects in two-factor two-level experiments Estimate of the effect of A a1b1 - a0b1 estimate of effect of A at high B a1b0 - a0b0 estimate of effect of A at low B sum/2 estimate of effect of A over all B Or average of high As – average of low As. Estimate of the effect of B a1b1 - a1b0 estimate of effect of B at high A a0b1 - a0b0 estimate of effect of B at high A sum/2 estimate of effect of B over all A Or average of high Bs – average of low Bs DOE Course
Estimating effects in two-factor two-level experiments Estimate the interaction of A and B a1b1 - a0b1 estimate of effect of A at high B a1b0 - a0b0 estimate of effect of A at low B difference/2 estimate of effect of B on the effect of A called as the interaction of A and B a1b1 - a1b0 estimate of effect of B at high A a0b1 - a0b0 estimate of effect of B at low A difference/2 estimate of the effect of A on the effect of B Called the interaction of B and A Or average of like signs – average of unlike signs DOE Course
Estimating effects, contd... Note that the two differences in the interaction estimate are identical; by definition, the interaction of A and B is the same as the interaction of B and A. In a given experiment one of the two literary statements of interaction may be preferred by the experimenter to the other; but both have the same numerical value. DOE Course
Remarks on effects and estimates • Note the use of all four yields in the estimates of the effect of A, the effect of B, and the effect of the interaction of A and B; all four yields are needed and are used in each estimates. • Note also that the effect of each of the factors and their interaction can be and are assessed separately, this in an experiment in which both factors vary simultaneously. • Note that with respect to the two factors studied, the factors themselves together with their interaction are, logically, all that can be studied. These are among the merits of these factorial designs. DOE Course
Remarks on interaction Many scientists feel the need for experiments which will reveal the effect, on the variable under study, of factors acting jointly. This is what we have called interaction. The simple experimental design discussed here evidently provides a way of estimating such interaction, with the latter defined in a way which corresponds to what many scientists have in mind when they think of interaction. It is useful to note that interaction was not invented by statisticians. It is a joint effect existing, often prominently, in the real world. Statisticians have merely provided ways and means to measure it. DOE Course
Symbolism and language A is called a main effect. Our estimate of A is often simply written A. B is called a main effect. Our estimate of B is often simply written B. AB is called an interaction effect. Our estimate of AB is often simply written AB. So the same letter is used, generally without confusion, to describe the factor, to describe its effect, and to describe our estimate of its effect. Keep in mind that it is only for economy in writing that we sometimes speak of an effect rather than an estimate of the effect. We should always remember that all quantities formed from the yields are merely estimates. DOE Course
Table of signs The following table is useful: Notice that in estimating A, the two treatments with A at high level are compared to the two treatments with A at low level. Similarly B. This is, of course, logical. Note that the signs of treatments in the estimate of AB are the products of the signs of A and B. Note that in each estimate, plus and minus signs are equal in number DOE Course
B+ B- A Example 2 Example 1 B B Example 2 A+ Low High B+ Low High A=2.5 B=2 B- A- Low Low A A B A High High B+ A=3 B- B B Example 4 Example 4 Example 3 Low High Low High A B-, B+ Low Example 3 Low A 15 A 14 13 A 12 High High 11 Y 10 9 -2 -1 0 1 Discussion of examples: Notice that in examples 2 & 3 interaction is as large as or larger than main effects. *A = [-(1) - b + a + ab]/2 = [-10 - 12 + 13 + 15]/2 = 3 DOE Course
Change of scale, by multiplying each yield by a constant, multiplies each estimate by the constant but does not affect the relationship of estimates to each other. • Addition of a constant to each yield does not affect the estimates. • The numerical magnitude of estimates is not important here; it is their relationship to each other. DOE Course
Modern notation and Yates’ order Modern notation: a0b0 = 1 a0b1 = b a1b0 = a a1b1 = ab We also introduce Yates’ (standard) order of treatments and yields; each letter in turn followed by all combinations of that letter and letters already introduced. This will be the preferred order for the purpose of analysis of the yields. It is not necessarily the order in which the experiment is conducted; that will be discussed later. For a two-factor two-level factorial design, Yates’ order is 1 a b ab Using modern notation and Yates’ order, the estimates of effects become: A = (-1 + a - b + ab)/2 B = (-1 - a + b +ab)/2 AB = (1 -a - b + ab)/2 DOE Course
Three factors each at two levels Example: The variable is the yield of a nitration process. The yield forms the base material for certain dye stuffs and medicines. Lowhigh A time of addition of nitric acid 2 hours 7 hours B stirring time 1/2 hour 4 hours C heel absent present Treatments (also yields) (i) old notation (ii) new notation. (i) a0b0c0 a0b0c1 a0b1c0 a0b1c1 a1b0c0 a1b0c1 a1b1c0 a1b1c1 (ii) 1 c b bc a ac ab abc Yates’ order: 1 a b ab c ac bc abc DOE Course
Effects in The 23 Factorial Design DOE Course
Estimating effects in three-factor two-level designs (23) Estimate of A (1) a - 1 estimate of A, with B low and C low (2) ab - b estimate of A, with B high and C low (3) ac - c estimate of A, with B low and C high (4) abc - bc estimate of A, with B high and C high = (a+ab+ac+abc - 1-b-c-bc)/4, = (-1+a-b+ab-c+ac-bc+abc)/4 (in Yates’ order) DOE Course
Estimate of AB Effect of A with B high - effect of A with B low, all at C high plus effect of A with B high - effect of A with B low, all at C low Note that interactions are averages. Just as our estimate of A is an average of response to A over all B and all C, so our estimate of AB is an average response to AB over all C. AB = {[(4)-(3)] + [(2) - (1)]}/4 = {1-a-b+ab+c-ac-bc+abc)/4, in Yates’ order or, = [(abc+ab+c+1) - (a+b+ac+bc)]/4 DOE Course
Estimate of ABC interaction of A and B, at C high minus interaction of A and B at C low ABC = {[(4) - (3)] - [(2) - (1)]}/4 =(-1+a+b-ab+c-ac-bc+abc)/4, in Yates’ order or, =[abc+a+b+c - (1+ab+ac+bc)]/4 DOE Course
This is our first encounter with a three-factor interaction. It measures the impact, on the yield of the nitration process, of interaction AB when C (heel) goes from C absent to C present. Or it measures the impact on yield of interaction AC when B (stirring time) goes from 1/2 hour to 4 hours. Or finally, it measures the impact on yield of interaction BC when A (time of addition of nitric acid) goes from 2 hours to 7 hours. As with two-factor two-level factorial designs, the formation of estimates in three-factor two-level factorial designs can be summarized in a table. DOE Course
Sign Table for a 23 design DOE Course
Example Yield of nitration process discussed earlier: 1 a b ab c ac bc abc Y = 7.2 8.4 2.0 3.0 6.7 9.2 3.4 3.7 A = main effect of nitric acid time = 1.25 B = main effect of stirring time = -4.85 AB = interaction of A and B = -0.60 C = main effect of heel = 0.60 AC = interaction of A and C = 0.15 BC = interaction of B and C = 0.45 ABC = interaction of A, B, and C = -0.50 NOTE: ac = largest yield; AC = smallest effect DOE Course
We describe several of these estimates, though on later analysis of this example, taking into account the unreliability of estimates based on a small number (eight) of yields, some estimates may turn out to be so small in magnitude as not to contradict the conjecture that the corresponding true effect is zero. The largest estimate is -4.85, the estimate of B; an increase in stirring time, from 1/2 to 4 hours, is associated with a decline in yield. The interaction AB = -0.6; an increase in stirring time from 1/2 to 4 hours reduces the effect of A, whatever it is (A = 1.25), on yield. Or equivalently DOE Course
an increase in nitric acid time from 2 to 7 hours reduces (makes more negative) the already negative effect (B = -485) of stirring time on yield. Finally, ABC = -0.5. Going from no heel to heel, the negative interaction effect AB on yield becomes even more negative. Or going from low to high stirring time, the positive interaction effect AC is reduced. Or going from low to high nitric acid time, the positive interaction effect BC is reduced. All three descriptions of ABC have the same numerical value; but the chemist would select one of them, then say it better. DOE Course
Number and kinds of effects We introduce the notation 2k. This means a factor design with each factor at two levels. The number of treatments in an unreplicated 2k design is 2k. The following table shows the number of each kind of effect for each of the six two-level designs shown across the top. DOE Course
Main effect 2 factor interaction 3 factor interaction 4 factor interaction 5 factor interaction 6 factor interaction 7 factor interaction 3 7 15 31 63 127 In a 2k design, the number of r-factor effects is Ckr = k!/[r!(k-r)!] DOE Course
Notice that the total number of effects estimated in any design is always one less than the number of treatments In a 22 design, there are 22=4 treatments; we estimate 22-1= 3 effects. In a 23 design, there are 23=8 treatments; we estimate 23-1= 7 effects One need not repeat the earlier logic to determine the forms of estimates in 2k designs for higher values of k. A table going up to 25 follows. DOE Course
E f f e c t s 25 24 23 22 Treatment s DOE Course
Yates’ Forward Algorithm (1) 1. Applied to Complete Factorials (Yates, 1937) A systematic method of calculating estimates of effects. For complete factorials first arrange the yields in Yates’ (standard) order. Addition, then subtraction of adjacent yields. The addition and subtraction operations are repeated until 2k terms appear in each line: for a 2k there will be k columns of calculations DOE Course
Yates’ Forward Algorithm (2) Example: Yield of a nitration process Tr. Yield 1stCol 2ndCol 3rdCol 1 7.2 15.6 20.6 43.6 Contrast of µ a 8.4 5.0 23.0 5.0 Contrast of A b 2.0 15.9 2.2 -19.4 Contrast of B ab 3.0 7.1 2.8 -2.4 Contrast of AB c 6.7 1.2 -10.6 2.4 Contrast of C ac 9.2 1.0 -8.8 0.6 Contrast of AC bc 3.4 2.5 -0.2 1.8 Contrast of BC abc 3.7 0.3 -2.2 -2.0 Contrast of ABC Again, note the line-by-line correspondence between treatments and estimates; both are in Yates’ order. DOE Course
Main effects in the face of large interactions Several writers have cautioned against making statements about main effects when the corresponding interactions are large; interactions describe the dependence of the impact of one factor on the level of another; in the presence of large interaction, main effects may not be meaningful. DOE Course
Example (Adapted from Kempthorne) Yields are in bushels of potatoes per plot. The two factors are nitrate (N) and phosphate (P) fertilizers. low level (-1)high level (+1) N (A) blood sulphate of ammonia P (B) superphosphate steamed bone flower; The yields are 1 = 746.75 n = 625.75 p = 611.00 np = 656.00 the estimates are N = -38.00 P = -52.75 NP = 83.00 In the face of such high interaction we now specialize the main effect of each factor to particular levels of the other factor. Effect of N at high level P = np-p = 656.00-611.00 = 45.0 Effect of N at low level P = n-1 = 625.71-746.75 = -121.0, which appear to be more valuable for fertilizer policy than the mean (-38.00) of such disparate numbers 746.75 P+ Keep both low is best -121 Y 656 -38 611.0 P- 625.75 N DOE Course
Note that answers to these specialized questions are based on fewer than 2k yields. In our numerical example, with interaction NP prominent, we have only two of the four yields in our estimate of N at each level of P. In general we accept high interactions wherever found and seek to explain them; in the process of explanation, main effects (and lower-order interactions) may have to be replaced in our interest by more meaningful specialized or conditional effects. DOE Course
Specialized or Conditional Effects • Whenever there is large interactions, check: • Effect of A at high level of B = A+ = A + AB • Effect of A at low level of B = A- = A – AB • Effect of B at high level of A = B+ = B + AB • Effect of B at low level of A = B- = B - AB DOE Course
Factors not studied In any experiment, factors other than those studied may be influential. Their presence is sometimes acknowledged under the dubious title “experimental error”. They may be neglected, but the usual cost of neglect is high. For they often have uneven impact, systematically affecting some treatments more than others, and thereby seriously confounding inferences on the studied factors. It is important to deal explicitly with them; even more, it is important to measure their impact. How? DOE Course
1. Hold them constant. 2. Randomize their effects. 3. Estimate their magnitude by replicating the experiment. 4. Estimate their magnitude via side or earlier experiments. 5. Argue (convincingly) that the effects of some of these non-studied factors are zero, either in advance of the experiment or in the light of the yields. 6. Confound certain non-studied factors. DOE Course
Simplified Analysis Procedure for 2-level Factorial Design • Estimate factor effects • Formulate model using important effects • Check for goodness-of-fit of the model. • Interpret results • Use model for Prediction DOE Course
Example: Shooting baskets • Consider an experiment with 3 factors: A, B, and C. Let the response variable be Y. For example, • Y = number of baskets made out of 10 • Factor A = distance from basket (2m or 5m) • Factor B = direction of shot (0° or 90 °) • Factor C = type of shot (set or jumper) Factor Name Units Low Level (-1) High Level (+1) A Distance m 2 5 B Direction Deg. 0 90 C Shot type Set Jump DOE Course
Treatment Combinations and Results Order A B C Combination Y 1 -1 -1 -1 (1) 9 2 +1 -1 -1 a 5 3 -1 +1 -1 b 7 4 +1 +1 -1 ab 3 5 -1 -1 +1 c 6 6 +1 -1 +1 ac 5 7 -1 +1 +1 bc 4 8 +1 +1 +1 abc 2 DOE Course
Estimating Effects Order A B AB C AC BC ABC Comb Y 1 -1 -1 +1 -1 +1 +1 -1 (1) 9 2 +1 -1 -1 -1 -1 +1 +1 a 5 3 -1 +1 -1 -1 +1 -1 +1 b 7 4 +1 +1 +1 -1 -1 -1 -1 ab 3 5 -1 -1 +1 +1 -1 -1 +1 c 6 6 +1 -1 -1 +1 +1 -1 -1 ac 5 7 -1 +1 -1 +1 -1 +1 -1 bc 4 8 +1 +1 +1 +1 +1 +1 +1 abc 2 Effect A = (a + ab + ac + abc)/4 - (1 + b + c + bc)/4 = (5 + 3 + 5 + 2)/4 - (9 + 7 + 6 + 4)/4 = -2.75 DOE Course
Effects and Overall Average Using the sign table, all 7 effects can be calculated: Effect A = -2.75 Effect B = -2.25 Effect C = -1.75 Effect AC = 1.25 Effect AB = -0.25 Effect BC = -0.25 Effect ABC = -0.25 The overall average value = (9 + 5 + 7 + 3 + 6 + 5 + 4 + 2)/8 = 5.13 DOE Course
Formulate Model The most important effects are: A, B, C, and AC Model: Y = b0 + b1 X1 + b2 X2 + b3 X3 + b13 X1X3 b0 = overall average = 5.13 b1 = Effect [A]/2 = -2.75/2 = -1.375 b2 = Effect [B]/2 = -2.25/2 = -1.125 b3 = Effect [C]/2 = -1.75/2 = - 0.875 b13 = Effect [AC]/2 = 1.25/2 = 0.625 Model in coded units: Y = 5.13 -1.375 X1 - 1.125 X2 - 0.875 X3 + 0.625 X1 X3 DOE Course
Checking for goodness-of-fit ActualPredictedValueValue 9.00 9.13 5.00 5.13 7.00 6.88 3.00 2.87 6.00 6.13 5.00 4.63 4.00 3.88 2.00 2.37 Amazing fit!! DOE Course
Interpreting Results 10 Effect of B=4-6.25= -2.25 (9+5+6+5)/4=6.25 # out of 10 8 6 (7+3+4+2)/4=4 4 2 0 90 B C: Shot type 10 Interaction of A and C = 1.25 8 # out of 10 C(-1) 6 At 5m, Jump or set shot about the same BUT at 2m, set shot gave higher values compared to jump shots 4 2 C (+1) 2m 5m A DOE Course
Design and Analysis ofMulti-Factored Experiments Analysis of 2k Experiments Statistical Details DOE Course
Errors of estimates in 2k designs 1.Meaning of 2 Assume that each treatment has variance 2. This has the following meaning: consider any one treatment and imagine many replicates of it. As all factors under study are constant throughout these repetitions, the only sources of any variability in yield are the factors not under study. Any variability in yield is due to them and is measured by 2. DOE Course
Errors of estimates in 2k designs, Contd.. 2. Effect of the number of factors on the error of an estimate What is the variance of an estimate of an effect? In a 2k design, 2k treatments go into each estimate; the signs of the treatments are + or -, depending on the effect being estimated. So, any estimate = 1/2k-1[generalized (+ or -) sum of 2k treatments] 2(any estimate) = 1/22k-2 [2k 2] = 2/2k-2; The larger the number of factors, the smaller the error of each estimate. Note: 2(kx) = k2 2(x) DOE Course
Errors of estimates in 2k designs, Contd.. 3. Effect of replication on the error of an estimate What is the effect of replication on the error of an estimate? Consider a 2k design with each treatment replicated n times. 1 a b abc d - - - - - - - - - - - - - - - - - - - - --- --- DOE Course
Errors of estimates in 2k designs, Contd.. Any estimate = 1/2k-1 [sums of 2k terms, all of them means based on samples of size n] 2(any estimate) = 1/22k-2 [2k 2/n] = 2/(n2k-2); The larger the replication per treatment, the smaller the error of each estimate. DOE Course
So, the error of an estimate depends on k (the number of factors studied) and n (the replication per factor). It also (obviously) depends on 2. The variance 2 can be reduced holding some of the non-studied factors constant. But, as has been noted, this gain is offset by reduced generality of any conclusions. DOE Course