Complex Experimental Design and Simple Data Analysis: A Pharmaceutical Example

Complex Experimental Design and Simple Data Analysis:A Pharmaceutical Example Joseph G Pigeon Villanova University

Introduction • Designs with restricted randomization have multiple error measures • Pharmaceutical example where the split plot structure is even more complex • Whole plot structure in two dimensions • Correlation structure in two dimensions • Caveats • Limited understanding of the biology involved • No originality of statistical methods claimed

Split Plot Designs • Originated in agricultural experiments where • Levels of some factors are applied to whole plots • Levels of other factors are applied to sub plots • Separate randomizations to whole plots and sub plots • Two types of experimental units • Two types of error measures • Correlation among the observations

Split Plot Designs • Also common in industrial experiments when • Complete randomization does not occur • Some factor levels may be impractical, inconvenient or too costly to change • This restriction on randomization results in some whole plot factors and some sub plot factors • Data analysis needs to account for this restricted randomization or split plot structure

Split Plot Example • Consider a paper manufacturer who wants to study • Effects of 3 pulp preparation methods • Effects of 4 temperatures • Response is tensile strength • Pilot plant is capable of 12 runs per day • One replicate on each of three days

Split Plot Example

Split Plot Example • Initially, we might consider this to be a 4 x 3 factorial in a randomized block design • If true, then the order of experimentation within a block should have been completely randomized • However, this was not feasible; data were not collected this way

Split Plot Example • Experiment was conducted as follows: • A batch of pulp was produced by one of the three methods • The batch was divided into four samples • Each sample was cooked at one of the four temperatures • Split plot design with • Pulp preparation method as whole plot treatment • Temperature as sub (split) plot treatment

Split Plot Example

Split Plot Example • Subplot error is less than whole plot error (typical)

Split Plot Example Lessons • We must carefully consider how the data were collected and incorporate all randomization restrictions into the analysis • Whole plot effects measured against whole plot error • Sub plot effects measured against sub plot error

Description of Example – MQPA Assay • Multivalent Q-PCR based Potency Assay • Used to assign potencies (independently) to each of five reassortants of a pentavalent vaccine • Relies on the quantitation of viral nucleic acid generated in 24 hours • Two major components • Biological component (infection of the standard and sample viruses) • Biochemical component (quantitative PCR reaction where PCR = Polymerase Chain Reaction)

Polymerase Chain Reaction (PCR)

Description of Example-Biological Component • Vero cell maintenance and set up • Serial dilution of known standard and unknown sample are incubated with trypsin • Infected in 4 replicate wells of Vero cell monolayers seeded in a 96 well plate • Infection proceeds for 24 hours and then halted with the addition of a detergent and storage at –70C

Description of Example-Biochemical Component • Lysate is thawed and diluted • Preparation of a “master mix” • Preparation of Q-PCR plate (master mix + diluted lysates) • Configuration of the Q-PCR detection system • Potency is determined by parallel line analysis of standard and test samples • Specific interest is on optimization of the PCR portion of the assay

PCR Optimization Design • Discussions with Biologists identified 13 factors • 8 factors associated with preparation of master mix • 5 factors associated with configuration of PCR detection system (instrument) • Discussions with Biologists identified 3 responses • Lowest cycle time (range: 1 – 40) • Least variability between replicates • Valid amplification plot (range: 0 – 4) • Completion of experiments and analysis immediately!

PCR Optimization Design

PCR Optimization Design Considerations • Interactions not expected to exist • Experiments performed in a 96 well plate • Each plate can accommodate at most 15 master mix combinations • 12 run PB deign for 8 factors

PCR Optimization Design Considerations • Time constraints imply at most 16 plates (instrument settings) • 25-1 fractional factorial for 5 factors (5 = 1234) • Concern about using only 12 of 28 combinations • Half of the plates use a 12 run PB design (123 = 45 = +1) • Half of the plates use the foldover PB design (123 = 45 = 1)

Plackett-Burman Design Factors: 8 Replicates: 1 Design: 12 Runs: 12 Center pts (total): 0 Data Matrix (randomized) Run A B C D E F G H 1 - + + + - + + - 2 + + - + - - - + 3 + - + - - - + + 4 - + + - + - - - 5 + + - + + - + - 6 + - + + - + - - 7 - + - - - + + + 8 + - - - + + + - 9 - - + + + - + + 10 - - - - - - - - 11 + + + - + + - + 12 - - - + + + - +

Half Fraction Design Factors: 5 Base Design: 5, 16 Resolution: V Runs: 16 Replicates: 1 Fraction: 1/2 Blocks: none Center pts (total): 0 Design Generators: E = ABCD Row StdOrder RunOrder A B C D E 1 1 7 -1 -1 -1 -1 1 2 2 8 1 -1 -1 -1 -1 3 3 3 -1 1 -1 -1 -1 4 4 15 1 1 -1 -1 1 5 5 13 -1 -1 1 -1 -1 6 6 9 1 -1 1 -1 1 7 7 10 -1 1 1 -1 1 8 8 6 1 1 1 -1 -1 9 9 16 -1 -1 -1 1 -1 10 10 2 1 -1 -1 1 1 11 11 4 -1 1 -1 1 1 12 12 12 1 1 -1 1 -1 13 13 5 -1 -1 1 1 1 14 14 11 1 -1 1 1 -1 15 15 14 -1 1 1 1 -1 16 16 1 1 1 1 1 1

PCR Optimization Design Layout • Each  represents a 12 run PB design • 16 × 12 = 192 observations

PCR Optimization Design Layout

PCR Optimization Design Layout • Whole plot structure in two dimensions

PCR Optimization Results • Biologists provided this summary of the 21 runs with an amplification plot rating of 4

PCR Optimization Results plate Count mm Count mm1 Count mm2 Count mm3 Count mm4 Count 3 3 5 2 -1 11 -1 16 -1 6 -1 16 4 4 6 3 1 10 1 5 1 15 1 5 5 1 8 5 N= 21 N= 21 N= 21 N= 21 7 2 9 3 10 1 14 2 11 2 19 5 12 3 22 1 14 1 N= 21 15 3 16 1 N= 21 mm5 Count mm6 Count mm7 Count mm8 Count instr1 Count -1 7 -1 9 1 21 -1 14 -1 12 1 14 1 12 N= 21 1 7 1 9 N= 21 N= 21 N= 21 N= 21 instr2 Count instr3 Count instr4 Count instr5 Count -1 10 -1 8 -1 19 -1 13 1 11 1 13 1 2 1 8 N= 21 N= 21 N= 21 N= 21

PCR Optimization Analysis Log • mm7 = 1; instr4 = –1

PCR Optimization Results plate Count mm Count mm1 Count mm2 Count mm3 Count mm4 Count 1 4 1 6 -1 31 -1 26 -1 47 -1 28 2 4 2 6 1 32 1 37 1 16 1 35 4 3 3 3 N= 63 N= 63 N= 63 N= 63 5 5 4 4 6 6 7 7 7 5 11 5 8 8 13 5 9 4 15 2 10 6 16 6 11 3 17 5 12 1 18 3 13 3 20 2 14 5 21 4 15 3 22 3 16 3 23 2 N= 63 N= 63 mm5 Count mm6 Count mm7 Count mm8 Count instr1 Count -1 30 -1 31 -1 41 -1 21 -1 34 1 33 1 32 1 22 1 42 1 29 N= 63 N= 63 N= 63 N= 63 N= 63 instr2 Count instr3 Count instr4 Count instr5 Count -1 31 -1 42 -1 26 -1 29 1 32 1 21 1 37 1 34 N= 63 N= 63 N= 63 N= 63

PCR Optimization Analysis Log • mm7 = 1; instr4 = –1 • mm3 = 1; mm7 = 1; mm8 = –1; instr3 = 1

PCR Optimization Results Fractional Factorial Fit: ctgm Estimated Effects and Coefficients for ctgm (coded units) Term Effect Coef SE Coef T P Constant 33.919 0.3852 88.06 0.000 instr1 -1.264 -0.632 0.3852 -1.64 0.103 instr2 0.596 0.298 0.3852 0.77 0.440 instr3 -2.157 -1.078 0.3852 -2.80 0.006 instr4 1.152 0.576 0.3852 1.50 0.137 instr5 0.667 0.333 0.3852 0.87 0.388 instr1*instr2 0.892 0.446 0.3852 1.16 0.249 instr1*instr3 0.424 0.212 0.3852 0.55 0.582 instr1*instr4 -0.221 -0.110 0.3852 -0.29 0.775 instr1*instr5 -0.276 -0.138 0.3852 -0.36 0.721 instr2*instr3 -1.110 -0.555 0.3852 -1.44 0.151 instr2*instr4 0.240 0.120 0.3852 0.31 0.756 instr2*instr5 1.522 0.761 0.3852 1.98 0.050 instr3*instr4 0.484 0.242 0.3852 0.63 0.531 instr3*instr5 0.182 0.091 0.3852 0.24 0.814 instr4*instr5 0.027 0.014 0.3852 0.04 0.972

PCR Optimization Results

PCR Optimization Analysis Log • mm7 = 1; instr4 = -1 • mm3 = 1; mm7 = 1; mm8 = -1; instr3 = 1 • Instr3 = 1; instr2 and instr5 should have opposite signs?

PCR Optimization Results Estimated Effects and Coefficients for ctgm (coded units) Term Effect Coef SE Coef T P Constant 33.947 0.3206 105.90 0.000 mm1 -0.304 -0.152 0.3206 -0.47 0.636 mm2 0.699 0.350 0.3206 1.09 0.277 mm3 -4.070 -2.035 0.3206 -6.35 0.000 mm4 0.222 0.111 0.3206 0.35 0.730 mm5 -0.341 -0.171 0.3206 -0.53 0.595 mm6 -0.027 -0.013 0.3206 -0.04 0.967 mm7 -4.525 -2.263 0.3207 -7.06 0.000 mm8 2.061 1.030 0.3206 3.21 0.002

PCR Optimization Analysis Log • mm7 = 1; instr4 = – 1 • mm3 = 1; mm7 = 1; mm8 = –1; instr3 = 1 • instr3 = 1; instr2 and instr5 should have opposite signs? • mm3 = 1; mm7 = 1; mm8 = –1

PCR Optimization Results Row plate mm ct1 ct2 ct3 ct4 ctgm well1 well2 1 3 14 26.88 27.33 27.25 27.13 27.15 37.98 40 2 3 19 27.62 28.10 28.02 27.40 27.78 40.00 40 3 4 5 29.20 29.04 29.39 28.70 29.08 40.00 40 4 11 14 27.53 26.97 28.04 27.90 27.61 40.00 40 5 11 19 28.25 28.57 28.64 28.09 28.39 40.00 40 6 12 5 28.13 28.93 28.39 28.51 28.49 40.00 40 Row amprating mm1 mm2 mm3 mm4 mm5 mm6 mm7 mm8 instr1 instr2 1 4 1 1 1 -1 -1 -1 1 -1 1 1 2 4 -1 -1 1 -1 1 1 1 -1 1 1 3 4 -1 1 1 1 -1 1 1 -1 -1 -1 4 4 1 1 1 -1 -1 -1 1 -1 -1 1 5 4 -1 -1 1 -1 1 1 1 -1 -1 1 6 3 -1 1 1 1 -1 1 1 -1 1 -1 Row instr3 instr4 instr5 1 1 -1 -1 2 1 -1 -1 3 1 -1 -1 4 1 -1 1 5 1 -1 1 6 1 -1 1

PCR Optimization Summary • No complex models – all simple analyses • 5 factors were found to be significant (mm3, mm7, mm8, instr3 and instr4) • These factors were further studied using response surface experiments • Scientists seem quite happy with the results of the PCR optimization experiments

Concluding Remarks • Many industrial experiments do have a split or strip plot structure which means multiple and possibly complex error measures • Arises from the conduct of an experiment and/or any restrictions on the randomization • We need to incorporate these considerations into a proper analysis and interpretation of experimental data

Concluding Remarks • Experimental designs with balance, symmetry and orthogonality permit simple but effective graphical analyses (even with some missing data) • Much can be learned from simple analyses following suitable experimental design • All models are wrong, but some models are useful • All models are wrong, but some models are more wrong than others

Complex Experimental Design and Simple Data Analysis: A Pharmaceutical Example