930 likes | 975 Views
NATIONAL VETERINARY S C H O O L T O U L O U S E. Statistics in bioequivalence. Didier Concordet d.concordet@envt.fr. May 4-5 2004. may 4-5 2004. Statistics in bioequivalence. Parametric or non-parametric ?. Transformation of parameters. Experimental design : parallel and crossover.
E N D
NATIONAL VETERINARY S C H O O L T O U L O U S E Statistics in bioequivalence Didier Concordet d.concordet@envt.fr May 4-5 2004
may 4-5 2004 Statistics in bioequivalence Parametric or non-parametric ? Transformation of parameters Experimental design : parallel and crossover Confidence intervals and bioequivalence Sample size in bioequivalence trials
may 4-5 2004 Statistics in bioequivalence Parametric or non-parametric ? Transformation of parameters Experimental design : parallel and crossover Confidence intervals and bioequivalence Sample size in bioequivalence trials
may 4-5 2004 Parametric ? A statistical property of the distribution of data All data are drawn from distribution that can be completely described by a finite number of parameters (refer to sufficiency) Example The ln AUC obtained in a dog for a formulation is a figure drawn from a N(m, s²) The parameters m, s² defined the distribution of AUC (its ln) that can be observed in this dog.
may 4-5 2004 Non parametric ? The distribution of data is not defined by a finite number of parameters. It is defined by its shape, number of modes, regularity….. The number of parameters used to estimate the distribution with n data increases with n. Practically These distributions have no specific name. The goal of a statistical study is often to show that some distributions are/(are not) different. It suffice to show that a parameter that participate to the distribution description (eg the median) is not the same for the compared distributions.
may 4-5 2004 Parametric : normality Usually, the data are assumed to be drawn from a (mixture) of gaussian distribution(s) up to a monotone transformation Example : The ln AUC obtained in a dog for a formulation is drawn from a N(3.5, 0.5²) distribution The monotone transformation is the logarithm The ln AUC obtained in another dog for the same formulation is drawn from a N(3.7, 0.5²) distribution The distribution of the data that are observable on these 2 dogs is a mixture of the N(3.5, 0.5²) and N(3.7, 0.5²) distributions
may 4-5 2004 Parametric methods • Methods designed to analyze data from parametric distributions • Standard methods work with 3 assumptions (detailed after) • homoscedasticity • independence • normality Practically for bioequivalence studies AUC and CMAX : parametric methods
may 4-5 2004 Non-parametric methods • Used when parametric methods cannot be used (e.g. heteroscedasticity) • Usually less powerful than their parametric counterparts (it is more difficult to show bioeq. when it holds) • Lie on assumptions on the shape, number of modes, regularity….. Practicallyfor bioequivalence studies The distribution of (ln) TMAX is assumed to be symmetrical
may 4-5 2004 Statistics in bioequivalence Parametric or non-parametric ? Transformation of parameters Experimental design : parallel and crossover Confidence intervals and bioequivalence Sample size in bioequivalence trials
may 4-5 2004 Statistics in bioequivalence Parametric or non-parametric ? Transformation of parameters Experimental design : parallel and crossover Confidence intervals and bioequivalence Sample size in bioequivalence trials
may 4-5 2004 Assumptions ? yes yes no Non parametric methods no Transformation Transformations of parameters Data Parametric methods
may 4-5 2004 Three fundamental assumptions Homoscedasticity The variance of the dependent variable is constant ; it does not vary with independent variables : formulation, animal, period. Independence The random variables implied in the analysis are independent. Normality The random variables implied in the analysis are normally distributed
may 4-5 2004 Ln AUC AUC Ref Ref Test Test Fundamental assumptions : homoscedasticity Homoscedasticity The variance of the dependent variable is constant, does not vary with independent variables : formulation, animal, period. Example : Parallel group design, 2 groups, 10 dogs by group Group 1 : Reference Group 2 : Test
may 4-5 2004 Fundamental assumptions : homoscedasticity • Homoscedasticity • Maybe the most important assumption • Analysis of variance is not robust to heteroscedasticity • More or less easy to check in practice : • - graphical inspection of data (residuals) • - multiple comparisons of variance (Cochran, Bartlett, Hartley…). These tests are not very powerful • Crucial for the bioequivalence problem : the width of the confidence interval mainly depends on the quality of estimation of the variance.
may 4-5 2004 Fundamental assumptions : Independence • Independence (important) • The random variables implied in the analysis are independent. • In a parallel group : the (observations obtained on) animals are independent. • In a cross-over : • the animals are independent. • the difference of observations obtained in each animals with the different formulations are independent. In practice : Difficult to check Has to be assumed
may 4-5 2004 Fundamental assumptions : Normality • Normality • The random variables implied in the analysis are normally distributed. • In a parallel group : the observations of each formulation come from a gaussian distribution. • In a cross-over : • - the "animals" effect is assumed to be gaussian (we are working on a sample of animals) • - the observations obtained in each animal for each formulation are assumed to be normally distributed.
may 4-5 2004 Fundamental assumptions : Normality • Normality • Not important in practice • when the sample size is large enough, the central limit theorem protects us • when the sample size is small, the tests use to detect non normality are not powerful (they do not detect non normality) • The analysis of variance is robust to non normality • Difficult to check : • - graphical inspection of the residuals : Pplot (probability plot) • - Kolmogorov-Smirnov, Chi-Square test…
may 4-5 2004 In practice for bioequivalence Log transformation AUC : to stabilise the variance to obtain a the symmetric distribution CMAX : to stabilise the variance to obtain a the symmetric distribution TMAX (sometimes) : to obtain a the symmetric distribution usually heteroscedasticity remains Without transformation TMAX (sometimes) usually heteroscedasticity
may 4-5 2004 The ln transformation : side effect m is the pop. mean of lnX is the pop. median of X If After a logarithmic transformation bioequivalence methods compares the median (not the mean) of the parameters obtained with each formulation
may 4-5 2004 Statistics in bioequivalence Parametric or non-parametric ? Transformation of parameters Experimental design : parallel and crossover Confidence intervals and bioequivalence Sample size in bioequivalence trials
may 4-5 2004 Statistics in bioequivalence Parametric or non-parametric ? Transformation of parameters Experimental design : parallel and crossover Confidence intervals and bioequivalence Sample size in bioequivalence trials
may 4-5 2004 Parallel and Cross-over designs Period 1 2 Test 1 Sequence 2 Ref. 22 Cross-over parallel
may 4-5 2004 Parallel vs Cross-over design Drawbacks Advantages Easy to organise Easy to analyse Easy to interpret Comparison is carried-out between animals: not very powerful Parallel Comparison is carried-out within animals: powerful Difficult to organise Possible unequal carry-over Difficult to analyse Cross-over
may 4-5 2004 NO Analysis of parallel and cross-over designs Why ? • To check whether or not the assumptions (especially homoscedasticity) hold • To check there is no carry-over (cross-over design) • To obtain a good estimate for • the mean of each formulation • the variance of interest • between subjects for the parallel design • within subject for the cross-over • To assess bioequivalence (student t-test or Fisher test)
may 4-5 2004 Why ? Classical hypotheses for student t-test and Fisher test (ANOVA) H 0 : T= R TandRpopulation mean for test and reference formulation respectively H 1 : T R Hypotheses for the bioequivalence test H 0 : |T- R| >D bioinequivalence H 1 : |T- R| D bioequivalence
may 4-5 2004 Analysis of parallel designs Step 1 : Check (at least graphically) homoscedasticity Transformation ? Step 2 : Estimate the mean for each formulation, estimate the between subjects variance.
may 4-5 2004 Variances comparison : P = 0.026 Example Test Ref Test Ref Heteroscedasticity
may 4-5 2004 Variances comparison : P = 0.66 Example on log transformed data Test Ref Test Ref Homoscedasticity
may 4-5 2004 Example Pooled variance Test Ref
may 4-5 2004 Another way to proceed : ANOVA Write an ANOVA model to analyse data useless here but useful to understand cross-over Notations Yij= ln AUC for the ith animal that received formulation i formulation 1 = Test, formulation 2 = Ref i = 1..2 ; j = 1..10 Yij = µ + Fi + eij y11=4.37 µ = population mean Fi = effect of the ith formulation eij = indep random effects assumed to be drawn from N(0,s²)
may 4-5 2004 Does not give any information about bioeq Another way to proceed : ANOVA Effects coding used for categorical variables in model. Categorical values encountered during processing are: FORMUL$ (2 levels) Ref, Test Dep Var: LN_AUC N: 20 Multiple R: 0.095661810 Squared multiple R: 0.009151182 Analysis of Variance Source Sum-of-Squares df Mean-Square F-ratio P FORMUL$ 0.062709687 1 0.062709687 0.166242589 0.688281535 Error 6.789922946 18 0.377217941 Least squares means LS Mean SE N FORMUL$ =Ref 3.936724342 0.194220993 10 FORMUL$ =Test 3.824733550 0.194220993 10
may 4-5 2004 Analysis of cross-over designs Difficult to analyse by hand, especially when the experimental design is unbalanced. Need of a model to analyse data. • Step 1 : Write the model to analyse the cross-over • Step 2 : Check (at least graphically) homoscedasticity Transformation ? • Step 3 : Check the absence of a carry-over effect • Step 4 : Estimate the mean for each formulation, • estimate the within (intra) subjects variance.
may 4-5 2004 A model for the 22 crossover design AUC Notations Sequence 1 Test + Ref formulation 1 = Test, formulation 2 = Ref AUCij,k(i,j),l= AUC for the lth animal of the seq. j when it received formulation i at period k(i,j) Sequence 2 Ref + Test i = 1..2 ; j = 1..,2 ; k(1,1) = 1 ; k(1,2) = 2 ; k(2,1) = 2 ; k(2,2) = 1 ; l=1..10
may 4-5 2004 A model for the 22 crossover design Y1,1,1,1=78.8 µ = population mean Fi = effect of the ith formulation Sj = effect of the jth sequence Pk(i,j) = effect of the kth period Anl|Sj = random effect of the lth animal of sequence j, they are assumed independent distrib according a N(0,²) ei,j,k,l = indep random effects assumed to be drawn from N(0,s²)
may 4-5 2004 Seq. 1 Seq. 2 Homoscedasticity ? Anl|Sj = assumed independent distrib according a N(0,²) In particular : Var(An|S1)=Var(An|S2) Average AUC Sequence 1 Sequence 2 Comparison of interindividual variances P = 0.038 Usually this test is not powerful
may 4-5 2004 Homoscedasticity ? ei,j,k,l = indep random effects assumed to be drawn from N(0,s²)
may 4-5 2004 After a ln tranformation... ln AUC Sequence 1 Test + Ref Sequence 2 Ref + Test Seq. 1 Seq. 2 Comparison of interindividual variances P = 0.137 Homoscedasticity seems reasonable
may 4-5 2004 Does not give any information about bioeq ANOVA table Effects coding used for categorical variables in model. Categorical values encountered during processing are: FORMUL$ (2 levels) Ref, Test PERIOD (2 levels) 1, 2 SEQUENCE (2 levels) 1, 2 ANIMAL (20 levels) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 Dep Var: LN_AUC N: 40 Multiple R: 0.978999514 Squared multiple R: 0.958440048 Analysis of Variance Source Sum-of-Squares df Mean-Square F-ratio P FORMUL$ 3.077972446 1 3.077972446 6.06297E+01 0.000000526 PERIOD 0.293816162 1 0.293816162 5.787574765 0.027801823 SEQUENCE 1.987295663 1 1.987295663 3.91456E+01 0.000008686 ANIMAL(SEQUENCE) 1.34946E+01 18 0.749700010 1.47676E+01 0.000000479 Error 0.863034164 18 0.050766716 Least squares means LS Mean SE N FORMUL$ =Ref 4.077673811 0.050381899 20 FORMUL$ =Test 3.507676363 0.053107185 20
may 4-5 2004 Period effect Period effect significant • Does not invalidate a crossover design • Does affect in the same way the 2 formulations • Origin : environment, equal carry-over
may 4-5 2004 Does not give any information about bioeq ANOVA table Effects coding used for categorical variables in model. Categorical values encountered during processing are: FORMUL$ (2 levels) Ref, Test PERIOD (2 levels) 1, 2 SEQUENCE (2 levels) 1, 2 ANIMAL (20 levels) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 Dep Var: LN_AUC N: 40 Multiple R: 0.978999514 Squared multiple R: 0.958440048 Analysis of Variance Source Sum-of-Squares df Mean-Square F-ratio P FORMUL$ 3.077972446 1 3.077972446 6.06297E+01 0.000000526 PERIOD 0.293816162 1 0.293816162 5.787574765 0.027801823 SEQUENCE 1.987295663 1 1.987295663 3.91456E+01 0.000008686 ANIMAL(SEQUENCE) 1.34946E+01 18 0.749700010 1.47676E+01 0.000000479 Error 0.863034164 18 0.050766716 Least squares means LS Mean SE N FORMUL$ =Ref 4.077673811 0.050381899 20 FORMUL$ =Test 3.507676363 0.053107185 20
may 4-5 2004 Independent random variables Sequence effect : carryover effect Differential carryover effect significant ? • For all statistical softwares, the only random variables of a model are the residuals e • The ANOVA table is built assuming that all other effects are fixed However We are working on a sample of animals
may 4-5 2004 Testing the carryover effect The test for the carryover (sequence) effect has to be corrected Analysis of Variance Source Sum-of-Squares df Mean-Square F-ratio P FORMUL$ 3.077972446 1 3.077972446 6.06297E+01 0.000000526 PERIOD 0.293816162 1 0.293816162 5.787574765 0.027801823 SEQUENCE 1.987295663 1 1.987295663 3.91456E+01 0.000008686 ANIMAL(SEQUENCE) 1.34946E+01 18 0.749700010 1.47676E+01 0.000000479 Error 0.863034164 18 0.050766716 Test for effect called: SEQUENCE Test of Hypothesis Source SS df MS F P Hypothesis 1.987295663 1 1.987295663 2.650787831 0.120875160 Error 1.34946E+01 18 0.749700010 The good P value
may 4-5 2004 Testing the carryover effect • The test for a carryover effect should be declared significant when P<0.1 • In the previous example P=0.12 : the carryover effect is not significant
may 4-5 2004 How to interpret the (differential) carryover effect ? • A carryover effect is the effect of the drug administrated at a previous period (pollution). • In a 22 crossover, it is differential when it is not the same for the sequence TR and RT. • A non differential carryover effect translates into a period effect • It is confounded with the groups of animals • consequently a poor randomisation can be wrongly interpreted as a carryover effect
may 4-5 2004 What to do if the carryover effect is significant ? • The kinetic parameters obtained in period 2 are unequally polluted by the treatment administrated at period 1. • In a 22 crossover, it is not possible to estimate the pollution • When the carryover effect is significant the data of period 2 should be discarded. • In such a case, the design becomes a parallel group design.
may 4-5 2004 How to avoid a carryover effect ? • Its origin is a too short washout period • The washout period should be taken long enough to ensure that no drug is present at the next period of the experiment
may 4-5 2004 Does not give any information about bioeq P 0.120875160 ANOVA table Effects coding used for categorical variables in model. Categorical values encountered during processing are: FORMUL$ (2 levels) Ref, Test PERIOD (2 levels) 1, 2 SEQUENCE (2 levels) 1, 2 ANIMAL (20 levels) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 Dep Var: LN_AUC N: 40 Multiple R: 0.978999514 Squared multiple R: 0.958440048 Analysis of Variance Source Sum-of-Squares df Mean-Square F-ratio P FORMUL$ 3.077972446 1 3.077972446 6.06297E+01 0.000000526 PERIOD 0.293816162 1 0.293816162 5.787574765 0.027801823 SEQUENCE 1.987295663 1 1.987295663 3.91456E+01 0.000008686 ANIMAL(SEQUENCE) 1.34946E+01 18 0.749700010 1.47676E+01 0.000000479 Error 0.863034164 18 0.050766716 Least squares means LS Mean SE N FORMUL$ =Ref 4.077673811 0.050381899 20 FORMUL$ =Test 3.507676363 0.053107185 20 Inter animals variability
may 4-5 2004 Balance sheet • The fundamental assumptions hold • There is no carryover (crossover design) • Estimate the mean for each formulation, estimate the between (parallel) or within (crossover) subjects variance.
may 4-5 2004 Statistics in bioequivalence Parametric or non-parametric ? Transformation of parameters Experimental design : parallel and crossover Confidence intervals and bioequivalence Sample size in bioequivalence trials
may 4-5 2004 Statistics in bioequivalence Parametric or non-parametric ? Transformation of parameters Experimental design : parallel and crossover Confidence intervals and bioequivalence Sample size in bioequivalence trials