330 likes | 598 Views
Cross-trial estimation of control effect and false positive issues in active-control trials. Abdul J Sankoh, PhD sanofi-aventis Bridgewater, NJ 08807, USA abdul.j.sankoh@sanofi-aventis.com Tel: (908) 231 2825; fax: (908) 231 2151. Presentation Outline.
E N D
Cross-trial estimation of control effect and false positive issues in active-control trials Abdul J Sankoh, PhD sanofi-aventis Bridgewater, NJ 08807, USA abdul.j.sankoh@sanofi-aventis.com Tel: (908) 231 2825; fax: (908) 231 2151 Active control trials: Princeton-Trenton ASA-2006
Presentation Outline Background and presentation motivation Types of active-control study designs Notation and hypotheses setting Cross-trial estimation of control effect Properties & simulation results summary on efficiency of two most popular (PE&CI) methods for quantifying inferiority margin Other plausible approaches- an overview Multiplicative odds ratio, a real clinical data example Bayesian Modeling- an example
1. Background In a two-arm active-control trial, •evensuperiority of experimental drug to active-control doesn’t guarantee superiority of either drug to placebo! To minimize chances of approving drugs not different than placebo, active-control candidate must have •Substantialhistorical clinical and statistical evidence from R, DB, PC, and well-conducted trials ° that show similarly designed and conducted trials of active-control regularly demonstrated superiority to placebo with substantial and similar treatment effect sizes, Assay sensitivity of the trial!
2. Types of active-control designs Two-arm active-control design (E vs. C) – most common: •Assay sensitivity is not measurable in the active trial: cannot show study’s ability to distinguish active from inactive therapy. Three-or more-arm active-control study design- not common: Dose-ranging active-control study design (E1, …, EK vs. C, 2K, where E1 is safe but sub-optimal dose) Concurrent active- and placebo-control study design (E vs. P; C vs. P; E vs. C). • Assay sensitivity is measurable in the active trial: can show study’s ability to distinguish active from inactive therapy –Generally preferred by agencies!
Assay sensitivity:- R. Temple: Fundamental problems with non-inferiority/ equivalence trials: “Critical assumption that trial had ‘assay sensitivity’” This assumption: • not necessarily true for all effective drugs •not directly testable in the (2-arm) active-control trial • thus requires an active-control study to have elements of a historically controlled study.
3. Notation & formulation • = event rate = Pr(Event occurs|Treatment) – P = Pr(Event occurs|P) - placebo group response rate (historical) C = Pr(Event occurs|C) - control group response rate (both) E = Pr(Event occurs|E) – experimental group response rate Define treatment difference : C-P = C - P = control effect (efficacy of C)- historical trials E-C = E - C = experimental effect (relative efficacy of E)- acive E-P = E-C+C-P= treatment effect (E effect)- cross-trial inference
Some notation & formulation Note: Treatment = Control + Experimental (E - P) = (C - P) + (E - C) i.e, E-P = C-P + E-C E-C = E-P - C-P __________________________________________________________ Let N = M - C<0 = non-inferiority margin, Mis (imaginary) response level at boundary of acceptable and unacceptable level of inferiority i.e., N = maximum allowable loss of efficacy associated with E relative to C in active trial.
Hypotheses formulation on the E-C = (E-C) = event rate = Pr[event occurs|treatment] – Non-inferiority: H0: E-CN vs. HA: E-C < N Superiority: H0: E-C 0 vs. HA: E-C < 0 E is non-inferior E is inferior |=================|===========)|| 0 N E is superior E is not different or worse |=========|========||| -S 0
Ex: = Pr(event occurs), then small E-C-N (N fixed)favors HA
Accounting for P in M: Percent of C effect retained to ensure efficacy of E : Assume small is better • Effect of C: C - P = C-P; • Effect of E: E - P= E-P; let (0,1) • Proportion of C effect retained by E: = C-P/E-P • Proportion of C effect lost by E: 1- =E-C/E-P Ex: C-P = 15% & E-P = 20%; so = 15/20 =75%; 1- = 25%. HA: (E-P)< (C-P) – (E retains at least % of C effect) E-C> (1-)C-P – (E loses (1-)% of C effect) • Since M-C= (1-)C-P, HA: C-E> (1-)M-C – (E effect relative to C (1-)%) E-C < N
4. Two most popular approaches forchoosing non-inferiority margin N Assume statistical approach is toconclude non-inferiority of E to C if N < LL of 100(1-2)% confidence interval (CI) on E-C. Generally, estimate C-P from historical studies (meta-analysis) and choose N using either: Point estimate (PE): |N| r|C-P|, 0<r 0.5, (N cannot exceed smallest effect size from historical studies) CI: N < LL of 100(1-’)% CI on C-P (1% ’2.5%)(useful in addressing within study variability) Efficiency issue: PE is too liberal,CI too conservative for controlling type I error rate for concluding E effect!
Efficiency and type I error rate with estimation of C and E effect in active trial using PE approach - Point estimate (PE): |N| r|C-P|, 0<r 0.5 - efficient under constancy and 100% retention of control effect - but type I error rate inflation for concluding efficacy of E if constancy assumption not tenable or C effect in active trial is less than historical trials(see Fig. 1) In general, type I error rate for concluding E effect w/ 97.5% PE Pr[E-P C-P|H0] 1-(1.96f); E[|H0]= f = [1+{(1-)0-1E-C}2]-1/2<1 20=2C-P estimate of control effect variance from historical data. 2E-C estimated from active trial; is observed & truediff.
Fig. 1.- Overall type I error rate w/ PE approach for concluding efficacy of E and non-inferiority to C for given % retention and efficiency ratio(C-P/E-C= 1.5, 1.1, 1.0, 0.91, 0.67)
Efficiency and type I error rate for estimation of C and E effect in active trial using CI approach-Preferred by Agencies?(Hung et al, Wang et al, Chen et al, Tsong et al, Rothman et al, Ng et al) Confidence interval (CI): N< LL of 100(1-’)% CI on C-P; - ultra-conservative even with constancy of control effect - generally serious deflation of type I error rate unless 100% retention of control effect is achieved (Fig. 2). In general, type I error rate for concluding E effect w/97.5% CI Pr[E-P C-P|H0] 1- (1.96h); E[|H0]= h = [1+(1- )0-1E-C][1+{(1-)0-1E-C}2]-1/2 >1 20=2C-P estimate of control effect variance from historical data 2E-C estimated from active trial; is observed & truediff.
Fig. 2.- CI approach overall type I error rate for concluding efficacy of E and non-inferiority to C for given % retention and efficiency ratio(C-P/E-C=1.1, 1.0, 0.91, 0.67)
Summary of simulation results (Figs. 1&2) In general, nominal type I error rate () is maintained if Historical control effect (0C-P)=Active control effect (C-P)! Generally, when N is fixed known constant, Pr[E-P C-P|H0] ; E[|H0]= For both methods, efficiency of active study highly dependent on external factors - outside control of active study. For PE, higher historical (C) variability vs. E more liberal. For CI, higher historical (C) variability vs. E more conservative. For both, smaller historical (C) variability vs. E more efficient active study.
?Any plausible less conservative approach – Critical Path Initiative FDA Considering "Less Conservative" Approaches To Non-Inferiority Trials - PINK-SHEET, June 27, 2005, pp 10. • "One of the things we've been thinking about is whether there are somewhat less conservative approaches" to non-inferiority trials "than we've been inclined to use," Temple said at a 6/15/2005 FDA's Cardio-Renal ACM. • FDA has "thought about things like narrowing the confidence interval for certain measures, using less stringent insistence on convention in a wide variety of ways". • One way to narrow confidence intervals would be "to incorporate prior data". BAYESIAN approach?
Numerical Ex 2:- ESSENCE- Active control, DB, R, PG study comparing Enoxoparin+Aspirin vs. Heparin + Aspirin in UA pts. 2ndary Endpoint: Composite of Death or MI @ Day 14. 2-sided p-value = 0.019 for OR of E+A vs. H+A (for primary composite endpoint of recurrent angina, MI, or death). For 2ndary composite endpoint of MI or death: • pH+A = 96/1564 = 0.06 = Pr(MI or Death|C H+A) • pE+A =79/1607 = 0.05 = Pr(MI or Death|E E+A) • OR(E+A) v (H+A)=0.791; 95% CI (0.582,1.074); p-value =0.132. Q: Is E+A better than A alone, had there been A arm in active trial? To answer, cross-trial estimation of Aspirin response?
Numerical Examples:1. ESSENCE- Active control, DB, R, PG study comparing Enoxoparin+Aspirin vs. Heparin + Aspirin in UA pts. 2ndary Endpoint: Composite of Death or MI @ Day 14. ▪ Used meta-analysis to estimate H+A ( C0 ) effects from historical data on reduction in Death or MI in UA patients. Incidence of MI or Death and associated OR from 6 historical studies Historical Data Active Study Aspirin H+A H+A E+A (A P) (H+A C0) (C) (E) 1. 4/121 2/12 96/1564 79/1607 2. 7/189 3/210 (0.06) (0.05) 3. 1/32 0/37 ------------------------- 4. 9/109 4/105 OR(E+A) vs. (H+A)= 0.791 5. 40/131 42/154 95% CI: (0.58, 1.07) 6. 7/73 4/70 2-sided p-value = 0.132 Total 68/655 55/698 (0.104) (0.079) OR(H+A) vs. A =0.665; 95% CI: (0.443, 0.992); p=.045 (6 22 tables - StatXact)
Ex 2: Active control, DB, R, PG study comparing E+A vs. H+A2nd Endpoint: Composite of Death or MI @ Day 14. Multiplicative Odds Ratio (OR) - Epidemiology To estimate OR of E+A vs. A alone OR(E+A) vs. A = [OR(E+A) vs. (H+A)][OR(H+A) vs. A] Active Historical = 0.791 0.665 = 0.52 95% CI on log(OR) – (0.31, 0.89); 2-sided p-value = 0.016 (ESSENCE Study FDA Statistical review, 1998).
?Bayesian Modeling in two-arm active design Bayesian:- Been used in traditional two-arm active-control designs to show implicit superiority of test drug over (putative) placebo (Simon, Biometrics 1999). Determine (predictive) posterior probabilities based on appropriate priors and conclude non-inferiority if predictive posterior probability greater than 0.975. If Pr[E effect relative to putative P|data] 0.95, then E>>P Pr[C effect +ve & E preserves 100 (1-)% C effect]0.975, then conclude E preserves (1-)% of C effect. Compute one-sided p-value = 1 - posterior probability.
Predictive (posterior) models for calculation of posterior probabilities Ex: One sample predictive (posterior) categorical model: From prior clinical info, estimate N = 30 (1:1 to 2 treatment grps). ▪ Partition sample space:(yj1, yj2, yj3) = (xj1, xj2, xj3) + (zj1, zj2, zj3) (y’s categorical & exchangeable) observedunobserved Assume yjr|pj = pjr & pjr~Dirichlet(jr); r=1,…,M Given sample info xj = (xj1, xj2, xj3); 1r3xjr= nj, then yjr|(pj, xj) Dir(jr+xjr). ▪pjr jr/j, j=j1+…+jM – prior to sample info ▪ pjr|x (jr+xjr)/(j+nj), j+nj = j1+xj1…+jM+xjM,- given sample
Ex:One sample of the Categorical model- continued Assume at interim look, of 7 randomized and treated with treatment j, we observe xj=(2, 4, 1). If jr =2, then pjr|j {0.333} =jr/j, = {2/(2+2+2)}, forall r; is our prior estimate before sample info, and pj|(j,x) (.307, .462, .230) = {(j1+xj1)/(j+nj)} is our updated estimate given sample info xj=(2, 4, 1) Compare with frequentist’s estimate pj= (.286, .571, .143), jr=0. For Nj=20, z=(5, 5, 3), {j} = (1, 3, 2); {jr/j}= (.167, .500, .333); {(j1+xj1)/(j+nj)} = (.133, .467, .267)
Fig. 3.-For N=15, predictive PP[z|x;] vs. multinomial Mn[z;p] future success probabilities for prior info: a=(2, 2, 2) & b=(2, 3, 1); sample info x = (2, 4, 1); future success z=(z11, z12, z13), z11+z12+z13=8
Ex: Extension to non-categorical model- yjk = outcome of kth subject in treatment j (j=1,2 and k=1,…, Nj), yjk any value on real line, R. • yj= (yj1,…, yjN1) is a random sample from Fj If Fj j(.)/ j(R), a Dirichlet process with shape measure j(.) defined on R, j(R) = j(-, ) • Then yjis Dirichlet process with shape measure j(.) • Given xj= (x11…, x1n1) sample information, yj|xDir(j(.)+nj(.)), nj(.) = kx(.), x unit mass at x
Strength of prior clinical evidence (R) 0 is measure of strength of prior clinical evidence (e.g., Phase I/IIa, historical trial sample size) • • (R) = 0 reflecting complete lack of clinical information. • • Thus (R) = (0, ) • • (R) could be dominated by sample information
Posterior estimates Let g() denote mean treatment difference: g() = [(y11+…+y1N1)/N1 - (y21+…+y2 N2)/N2]. • • n = E[g()|x] = n+ (1-)0,01= (mean of observed) + (1-)(mean of unobserved) – *(stage 1 estimate)+(1-*)(stage 2 estimate) - frequentist‘s weighted statistic • n = Var[g()|x] = w1s2 + w2(R)(n-0)2) + w32 = wns2 for (R) = 0 . Where s2 & 2 = sample & prior variances; n & 0 = sample & prior means; w’s constants.
Establishing non-inferiority • With posterior model (categorical/non-categorical)- can determine (predictive) posterior probabilities and conclude non-inferiority if predictive posterior probability greater than 0.975. If Pr[E effect relative to putative P|data] 0.95, then E>>P Pr[C effect +ve & E preserves 100 (1-)% C effect]0.975, then conclude E preserves (1-)% of C effect. Compute one-sided p-value = 1 - posterior probability
Other possible Applications Given clinically acceptable treatment effect size, For fixed N (=1j JNj), test for futility possible– group sequential If N not fixed, sample size re-estimation possible – adaptive Dynamic treatment allocation – play the winner
Concluding Remarks •Generally difficult to design and conduct efficient active control trials • Active control trial design and hypotheses formulation not only disease, endpoint, and analysis method dependent, but too dependent on historical. • Equivalence or non-inferiority trials always raise a question: Is at least % of control effect size retained in active trial? If not, equivalence or non-inferiority conclusion is meaningless as experimental drug (E) could have no effect at all. • Thus if control effect size is not “substantial enough” and/or “constant”, superiority and NOT non-inferiority objective should be pursued; else stringent requirement of‘100% retention of control effect’ is inescapable! • Constancy assumption for C effect and trial conditions is not time-invariant and unrealistic for active-control trial success
Concluding Remarks • Simulation results show adequate type I error rate control for concluding E effect with point estimate for inferiority margin (N) selection when 75% retention of C effect. But CI method ultra-conservative for concluding E effect even with 75% retention of C effect. • Since primary concern is assay sensitivity, regulators should provide lists of compounds (by class) with acceptable assay sensitivity and thus ok to focus on establishing non-inferiority of E to C and not demonstration of E effect. • For compounds with poor assay sensitivity, agencies should recommend three-arm dose-ranging active design, else clearly document minimum C effect that must be reproduced in active trial to ensure efficacy of E in active study. Need to explore alternative approaches to establish E effect!
Some References • Tsong Y, Zhang J (2005). Testing superiority and non-inferiority hypotheses in active controlled active trials. Biometrical Journal, 47, 62-74. • Hung HMJ, Wang S-J, Tsong Y, Lawrence J, O’Neil RT (2003). Some fundamental issues with non-inferiority testing in active controlled trials. Statistics in Med., 22, 213-225. • ICH Efficacy Document No. E-9 (1997). Statistical Principles of Clinical Trials. http://www.fda.gov/cder/guidance. • DerSimonian R, Laird N (1986). Meta-analysis in clinical trials. CCT, 177-188. • Sankoh, AJ, Huque, MF. Impact of Multiple Endpoints on Type I Error Rate and Power of Test Statistic in Non-superiority Clinical Trials. Far East Journal of Theoretical Statistics, 13 (1), 47-65. • Sankoh AJ, Al-Osh M, Huque FM (1999). On the utility of the Dirichlet distribution for meta-analysis of clinical studies. JBS, 9: 289-306. • Rohmel J (1998). Therapeutic equivalence investigations: statistical consideration. Stats in Med, 17, 1703-1714 • Holmgreen EB (1999). Establishing equivalence by showing that a specified percentage of the effect of the active control over placebo is maintained JBS 9(4), 651-659