Cross-trial estimation of control effect and false positive issues in active-control trials

Cross-trial estimation of control effect and false positive issues in active-control trials Abdul J Sankoh, PhD sanofi-aventis Bridgewater, NJ 08807, USA abdul.j.sankoh@sanofi-aventis.com Tel: (908) 231 2825; fax: (908) 231 2151 Active control trials: Princeton-Trenton ASA-2006

Presentation Outline Background and presentation motivation  Types of active-control study designs Notation and hypotheses setting Cross-trial estimation of control effect  Properties & simulation results summary on efficiency of two most popular (PE&CI) methods for quantifying inferiority margin Other plausible approaches- an overview  Multiplicative odds ratio, a real clinical data example  Bayesian Modeling- an example

1. Background In a two-arm active-control trial, •evensuperiority of experimental drug to active-control doesn’t guarantee superiority of either drug to placebo! To minimize chances of approving drugs not different than placebo, active-control candidate must have •Substantialhistorical clinical and statistical evidence from R, DB, PC, and well-conducted trials ° that show similarly designed and conducted trials of active-control regularly demonstrated superiority to placebo with substantial and similar treatment effect sizes, Assay sensitivity of the trial!

Problem and motivation

2. Types of active-control designs Two-arm active-control design (E vs. C) – most common: •Assay sensitivity is not measurable in the active trial:  cannot show study’s ability to distinguish active from inactive therapy. Three-or more-arm active-control study design- not common: Dose-ranging active-control study design (E1, …, EK vs. C, 2K, where E1 is safe but sub-optimal dose) Concurrent active- and placebo-control study design (E vs. P; C vs. P; E vs. C). • Assay sensitivity is measurable in the active trial:  can show study’s ability to distinguish active from inactive therapy –Generally preferred by agencies!

Assay sensitivity:- R. Temple: Fundamental problems with non-inferiority/ equivalence trials: “Critical assumption that trial had ‘assay sensitivity’” This assumption: • not necessarily true for all effective drugs •not directly testable in the (2-arm) active-control trial • thus requires an active-control study to have elements of a historically controlled study.

3. Notation & formulation •  = event rate = Pr(Event occurs|Treatment) – P = Pr(Event occurs|P) - placebo group response rate (historical) C = Pr(Event occurs|C) - control group response rate (both) E = Pr(Event occurs|E) – experimental group response rate  Define treatment difference : C-P = C - P = control effect (efficacy of C)- historical trials E-C = E - C = experimental effect (relative efficacy of E)- acive E-P = E-C+C-P= treatment effect (E effect)- cross-trial inference

Some notation & formulation  Note: Treatment = Control + Experimental (E - P) = (C - P) + (E - C) i.e, E-P = C-P + E-C  E-C = E-P - C-P __________________________________________________________  Let N = M - C<0 = non-inferiority margin, Mis (imaginary) response level at boundary of acceptable and unacceptable level of inferiority i.e., N = maximum allowable loss of efficacy associated with E relative to C in active trial.

Hypotheses formulation on the E-C = (E-C)  = event rate = Pr[event occurs|treatment] – Non-inferiority: H0: E-CN vs. HA: E-C < N Superiority: H0: E-C 0 vs. HA: E-C < 0  E is non-inferior E is inferior  |=================|===========)|| 0 N  E is superior E is not different or worse  |=========|========||| -S 0

Ex:  = Pr(event occurs), then small E-C-N (N fixed)favors HA

Accounting for P in M: Percent of C effect retained to ensure efficacy of E : Assume small  is better • Effect of C: C - P = C-P; • Effect of E: E - P= E-P; let (0,1) • Proportion of C effect retained by E:  = C-P/E-P • Proportion of C effect lost by E: 1- =E-C/E-P Ex: C-P = 15% & E-P = 20%; so = 15/20 =75%; 1- = 25%. HA: (E-P)< (C-P) – (E retains at least % of C effect) E-C> (1-)C-P – (E loses  (1-)% of C effect) • Since M-C= (1-)C-P,  HA: C-E> (1-)M-C – (E effect relative to C  (1-)%) E-C < N

4. Two most popular approaches forchoosing non-inferiority margin N Assume statistical approach is toconclude non-inferiority of E to C if N < LL of 100(1-2)% confidence interval (CI) on E-C.  Generally, estimate C-P from historical studies (meta-analysis) and choose N using either: Point estimate (PE): |N| r|C-P|, 0<r 0.5, (N cannot exceed smallest effect size from historical studies) CI: N < LL of 100(1-’)% CI on C-P (1%  ’2.5%)(useful in addressing within study variability) Efficiency issue: PE is too liberal,CI too conservative for controlling type I error rate for concluding E effect!

Efficiency and type I error rate with estimation of C and E effect in active trial using PE approach -  Point estimate (PE): |N| r|C-P|, 0<r 0.5 - efficient under constancy and 100% retention of control effect - but type I error rate inflation for concluding efficacy of E if constancy assumption not tenable or C effect in active trial is less than historical trials(see Fig. 1)  In general, type I error rate for concluding E effect w/ 97.5% PE Pr[E-P  C-P|H0]  1-(1.96f); E[|H0]=  f = [1+{(1-)0-1E-C}2]-1/2<1  20=2C-P estimate of control effect variance from historical data.  2E-C estimated from active trial;  is observed &  truediff.

Fig. 1.- Overall type I error rate w/ PE approach for concluding efficacy of E and non-inferiority to C for given % retention and efficiency ratio(C-P/E-C= 1.5, 1.1, 1.0, 0.91, 0.67)

Efficiency and type I error rate for estimation of C and E effect in active trial using CI approach-Preferred by Agencies?(Hung et al, Wang et al, Chen et al, Tsong et al, Rothman et al, Ng et al)  Confidence interval (CI): N< LL of 100(1-’)% CI on C-P; - ultra-conservative even with constancy of control effect - generally serious deflation of type I error rate unless 100% retention of control effect is achieved (Fig. 2).  In general, type I error rate for concluding E effect w/97.5% CI Pr[E-P  C-P|H0]  1- (1.96h); E[|H0]=  h = [1+(1- )0-1E-C][1+{(1-)0-1E-C}2]-1/2 >1  20=2C-P estimate of control effect variance from historical data  2E-C estimated from active trial;  is observed &  truediff.

Fig. 2.- CI approach overall type I error rate for concluding efficacy of E and non-inferiority to C for given % retention and efficiency ratio(C-P/E-C=1.1, 1.0, 0.91, 0.67)

Summary of simulation results (Figs. 1&2)  In general, nominal type I error rate () is maintained if Historical control effect (0C-P)=Active control effect (C-P)!  Generally, when N is fixed known constant, Pr[E-P  C-P|H0]  ; E[|H0]=  For both methods, efficiency of active study highly dependent on external factors - outside control of active study.  For PE, higher historical (C) variability vs. E more liberal.  For CI, higher historical (C) variability vs. E more conservative.  For both, smaller historical (C) variability vs. E more efficient active study.

?Any plausible less conservative approach – Critical Path Initiative FDA Considering "Less Conservative" Approaches To Non-Inferiority Trials - PINK-SHEET, June 27, 2005, pp 10. • "One of the things we've been thinking about is whether there are somewhat less conservative approaches" to non-inferiority trials "than we've been inclined to use," Temple said at a 6/15/2005 FDA's Cardio-Renal ACM. • FDA has "thought about things like narrowing the confidence interval for certain measures, using less stringent insistence on convention in a wide variety of ways". • One way to narrow confidence intervals would be "to incorporate prior data".  BAYESIAN approach?

Numerical Ex 2:- ESSENCE- Active control, DB, R, PG study comparing Enoxoparin+Aspirin vs. Heparin + Aspirin in UA pts. 2ndary Endpoint: Composite of Death or MI @ Day 14.  2-sided p-value = 0.019 for OR of E+A vs. H+A (for primary composite endpoint of recurrent angina, MI, or death).  For 2ndary composite endpoint of MI or death: • pH+A = 96/1564 = 0.06 = Pr(MI or Death|C  H+A) • pE+A =79/1607 = 0.05 = Pr(MI or Death|E  E+A) • OR(E+A) v (H+A)=0.791; 95% CI (0.582,1.074); p-value =0.132. Q: Is E+A better than A alone, had there been A arm in active trial?  To answer, cross-trial estimation of Aspirin response?

Numerical Examples:1. ESSENCE- Active control, DB, R, PG study comparing Enoxoparin+Aspirin vs. Heparin + Aspirin in UA pts. 2ndary Endpoint: Composite of Death or MI @ Day 14. ▪ Used meta-analysis to estimate H+A ( C0 ) effects from historical data on reduction in Death or MI in UA patients. Incidence of MI or Death and associated OR from 6 historical studies Historical Data Active Study Aspirin H+A H+A E+A (A  P) (H+A  C0) (C) (E) 1. 4/121 2/12 96/1564 79/1607 2. 7/189 3/210 (0.06) (0.05) 3. 1/32 0/37 ------------------------- 4. 9/109 4/105 OR(E+A) vs. (H+A)= 0.791 5. 40/131 42/154 95% CI: (0.58, 1.07) 6. 7/73 4/70 2-sided p-value = 0.132 Total 68/655 55/698 (0.104) (0.079)  OR(H+A) vs. A =0.665; 95% CI: (0.443, 0.992); p=.045 (6 22 tables - StatXact)

Ex 2: Active control, DB, R, PG study comparing E+A vs. H+A2nd Endpoint: Composite of Death or MI @ Day 14. Multiplicative Odds Ratio (OR) - Epidemiology To estimate OR of E+A vs. A alone OR(E+A) vs. A = [OR(E+A) vs. (H+A)][OR(H+A) vs. A] Active Historical = 0.791 0.665 = 0.52  95% CI on log(OR) – (0.31, 0.89); 2-sided p-value = 0.016 (ESSENCE Study FDA Statistical review, 1998).

?Bayesian Modeling in two-arm active design Bayesian:- Been used in traditional two-arm active-control designs to show implicit superiority of test drug over (putative) placebo (Simon, Biometrics 1999).  Determine (predictive) posterior probabilities based on appropriate priors and conclude non-inferiority if predictive posterior probability greater than 0.975.  If Pr[E effect relative to putative P|data]  0.95, then E>>P  Pr[C effect +ve & E preserves  100 (1-)% C effect]0.975, then conclude E preserves  (1-)% of C effect.  Compute one-sided p-value = 1 - posterior probability.

Predictive (posterior) models for calculation of posterior probabilities Ex: One sample predictive (posterior) categorical model:  From prior clinical info, estimate N = 30 (1:1 to 2 treatment grps). ▪ Partition sample space:(yj1, yj2, yj3) = (xj1, xj2, xj3) + (zj1, zj2, zj3) (y’s categorical & exchangeable) observedunobserved Assume yjr|pj = pjr & pjr~Dirichlet(jr); r=1,…,M Given sample info xj = (xj1, xj2, xj3); 1r3xjr= nj, then yjr|(pj, xj) Dir(jr+xjr). ▪pjr jr/j, j=j1+…+jM – prior to sample info ▪ pjr|x (jr+xjr)/(j+nj), j+nj = j1+xj1…+jM+xjM,- given sample

Ex:One sample of the Categorical model- continued  Assume at interim look, of 7 randomized and treated with treatment j, we observe xj=(2, 4, 1). If jr =2, then pjr|j {0.333} =jr/j, = {2/(2+2+2)}, forall r; is our prior estimate before sample info, and pj|(j,x) (.307, .462, .230) = {(j1+xj1)/(j+nj)} is our updated estimate given sample info xj=(2, 4, 1)  Compare with frequentist’s estimate pj= (.286, .571, .143), jr=0.  For Nj=20, z=(5, 5, 3), {j} = (1, 3, 2); {jr/j}= (.167, .500, .333); {(j1+xj1)/(j+nj)} = (.133, .467, .267)

Fig. 3.-For N=15, predictive PP[z|x;] vs. multinomial Mn[z;p] future success probabilities for prior info: a=(2, 2, 2) & b=(2, 3, 1); sample info x = (2, 4, 1); future success z=(z11, z12, z13), z11+z12+z13=8

Ex: Extension to non-categorical model- yjk = outcome of kth subject in treatment j (j=1,2 and k=1,…, Nj), yjk any value on real line, R. • yj= (yj1,…, yjN1) is a random sample from Fj If Fj j(.)/ j(R), a Dirichlet process with shape measure j(.) defined on R, j(R) = j(-, ) • Then yjis Dirichlet process with shape measure j(.) • Given xj= (x11…, x1n1) sample information, yj|xDir(j(.)+nj(.)), nj(.) = kx(.), x unit mass at x

Strength of prior clinical evidence (R)  0 is measure of strength of prior clinical evidence (e.g., Phase I/IIa, historical trial sample size) • • (R) = 0 reflecting complete lack of clinical information. • • Thus (R) = (0, ) • • (R) could be dominated by sample information

Posterior estimates Let g() denote mean treatment difference: g() = [(y11+…+y1N1)/N1 - (y21+…+y2 N2)/N2]. • • n = E[g()|x] = n+ (1-)0,01= (mean of observed) + (1-)(mean of unobserved) –  *(stage 1 estimate)+(1-*)(stage 2 estimate) - frequentist‘s weighted statistic • n = Var[g()|x] = w1s2 + w2(R)(n-0)2) + w32 = wns2 for (R) = 0 . Where s2 & 2 = sample & prior variances; n & 0 = sample & prior means; w’s constants.

Establishing non-inferiority • With posterior model (categorical/non-categorical)-  can determine (predictive) posterior probabilities and conclude non-inferiority if predictive posterior probability greater than 0.975.  If Pr[E effect relative to putative P|data]  0.95, then E>>P  Pr[C effect +ve & E preserves  100 (1-)% C effect]0.975, then conclude E preserves  (1-)% of C effect.  Compute one-sided p-value = 1 - posterior probability

Other possible Applications Given clinically acceptable treatment effect size,  For fixed N (=1j JNj), test for futility possible– group sequential  If N not fixed, sample size re-estimation possible – adaptive Dynamic treatment allocation – play the winner

Concluding Remarks •Generally difficult to design and conduct efficient active control trials • Active control trial design and hypotheses formulation not only disease, endpoint, and analysis method dependent, but too dependent on historical. • Equivalence or non-inferiority trials always raise a question: Is at least % of control effect size retained in active trial? If not, equivalence or non-inferiority conclusion is meaningless as experimental drug (E) could have no effect at all. • Thus if control effect size is not “substantial enough” and/or “constant”, superiority and NOT non-inferiority objective should be pursued; else stringent requirement of‘100% retention of control effect’ is inescapable! • Constancy assumption for C effect and trial conditions is not time-invariant and unrealistic for active-control trial success

Concluding Remarks • Simulation results show adequate type I error rate control for concluding E effect with point estimate for inferiority margin (N) selection when  75% retention of C effect. But CI method ultra-conservative for concluding E effect even with  75% retention of C effect. • Since primary concern is assay sensitivity, regulators should provide lists of compounds (by class) with acceptable assay sensitivity and thus ok to focus on establishing non-inferiority of E to C and not demonstration of E effect. • For compounds with poor assay sensitivity, agencies should recommend three-arm dose-ranging active design, else clearly document minimum C effect that must be reproduced in active trial to ensure efficacy of E in active study. Need to explore alternative approaches to establish E effect!

Some References • Tsong Y, Zhang J (2005). Testing superiority and non-inferiority hypotheses in active controlled active trials. Biometrical Journal, 47, 62-74. • Hung HMJ, Wang S-J, Tsong Y, Lawrence J, O’Neil RT (2003). Some fundamental issues with non-inferiority testing in active controlled trials. Statistics in Med., 22, 213-225. • ICH Efficacy Document No. E-9 (1997). Statistical Principles of Clinical Trials. http://www.fda.gov/cder/guidance. • DerSimonian R, Laird N (1986). Meta-analysis in clinical trials. CCT, 177-188. • Sankoh, AJ, Huque, MF. Impact of Multiple Endpoints on Type I Error Rate and Power of Test Statistic in Non-superiority Clinical Trials. Far East Journal of Theoretical Statistics, 13 (1), 47-65. • Sankoh AJ, Al-Osh M, Huque FM (1999). On the utility of the Dirichlet distribution for meta-analysis of clinical studies. JBS, 9: 289-306. • Rohmel J (1998). Therapeutic equivalence investigations: statistical consideration. Stats in Med, 17, 1703-1714 • Holmgreen EB (1999). Establishing equivalence by showing that a specified percentage of the effect of the active control over placebo is maintained JBS 9(4), 651-659

Cross-trial estimation of control effect and false positive issues in active-control trials

Cross-trial estimation of control effect and false positive issues in active-control trials

Presentation Transcript

Control for Mobile Robots

Active Reading

The Policy Debate: Active or Passive?

Clinical Trials

The Nuremberg Trials

Clinical Trials Overview

Experimental studies: Clinical trials, field trials, community trials, and intervention studies

Computing Platforms for Multimedia

Randomised Controlled Trials in the Social Sciences Cluster randomised trials Martin Bland Professor of Health Statistic

Development and Implementation of Novel Techniques for the Control of Shunt Active Filter

Clinical trials - facts and myths!

Active Directory

Battle of Trenton – December, 1776 Battle of Monmouth – June, 1778

Active Learning

HIV Clinical trials at MRC CTU

Active Directory Disaster Recovery

Pressure Control