Bayesian posterior predictive probability - what do interim analyses mean for decision making?

Bayesian posterior predictive probability - what do interim analyses mean for decision making? Oscar Della Pasqua & Gijs Santen Clinical Pharmacology Modelling & Simulation, GlaxoSmithKline, UK Division of Pharmacology, Leiden University, The Netherlands

Time course of HAMD in Depression

Linear mixed-effects model • HAMD response Yij = baselinei*baseffectj + tmteffectz,j + η1i + η2i*j + εij  Fixed Effects • Interaction baseline*time • Interaction treatment effect*time  Two Random Effects • multivariate distribution with mean 0 and unknown variance-covariance matrix  Residual Error See Santen et al, Clin Pharmacol Ther, Sept 2009.

Model fitting

Diagnostics – Goodness-of-fit

Diagnostics – NPDE One random effect (same MLE as MMRM) New model with two random effects

Typical clinical trial design • 2 active treatment arms, one placebo arm • 150 patients per arm • Trial duration of 6-8 weeks • Observations every 1-2 weeks • Endpoint: HAMD • Statistical analysis: LOCF / MMRM

Interim analysis: current situation • ~50% of trials fail. Early detection of failing trials is worthwhile! • Important factors for an interim analysis include recruitment rate, treatment duration, timing of IA • Even though recruitment rate is not known at the start of the study, criteria for and timing of interim analysis is defined a priori. • Inaccurate expectation about the informative value and risk of making a wrong decision.

Major issues for an interim analysis (IA): recruitment rate, study duration and timing of IA

Timing & enrolment: impact of recruitment rate 450 450 ‘Completers line’ Patients in study 150 150 ‘Completers line’ 0 0 0 0 56 180 56 180 Time (days from start of enrolment) Time (days from start of enrolment) The slower recruitment the higher the impact of an interim analysis

‘Completers line for a shorter treatment duration 450 450 ‘Completers line’ for a longer treatment 150 150 0 0 0 0 56 56 180 180 Time (days from start of enrolment) Timing & enrolment: impact of treatment duration 450 Patients in study 150 0 0 56 180 Time (days from start of enrolment) Shorter treatment duration  earlier interim analysis, more impact

Interim analysis • Which parameter should be used to infer decisions? • What about the timing of the interim analysis? - When is enough information available? • How to best compare different decision criteria?

1 2 3 4 Interim analysis is performed on the simulated datasets using the actual enrolment data Best performing decision criteria and timing are selected Decision is made whether analysis is performed now or is postponed. Simulate dataset from historical trials with:1. negative treatment arm ( HAMD=0)2. positive treatment arm ( HAMD=2) Incoming data on enrolment

1 2 3 BOOTSTRAP Posterior Predictive Power Data obtained until time t is analysed using the longitudinal model WinBUGS MCMC Posterior distributions 1000 new trials are simulated with the projected number of patients from these posterior distributions. Conditional power is computed Posterior Predictive Power: ..%

Goalpost for stopping for efficacy Goalpost for stopping for futility Surface required to trigger decision Interim analysis: Decisions Density Posterior predictive power (%) • Decision criteria to be determined: • Futility goalpost (e.g. 50%) • Efficacy goalpost (e.g. 90%) • Degree of evidence required to trigger a decision (e.g. 85%)

Choice of decision criteria • Main goal is to maximise difference between ‘power’ and ‘type I error’ • Type I error may never be higher than 5%, type II error should remain below 20% • This is done separately for futility and efficacy testing  STOP efficacious treatment arms for efficacy, but not at the cost of inflating the false positive rate  STOP non-efficacious treatment arms for futility without inflating the false negative rate

Interim Analysis - An example • 3 treatment arms • 150 patients / arm • Paroxetine CR 12.5 mg, 25 mg and placebo • Study design includes clinical assessments at weeks 1,2,3,4,6 and 8 • An interim analysis is initially proposed with at least 25% completers, around day 70 from the start of enrolment. • Assess impact of recruitment rate on timing and • Determine optimal decision criteria for the IA.

Selection of timing & criteria Decision boundary (%PPP) 90% 95% (power – type I error)

Determining timing & criteria Recruitment rate Cumulative patient enrolment Day Parameters: Futility goalpost at 45% Efficacy goalpost at 60% Degree of evidence at 85% (both) (power – type I error) Use of the proposed implementation for the interim analysis of data from the actual trial did result in the correct decision! • Additional conditions: • Inefficacious treatment arm should be stopped for efficacy in <5% (Type I error) • Treatment arm  = 2 points HAMD should be stopped for futility in <20% (Type II error)

Conclusions • Decisions about futility and efficacy during and IA are affected by enrolment rate. • Historical clinical data can be used in a Bayesian framework to optimise an interim analysis. • In contrast to adaptive design protocols, the proposed method optimises the criteria and the timing at which decisions should be made about futility and efficacy. • The uncertainty of parameters estimates obtained at the interim analysis is factored in a Bayesian framework. • Work in progress to show the application of the methodology in other therapeutic indications.

The success of R&D to address unmet medical needs does not depend only on finding new targets, it depends on better decision making.

Bayesian posterior predictive probability - what do interim analyses mean for decision making?