160 likes | 228 Views
Learn how to optimize futility stopping rules in clinical trials for quicker decisions, reduced costs, and patient safety. Case study on an HIV drug trial. Includes decision analysis and practical insights.
E N D
Informing the selection of futility stopping thresholds: case study from a late-phase clinical trialHughes S, Cuffe RL, Lieftucht A, Nichols WGPharmaceutical Statistics 2009; 8: 25-37 Sara Hughes GSK Head of Clinical Statistics For PSI/DIA Journal Club, June 2012 sara.h.hughes@gsk.com
Research Example • Constant drive for more efficient clinical trial designs • Quicker decisions • Reduced financial & human investment in ‘futile’ drugs / doses • Patient safety • Adaptive designs receiving much press and research • Futility designs became viable in late 1970s • But, limited examples of application in clinical trial literature (at time of this paper) • Futility case study • General futility definitions and case study background • Useful graphical tools created to demonstrate risks of futility design • Decision analysis developed to aid selection of futility stopping rules 2
Futility Dictionary of Terms • Futility interim analysis: the option to stop a study if the possibility at the interim stage of ultimately getting a positive result is remote • ie “it’s futile to continue - the data looks so bad that no amount of further data will reverse that - let’s quit now” • Stopping threshold: what result would make us quit? • Various statistical methods exist to quantify probability of future success (POS) but little guidance available for selecting optimal values for stopping thresholds • High threshold few bad trials continue but some good trials stopped • Low threshold most good trials continue but so do some failures 3
HIV Futility Case Study • GSK has an EU license to sell HIV drug Telzir at dose 700mg twice-daily with Ritonovir 100mg twice-daily boosting • Interested in investigating Telzir 1400mg once-daily with Ritonovir 100mg once-daily boosting • Once daily dosing would offer increased convenience • Reduced Ritonovir dose may offer improved safety profile • Study to assess this is large, lengthy and costly • Futility design reduces the risk of a failed study • Without high probability of success, can redirect resources to other research & stop prescribing ineffective dosing regimen 4
Study Design • Primary endpoint: Non-inferiority on efficacy (proportion with undetectable HIV viral load) Stop After Stage 1 if POS < X% • Key powered secondary endpoint: Superior on safety (difference of ≥13mg/dL in non-HDL cholesterol) Stop After Stage 1 if POS <Y% Stage One (N=200) Stage Two (N=528) Investigational dose Investigational dose 1:1 randomisation Standard dose Standard dose 24 weekInterim futility analysis 48 weekFinal analysis 5
“POS” for Case Study • A variety of statistical stopping methods can be used for calculating POS (probability of future success): • frequentist conditional power (calculated under H0, H1, or current trend) • semi-Bayesian predictive power • formal group sequential methods • Case study POS: conditional power under current trend • “Based on the results so far - and assuming these results reflect the truth - what is the probability of successfully achieving the study objectives at the end of the study?” • Choice of stopping thresholds more important than choice of method. We had two challenges: • How to convey features & risks of futility design to non-statistical colleagues • how to derive optimal stopping thresholds? 6
Impact Of “When” The Interim Occurs chance of false stop (%) patients recruited 8
Impact of Interim on Trial’s Power for Primary Efficacy Endpoint Note: no impact on type I error for futility designs 9
Quantifying Risks of Design Setting a stopping threshold of 70% POS will lead to a 27% chance of stopping at the interim if the drug works and a 10% chance of continuing if it doesn’t work probability of false stop (%) probability of false go (%) 10
Issues • Clear graphs illustrated risks & benefits of varying stopping thresholds and timing of interim analysis But: • Not every ‘successful’ trial is equally good • eg some results more or less likely to lead to license approval • Wanted to quantitatively include in decision making information we already had on this new regimen’s performance (PK and small pilot studies) • Decision analysis combined all these factors in order to weigh up benefits and risks of each stopping threshold 11
Decision Analysis Step 1: Categorise possible outcomes & elicit prior expectations Efficacy Safety 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0.0 0.0 bad mediocre good bad mediocre good excellent excellent Proportion of responders at Wk 24 Improvement in non-HDL cholesterol (mg/dL) Prior probability 12
Decision Analysis Step 2:Calculate predicted distribution of trial outcomes for each choice of stopping threshold (shown for 80% POS) 18% 22% 39% 50% 20% 30% 1% 9% 8% 3% efficacy safety 13
Results of Decision Analysis • Using pie-charts for primary efficacy endpoint: • 50% probability of study continuation given 80% POS stopping threshold • Relaxing POS threshold to 70% progresses an additional 5% of trials, 53% of which go on to good/excellent results • Relaxing POS threshold to 60% progresses a further 5% of trials, 48% of which go on to good/excellent results • Relaxing POS threshold to 50% progresses a further 5% of trials, 43% of which go on to good/excellent results • … 14
Final Stopping Thresholds Selected • Efficacy endpoint: 70% POS • Safety endpoint: 60% POS • Based on our assumptions, we had 38% overall probability of continuing the study to Stage Two • If study continued, estimated 62% probability of final good/excellent results for both endpoints • Compared to 33% probability of good/excellent results with no futility interim analysis • If stopped correctly for futility, prevented 528/2 subjects from possibly inferior regimen and saved company approx. £8million in wasted R&D funds 15
Case Study Conclusions • Futility designs under-utilised but have great potential: • “playing the winner”, maximising use of limited resources • Depending on phase of trial and nature of disease and drug being studied, stopping threshold level may vary considerably • Selection of optimal stopping threshold challenging • Lack of practical guidance in statistical literature • Can motivate discussion via informative graphs, simulations and decision analysis – making this design far more appealing and acceptable to non-statistical colleagues • Statistical team led the study design development & choice of threshold work 16