1 / 53

Statistical & Design Considerations for Non-inferiority trials

Statistical & Design Considerations for Non-inferiority trials. Andrew Nunn MRC Clinical Trials Unit London. Outline. What is a non-inferiority trial? How do they differ from superiority trials? What do the regulators say? How large do the trials need to be?

Download Presentation

Statistical & Design Considerations for Non-inferiority trials

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical & Design Considerations for Non-inferiority trials Andrew Nunn MRC Clinical Trials Unit London

  2. Outline • What is a non-inferiority trial? • How do they differ from superiority trials? • What do the regulators say? • How large do the trials need to be? • How should the trials be conducted and analysed? TB Forum December 2005

  3. What is a non-inferiority trial? • How is it different from an equivalence trial? • Does non-significant imply non-inferior? • Does non-inferior imply non-significant? • Are non-inferiority trials always larger than superiority trials? • Can a failed superiority trial be turned into a non-inferiority trial? • Why do we need these trials in TB? TB Forum December 2005

  4. A little bit of history - 35yrs ago • Study R, the first East African/BMRC trial of short course chemotherapy could be regarded as a non-inferiority trial. • 2STH/16TH worked well under strict trial conditions. • The main objective was to see if a six month regimen was at least as good as the standard treatment, better - would be a bonus. S = streptomycin, T = thiacetazone, H = isoniazid TB Forum December 2005

  5. Study R – 30 month relapse free rates A possible δ 6SHR Confidence intervals for difference from control 2STH/16TH regimen 6SHZ 6SHT 6SH 5% -35% -5% 0 No difference

  6. A point to remember • “It is never correct to claim that treatments have no effect or that there is no difference in the effects of treatments. It is impossible to prove … that two treatments have the same effect. There will always be uncertainty surrounding estimates of treatment effects, and a small difference can never be excluded.” Alderson P, Chalmers I. BMJ 2003:326:1691-8.

  7. Why non-inferiority for new TB drugs? • Under a wide variety of trial conditions the gold standard 2EHRZ/4HR regimen is at least 95% effective. • Nomads in the Algerian Sahara • Recently published IUATLD study in Africa and Asian centres. • We will be very unlikely to better it. • we would require a total of 2600 evaluable patients to demonstrate a reduction from 5% to 2.5% relapses TB Forum December 2005

  8. Our goal • Our goal is to reduce treatment duration to a maximum of 4 months and preferably less. • How much are we prepared to pay, if anything, for such a reduction? • must the new regimen be as good as the standard? • would we be satisfied with a regimen that was almost as good? • if so, how good is almost? TB Forum December 2005

  9. EMEA quote • “If no degree of possible inferiority of the test [new regimen] to the reference [control] is acceptable, then the development of products with equal efficacy to a comparator by means of non-inferiority trials would become impossible.” EMEA /CPMP /EWP /2158 /99 TB Forum December 2005

  10. Does non-significant = non-inferior? • No! definitely not. • Common sense will tell us that a non-significant result from an under-powered study is, in the extreme case of little value. • BUT, non-inferior does not necessarily mean non-significant! TB Forum December 2005

  11. How do we do it? • We need a null hypothesis. • The situation is the reverse of what is required in a superiority design. • For superiority • H0is there is no difference. • For non-inferiority • H0is there is a difference. • The alternative hypothesis is also reversed. TB Forum December 2005

  12. Equivalence & non-inferiorityWhat’s the difference? TB Forum December 2005

  13. Determining equivalence • First step in establishing equivalence - define ‘limits of equivalence’ (± δ) • Having conducted the trial, calculate the 95% confidence intervals for the difference between the control and the new treatment • If the confidence interval is entirely within ± δ then equivalence is established TB Forum December 2005

  14. Non-inferiority • Equivalence requires that the difference control - new intervention is both > -δ and < δ, the new treatment must be neither worse nor better than the control by a fixed amount. • In contrast to equivalence with non-inferiority we are only interested in determining whether new treatment is no worse by an amount δ. TB Forum December 2005

  15. Non-inferiority The 95% CI for the difference between the control and the intervention are all > -δ,i.e. non-inferiority demonstrated. -δ 0 No difference TB Forum December 2005

  16. Non-inferiority The lower 95% CI for the difference between the control and the intervention are all > -δ, i.e. non-inferiority demonstrated. The lower 95% CI is < -δ,non-inferiority has not been demonstrated. -δ 0 No difference TB Forum December 2005

  17. Non-inferiority and superiority The 95% CI for the difference between the control and the intervention are all >-δ,i.e. non-inferiority demonstrated. In this case both non-inferiority and superiority have been demonstrated -δ 0 No difference TB Forum December 2005

  18. Non-inferiority and inferiority The 95% CI for the difference between the control and the intervention are all >-δ,i.e. non-inferiority demonstrated. In this case both non-inferiority and superiority have been demonstrated In this case both non-inferiority and inferiority have been demonstrated -δ 0 No difference TB Forum December 2005

  19. Choosing δ • The value of δ must be chosen before the trial begins. • It’s value will depend on clinical, statistical and possibly regulatory considerations. TB Forum December 2005

  20. Example: 2NN Study • van Leth, Phanuphak et al (Lancet 2004), a study of first-line antiretroviral therapy in HIV • Main comparison between nevirapine twice daily and efavirenz (plus stavudine and lamivudine) in terms of ‘treatment failure’ (based on virology, disease progression, therapy change) • Primary objective was to establish the non-inferiority of nevirapine twice daily (δ=10%) TB Forum December 2005

  21. Example: 2NN Study • Confidence intervals for failure rates (E-2NN) • All data (-12.8%, 0.9%) • Only those starting med. (-14.6%, -0.8%) • Concurrently randomised (-11.9%, 3.4%) • Non of these intervals are completely above δ value of -10%; one interval also excludes zero TB Forum December 2005

  22. Example: 2NN Study • BUT, the authors concluded: ‘Antiviral therapy with nevirapine or efavirenz showed similar efficacy, so triple-drug regimens with either … are valid for first-line treatment’ Lancet 2004, 363:1253-63 TB Forum December 2005

  23. Does it matter? • A non-inferiority trial can demonstrate significant benefit from the new treatment - (cf Study A). • But is it possible to have non-inferiority and a significantly worse outcome in the new treatment? • Yes! provided δ is acceptable to clinicians. • if N is large enough any difference can be shown to be significant! TB Forum December 2005

  24. Adverse effects • Assessment of adverse effects is particularly important in equivalence trials. • It is not enough to prove non-inferiority in terms of efficacy. • A new treatment must be as safe, or safer, than the old one. TB Forum December 2005

  25. Choosing δ • On 27th July 2005 the European Medicines Agency (EMEA) issued a new European “Guideline on the choice of the non-inferiority margin” • This guideline comes into effect in January 2006. EMEA /CPMP /EWP /2158 /99 TB Forum December 2005

  26. Quote from EMEA document “The lower limit of the confidence interval [of the difference between the new regimen and the control] ... represents a lower bound and is usually interpreted as the degree of inferiority to the reference that can be excluded based on the data presented….. EMEA /CPMP /EWP /2158 /99 TB Forum December 2005

  27. Quote from EMEA document “Of course this is not an actual lower bound and the magnitude of inferiority could be greater. However it is generally considered that the chance of the true difference being worse than that suggested by this bound is acceptably small.” EMEA /CPMP /EWP /2158 /99 TB Forum December 2005

  28. General EMEA recommendations • If possible three study arms should be included, test, reference and placebo - allows validation of the non-inferiority margin. • The margin should be such there is assurance that the test arm has a clinically relevant effect. • The primary focus is the relative effect of the test arm and the reference arm. • The choice of the margin should be justified in the protocol • The choice of the margin should be independent of power considerations. TB Forum December 2005

  29. Design consideration • It is important to ensure that the design of equivalence trials, including definitions of a favourable response, should be as similar as possible to earlier trials assessing the control regimen. TB Forum December 2005

  30. Internal validity • In a superiority trial there is a strong incentive to ensure high quality of conduct. • In contrast in an non-inferiority trial the conclusion of non-inferiority could be reached because of poor discriminatory power. • In a TB trial this could occur if follow-up rates were poor and/or there was failure in the lab to detect all relapses. TB Forum December 2005

  31. But • If there are already many treatments being used interchangeably for the disease under consideration a possible approach might be to consider the information available from all of them. From this a delta may be constructed which summarises the information known about the relative efficacy of these products, and the new trial can be designed to provide a similar level of knowledge of the relative efficacy of the new product. TB Forum December 2005

  32. Accepting a larger δ • “In the situation where the test product is anticipated to have a safety advantage over the reference it is likely that a larger delta could be justified as some loss of efficacy might be accepted in exchange for the safety benefits” TB Forum December 2005

  33. Is there a case for a larger δ if treatment can be shortened? • “It may be possible to justify a wider non-inferiority margin for efficacy if the product has an advantage in some other aspect of its profile. This margin should not, however, be so wide that superiority to placebo is left in doubt” TB Forum December 2005

  34. How large a δ would you accept? • If treatment could be shortened from 6 to 4 months would an increase in the failure/relapse rate from 5% to 10% be acceptable? TB Forum December 2005

  35. How large a δ would you accept? • If treatment could be shortened from 6 to 4 months would an increase in the failure/relapse rate from 5% to 10% be acceptable? - provided that the failures and relapses could be satisfactorily retreated. TB Forum December 2005

  36. FDA position • FDA (as described in FDA’s 1992 Points to Consider document) originally used a ‘step function’: Cure Rateδ  90% 10% 80 - 89 % 15% < 80 % 20% • A more flexible approach has since been adopted TB Forum December 2005

  37. Example : Pediatric Meningitis TrialInvestigational Drug vs. Active Control Sponsor’s FDA Proposal Proposal Projected response rate 80% 80% Delta 15% 10% Evaluable total sample size 224 504 Projected % evaluable 70% 70% Total to be enrolled 320 720 Projected enrollment time 2-4 years 4-6 years FDA proposed study considered not to be feasible Note:  = 5%, power = 80%

  38. What confidence level? • Traditionally we use 95% confidence in superiority trials (thanks to RA Fisher!) • Guidelines for pharmacokinetic equivalence have traditionally used 90% CI. • In regulatory situations the choice is based on level of risk regulators are prepared to accept. • Could be appropriate to use 90%, 95% or even 99%. • Need for flexibility.

  39. Calculating power - an example • Given the expected range of, say 3-6% relapse rates in the control, 2EHRZ/4HR regimen. • What study size would we require for a range of δ? TB Forum December 2005

  40. 5% relapse, δ = 10%, 100 per arm

  41. 5% relapse, δ = 5%, 400 per arm 5% relapse, δ = 10%, 100 per arm

  42. But ... • These power calculations do not allow for additional numbers required for a Per Protocol analysis, or patients excluded because they do not have TB, or because they have MDR disease. • Neither do they allow for losses to follow-up. TB Forum December 2005

  43. How should we analyse non-inferiority trials? • Superiority trials are analysed by ITT because it is the most conservative and least likely to be biased. • ITT analysis of non-inferiority trials is not conservative - there is a bias towards no difference. • PP biased since not all randomised patients included. • It is recommended that non-inferiority trials should be analysed by both ITT and per protocol (PP). TB Forum December 2005

  44. Defining ITT and PP • Definitions vary. • For ITT some definitions exclude patients who either do not have confirmed diagnosis, or who never received treatment. • PP includes all receiving full course of treatment with no major protocol violations. • What definitions are appropriate for TB trials? • CPMP: ‘similar conclusions from both the ITT and PP are required in a non-inferiority trial.’ • ‘Sample size computations should ensure sufficient numbers in the PP population’. CPMP: Committee on Proprietary Medical Products (2000)

  45. CAVE! • Drop-outs from the two regimens need to be carefully evaluated. • Suppose patients not responding dropped out early from one treatment arm, or • Possibly because of differential withdrawal rate for adverse events - • This would suggest there may be important differences between the treatments. TB Forum December 2005

  46. Interim analyses • Do we need them? • Probably not if it is to consider stopping early for strong evidence of non-inferiority. • Such evidence would support a case for the possible superiority of the new treatment to the control - a strong incentive to keep on. TB Forum December 2005

  47. Conclusions 1 • A major concern among regulators in many NI trials is that the efficacy of the control is not well established. • This is NOT the case with the control regimen 2EHRZ/4HR. One advantage of no new drugs for 40 years!! • In the event of establishing a 4 month regimen to be non-inferior it would be unwise to use that regimen as the control in the next NI trial - biocreep. Biocreep - slightly inferior treatment becomes the control for next generation of NI trials TB Forum December 2005

  48. Conclusions 2 • NI trials must be conducted with rigour • The value of δ needs to be determined before the start of the trial and should take into account both clinical and statistical considerations. • Both the value of δ and other aspects of design need to be discussed with regulators • Non-inferiority needs to be demonstrated not only for efficacy but also for safety. TB Forum December 2005

  49. Regulatory Guidance • ICH E9 ‘Note for Guidance on Statistical Principles for Clinical Trials’, September 1998 • ICH E10 ‘Note for Guidance on Choice of Control Group’, July 2000 • CPMP ‘Note for Guidance on the Investigation of Bioavailability and Bioequivalence’, July 2001 • CPMP ‘Points to Consider on Switching between Superiority and Non-Inferiority’, July 2000 • CHMP ‘Guideline on the Choice of the Non-Inferiority Margin’, July 2005 TB Forum December 2005

More Related