Equivalence, Similarity, and Non-inferiority Clinical Trials in Neurotherapeutics

Equivalence, Similarity, and Non-inferiority Clinical Trials in Neurotherapeutics ASENT 10th Annual Meeting March 2008 Marc K. Walton, M.D., Ph.D. Senior Medical Policy Advisor Office of Policy Office of the Commissioner, FDA The views expressed are those of the author, and do not represent an official FDA position

Equivalence / Similarity / Noninferiority • Distinctions of study circumstances and study goal • Need to be clearly identified • Two treatments • Both known efficacious • Want to know comparative efficacy • One treatment with unknown efficacy • Second treatment known efficacious • Want to prove first is also efficacious

Comparative Efficacy • Can the two treatments be shown Equivalent? • Are two agents are essentially the same? • Implies same amount of efficacy for each treatment to within some level of disinterest • Neither drug is better than the other • Two sided interest • Not a regulatory requirement for marketing • Not common regulatory interest (ie, sponsor interest, possibly with desire for regulatory affirmation) • “The Same”? • With respect to characteristics planned for rigorous evaluation

Proof of Efficacy for an Unproven Treatment • Is the new treatment efficacious? • In the same manner as the established treatment • Often a goal for regulatory decision • Based on evidence that new drug has some efficacy • No explicit requirement for same efficacy of other drugs, or any predefined fraction of efficacy of other drugs for same disorder. • One sided comparison of interest • Precise comparative efficacy is not a goal • Non-inferiority: A misnomer • Goal to is to show “sufficient efficacy” • Accomplished by showing “not unacceptably inferior” to control • Does not imply truly not inferior

Similarity • Perhaps not equivalent, but close enough • ‘Close enough’? • In the eye of beholder • May imply a two sided interest as well • Individual judge dependent • Difficult to define/describe • Principles of Equivalence / Noninferiority studies apply • More laxity in interpretation granted on an individual person basis

Equivalency & Noninferiority: Active Control Studies • Why / When do Active Control studies? • Comparison of effects of two agents • May not need to exclude Placebo • Issues of assay sensitivity (later) and validity of interpretation much easier if placebo included • Assessment of effects of one agent when placebo control not permissible • Earlier lecture touched on circumstances • Ethics of withholding a known effective treatment with life-saving or irreversible life-altering effect • Can be dose-ranging study with single agent • Ethical need to avoid non-effective doses • May not be a practicable approach • Remainder of talk assumes active control study of two drugs

Active Control Studies • Study aspects enabling data to be validly interpreted • Validity of Design • Integrity of Conduct • Adherence to protocol • Same overall issues with standard placebo-control studies • Additional complexities in Active Control studies

Increased Complexities • Assay sensitivity of the design • Interpretation of the quantitative result • Determination of margin of acceptable difference • Margin • Built on combination of • Historical knowledge of comparator’s efficacy • Clinical judgment of what is an ‘acceptable’ difference

Assay Sensitivity • Can study detect a difference if one exists? • Unrecognized failure of assay sensitivity leads to: • Type II error for superiority study • Type I error for non-inferiority study • Factors which affect sensitivity can seem to impair or advance study organizer’s goals oppositely • Promotes ability to distinguish between • Evidence of Absence • Absence of Evidence • Laboratory assays often include positive and negative controls; rarely can be done in clinical trials

Interpretation of Study Results • Results analyzed as comparison of two groups • Need numeric criterion to form interpretation • Placebo control & other superiority studies have same requirement, but easy to define • Show between-group difference > 0 • New drug superior to placebo / other control • Superiority of new treatment over old treatment difficult to achieve with active agents • Thus non-inferiority approach attractive • Need to have quantitative comparison criterion to interpret study result, allowing for potentially the same efficacy

Margin of Acceptability • How much less efficacious can the new drug be than old drug, and still deem it acceptable to use (or equivalent)? • Two components to consider • First – what is the efficacy the old drug provides? • Statistical analysis of existing information • Can not allow new drug to be worse than old drug by that amount, or it provides no efficacy • Second – how much of that quantitative amount of efficacy is it permissible to give up? • Clinical judgment • Usually easier to assess after know first component • Express as an absolute amount or relative amount

Margin of Acceptability • Two components – M1, M2 • Separate basis, separate purpose • Margin of acceptable inferiority cannot be larger than: • Statistical component – M1 • Clinical component – M2 • Overall margin (M) • If M2 << M1: M = M2 • M2 > M1: Either no need for AC study (use placebo control study) or non-inferiority not feasible • M2 <(modest) M1 leads to complexities and anxieties

Active Control Agent Existing Knowledge • Solid knowledge of efficacy of active control agent is essential • From prior studies – historical placebo control studies • Preferably from multiple different prior studies • Historical studies each conducted in a specific manner • Regimen of use, population of use • Concomitant care, alternative therapies • Existing knowledge applicable in circumstances of prior experience • May not be reliable in other circumstances • Quality of historical studies needs to be considered • Adherence to protocol • Quality of data collection for outcome now of interest (may not have been primary outcome in study design)

M1 – Statistical Margin Component • What is the treatment effect of the comparator agent? • Typical? • Reasonably likely present? • Highly likely present? • There will be no ability to actually confirm in new AC trial • Generally derived from some form of meta-analysis • With allowance for uncertainty in historical quantitative estimates (i.e., variance of each effect estimate; variance of meta-estimate) • The more precisely treatment effect was estimated in historical trials and the more trials there were, the more precise the M1 value may be determined

M1 – Estimate from Historical Evidence • Objectively done meta-analysis is important • What studies to include • What portions to include (e.g., patient subsets) • Selection to improve strength of meta-analysis and applicability to planned new study use of drug • How relevant is historical M1 estimate to a new study: • Relates to issues outside of purely mathematical variance – Not in meta-analysis • Population unchanged? • Planned eligibility criteria • Unplanned shift in available population related to changes in medical practice (e.g., development of alternative treatments) in intervening time • Active Control’s treatment regimen unchanged? • Concomitant care impact changed?

Uncertainties Affecting Interpretation of New Study Imprecision of control agent’s effect in each of the historical studies Partially estimated by statistical variance of each study Variability of treatment effect size between historical studies Partially estimated by meta-analysis methods Uncertainty of quantitative extrapolation of historical treatment effect to treatment effect of control agent in new study The Constancy Assumption – relevance of estimate Not statistically calculable Can create an (arbitrary) adjustment inserted into mathematical procedures Uncertainty of comparison between control and new agent in the new study Estimated with statistical variance in new study AS ALWAYS, Statistical approach adequate only if non-statistical sources of bias ignorable

Circumstances for Confidence in M1 Estimate • Good quality of prior placebo control studies – A&WC • Multiple prior placebo controlled trials with comparator • Good grounds for combining data in a meta-analysis (all randomized, combinable doses/regimen; outcome measure defined, assessed the same way) • Consistent results across all the prior studies • Prior studies done over extended period of time, but completed not long ago • Allows for some changes in populations, concomitant care, etc, showing this is not critical • Prior studies done with some variations in design • Implies drug treatment effect not highly sensitive to design • Better with large treatment-associated efficacy effect

Historical Knowledge for Non-inferiority Purposes May be source of difficulty Active control design being used because placebo control cannot Due to nature of benefit of established drug May be few (one?) placebo controlled studies with a life-saving drug before it becomes an established, necessary standard of care Limited ability to form precise estimate of treatment effect

M2 – Clinical Component • How much efficacy is clinically acceptable to give up in the new agent and still allow new treatment to be medically acceptable to use in place of an already proven active agent • Clinical judgment, given importance of endpoint, nature of disorder, amount of efficacy of the active control agent • Absolute (% of patients, points on scale) or relative to control’s effect (% of control’s efficacy) • M2 does not serve to make a hazy M1-statistical analysis more reassuring by making it more stringent; M2 is from solely clinical meaning point of view • ½ often used • No automatic reason why this is the correct M2 in different settings • Each case should be considered independently

Non-inferiority Analysis: Conceptual Methods • Sequential CI method • Dual (Double) CI method • Fixed (Pre-determined) Margin method • Synthesis method • Putative Placebo method • Strengths and Weaknesses to each • How confident can ‘we’ be in effect estimates from historical data? • How many AC studies will we have to consider? • Given nature of benefit, how much risk of being wrong in favor of new drug is acceptable? • How much risk of being wrong by discrediting new drug is acceptable? • Statistical purity vs. discretized logic steps

Conceptual Methods • Sequential CI method • M1 estimate of active comparator’s effect from meta-analysis confidence interval • Apply M2 limit to form overall margin (M) • New agent’s comparative effect estimated with confidence interval from new study • Show C.I. does not exceed overall margin M • Synthesis method • Combines historical placebo-comparator data and new study comparator-test agent data into single calculation. • Result is numeric value indicating putative placebo-controlled efficacy estimate of test agent. • Chief difference is conceptual approach of how to address inter-study variability and strength of constancy assumption

Quality of Study Data & Analysis • Constancy assumption needs good study quality • Effects of study flaws • Anti-conservative; opposite of superiority study • Flaws • Adherence to study protocol • Cross-overs • Completeness of data • Dropouts, missing data • Analysis population • ITT keeps errors in but is statistically pure • Per Protocol keeps errors out, but impure • Probably best to do both and show that all ways of looking at dataset are consistent

Interpretation of Results • Examples of CIs and assessment • Apparent paradox of non-inferiority success with an inferior drug a b c d e f

Closing Comments • Size of non-inferiority studies • Often large to achieve sufficient precision in estimate of New-Active effect difference so that CI of comparison falls within the limited range defined by margin • When M is small, study is large; when M is large, study can be moderate in size • Biggest factor can be efficacy of active comparator • Equivalence studies • If this implies greater closeness of effect than “not-unacceptably-inferior”, then M is smaller based on M2 component • Consequently sample size of study will increase

Closing Comments • Active Control-Noninferiority studies of successive new agents ( risk of ‘bio-creep’) • M based on quantitative data applicable to the comparator being used. Risk of using New1, based on single AC study vs. Std, as the AC agent for New2 assessment, and so on • New1 might be inferior to original comparator by a small amount (in itself negligible) • Negligible inferiority added twice (Std-New1; New1-New2) may be not negligible ; frequently only 1-2 study with New1 so that meta-analysis of New1«Std«Pbo gives larger CI, and less assured efficacy • If ignored, New3 may have little to no efficacy yet pass the erroneously applied non-inferiority margin

Closing Comments • Planning and conducting a non-inferiority study does not prohibit achieving a conclusion of superiority if such is the case. Analytic plan of study can allow for this when done in the proper manner. • Safety issues also always need considering • Often not as enticing as working on efficacy aspect • Reading and Reporting of Active Control Studies • Planning and analysis aspects unique to AC studies • Great difficulty in assessing study result of these aspects not well described in publication

Equivalence, Similarity, and Non-inferiority Clinical Trials in Neurotherapeutics