700 likes | 911 Views
An Update on Statistical Issues Associated with the International Harmonization of Technical Standards for Clinical Trials (ICH). Robert O’Neill , Ph.D. Director, Office of Biostatistics, CDER, FDA. 22nd Spring Symposium, New Jersey Chapter of ASA, Wed. June 6,2001. Outline of talk.
E N D
An Update on Statistical Issues Associated with the International Harmonization of Technical Standards for Clinical Trials (ICH) Robert O’Neill , Ph.D. Director, Office of Biostatistics, CDER, FDA 22nd Spring Symposium, New Jersey Chapter of ASA, Wed. June 6,2001
Outline of talk • International Harmonization of technical standards: efficacy, safety, quality • statistics - where does it fit in • Resources - who are the people and what are the processes • A focus on a few ICH Guidances of interest • A few issues of particular statistical concern • The future - where do we go from here
Harmonization of technical standards • ICH (Europe, Japan, United States) • Began in 1989; ICH 1 in Brussels 1991 • ICH continues today • Outside of ICH • APEC - Bridging study initiative , Teipei meeting • Canada, observers, WHO
Statistical Resources in the ICH regions • United States • CDER, CBER • Europe • U.K., Germany, Sweden • CPMP • Japan • MHW; advisors, university • China, Taiwan, Canada, Korea
Web addresses for information and guidances www.fda.gov/cder/guidance/index.htm www.ifpma.org/ich1 www.emea.eu.int/
ICH Guidances with statistical content • E1; Extent of population exposure to assess clinical safety • E3; structure and content of clinical study reports (CONSORT statement) • E4; Dose-response information to support drug registration • E5; Ethnic factors in the acceptability of foreign clinical data • E9; Statistical principles for clinical trials • E10; Choice of control group • E11; Clinical investigation of medicinal products in the pediatric population
ICH Guidances with statistical content • Safety • carcinogenicity • Quality • Stability (expiration dating) : Q1A, Q1E
New initiatives from the European Regulators (CPMP)- Points to Consider Documents • On Validity and Interpretation of Meta-Analyses, and One Pivotal Study (Jan, 2001) • On Missing Data (April, 2001) • On Choice of delta • On switching between superiority and non-inferiority • On some multiplicity issues and related topics in clinical trials
Efficacy Working Party (EWP) Points to Consider CPMP/EWP/1776/99 Points to Consider on Missing Data (Released for Consultation January 2001) CPMP/EWP/2330/99 Points to Consider on Validity and Interpretation of Meta-Analyses, and one Pivotal study ( released for consultation October 2000) CPMP/EWP/482/99 Points to Consider on Switching between Superiority and Non-inferiority (Adopted July 2000)
ICH E9Statistical Principles for Clinical Trials: Contents • Introduction ( Purpose, scope, direction ) • Considerations for Overall Clinical Development • Study Design Considerations • Study Conduct • Data Analysis • Evaluation of safety and tolerability • Reporting • Glossary of terms
Study Design: A Major Focus of the Guideline • Prior planning • Protocol considerations
Prospective Planning • Design of the trial • Analysis of outcomes
Confirmatory Study vs. Exploratory Study • A hypothesis stated in advance and evaluated • Data driven findings
Design Issues • Endpoints • Comparisons • Choice of study type • Choice of control group • Superiority • Non-inferiority • Equivalence • Sample size • Assumptions, sensitivity analysis
Choice of Study Type • Parallel group design • Cross-over design • Factorial design • Multicenter design
Analysis: Outcome Assessment • Multiple endpoints • Adjustments
Assessing Bias and Robustness of Study Results Analysis sets
Analysis Sets • ITT principle • All randomized population • Full Analysis population • Per Protocol
Data Analysis Considerations • Prespecification of the Analysis • Analysis sets • Full analysis set • Per Protocol Set • Roles of the Different Analysis Sets • Missing Values and Outliers
Statistical Analysis Plan (SAP) • A more technical and detailed elaboration of the principal features stated in the protocol. • Detailed procedures for executing the statistical analysis of the primary and secondary variables and other data. • Should be reviewed and possibly updated during blind review, and finalized before breaking the blind. • Results from analyses envisaged in the protocol (including amendments) regarded as confirmatory. • May be written as a separate document.
Analysis Sets • The ideal: the set of subjects whose data are to be included in the analysis: • all subjects randomized into the trial • satisfied entry criteria • followed all trial procedures perfectly • no loss to follow-up • complete data records
Full Analysis Set • Used to describe the analysis set which is complete as possible and as close as possible to the intention to treat principle • May be reasonable to eliminate from the set of ALL randomized subjects, those who fail to take at least one dose, or those without data post randomization. • Reasons for eliminating any randomized subject should be justified and the analysis is not complete unless the potential biases arising from exclusions are addressed and reasonably dismissed.
Per Protocol Set • Sometimes described as: • Valid cases, efficacy sample, evaluable subjects • Defines a “subset” of the subjects in the full analysis set • May maximize the opportunity for a new treatment to show additional efficacy • May or may not be conservative • Bias arises from adherence to protocol related to treatment and/or outcome
Roles of the Different Analysis Sets • Advantageous to demonstrate a lack of sensitivity of the principal trial results to alternative choices of the set of subjects analyzed. • The full analysis set and per protocol set play different roles in superiority trials, and in equivalence or non-inferiority trials. • Full analysis set is primary analysis in superiority trials - avoids optimistic efficacy estimate from per protocol which excludes non-compliers. Full analysis set not always conservative in equivalence trial
Impact on Drug Development • On sponsor design and analysis of clinical trials used as evidence to support claims • On regulatory advice and evaluation of sponsor protocols and completed clinical trials • On maximizing quality and utility of clinical studies in later phases of drug development • On multidisciplinary understanding of key concepts and issues • Enhanced attention to planning and protocol considerations
Will the Guideline Help to Avoid Problem Areas in the Future - Maybe ! • Not a substitute for professional advice-will require professional understanding and implementation of the principles stated • Will not assure correct analysis and interpretation • Most of the guideline topics reflect areas where problems have been observed frequently in clinical trials in drug development
ICH : Chemistry • Q1E: Bracketing and Matrixing Designs for Stability Testing of Drug Substances and Drug Products: • Considerable new work, including extensive simulations to evaluate size of studies and the ability to detect important changes to expiration date setting (incomplete blocks, alias, etc).
ICH E10: Choice of Control Group and Related Design Issues in Clinical Trials • Section 1.5 is very statistically oriented involving issues like: • Assay sensitivty • Historical evidence of sensitivity to drug effects • Choice of a margin for a non-inferiority (don’t show a difference ) trial.
Assay Sensitivity in Non-inferiority designs • Assay sensitivity is a property of a clinical trial defined as the ability to distinguish an effective treatment from a less effective or ineffective treatment • Note that this property is more than just the statistical power of a study to demonstrate an effect - it also deals with the conduct and circumstances of a trial
The presence of assay sensitivity in a non-inferiority trial may be deduced from two determinations 1) Historical evidence of sensitivity to drug effects, I.e., that similarly designed trials in the past regularly distinguished effective treatments from less effective or ineffective treatments, and 2) Appropriate trial conduct, I.e. that the conduct of the trial (current) did not undermine its ability to distinguish effective treatments from less effective or ineffective treatments. [can be fully evaluated only after the active control non-inferiority trial is completed.]
Successful use of a non-inferiority trial thus involves four critical steps 1) Determining that historical evidence of sensitivity to drug effect exists. Without this determination, demonstration of efficacy from a showing of non-inferiority is not possible and should not be attempted. 2) Designing a trial. Important details of the trial design, e.g. study population, concomitant therapy, endpoints, run-in periods, should adhere closely to the design of the placebo-controlled trials for which historical sensitivity to drug effects has been determined.
Successful use of a non-inferiority trial thus involves four critical steps (cont.) 3) Setting a margin. An acceptable non-inferiority margin should be defined, taking into account the historical data and relevant clinical and statistical considerations. 4) Conducting the trial. The trial conduct should also adhere closely to that of the historical trials and should be of high quality.
Choosing the Non-inferiority margin • Prior to the trial, a non-inferiority margin, sometimes called a delta, is selected. • This margin is the degree of inferiority of the test treatments to the control that the trial will attempt to exclude statistically. • The margin chosen cannot be greater than the smallest effect size that the active drug would be reliably expected to have compared with placebo in the setting of the planned trial. [based on both statistical reasoning and clinical judgement, should reflect uncertainties in evidence and be suitably conservative.]
Outline of the Issues • What is the the non-inferiority design • What are the various objectives of the design • Complexities in choosing the margin of treatment effect - it depends upon the strength of evidence for the treatment effect of the active control • Literature on historical controls, and on the heterogeneity of treatment effects among studies • The statistical approaches to each objective, and their critical assumptions • Cautions and concluding remarks
Non-Inferiority Design • A study design used to show that a new treatment produces a therapeutic response that is no less than a pre-specified amount of a proven treatment (active control), from which it is then inferred that the new treatment is effective. The new treatment could be similar or more effective than the existing proven treatment • A non-inferiority margin is pre-selected as the allowable reduction in therapeutic response. The margin is chosen based on the historical evidence of the efficacy of the active control and other clinical and statistical considerations relevant to the new treatment and the current study. • ICH - E10: “This delta can not be greater than the smallest effect size that the active drug would be reliably expected to have compared with placebo in the setting of a planned trial.” - the concept of reliably and repeatedly being able demonstrate a treatment effect of a specified size !
Non-Inferiority Design (cont’d) • A test treatment is declared clinically non-inferior to the active control if: • the trial has the necessary assay sensitivity for the trial to be valid for non-inferiority testing • the one-sided 97.5 confidence interval is entirely to the right of -
Inference for Non-Inferiority Delta Limits & 95% Confidence Intervals Non-inferiority shown Non-inferiority shown Non-inferiority not shown Non-inferiority shown/ superiority could be claimed - 0 Control Better Test Agent Better Treatment Difference
What are the various objectives of the non-inferiority design • To prove efficacy of test treatment by indirect inference from the active control treatment • To establish a similarity of effect to a known very effective therapy - e.g. anti-infectives • To infer that the test treatment would have been superior to an ‘imputed placebo’ ; ie. had a placebo group been included for comparison in the current trial. - a new and controversial area - choice of margin is the key
What is the Evidence supporting the treatment effect of the active control, and how convincing is it ? • Large treatment effects vs. small or modest effects • Large treatment effects - anti-infectives • Modest treatment effects - difficulties in reliably demonstrating the effect - Sensitivity to drug effects • Amount of prior study data available to estimate an effect • One single study • Several studies, of different sizes and quality • No estimate or study directly on the comparator - standard of care
How is the margin “ “ chosen based upon prior study data • For a large treatment effect, it is easier - a clinical decision of how similar a response rate is needed to justify efficacy of a test treatment - e.g. anti-infectives is an example. • For modest and variable effects, it is more difficult ; and some approaches suggest margin selection based upon several objectives.
Complexities in choosing the margin (how much of the control treatment effect to give up) • Margins can be chosen depending upon which of these questions is addressed: • how much of the treatment effect of the comparator can be preserved in order to indirectly conclude the test treatment is effective - a clinical decision for very large effects; a statistical problem for small and modest effects • how much of a treatment effect would one require for the test treatment to be superior to placebo, had a placebo been used in the current active control study - a lesser standard than the above
How convincing is the prior evidence of a treatment effect ? • Do clinical trials of the comparator treatment consistently and reliably demonstrate a treatment effect - when they do not, what is the reason ? • Study is too small to detect the effect - under powered for a modest effect size • The treatment effect is variable, and the estimate of the magnitude will vary from study to study, sometimes with NO effect in a given study - a BIG problem for active controlled studies (Sensitivity to drug effect)
How do you know which treatment effect size is appropriate for the current active control ?How much protection should be built into the choice of the margin to account for unknown bias and uncertainty in study differences ?
Inherently, the answer relies upon historical controls and their applicability to the current study • Choice of the margins should take into account all sources of variability as well as the potential biases associated with non-comparability of the current study with the historical comparisons. • A need to balance the building in of ‘bias’ in the comparison and quantifying the ‘amount of treatment effect preserved’, as a function of the relative amount of data from the historical studies and the current study
Use of historical controls in current RCT’s • Pocock,S. The combination of randomized and historical controls in clinical trials. J. Chronic Diseases 1976, 29 pp.175-188 • Lists 6 conditions to be met for valid use of historical controls with controls in current trial • “Only if all these conditions are met can one safely use the historical controls as part of a randomized trial. Otherwise, the risk of a substantial bias occurring in treatment comparisons cannot be ignored.”
Importance of the assumption of constancy of the active control treatment effect derived from historical studies • It is relevant to the design and sample size of the current study, to the choice of the margin, to the amount of bias built into the comparisons, to the amount of effect size one can preserve (both of these are likely confounded), and to the statistical uncertainty of the conclusion. • Before one can decide on how much of the effect to preserve, one should estimate an effect size for which there is evidence of a consistent demonstration that effect size exists.
Explaining Heterogeneity among independent studies : Lessons from meta-analyses • Variation in baseline risk as an explanation of heterogeneity in meta-analysis, S.D. Walter, Stat. In Medicine, 16, 2883-2900 (1997) • An empirical study of the effect of the control rate as a predictor of treatment efficacy in meta-analysis of clinical trials, Schmid,Lau,McIntosh and Cappelleri, Stat. In Medicine, 17, 1923-1942 (1998)
Explaining Heterogeneity among independent studies : Lessons from meta-analyses (cont.) • Explaining heterogeneity in meta-analysis: a comparison of methods. Thompson and Sharp, Stat. In Medicine, 18, 2693-2708 (1999) • Assessing the potential for bias in meta-analysis due to selective reporting of subgroup analyses within studies. Hahn, Williamson, Hutton, Garner and Flynn, Stat. In Medicine, 19, 3325-3336 (2000)
Explaining Heterogeneity among independent studies : Lessons from meta-analyses (cont.) • Large trials vs. meta-analysis of smaller trials - How do their results compare ? Cappelleri, Ioannidis, Schmid, de Ferranti, Aubert, Chalmers, Lau. JAMA, 16 1332-1338, 1996 • Discordance between meta-analysis and large-scale randomized controlled trials: examples from the management of acute myocardial infarction. Borzak and Ridker, Ann. Internal Med.,123, 873-877 (1995) • Discrepancies between meta-analysis and subsequent large randomized controlled trials. LeLorier, Gregoire, Benhaddad, Lapierre,Derderian. NEJM, 337, 536-42 (1997)
Use of meta-analysis - necessary but not sufficient • Distinguish under powered studies from well powered studies for a common effect size - if possible • How many trials are consistent with no effect, rather than an effect of some size • Determine between trial variability as an additional factor to consider in choosing a conservative margin • How do you know if the current study comes from the same trial population, and where does it rest in the trial distribution - critical to assumptions for control group rate and constancy of treatment effect • Resorting to meta-analysis of all studies, when few individual studies reject null, tells you something !