660 likes | 788 Views
Clinical Trials Using Pharmacogenetic Information: a Statistician's View. Rick Chappell, Ph.D. Professor, Department of Biostatistics and Medical Informatics University of Wisconsin Medical School Stat 641 - 12/13/2010. Outline. Definitions Motivation Statistical Background
E N D
Clinical Trials Using Pharmacogenetic Information:a Statistician's View Rick Chappell, Ph.D. Professor, Department of Biostatistics and Medical Informatics University of Wisconsin Medical School Stat 641 - 12/13/2010
Outline • Definitions • Motivation • Statistical Background • Types of Clinical Trial Designs for Predictive Biomarker Validation • Examples • Conclusions • References
Definitions • Genomics and Pharmacology (Sen, 2009): • “Pharmacology is the science of drugs including materia medica, toxicology, and therapeutics; • ... Pharmacodynamics deals with reactions between drugs and living structures; • ... Pharmacokinetics relates to the study of the bodily absorption, distribution, metabolism, and excretion of drugs;
Definitions • Genomics and Pharmacology (Sen, 2009, cont.): • ... Pharmacogenetics deals with genetic variation underlying differential response to drugs as well as drug metabolism. • ... The whole complex constitutes the discipline: Pharmacogenomics.”
Genomics and Pharmacology (Sen, 2009, cont.): • Summary definition: "... genomics looks at the vast network of genes, over time, to determine how they interact, manipulate and influence biological pathways.” • My comments: This is a multidimensional definition: a "vast number" of genes, plus time. Add more dimensions: drugs and how they interact with these genes and pathways. • And, really, other factors too: radiation, devices, procedures ...
Definitions • Pharmacogenetic mechanisms (Palmer, 2009): • Variation in the metabolism of a drug among individuals, especially in enzymes involved in the catabolism or excretion of a drug; • Variation among population members with respect to drug adverse effects; and • Variation in the drug treatment target or target pathways, conceptually dividing a population into "Responders" and "Nonresponders”.
Definitions 3. Biomarker (Chakraverty, 2005): • A characteristic that is objectively measured and evaluated as an indicator of biologic processes or response to a therapeutic intervention. • Here, biomarkers will be restricted to be classifiers, which means that they cannot change over the course of the study. There are two types (Chang, 2008).
Definitions4. Types of constant biomarkers (classifiers) Prognostic Biomarkers: • Inform clinical outcomes, independent of treatment; • Provide information about course of disease in all individuals, whether or not they have received the treatment under study; • Can be used to separate good- and poor-prognosis patients at the time of diagnosis; • If separation is good, can be used to aid the treatment decision, in particular its aggressiveness.
Definitions4. Types of constant biomarkers (classifiers) Prognostic Biomarkers: • Inform clinical outcomes, independent of treatment; • Provide information about course of disease in all individuals, whether or not they have received the treatment under study; • Can be used to separate good- and poor-prognosis patients at the time of diagnosis; • If separation is good, can be used to aid the treatment decision, in particular its aggressiveness. Prognostic biomarkers create a staging system!
Definitions4. Types of constant biomarkers (classifiers) • Predictive Biomarkers: • Inform treatment effect on the clinical endpoint; • Can be used to determine treatment.
B. Motivation - 2. A (personally) early example of medically early “biomarkers” As a graduate student, I analyzed a retrospective dataset of colorectal tumors in order to refine Dukes’ staging (Michelassi, Block, Vanucci, Montag & Chappell, 1988). I was surprised at the magnitude of association between 5-year survival and the predictors:
B. Motivation - 3. Recent Developments with Iressa BOSTON, May 29 (UPI, from the Wall Street Journal) -- “The gene-targeting drug Iressa is proving beneficial for lung-cancer patients who are Asian or non-smokers, a Harvard Medical School oncologist said.” The success of Iressa is spurring pharmaceutical companies to develop genetically targeted drugs to improve the treatment of numerous forms of cancer, said Lecia Sequist, an oncologist at Harvard and Massachusetts General Hospital Cancer Center.”
“… While Iressa made little impression as an overall lung cancer drug when introduced by AstraZeneca in 2002, it has proved effective among Asians and non-smokers who have lung cancer, said David Carbone, an oncologist at the Vanderbilt-Ingram Cancer Center in Nashville. "It is a great example of how genetics can be used to guide therapy,” Carbone said. …”
What do these two examples have in common? The first looks at the influence of prognostic markers. The second example identifies predictive markers. But they are both based on clinical or demographic characteristics (tumor stage, histology, pathology, race, smoking status), not directly measured genomic information.
Some investigators (Simon, 2004a) worry that lack of progress in cancer therapy is due to the failure to jump to genomic markers: “The development of traditional prognostic and diagnostic biomarkers has been largely disappointing. The extensive literature on this topic is often contradictory and relatively few oncologic biomarkers have been adopted in clinical practice.”
The title alone of Hilsenbeck (1992) reflects dissatisfaction with traditional prognostic information: “Why do so many prognostic factors fail to pan out?”
How do we move into the new age of pharmagenomics? Move straight to the source of genetic variation by looking directly at genes. In both examples, the observed markers are likely to be important only because they are confounders for genetic markers. By “confounders” I mean that the observed markers are uninformative given knowledge about the genetic ones.
The effect of Iressa on lung cancer certainly has been examined with respect to direct genomic information: Tumors with mutations in the kinase domain of their epidermal growth factor receptor gene are highly sensitive to EGFR inhibitors like Iressa. This was no surprise. But what about other drugs/treatments (e.g., radiation) with less obvious genomic predictors?
Simon goes on to point out the difficulties in readying a gene expression biomarker for clinical use due to the very large number of candidate biomarkers, which may be orders of magnitude larger than the number of cases.
B. Motivation - 4. Follicular Lymphoma Released Proteins • Candidate List of yoUr Biomarkers contains 56 candidate lists of biomarkers • Including one by Vaughn et al. (link in references) with 391 proteins for FL along with the relevant genes. • “CLUB is a freely available online system which allows the biomarkers research community to share, compare and analyze their list of candidate genes, transcripts or proteins.”
: : : : : : : Name:Candidate List of Follicular Lymphoma Released Proteins Description: Using a two-peptide minimum per protein and standard criteria, 391 proteins (5.6% maximum predicted error rate) released from the FL cells were identified Sample Information Sample Type: Cell Line Sample Description: Follicular lymphoma-derived cell line SU-DHL-4 was used. Experiment Details Detection Type: Protein Expression Experiment Design: Normal vs Diseased Comparison Description: Tandem mass spectrometry analysis was used for the identification of proteins. Candidates And 388 others ...
A Gross Simplification: • This presentation (in accordance with the current state of research) will assume a single quantity to be tested in conjunction with a treatment to determine their effects upon patient outcomes. • This quantity can either be a single marker or a multigene expression profile, also called a “prediction function.”
Simon (2004b) suggested that pharmacogenomic predictors be developed using phase II data and validated in a phase III trial. • In a purely clinical sense, we don’t care about including unnecessary genes in the prediction profile as long as we have predictive accuracy. • “The phase III trial is free from the problems of data dredging” because it relies on this single prespecified predictor.
In the following (and often in the design literature), we will assume a single binary predictor and call it a “marker”, regardless of whether it is based on a single gene or a multigene predictor profile. • I focus on the design of Phase III clinical trials of a treatment in the presence of a single marker.
C. Statistical Background Suppose we have a binary outome (e.g., “response / no response”). The outcome might depend on two binary factors, T (+/-) and M (+/-). We denote the probability of response as P(resp.; levels of T and M). (Similar results will also hold for time-to-event and continuous responses.)
Three quantities of interest: 1. Treatment effect = P(resp.; T+) - P(resp.; T-).
Three quantities of interest: 1. Treatment effect = P(resp.; T+) - P(resp.; T-). 2. Marker’s prognostic effect = P(resp.; M+) - P(resp.; M-).
Three quantities of interest: 3. Marker’s predictive effect (interaction of Marker, Treatment) = {Treatment effect with M+} - {Treatment effect with M-} = {P(resp.; T+, M+) - P(resp.; T-, M+)} - {P(resp.; T+, M-) - P(resp.; T-, M-)}.
Three quantities of interest: Because treatment effects are estimated with standard methods from randomized clinical trials, and Prognostic effects are estimable from single-sample observational studies, I will now concentrate on studies which assess or allow for a marker’s predictive effect. These all require treatment randomization.
Who is interested in these quantities (for example, for T = Iressa, M = EGFR)? • Patients’ self-interest is served by knowing whether the treatment works for them. They may not care whether a marker is prognostic or even predictive as long as they significantly benefit from the treatment. They would care if a marker could show them to be in a group for which the treatment is ineffective. • AstraZenica, manufacturer of Iressa, is interested in finding indications for it, and are happy if a marker helps them do it. They are also primarily interested in the treatment effect, either in all patients or a subgroup. • Genzyme, manufacturer of the EGFR gene amplification test, would want to demonstrate a treatment effect andmarker prognostic value.
How many patients does it take to estimate these quantities? 1. Treatment effect = P(resp.; T+) - P(resp.; T-). Estimating this with good (90%) power to detect a true treatment difference of 10% requires about 500 + 500 = 1000 patients total.
How many patients does it take to estimate these quantities? 2. Marker’s prognostic effect = P(resp.; M+) - P(resp.; M-). Estimating this with good (90%) power to detect a true prognostic difference of 10% also requires about 500 + 500 = 1000 patients total.
How many patients does it take to estimate these quantities? Marker’s predictive effect = {Treatment effect with M+} - {Treatment effect with M-} = {P(resp.; T+, M+) - P(resp.; T-, M+)} - {P(resp.; T+, M-) - P(resp.; T-, M-)}. Estimating this with good (90%) power to detect a true predictive difference of 10% requires about 1000 + 1000 + 1000 + 1000 = 4000 patients total if we use a factorial (four equal-sized samples) for maximum efficiency. Estimating interactions (differences of differences) requires a lot of subjects.
D. Types of Clinical Trial Designs for Predictive Biomarker Validation (Mandrekar & Sargent, 2009) • Retrospective - use of data from a previously conducted randomized clinical trial to validate a marker . • Randomization is necessary in order to obtain an unbiased estimate of the treatment effect, without which any conclusions about the biomarker prediction are invalid. • Practical - can use existing data. • Not a “fishing license”: markers and hypotheses must be prespecified. • Samples must be available from all/most patients in order to avoid selection bias.
2. Enrichment (targeted) designs - all prospective subjects are screened for a marker or marker profile, and only those with (or without) certain molecular features are included. • Prospective - inclusion criteria must be specified in advance. • Appropriate when preliminary evidence hints that patients with (or without) a profile benefit from treatment. • Do not yield information about the omitted patients. Results will only tell you if the treatment is effective for the targeted patients.
3. Unselected (“all-comers”) designs - eligible patients of any biomarker profile are admitted into the trial. • Also prospective. • Ability to provide adequate tissue may be an eligibility criterion for these designs, but not the specific biomarker result. • Can directly validate the biomarker profile. • Subdivided based not on the design but on the prespecified analysis methods.
Unselected Designs A.Sequential testing strategy designs - hypothesis of treatment effectiveness is tested in both the overall population and also the marker-defined prospectively planned subgroup. • Similar in principle to a standard RCT design with a single primary hypothesis. • Can either test the entire population first or test in the subgroup first and then the whole group if treatment is significant in the subgroup (closed testing procedure). • Both analyses preserve type I error (not biased by multiple comparisons).
Unselected Designs A.Sequential testing strategy designs (continued) • “Entire population first” analysis is useful when marker is of secondary importance. • “Marker + (or -) subgroup first” analysis is best when there is strong prior evidence that marker is predictive and also subgroup has sufficient power. • The choice of procedures is important; you may be sorry (too late) if you pick the “wrong one”. • It is possible to have an adjustment for multiple comparisons which lets you pick the more significant result, but this requires an adjustment (downwards) of the p-value as its price.
Unselected Designs B.Marker by treatment interaction designs - a marker by treatment factorial design in which patients are stratified by marker status and then randomized to treatment/control within each marker group. • Hypothesis of marker predictive validity is formally tested by examining the Marker/Treatment interaction. • Mandrekar and Sargent call it “Clearly, a prospective and definitive marker validation trial.” • As above, requires a large sample size.
Unselected Designs C.Marker-based strategy designs - randomizes patients to have their treatment either based on or independent of the marker status. • For example, all M+ patients receive the treatment and M- patients are randomized to the treatment vs. control. • Essentially a clinical trial in which the intervention is the marker assay; it directly tests the assay’s use. • Because some patients in both the M+ and M- arms each receive the treatment, this design is inefficient. • Requires an even larger sample size than the interaction designs. • I recommend interaction designs for testing the utility of a marker assay.
4. Hybrid randomized/nonrandomized designs - only marker + (or -) patients are randomized to treatment/control; all others receive only the control. • Also prospective. • Useful when unethical to randomize some patients. • Powered to detect differences only in the randomized group, like in enriched designs. • Cannot directly validate marker’s predictive ability because treatment effect not estimable in one group. • Can provide data on other markers for future studies.
5. Adaptive designs (Chang, 2008) • A general term for designs with two or more stages in which the structure of later stages can depend upon the results of the earlier ones. The following design parameters can be modified: • Optimal marker cut points can be chosen based on performance in the first stage. • Marker subgroups in which patients are randomized. Trial begins with patients accrued in all subgroups; futility analyses determine whether some subgroups are discontinued • Overall sample size can be modified.
Advantages and disadvantages of adaptive designs • Flexibility in the face of uncertainty about treatment effect, overall event rates, marker cut points, and marker effects. • Requirement for quick outcomes, eliminating time to event endpoints in most cases. • Requirement for statistical adjustment in order to use the same data for definition and validation of cut points, or for the design and analysis of trial populations. • Logistical complexity.
E. Examples • Retrospective analysis of a marker in existing RCTs: • M = KRAS (wild type vs. mutant) • T = panitumumab / cetuximab (vs. best supportive care) • Disease = advanced colorectal cancer; • Primary outcome = PFS