1 / 92

On the Road to Genomic Predictive Medicine An Interim Analysis

On the Road to Genomic Predictive Medicine An Interim Analysis. Richard Simon Chief, Biometric Research Branch National Cancer Institute. How I got involved in genomics. In the late 1990’s genomic data was for me the most exciting scientific data of our generation

tyme
Download Presentation

On the Road to Genomic Predictive Medicine An Interim Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. On the Road to Genomic Predictive Medicine An Interim Analysis Richard Simon Chief, Biometric Research Branch National Cancer Institute

  2. How I got involved in genomics • In the late 1990’s genomic data was for me the most exciting scientific data of our generation • Analysis of that data shouldn’t be left to amateurs • We had a great cadre of statisticians involved in clinical trials and we know how to do reliable clinical trials, but the drugs are often disappointing • Statisticians should be involved in basic research, pre-clinical target discovery and policy

  3. Biomedical leaders were looking to computer scientists and physicists for help, not to statisticians • Statisticians were viewed as useful for testing hypotheses and computing p values, not for discovery

  4. Many statisticians tend to see themselves as methods developers not as scientists focused on subject matter area

  5. Imatinib chronology • 1960 - Philadelphia chromosome described (P Nowell) • 1973 – Ph characterized as translocation of BCR on chromosome 9 with ABL on chromosome 22 (J Rowley) • 1986 – BCR-ABL fusion gene characterized as constituatively activated kinase (D Baltimore)

  6. Imatinib chronology • 1988 -1995 CIBA-GEIGY develops kinase inhibitors (A Matter, N Lydon, J Zimmermann, E Buchdunger) • 1996 B Drucker (Dana Farber -> Oregon) screens in ex-vivo tumors and normal lymphocytes against compounds provided by Novartis and convinces company to sponsor clinical trials in CML in spite of only 5000 cases/yr in US

  7. Success depended on collaboration between industry and academia • Delayed development resulted from reluctance of field to accept hypothesis that kinases can be selectively inhibited or that inhibiting a single gene could be very effective • Industry involvement dependent on vision of a small leadership group in one company • Clinical translation dependent on vision of one oncologist

  8. Success depends on serendiptiy • Academic medicine (NIH) is a bottom-up system not optimized for risk taking or exploiting scientific leads for translating basic research to clinical products or for mounting large cooperative programs for overcoming bottlenecks in translation • Academic medicine is very dependent on industry but industry has its own constraints

  9. Predictive Medicine • Germline genetics • GWAS • 23andMe • Tumor genomics • Tumor Cell Genome Atlas

  10. Ioannidis et al.JNCI 102:846(2010) • 56 GWAS • 92 statistically significant associations between cancer phenotype and genetic variant • Median OR = 1.22 • IQR OR = 1.15 – 1.36

  11. Cancers of a given histologic diagnosis are genomically heterogeneous • Cancers are mostly caused by somatic mutations not genetic polymorphisms • Most of the information about the disease is in the tumor genome, not the germ-line genome

  12. Biomarkers for Early Detection • Because of the long time between first mutation and clinical diagnosis of human solid tumors, there would seem to be great opportunity for early detection

  13. Phase II trials of early detection have used samples from patients at diagnosis • Effective detection must have long lead time and high specificity for tumors which will evolve to be life threatening

  14. Biomarkers for Informing Treatment Selection • Prognostic biomarkers • Measured before treatment to indicate long-term outcome for patients untreated or receiving standard treatment • To identify which patients have excellent prognosis on conservative treatment • Predictive biomarkers • Measured before treatment to identify who is likely or unlikely to benefit from a particular treatment

  15. Prognostic Markers • Vast literature on prognostic markers • Very few used in practice • Most studies motivated by desire to learn about disease biology • Broad selection of cases • Little focus on intended use • Little focus on analytical validation of assay

  16. Validation of Biomarkers • Analytical validity • Measures what it supposed to • Reproducible • Clinical validity • Correlates with something clinically • Clinical utility • Is actionable • Measuring marker leads to action that benefits patient • Requires clarity on intended use

  17. If you don’t know where you are going, you might not get thereYogi Berra

  18. Prognostic Markers • OncotypeDx: Which patients with node negative ER positive breast cancer who are receiving tamoxifin will have such good prognosis that they do not need cytotoxic chemotherapy? • Analysis focused on whether marker identifies such a subset, not on statistical significance

  19. p<0.0001 338 pts 149 pts 181 pts B-14 Results—Relapse-Free Survival Paik et al, SABCS 2003

  20. Major problems with prognostic studies of gene expression signatures • Inadequate focus on intended use • Cases selected based on availability of specimens rather than for relevance to intended use • Heterogeneous sample of patients with mixed stages and treatments. Attempt to disentangle effects using regression modeling • Overemphasis on statistical significance and hazard ratios. • Over-fitting data

  21. For p>n problems • Fit of a model to the same data used to develop it is no evidence of prediction accuracy for independent data

  22. Validation of Prognostic Model • Completely independent validation dataset • Splitting dataset into training and testing sets • Cross-validation

  23. Partition data set D into K equal parts D1,D2,...,DK • First training set T1=D-D1 • Develop completely specified prognostic model M1 using only data T1 • Compute prognostic score for cases in D1 • Develop model M2 using only T2 and then score cases in D2

  24. Repeat for ... TK -> MK -> DK • Group patients into risk groups (e.g. 2 or more) based on their cross-validated scores • Calculate Kaplan-Meier survival curve for each risk-group

  25. Complete cross Validation • Cross-validation simulates the process of separately developing a model on one set of data and predicting for a test set of data not used in developing the model • All aspects of the model development process must be repeated for each loop of the cross-validation • Feature selection • Tuning parameter optimization

  26. Prediction on Simulated Null DataSimon et al. J Nat Cancer Inst 95:14, 2003 • Generation of Gene Expression Profiles • 20 specimens (Pi is the expression profile for specimen i) • Log-ratio measurements on 6000 genes • Pi ~ MVN(0, I6000) • Can we distinguish between the first 10 specimens (Class 1) and the last 10 (Class 2)? • Prediction Method • Compound covariate predictor built from the log-ratios of the 10 most differentially expressed genes.

  27. Cross Validation • The cross-validated estimate of misclassification error is an estimate of the prediction error for the model fit applying the specified algorithm to full dataset

  28. Statistical significance of the difference in survival among risk groups is usually not the point • But to evaluate significance, the log-rank test cannot be used for cross-validated Kaplan-Meier curves because the survival times are not independent

More Related