370 likes | 521 Views
An Array of FDA Efforts in Pharmacogenomics. Weida Tong Director, Center for Toxicoinformatics, NCTR/FDA Weida.tong@fda.hhs.gov. CAMDA 08, Boku University, Vienna, Austria, Dec 4-6, 2008. Research spending. NDAs and BLAs received by FDA. R&D spending. NIH budget. NMEs. BLAs.
E N D
An Array of FDA Efforts in Pharmacogenomics Weida Tong Director, Center for Toxicoinformatics, NCTR/FDA Weida.tong@fda.hhs.gov CAMDA 08, Boku University, Vienna, Austria, Dec 4-6, 2008
Research spending NDAs and BLAs received by FDA R&D spending NIH budget NMEs BLAs Pipeline Problem: Spending More, Getting Less While research spending (Pharma and NIH) has increased, fewer NME’s and BLA’s have been submitted to FDA
The FDA Critical Path to New Medical Products • Pharmacogenomics and toxicogenomics have been identified as crucial in advancing • Medical product development • Personalized medicine
Guidance for Industry: Pharmacogenomic Data Submissions www.fda.gov/cder/genomics www.fda.gov/cder/genomics/regulatory.htm
A Novel Data Submission Path - Voluntary Genomics Data Submission (VGDS) • Defined in Guidance for Industry on Pharmacogenomics (PGx) Data Submission (draft document released in 2003; final publication, 2005) • To encourage the sponsor interacting with FDA through submission of PGx data at the voluntary basis • To provide a forum for scientific discussions with the FDA outside of the application review process. • To establish regulatory environment (both the tools and expertise) within the FDA for receiving, analyzing and interpreting PGx data
VGDS Status • Total of >40 submissions have been received • The submissions contain PGx data from • DNA Microarrays • Proteomics • Metabolomics • Genotyping including Genome wide association study (GWAS) • Others • Bioinformatics has played an essential role to accomplish: • Objective 1: Data repository • Objective 2: Reproduce the sponsor’s results • Objective 3: Conduct alternative analysis
FDA Genomic Tool: ArrayTrack – Support FDA regulatory research and review • Developed by NCTR/FDA • Develop 1: An integrated solution for microarray data management, analysis and interpretation • Develop 2: Support meta data analysis across various omics platforms and study data • Develop 3: SNPTrack, a sister product in collaboration with Rosetta • FDA agency wide application • Review tool for the FDA VGDS data submission • >100 FDA reviewers and scientists have participated the training • Integrating with Janus for e-Submission
Microarray data Proteomics data Metabolomics data Public data ArrayTrack: An Integrated Solution for omics research Clinical and non-clinical data Chemical data ArrayTrack
Protein Gene Metabolite
Specific Functionality Related to VGDS Gene • Phenotypic anchoring • Systems Approach Gene name is hidden Clinical pathology data CLinChem name is hidden
ArrayTrack-Freely Available to Public Web-access Local installation # of unique users calculated quarterly • To be consistent with the common practice in the research community • Over 10 training courses have been offered, including two in Europe • Education: Part of bioinformatics course in UCLA, UMDNJ and UALR • Eli Lilly choose ArrayTrack to support it’s clinical gene-expression studies after rigorously assessing the architectural structure, functionality, security assessments and custom support
ArrayTrack Website http://www.fda.gov/nctr/science/centers/toxicoinformatics/ArrayTrack/
MicroArray Quality Control (MAQC) - An FDA-Led Community Wide Effort to Address the Challenges and Issues Identified in VGDS • QC issue – How good is good enough? • Assessing the best achievable technical performance of microarray platforms (QC metrics and thresholds) • Analysis issue – Can we reach a consensus on analysis methods? • Assessing the advantages and disadvantages of various data analysis methods • Cross-platform issue – Do different platforms generate different results? • Assessing cross-platform consistency
MAQC Way of Working Participants: Everyone was welcome; however, cutoff dates had to be imposed. Cost-sharing: Every participant contributed, e.g., arrays, RNA samples, reagents, time and resources in generating and analyzing the MAQC data Decision-making: Face-to-face meetings (1st, 2nd, 3rd, and 4th) Biweekly, regular MAQC teleconferences (>20 times) Smaller-scale teleconferences on specific issues (many) Outcome: Peer-reviewed publication: Followed the normal journal-defined publication process 9 papers submitted to Nature Biotechnology 6 accepted and 3 rejected Transparency MAQC Data is freely available at GEO, ArrayExpress, and ArrayTrack RNA samples are available from commercial vendors
MicroArray Quality Control (MAQC) project – Phase I Feb 2005 • MAQC-I: Technical Performance • Reliability of microarray technology • Cross-platform consistency • Reproducibility of microarray results • MAQC-II: Practical Application • Molecular signatures (or classifiers) for risk assessment and clinical application • Reliability, cross-platform consistency and reproducibility • Develop guidance and recommendations 137 scientists from 51 ORG MAQC-I Sept 2006 MAQC-II >400 scientists from >150 ORG Dec 2008
Results from the MAQC-I Study Published in Nature Biotechnology on Sept/Oct 2006 • Six research papers: • MAQC Main Paper • Validation of Microarray Results • RNA Sample Titrations • One-color vs. Two-color Microarrays • External RNA Controls • Rat Toxicogenomics Validation Nat. Biotechnol. 24(9) and 24(10s), 2006 Plus: EditorialNature Biotechnology Foreword Casciano DA and Woodcock J Stanford Commentary Ji H and Davis RW FDA Commentary Frueh FW EPA Commentary Dix DJ et al.
Key Findings from the MAQC-I Study When standard operating procedures (SOPs) are followed and the data is analyzed properly, the following is demonstrated: • High within-lab and cross-lab reproducibility • High cross-platform comparability, including one- vs two-color platforms • High correlation between quantitative gene expression (e.g. TaqMan) and microarray platforms • The few discordant measurements were found, mainly, due to probe sequence and thus target location
How to determine DEGs - Do we really know what we know • A circular path for DEGs • Fold Change – biologist initiated (frugal approach) • Magnitude difference • Biological significance • P-value – statistician joined in (expensive approach) • Specificity and sensitivity • Statistical significance • FC (p) – A MAQC findings (statistics got to know its limitation) • The FC ranking with a nonstringent P-value cutoff, FC (P), should be considered for class comparison study • Reproducibility
Nature Science Nature Method Cell Analytical Chemistry
FC Sorting Sensitivity 1-specificity POG Post-MAQC-I Study on Reproducibility of DEGs - A Statistical Simulation Study Lab 1 Lab 2 P vs FC POG Reproducibility
How to determine DEGs- Do we really know what we don’t know • A struggle between reproducibility and specificity/sensitivity • A monotonic relationship between specificity and sensitivity • A “???” relationship between reproducibility and specificity/sensitivity
More on Reproducibility • General impressions (conclusions): • Reproducibility is a complicated phenomena • No straightforward way to assess the reproducibility of DEGs • Reproducibility and statistical power • More samples higher reproducibility • Reproducibility and statistical significance • Inverse relationship but not a simple trade-off • Reproducibility and the gene length • A complex relationship with the DEG length • Irreproducible not equal to biological irrelevant • If two DEGs from two replicated studies are not reproducible, both could be true discovery
MicroArray Quality Control (MAQC) project – Phase II Feb 2005 • MAQC-I: Technical Performance • Reliability of microarray technology • Cross-platform consistency • Reproducibility of microarray results • MAQC-II: Practical Application • Molecular signatures (or classifiers) for risk assessment and clinical application • Reliability, cross-platform consistency and reproducibility • Develop guidance and recommendations 137 scientists from 51 ORG MAQC-I Sept 2006 MAQC-II >400 scientists from >150 ORG Dec 2008
Application of Predictive Signature Treatment Long term effect Clinical application (Pharmacogenomics) Treatment outcome Prognosis Diagnosis Short term exposure Long term effect Safety Assessment (Toxicogenomics) Prediction Phenotypic anchoring
Challenge 1 Batch effect Data Set QC Which QC methods Normalization e.g.: Raw data, MAS5, RMA, dChip, Plier Preprocessing How to generate an initial gene pool for modeling Feature Selection P, FC, p(FC), FC(p) … Classifier Which methods: KNN, NC, SVM, DT, PLS … • How to assess the success • Chemical based prediction • Animal based prediction Validation
Challenge 2: Assessing the Performance of a Classifier Prediction Accuracy: Sensitivity, Specificity 1 Robustness: Reproducibility of signatures 3 2 Mechanistic Relevance: Biological understanding
Dataset Set QC A consensus approach (12 teams) Normalization Preprocessing Freedom of choice (35 analysis teams) Feature Selection Classifier Validation, validation and Validation! Validation
Dataset Set QC Normalization Preprocessing Feature Selection Classifier Validation What We Are Looking For • Which factors (or parameters) critical to the performance of a classifier • A standard procedure to determine these factors • The procedure should be the dataset independent • A best practice - Could be used as a guidance to develop microarray based classifiers
Three-Step Approach Step1 Training set Step 2 Blind test set Step 3 Future sets New exp for selected endpoints Prediction • Classifiers • Sig. genes • DAPs Assessment Validate the Best Practice Frozen Best Practice
MAQC-II Data Sets Clinical data Toxicogenomics data
Where We Are Step1 Training set Step 2 Blind test set Step 3 Future sets New exp for selected endpoints Prediction • Classifiers • Sig. genes • DAPs Assessment Validate the Best Practice Frozen Best Practice
Dataset Set QC Normalization Preprocessing Feature Selection Prediction Accuracy Classifier Validation Robustness Mechanistic Relevance 18 Proposed Manuscripts • Main manuscript - Study design and main findings • Assessing Modeling Factors (4 proposals) • Prediction Confidence (5 proposals) • Robustness (3 proposals) • Mechanistic Relevance (2 proposals) • Consensus Document (3 proposals)
Consensus Document (3 proposals) • Principles of classifier development: Standard Operating Procedures (SOPs) • Good Clinical Practice (GCP) in using microarray gene expression data • MAQC, VXDS and FDA guidance on genomics Modeling Assessing Consensus Guidance
Best Practice Document • One of the VGDS and MAQC objectives is to communicate with the private industry/research community to reach consensus on • How to exchange genomic data (data submission) • How to analyze genomic data • How to interpret genomic data • Lessons Learned from VGDS and MAQC have led to development of Best Practice Document (Led by Federico Goodsaid) • Companion to Guidance for Industry on Pharmacogenomic Data Submission (Docket No. 2007D-0310). (http://www.fda.gov/cder/genomics/conceptpaper_20061107.pdf) • Over 10 pharmas have provided comments
An Array of FDA Endeavors- Integrated Nature of VGDS, ArrayTrack, MAQC and Best Practice Document Best Practice Document MAQC VGDS ArrayTrack