380 likes | 468 Views
Identifying Changes in Signaling from High-Throughput Data. Michael Ochs Fox Chase Cancer Center. Group 1 Patients. Group 2 Patients. Overall Survival (years). 0. 2. 4. 6. 8. 10. The “New” Paradigm. Group 1. Group 2. Targeted Therapies. Personalized Medicine.
E N D
Identifying Changes in Signaling from High-Throughput Data Michael Ochs Fox Chase Cancer Center Fox Chase Cancer Center
Group 1 Patients Group 2 Patients Overall Survival (years) 0 2 4 6 8 10 The “New” Paradigm Group 1 Group 2 Targeted Therapies Personalized Medicine Your Chromosomes Here Fox Chase Cancer Center
Outline • Signaling and Gene Expression • Bayesian Decomposition • Examples of Analyses Fox Chase Cancer Center
Cellular Signaling Extracellular Signal Signal Transduction Metabolic Changes Transcription Downward, Nature, 411, 759, 2001 Fox Chase Cancer Center
Gene Expression Fox Chase Cancer Center
M F H E A C B D Identifying Pathways A B C D E Fox Chase Cancer Center
Take measurements of thousands of genes, some of which are responding to stimuli of interest 3 1 2 And find the correct set of basis vectors that link to pathways * * * * * * then identify the pathways Goal of Analysis Fox Chase Cancer Center
Block Protein-Protein Interaction Leads to Loss of Some Transcripts, Reduction of Others Depending on Active Signaling Pathways Biological Model But the Gene Lists are Incomplete as are the Network Diagrams! Fox Chase Cancer Center
Issues to Solve • Overlapping Signals • Genes are involved in multiple processes • Various processes are active simultaneously in any observed data • Identification of Process Behind Signal • If find a signal, what is the cause • Do identification without a complete model Fox Chase Cancer Center
Outline • Signaling and Gene Expression • Bayesian Decomposition • Examples of Analyses Fox Chase Cancer Center
Data • * * * * * * * * * * * * * * * * * * * * • * * * * * * * * * * * * * * * * * * * * • * * * * * * * * * * * * * * * * * * * * • * * * * * * * * * * * * * * * * * * * * • * * * * * * * * * * * * * * * * * * * * • * * * * * * * * * * * * * * * * * * * * • * * * * * * * * * * * * * * * * * * * * • * * * * * * * * * * * * * * * * * * * * • * * * * * * * * * * * * * * * * * * * * • * * * * * * * * * * * * * * * * * * * * • * * * * * * * * * * * * * * * * * * * * • * * * * * * * * * * * * * * * * * * * * • * * * * * * * * * * * * * * * * * * * * • * * * * * * * * * * * * * * * * * * * * • * * * * * * * * * * * * * * * * * * * * • * * * * * * * * * * * * * * * * * * * * • * * * * * * * * * * * * * * * * * * * * • * * * * * * * * * * * * * * * * * * * * (Spellman et al, Mol Biol Cell, 9, 3273, 1999) (Cho et al, Mol Cell, 2, 65, 1998) Fox Chase Cancer Center
Distribution of Patterns condition M condition 1 of simpler behaviors Patterns of Behavior complex behavior pattern 1 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * is explained as combinations pattern k vs Mock BD: Identification of Signals condition 1 condition M gene 1 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * pattern k pattern 1 gene 1 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ** ** * * * * * * * * * * * * X = gene N Data gene N Fox Chase Cancer Center
Markov Chain Monte Carlo We cannot always solve the problem directly, we can only estimate relative probabilities of possible solutions Markov Chain Monte Carlo is used to explore the possible solutions Fox Chase Cancer Center
Bayesian Statistics p(data | model) p(model) p(model | data) = p(data) condition 1 condition M pattern 1 pattern k gene 1 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * gene 1 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ** ** * * * * * * * * * * * * pattern 1 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * = X pattern k condition M condition 1 gene N gene N Fox Chase Cancer Center
Outline • Signaling and Gene Expression • Bayesian Decomposition • Examples of Analyses Fox Chase Cancer Center
Acknowledgements • Tom Moloshok (Cell Cycle, Mouse) • Ghislain Bidaut (Yeast Deletion Mutants) • Andrew Kossenkov (TFs, YDMs) • Bill Speier, DJ Datta, Daniel Chung, Ryan Goldstein, Matt Lewandowski Fox Chase Cancer Center
Cell Cycle Tobin and Morel, Asking About Cells, Harcourt Brace, 1997 Fox Chase Cancer Center
Data • Data: Expression data of 788 yeast cell-cycle regulated genes [Cho, 1998] across 17 different time points was taken for analysis. • Coregulation: 11 groups (from 5 to 17 genes in each group – 67 genes in total, 18 from 67 genes belong to more than one group) were composed, based on literature review (not cell cycle literature). • Analysis: with and without coregulation information Fox Chase Cancer Center
Validation Cherepinsky et al, PNAS, 100, 9668, 2003 Fox Chase Cancer Center
Sensitivity = Specificity= ROC Analysis ROC Receiver Operator Characteristic Fraction of called positives that are correct Sensitivity Fraction of called negatives that are correct TP true positive TN true negative FP false positive FN false negative 1 - Specificity Area under the curve is the measurement of algorithm efficacy Fox Chase Cancer Center
Hierarchical Clustering ROC Curve Cherepinsky et al, PNAS, 100, 9668, 2003 Fox Chase Cancer Center
Bayesian Decomposition Sensitivity 1 - Specificity Fox Chase Cancer Center
Deletion Mutant Data Set (Hughes et al, Cell, 102, 109, 2000) • 300 Deletion Mutants in S. cerevisiae • Biological/Technical Replicates with Gene Specific Error Model • Filter Genes • >25% Data Missing in Ratios or Uncertainties • < 2 Experiments with 3 Fold Change • Filter Experiments • < 2 Genes Changing by 3 Fold 228 Experiments/764 Genes Fox Chase Cancer Center
Mutant M Mutant 1 pattern 1 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * pattern k BD: Matrix Decomposition Distribution of Patterns (what genes are in patterns) Mutant 1 Mutant M gene 1 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * pattern k pattern 1 gene 1 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ** * * * * * * * * * * * * * * X = Patterns of Behavior (does mutant contain pattern) gene N Data gene N Fox Chase Cancer Center
Analysis • Bayesian Decomposition • Identify patterns and linked genes • Use genes to determine function • Interpretation of Functions • Gene Ontology • Transcription factor data • Validation Fox Chase Cancer Center
Use of Ontology: Pattern 13 13 15 Fox Chase Cancer Center
The Other Pattern: 15 13 15 Fox Chase Cancer Center
to Transcription Factors to mRNA Changes Transcription Factors Signaling Pathways Fox Chase Cancer Center
Genes from Pattern 13 *Fig1 *Prm6 *Fus1 *Ste2 *Aga1 *Fus3 Pes4 *Prm1 ORF *Bar1 * known to be involved in mating response known to be regulated by Ste12p Fox Chase Cancer Center
Amount of Behavior Explained by Mating Pathway for Mutants Validation (Posas, et al, Curr Opin Microbiology, 1, 175, 1998) Fox Chase Cancer Center
Pattern 13 Mutants Fox Chase Cancer Center
Pattern 15 Mutants Fox Chase Cancer Center
Conclusions • Transcriptional Response Provides Signatures of Pathway Activity • Ontologies Can Guide Interpretation • Bayesian Decomposition Can Dissect Strongly Overlapping Signatures Fox Chase Cancer Center
Tom Moloshok Jeffrey Grant Yue Zhang Elizabeth Goralczyk Liat Shimoni Luke Somers (UPenn) Olga Tchuvatkina Michael Slifker Sinoula Apostolou Brendan Reilly Collaborators A. Godwin (FCCC) A. Favorov (GosNIIGenetika) J.-M. Claverie (CNRS) G. Parmigiani (JHU) O. Favorova (RMSU) Acknowledgements Fox Chase Ghislain Bidaut (UPenn CBIL) Andrew Kossenkov Vladimir Minayev (MPEI) Garo Toby (Dana Farber) Yan Zhou Aidan Petersen Bill Speier (Johns Hopkins) Daniel Chung (Columbia) DJ Datta (UCSF) Elizabeth Faulkner (UPenn) Frank Manion Bob Beck Fox Chase Cancer Center
Fuzzy Clustering PCA Patterns as Basis Vectors BD Fox Chase Cancer Center
MakingProteins(Phenotype) Fox Chase Cancer Center
ROSETTA DATA • From 5 to 20 patterns were posited in the analysis. • Results were checked on information about Metabolic Pathways taken from Saccharomyces Genome Database - 11 groups of 4-6 genes, known to be involved in the same metabolic pathways. • ROC analysis was performed Fox Chase Cancer Center
ROSETTA DATA Fox Chase Cancer Center