1 / 29

Ming-Chih Kao, PhD University of Michigan Medical School mckao@med.umich

Integrating Cross-Platform Microarray Data by Second-order Analysis: Functional Annotation and Network Reconstruction. Ming-Chih Kao, PhD University of Michigan Medical School mckao@med.umich.edu. Wing Hung Wong Professor of Statistics and of Health Research and Policy Stanford University.

ina
Download Presentation

Ming-Chih Kao, PhD University of Michigan Medical School mckao@med.umich

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Integrating Cross-Platform Microarray Data by Second-order Analysis: Functional Annotation and Network Reconstruction Ming-Chih Kao, PhD University of Michigan Medical School mckao@med.umich.edu

  2. Wing Hung Wong Professor of Statistics and of Health Research and Policy Stanford University Xianghong Jasmine Zhou Assistant Professor of Biological Sciences USC

  3. 2nd-Order AnalysisCurrent Challenges in Microarray Data Analysis • How to effectively combine the expression data sets generated with different technology/laboratory platforms? • How to identify functionally related genes without co-expression pattern? • How to identify transcription cascades?

  4. Microarray Platforms 2nd-Order AnalysisMultiple Microarray Technology Platforms

  5. 2nd-Order AnalysisPublic Microarray Data Sources

  6. gene1 gene2 Transcription Factor 2 ? gene3 Transcription Factor 1 gene4 gene5 Transcription Factor 3 ? gene6 gene7 Amplification of signal

  7. exp. correlation G1 G2 expression Experimental groups Second-order Correlation exp. correlation G3 G4 Experimental groups Copper Zinc Cell Cycle Stress Osmotic Starvation experiments First-order correlation

  8. 2nd-Order AnalysisAn Example Regulation of Cell Cycle: POG1-MPT5 and SDA1-CDC5 Heat Steady Chromatin Silencing Amino acid Starvation Gamma Radiation Protein Metabolism DNA Damage Expression of POG1-MPT5 Expression of SDA1-CDC5 Expression Correlation POG1-MPT5, SDA1-CDC5 Experimental groups

  9. 2nd-Order AnalysisValidation • Group functionally related genes that may not exhibit similar expression patterns? • Data • Stanford Microarray Database (cDNA array) • NCBI GEO Database (Affymetrix array) • Rosetta Compendium (cDNA array) 39 experimental groups subjected to different (types) of perturbations, such as cell cycle, heat shock, osmotic pressure, starvation, zinc, nitrogen depletion, etc.

  10. 43 functional classes 2,429 genes 5,142 doublets 278,799 Quadruplets Homogenous Quadruplets 84% Heterogeneous Quadruplets 16% 2nd-Order AnalysisValidation: Scheme

  11. 2nd-Order AnalysisValidation: Comparison

  12. 2nd-order analysis groups functionally related genes The derived quadruplets give rise to a set of 2,597 distinct and novel gene pairs 97% of the 2,597 pairs are missed by the standard methods Reasons for the poor performance of the 1st-order method Inter-dataset variations Cross-doublet gene pairs need not show high expression correlation Sensitivity to gene pairs which are only co-expressed in a subset of the data sets 2nd-Order AnalysisValidation: Results

  13. Heat shock Starvation Cell Cycle a a a e e e 5 5 f 5 b b f b f d d d c c c Osmotic pressure Radiation Nitrogen Depletion a a a e e e f 5 5 5 b b b f f d d d c c c

  14. 2nd-Order AnalysisInteraction Modules

  15. 2nd-Order AnalysisInteraction Modules

  16. 2nd-Order AnalysisInteraction Modules: Leave-one-out Cross Validation • For each gene occurred in the 100 tightest and most stable clusters of known genes, we masked its function and make prediction based on our 2-step procedure, and check the predicted function and its true function. • We made predictions for 179 doublets, among which 163 are correct  91% success ratio

  17. 2nd-Order AnalysisInteraction Modules: Functional Prediction • 79 functions of 69 unknown yeast genes involved in diverse biological processes • Experimental studies in the literature and in our laboratory • YLR183C in “mitosis” • Regulation of G1/S transition • YLL051C in “cation transport” • Ferric-chelate reductase activity and iron-regulated expression

  18. Transcription Factors 2nd-Order AnalysisFrequently Occurring Tight Clusters

  19. 2nd-Order AnalysisFrequently Occurring TCs with 2nd-Order Correlation

  20. Cooperativity Transcription Factors Set 1 Transcription Factor Set 2

  21. 3 types of transcription cascades

  22. 2nd-Order AnalysisChIP-Chip

  23. 2nd-Order AnalysisTranscription Module Results • 60 transcription modules identified • 34 pairs showed high 2nd-order correlation • 29% (P<10-5) of those modules pairs are participants in transcription cascades • 2 pairs in Type I cascades • 8 pairs in Type II cascades • 3 pairs in Type III cascades • These transcription cascades inter-connect into a partial cellular regulatory network

  24. 1.0 Avg. Expression Leu3 module vs. Met4 module -1.0 1.0 Avg. Expression Correlation Leu3 module vs. Met4 module -1.0 2nd-Order AnalysisLeu3 and Met4 Transcription Cascade

  25. 2nd-Order AnalysisHierarchical clustering of transcriptional modules

  26. 2nd-Order AnalysisAssigning transcription factor to pathways For an unknown transcription factor in a module cluster, we can annotate its function by integrating 2 types of evidence: the functions of known genes in its target module the functions of known transcription factors regulating other modules in the same cluster

  27. A framework to integrate many microarray data sets in a platform-independent way, and investigated its properties and applications. Group together functionally-related genes without direct expression similarity Cluster the functional interaction into modules and functional annotation for unknown genes Reveal the cooperativity in the regulatory network and reconstruct transcription cascades 2nd-Order AnalysisSummary

More Related