1 / 20

Introduction

HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U. , Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P. and Di Cunto F. ugo.ala@unito.it Molecular Biotechnology Center, University of Torino. Introduction.

seven
Download Presentation

Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U.,Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P. and Di Cunto F.ugo.ala@unito.itMolecular Biotechnology Center, University of Torino

  2. Introduction • Massive repositories of gene expression data obtained with microarray technology represent an extremely rich source of biological information; • Since genes involved in the same functions tend to show very similar expression profiles, co-expression analysis performed on these datasets could be a very powerful approach for inferring functional relationships among genes and for predicting the involvement of specific sequences in human genetic diseases; • However, so far gene co-expression has not proved to be a particularly useful criterion for disease genes identification.

  3. Functional relationships inferred on the basis of co-expression in a single species contain a large majority of false positive predictions. Reasons • Microarray data are noisy • Many genes showing very similar expression profiles are not functionally related (Spellman et al, 2002)

  4. A powerful help: phylogenetic conservation Since gene regulatory regions evolve at higher speed than coding regions, if the co-expression of two genes is evolutionarily conserved, it is much more likely that the genes are functionally related. Obviously, the confidence level increases with the phylogenetic distance among species. A gene co-expression network constructed with expression data from distant species (H. sapiens, C. elegans, D. melanogaster, S. cerevisiae) (Stuart et al, 2003)

  5. A powerful help: phylogenetic conservation Human-mouse conserved co-expression represents an excellent compromise between sensitivity and specificity to predict functional relationships among mammalian genes (Pellegrino et al, 2004)

  6. Construction of human-mouse conserved coexpression networks for disease gene predictionStep one: single species networks Mus musculus Homo sapiens Single-species datasets of microarray experiments, based on probes which can be linked to EntrezGene IDs Evaluation of gene expression profile correlation among all the probes by Pearson’s coefficient Link every probe with the probes which are in the first percentile of the respective ranked lists Merge links between probes by Entrez Gene identifiers Human gene co-expression networks H-GCN Mouse gene co-expression networks M-GCN

  7. Construction of human-mouse conserved coexpression networks for disease gene predictionStep two: human-mouse networks Human gene co-expression networks H-GCN Mouse gene co-expression networks M-GCN Select the links found in both the co-expression networks, according to Homologene Human-mouse co-expression network

  8. Conserved co-expression networksData retrieval Experiments based on cDNA platforms and performed mostly on tumor cell lines Experiments based on Affymetrix platforms and performed on normal tissues • 4129 experiments for 102296 EST probes • for human • 467 experiments for 80595 EST probes for • mouse • 353 experiments for 46241 probesets for • human (Roth et al, 2006) • 122 experiments for 19692 probesets for • mouse (Su et al, 2004)

  9. Conserved co-expression networksResults • 8512 nodes (genes); • 56397 edges; • 12766 nodes (genes); • 155403 edges; We concentrate our network analysis on CC (Co-expression cluster) defined as the nearest neighbors of each node of networks, thus obtaining a CC for each gene

  10. A-GCN S-GCN Random in vivo in vitro yeast-two-hybrid Conserved co-expression networksComparison with other networks Good protein-protein predictors Both networks exhibit a highly significant overlap with protein-protein interactions reported in the Human Protein Reference Database

  11. A-CCN S-CCN Random Conserved co-expression networksGO Analysis Good criterion to identify functionally related genes A-CCN and S-CCN show a strong enrichment for functional annotation, compared with random permutations.

  12. A-CCN S-CCN Random Predicting human disease genes MimMiner (Van Driel et al, 2006), a text-mining phenotype similarity relationship database, represents a very useful way for the merging of co-expression data with disease information. A-CCN and S-CCN show also a strong enrichment for what concern OMIM Ids characterizing disease phenotype.

  13. How to of the algorithm (1) CCs Conserved Co-expression clusters OMIM locus (phenotype description)

  14. How to of the algorithm (2) CCs Conserved Co-expression clusters OMIM locus (phenotype description) DRCCs Disease Related Co-expression Clusters

  15. How to of the algorithm (3) DRCCs Disease Related Co-expression Clusters OMIM locus (phenotype description) These genes become our candidate disease genes

  16. Leave-one-out Leave-one-out cross validation tests over all known disease genes have shown good performance

  17. Predicting human disease genesResults We applied our procedure to 850 OMIM phenotype entries with unknown molecular basis (but mapped to one or more genetic loci). The candidates are 321, covering a set of 81 loci (65 from A-CCN, 6 from S-CCN and 10 from both networks)

  18. Examples and discussion of some candidates

  19. Conclusions Our approach, based on conserved co-expression analysis, has been demonstrated particularly successful to provide reliable predictions of potential disease-causing genes because of two main factors: • the phylogenetic filter • the integration with quantitative phenotype correlation data In conclusion, we propose that our method and our list of candidates will provide a useful support for the identification of new disease-causing genes.

  20. Our real network … Damasco C. Di Cunto F. Piro R. Ala U. Brunner H. Grassi E. Provero P. Silengo L.

More Related