1 / 36

Functional Linkages between Proteins

Functional Linkages between Proteins. E. Coli. S. cerevisiae. Drosophila. Introduction. Piles of Information. Flakes of Knowledge.

cerise
Download Presentation

Functional Linkages between Proteins

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Functional Linkages between Proteins

  2. E. Coli S. cerevisiae Drosophila Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGACTCACGATGTGACTGCATGCGTCATTATCTAGTATGAAAAAAGCCATGCTAGGCTAGTCAGCGACATGAGCCATGACTAGCGCAGCATCAGTCATCAGTCAGCGGAGCGAGGAGAGAGAGACGACTGACTAGCATGCACACATGCATGACGTCATGACTGCATGACTGACTGACTGACTGCATGCATGATATTTTTTTTTTCATGCATGCAGCATGCTACCCAGCTACAGTGCACAGCAGGTACGACGCATCAGCATACGTACGGCATGACGACTCAGACTACGCATACGACTACGAC

  3. Data Analysis • Traditional Methods (Experiments & Sequence Homology) The function of a protein • New Computational Methods Functional linkages between proteins

  4. What does Functional Linkage mean ? • A common structural complex • A common metabolic pathway • A common biological process 4)All answers are correct

  5. New Computational Methods • Phylogenetic Profile Method • Rosetta Stone Method • Chromosomal Proximity Method • COG Database

  6. 1 1 1 Phylogenetic Profile Method

  7. Phylogenetic Profile Method Why Should it Work ? • Biologically: Simliar profile  likelihood for common pathway or complex • Mathematically: N genomes  2N possible profiles  A unique characterization

  8. Rosetta Stone Method

  9. Rosetta Stone Method (= Domain Fusion Analysis) • Interacting proteins have homologs in another organism fused into a single protein chain

  10. Rosestta Stone Method

  11. Rosestta Stone Method Why Should it Work ? • Experimentally: E. coli ~4300 proteins ~6800 pairs similar to a single protein • Biologically:

  12. Rosestta Stone Method Validation Tests(E. coli): • Annotation of proteins from the SWISS-PROT database (68% vs. 15%) • Database of Interacting Proteins (6.4%) • Phylogenetic Profile Method (5% vs. 0.6%)

  13. Models’ Success & Failure found predicted

  14. Rosestta Stone Method False Negatives 1) interactions that have evolved through other mechanisms, i.e. there never was a fusion 2) The fused protein has disppeared during evolution

  15. Rosestta Stone Method False Positives 1) Proteins have been fused to regulate co-expression 2) Can’t distinguish between binding and non-binding homologs. 3) Functional interaction rather than a physical interaction

  16. Rosestta Stone Method • Reducing Errors

  17. Rosestta Stone Method • Reconstruction of metabolic pathways

  18. Functional Protein Networks

  19. Orthologs vs. Paralogs • Orthologs: genes in different species that evolved from a common ancestral gene by speciation • Paralogs: genes related by duplication within a genome

  20. Chromosomal Proximity Proximate Genes • On the same strand • Within 300 bp, or - • Respective paralogs within 300 bp Inferred link • genes whose orthologs are close in at least three phylogenetic groups

  21. Chromosomal Proximity • Direct Link two proximate genes that are also proximate in at least two other phylogenetic groups • Indirect Link genes whose orthologs are close in at least three other phylogenetic groups

  22. Chromosomal Proximity

  23. Chromosomal Proximity Why Should it Work ? • Biologically: Conservation of proximity across multiple genomes  Linked function • Logically: How likely is it that two genes are randomly proximate ?

  24. Chromosomal Proximity Method’s Reliability:

  25. Chromosomal Proximity Validation: • 1586 links were detected between ortholog families • KEGG: 80% in the same biological pathway • COG: 67% in the same functional category

  26. Chromosomal Proximity • Total validated links per genome 380 direct 352 inferred

  27. Chromosomal Proximity

  28. The COG Database • Clusters of Orthologous Groups • COGs creation • Each COG contains proteins that have evolved from an ancestral protein

  29. The COG Database Current Numbers (2004) • 43 Complete genomes • 30 phylogenetic groups • 2223 phylogenetic patterns • 17 functional categories • 3307 COGS • 74059 proteins, 71% of total

  30. The COG Database

  31. The COG Database How can we use it ? Direct Information • Annotation of Proteins (group and individual) • Phylogenetic Patterns • Multiple Alignment

  32. The COG Database How can we use it ? Detecting Missed Genes • Patterns that contain all but one • Mostly small proteins

  33. The COG Database • Groups number growth • Are we approaching saturation ?

  34. COG on the WWW

  35. Reliability of the Methods • Major validation: Experimentally known linkages • Validation by “keyword recovery” search

  36. references • Eisenberg D, Marcotte EM, Xenarios I, Yeates TO. Protein function in the post-genomic era. Nature. 2000 405:823-826. Review • Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates TO, Eisenberg D. Detecting protein function and proteing protein interactions from genome sequences. Science. 1999 285:751-753. • Yanai I, Mellor JC, DeLisi C. Identifying functional links between genes using conserved chromosomal proximity. Trends Genet. 2002 18:176-179. • Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS, Kiryutin B, Galperin MY, Fedorove ND, Koonin EV. The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 2001 29:22-28. • Tatusov,R.L., Koonin,E.V. and Lipman,D.J. (1997) A genomic perspective on protein families. Science, 278, 631–637. • http://www.ncbi.nlm.nih.gov/COG

More Related