1 / 20

Data Content of the BioCyc Databases

Data Content of the BioCyc Databases. BioCyc Tier 1 Databases. EcoCyc Project – EcoCyc.org. E. co li En cyc lopedia Review-level Model-Organism Database for E. coli Tracks evolving annotation of the E. coli genome and cellular networks The two paradigms of EcoCyc

kasie
Download Presentation

Data Content of the BioCyc Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Content of the BioCyc Databases

  2. BioCyc Tier 1 Databases

  3. EcoCyc Project – EcoCyc.org • E.coli Encyclopedia • Review-level Model-Organism Database for E. coli • Tracks evolving annotation of the E. coli genome and cellular networks • The two paradigms of EcoCyc • “Multi-dimensional annotation of the E. coli K-12 genome” • Positions of genes; functions of gene products – 76% / 66% exp • Gene Ontology terms; MultiFun terms • Gene product summaries and literature citations • Evidence codes • Multimeric complexes • Metabolic pathways • Regulation of gene expression and of protein activity Karp, Gunsalus, Collado-Vides, Paulsen Nuc. Acids Res. 35:7577 2007ASM News 70:25 2004 Science 293:2040

  4. EcoCyc = E.coli Dataset + Pathway/Genome Navigator URL: EcoCyc.org Pathways: 246 Reactions: Metabolic: 1394 Transport: 246 Compounds: 1,830 EcoCyc v13.6 Citations: 19,000 Proteins: 4,479 Complexes: 895 RNAs: 285 Gene Regulation: Operons: 3,369 Trans Factors: 196 Promoters: 1,796 TF Binding Sites: 2,205 Genes: 4,492

  5. EcoCyc Gene and Protein Information • Gene locations and protein functions updated through literature curation and in collaboration with RefSeq, EcoGene, and UniProt • EcoCyc curators author minireview summaries for gene products, complexes, pathways, and transcription units • Gene Ontology terms curated by EcoCyc and imported regularly from UniProt • Protein features regulatory imported from UniProt

  6. EcoCyc Regulation • Multiple types of regulatory information present in EcoCyc • Transcriptional regulation and operon organization • Attenuation • Regulation of translation by small RNAs and proteins • Regulation of protein activity by covalent and non-covalent means

  7. Other E. coli Genomes in BioCyc • Currently BioCyc contains ~40 other E. coli and Shigella genomes • New genomes will be included from RefSeq as BioCyc expands • SRI is building orthology-based curation tools that will allow us to propagate curation from EcoCyc to these other strains

  8. EcoCyc Accelerates Science • Experimentalists • E. coli experimentalists • Experimentalists working with other microbes • Analysis of expression data • Computational biologists • Biological research using computational methods • Genome annotation • Study connectivity of E. coli metabolic network • Study phylogentic extent of metabolic pathways and enzymes in all domains of life • Bioinformaticists • Training and validation of new bioinformatics algorithms – predict operons, promoters, protein functional linkages, protein-protein interactions, • Metabolic engineers • “Design of organisms for the production of organic acids, amino acids, ethanol, hydrogen, and solvents “ • Educators

  9. EcoliHub Resource • www.ecolihub.org • Hub search • Simultaneously searches 12 different E. coli databases • EcoliHub Omics • Omics data repository and analysis for E. coli • EcoliHouse • Queryable MySQL server containing multiple E. coli databases • EcoliWiki • Community contributed content about E. coli

  10. MetaCyc: Metabolic Encyclopedia • Describe a representative sample of every experimentally determined metabolic pathway • Describe properties of metabolic enzymes • Literature-based DB with extensive references and commentary • Pathways, reactions, enzymes, substrates • Jointly developed by • P. Karp, R. Caspi, C. Fulcher, SRI International • L. Mueller, A. Pujar, Boyce Thompson Institute • S. Rhee, P. Zhang, Carnegie Institution Nucleic Acids Research2008

  11. MetaCyc Data -- Version 14.0

  12. Taxonomic Distribution ofMetaCyc Pathways – version 13.1

  13. MetaCyc Pathway Ontology • Provides a classification system for metabolic pathways

  14. Biosynthesis [902] • Amino acids Biosynthesis [105] • Aromatic Compounds Biosynthesis [13] • Carbohydrates Biosynthesis [70] • Cell structures Biosynthesis [31] • Cofactors, Prosthetic Groups, Electron Carriers Biosynthesis [160] • Hormones Biosynthesis [40] • Fatty Acids and Lipids Biosynthesis [101] • Metabolic Regulators Biosynthesis [4] • Nucleosides and Nucleotides Biosynthesis [20] • Amines and Polyamines Biosynthesis [32] • Secondary Metabolites Biosynthesis [351] • Antibiotic Biosynthesis [20] • Fatty Acid Derivatives Biosynthesis [7] • Flavonoids Biosynthesis [70] • Nitrogen-Containing Secondary Compounds Biosynthesis [64] • Alkaloids Biosynthesis [43] • Phenylpropanoid Derivatives Biosynthesis [46] • Phytoalexins Biosynthesis [25] • Sugar Derivatives Biosynthesis [10] • Terpenoids Biosynthesis [103] • Siderophore Biosynthesis [7]

  15. Degradation/Utilization/Assimilation [639] • Alcohols Degradation [14] • Aldehyde Degradation [12] • Amines and Polyamines Degradation [40] • Amino Acids Degradation [113] • Aromatic Compounds Degradation [152] • C1 Compounds Utilization and Assimilation [24] • Carbohydrates Degradation [52] • Carboxylates Degradation [30] • Chlorinated Compounds Degradation [39] • Cofactors, Prosthetic Groups, Electron Carriers Degradation [2] • Fatty Acid and Lipids Degradation [18] • Inorganic Nutrients Metabolism [72] • Nitrogen Compounds Metabolism [15] • Phosphorus Compounds Metabolism [3] • Sulfur Compounds Metabolism [54] • Nucleosides and Nucleotides Degradation and Recycling [9] • Secondary Metabolites Degradation [58] • Nitrogen Containing Secondary Compounds Degradation [13] • Sugar Derivatives Degradation [31] • Terpenoids Degradation [10]

  16. Detoxification [16] • Acid Resistance [2] • Arsenate Detoxification [3] • Mercury Detoxification [1] • Methylglyoxal Detoxification [8]

  17. Generation of precursor metabolites and energy [124] • Chemoautotrophic Energy Metabolism [14] • Hydrogen Oxidation [2] • Electron Transfer [11] • Fermentation [34] • Glycolysis [6] • Methanogenesis [12] • Pentose Phosphate Pathways [4] • Photosynthesis [6] • Respiration [25] • Aerobic Respiration [9] • Anaerobic Respiration [14] • TCA cycle [9]

  18. Tier 3 Databases

  19. Curation Level • EcoCyc and MetaCyc have many types of data that you will not see in Tier 3 databases • Examples: • Regulation • Minireview summaries • Citations • GO terms • Protein features

  20. BioCyc Ortholog Data • Currently BioCyc ortholog data obtained from CMR all-vs-all protein BLAST comparisons • Require bidirectional best BLAST hits, at least 10% identity, at least 40% similiarity, P-value under 1 • Not all organisms contain ortholog data currently • CMR lacks entries for some organisms • Some BioCyc genomes not obtained from CMR

More Related