130 likes | 262 Views
Overview of Microbial Pathway and Genome Databases. Overview. Survey of other databases / web sites that integrate hundreds of microbial genomes and pathway information Most of these resources are described in publications that can be found via PubMed Differences among each resource include:
E N D
Overview • Survey of other databases / web sites that integrate hundreds of microbial genomes and pathway information • Most of these resources are described in publications that can be found via PubMed • Differences among each resource include: • Genomes included • What other information is integrated with the genome data • Value-added computational processing applied to each genome • Query, visualization, and analysis tools available at each site
Overall Comparison to BioCyc • Many of the other databases contain more genomes than BioCyc • This will change in 2011 as BioCyc transitions to RefSeq as its genome source • BioCyc Tier 1 and Tier 2 databases more highly curated than other databases • BioCyc has more extensive query, visualization, and analysis tools than other sites • BioCyc desktop version can be installed locally, and allows editing of PGDBs • Some other sites re-annotate the genomes, which may or may not improve data quality
Microbial Genome Resources • CMR – Comprehensive Microbial Resource • Entrez • IMG – Integrated Microbial Genomes • KEGG – Kyoto Encyclopedia of Genes and Genomes • PATRIC • SEED/NMPDR • UMBBD – Univ of Minnesota Biocatalysis Biodegradation Database
CMR – Comprehensive Microbial ResourceJ. Craig Venter Institute • http://cmr.jcvi.org/tigr-scripts/CMR/CmrHomePage.cgi • ~700 genomes • Genome data only, no pathways • Genome browser, gene pages • Many comparative operations • Will be discontinued later in 2010
Entrez GenomesNational Center for Biotechnology Information • http://www.ncbi.nlm.nih.gov/sites/genome • Web portal to Genbank genomes • Genome browser, gene pages
IMG – Integrated Microbial GenomesJoint Genome Institute • http://img.jgi.doe.gov/cgi-bin/pub/main.cgi • 1,911 microbial genomes (approx half are draft quality) • Genome browser, gene pages • Many comparative operations • Genome context analyses available
PATRICVirginia Bioinformatics Institute • http://patric.vbi.vt.edu/ • Genome browser, gene pages • KEGG pathways
SEED / NMPDRArgonne National Laboratory • http://www.nmpdr.org/FIG/wiki/view.cgi • 782 microbial genomes • Funding ended in 2009 • Unique features: • Systems • Essential genes • Comparative genomics tools • Community annotation
UMBBDUniversity of Minnesota • http://umbbd.msi.umn.edu/ • Database of ~150 microbial biodegradation pathways • Does not include full microbial genomes
KEGG – Kyoto Encyclopedia of Genes and GenomesKyoto University • http://www.genome.ad.jp/kegg/ • 1,382 organisms • KEGG reannotates each genome • Static reference pathway maps are colored with the genes present in each organism
Comparison with KEGG • KEGG vs MetaCyc: Reference pathway collections • KEGG maps are not pathways Nuc Acids Res 34:3687 2006 • KEGG maps contain multiple biological pathways • Two genes chosen at random from a BioCyc pathway are more likely to be related according to genome context methods than from a KEGG pathway • KEGG maps are composites of pathways in many organisms -- do not identify what specific pathways elucidated in what organisms • KEGG has no literature citations, no comments, less enzyme detail • KEGG assigns half as many reactions to pathways as MetaCyc • KEGG vs organism-specific PGDBs • KEGG does not curate or customize pathway networks for each organism • Highly curated PGDBs now exist for important organisms such as E. coli, yeast, mouse, Arabidopsis
Comparison of Pathway Tools to KEGG • Inference tools • KEGG does not predict presence or absence of pathways • KEGG lacks pathway hole filler, operon predictor • Curation tools • KEGG does not distribute curation tools • No ability to customize pathways to the organism • Pathway Tools schema much more comprehensive • Visualization and analysis • KEGG does not perform automatic pathway layout • KEGG metabolic-map diagram extremely limited • No comparative pathway analysis