360 likes | 467 Views
Development, maintenance and sharing of small-scale databases for genome research. Volker Brendel Department of Genetics, Development and Cell Biology Department of Statistics Iowa State University. … the joys and perils of molecular data mining.
E N D
Development, maintenance and sharing of small-scale databases for genome research Volker Brendel Department of Genetics, Development and Cell Biology Department of Statistics Iowa State University
http://genome-www5.stanford.edu/MicroArray/SMD/ Blader IJ, et al. (2001) J Biol Chem 276(26):24223-31 Microarray Analysis Reveals Previously Unknown Changes in Toxoplasma gondii-infected Human Cells.
The Molecular Biology Database Collection: 2003 update Andreas D. Baxevanis Database Categories List Major Sequence RepositoriesComparative GenomicsGene ExpressionGene Identification and StructureGenetic and Physical MapsGenomic DatabasesIntermolecular InteractionsMetabolic Pathways and Cellular RegulationMutation DatabasesPathologyProtein DatabasesProtein Sequence MotifsProteome ResourcesRNA SequencesRetrieval Systems and Database StructureStructureTransgenicsVaried Biomedical Content
Alphabetical Database List 16S and 23S Ribosomal RNA Mutation Database AAindex Physicochemical properties of peptides ACeDB C. elegans, S. pombe, and human sequences and genomic information . . . ZmDB Maize genome database
Molecular Databases - Problems • Content (accuracy; currency) • Multiplicity (> 1,000 specialized databases!?) • Lack of standards (e.g., ZmDB: ACeDB, MySQL, FileMakerPro) • Accessibility (web!?)
Why are there so many distinct databases in molecular biology?
NSF 99 -171 PLANT GENOME RESEARCH PROGRAM - COLLABORATIVE RESEARCH ON FUNCTIONAL GENOMICS Program Announcement DIRECTORATE FOR BIOLOGICAL SCIENCES LETTER OF INTENT: NOVEMBER 8, 1999 PROPOSAL DEADLINE: JANUARY 7, 2000 NATIONAL SCIENCE FOUNDATION
Informatics: Include a detailed description of all informatics components of the project. This section should describe the informatics tools used for internal data management as well as the distribution of information to the scientific community. Technical descriptions must be sufficiently detailed to allow adequate review by informatics experts. All data must be released to the public in an accessible and useable form. If project includes development of a new database or expansion of an existing database, a plan for its long-term maintenance must be described.
Case study:ZmDB – a maize genome database www.zmdb.iastate.edu
End of funding!End of project!?End of database!?End of data integrity!?
MaizeGDB www.maizedb.org
PlantGDB www.plantgdb.org
AtGDB www.plantgdb.org/AtGDB
Matt Wilkerson Shannon Schlueter NSF Plant Genome Research Project(s)
Genome Annotation – The Need for User Contributions! Examples for Arabidopsis At1g28080 (missed 5' UTR) At2g40840 (intergenic region? [U12 intron!]) User Contributed Annotation [Alan Myers!?] At1g14370 (overlapping 3'-UTRs ? [no!]) User Contributed Annotation [Your Name here!!]