1 / 44

Value-adding, Access, and Use: Biological Databases as a Case Study

Value-adding, Access, and Use: Biological Databases as a Case Study. Genes…. …….make proteins. Proteins form complex 3D structures. Molecules interact. the right molecules need to be present at the right time. EMBL-Bank DNA sequences. EMBL-Bank DNA sequences. SWISS-PROT + TrEMBL

nevin
Download Presentation

Value-adding, Access, and Use: Biological Databases as a Case Study

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Value-adding, Access, and Use: Biological Databases as a Case Study

  2. Genes…..

  3. …….make proteins

  4. Proteins form complex 3D structures

  5. Molecules interact

  6. the right molecules need to bepresent at the right time

  7. EMBL-BankDNA sequences

  8. EMBL-BankDNA sequences SWISS-PROT + TrEMBL InterPro

  9. EMBL-BankDNA sequences SWISS-PROT + TrEMBL InterPro EnsEMBL Metazoan Genome Gene Annotation

  10. EMBL-BankDNA sequences Array-Express Microarray Expression Data SWISS-PROT + TrEMBL InterPro EnsEMBL Metazoan Genome Gene Annotation

  11. EMBL-BankDNA sequences Array-Express Microarray Expression Data SWISS-PROT + TrEMBL InterPro EnsEMBL Metazoan Genome Gene Annotation

  12. EMBL-BankDNA sequences Array-Express Microarray Expression Data SWISS-PROT + TrEMBL InterPro EnsEMBL Metazoan Genome Gene Annotation EMSD Macromolecular Structure Data

  13. EMBL-BankDNA sequences Array-Express Microarray Expression Data SWISS-PROT + TrEMBL InterPro EnsEMBL Metazoan Genome Gene Annotation EMSD Macromolecular Structure Data

  14. EMBL-BankDNA sequences Array-Express Microarray Expression Data SWISS-PROT + TrEMBL InterPro EnsEMBL IntAct Protein Protein Interaction Data EMSD Macromolecular Structure Data

  15. Integr8

  16. EMBL-BankDNA sequences Array-Express Microarray Expression Data SWISS-PROT + TrEMBL InterPro EnsEMBL IntAct Protein Protein Interaction Data EMSD Macromolecular Structure Data

  17. EMBL-BankDNA sequences SWISS-PROT + TrEMBL InterPro IntAct Protein Protein Interaction Data

  18. Running a database project Database design End Users Service Tools Service DB Genomes Genes Patents Updates Submitters Add value (computation) Releases & Updates Q/C etc Add value (review etc.)

  19. Running a database project Database design End Users Service Tools Production DB Service DB Genomes Genes Patents Updates Submitters Add value (computation) Releases & Updates Q/C etc Add value (review etc.)

  20. Running a database project Database design End Users Service Tools Production DB Service DB Genomes Genes Patents Updates Submission tools Submitters Add value (computation) Releases & Updates Q/C etc Add value (review etc.)

  21. Running a database project Database design End Users Service Tools Production DB Service DB Genomes Genes Patents Updates Submission tools Submitters Add value (computation) Releases & Updates Q/C etc Add value (review etc.)

  22. Running a database project Database design End Users Service Tools Production DB Service DB Genomes Genes Patents Updates Submission tools Submitters Add value (computation) Releases & Updates Q/C etc Add value (review etc.)

  23. Running a database project Database design End Users Service Tools Production DB Service DB Genomes Genes Patents Updates Submission tools Submitters Add value (computation) Releases & Updates Q/C etc Add value (review etc.)

  24. Running a database project Database design End Users Service Tools Production DB Service DB Genomes Genes Patents Updates Submission tools Submitters Data Distrib. Add value (computation) Releases & Updates Q/C etc Add value (review etc.)

  25. Running a database project Other archives Database design End Users Data exchange Service Tools Production DB Service DB Genomes Genes Patents Updates Submission tools Submitters Data Distrib. Releases & Updates Q/C etc Add value (review etc.)

  26. Running a database project Other archives Database design Development DB End Users Data exchange Service Tools Production DB Service DB Genomes Genes Patents Updates Submission tools Submitters Data Distrib. Releases & Updates Q/C etc Add value (review etc.)

  27. Running a database project Other archives Database design Development DB End Users Data exchange Service Tools Production DB Service DB Genomes Genes Patents Updates Submission tools Submitters Data Distrib. Add value (computation) Releases & Updates Q/C etc Add value (review etc.)

  28. EMBL nucleotide sequence database

  29. Dataflow

  30. EMBLFlat File

  31. EMBL Relational Schema Sequence Info Reference Info Location Info Taxonomy Info Feature Info

  32. Data Access and Use • Network services • Sequence Retrieval System (SRS)integrating and linking the main nucleotide and protein databases plus many specialized databases • Database releases are produced quarterly- via FTP (inc. mirror sites) and CD-ROM • Daily and cumulative updates via FTP • Sequence search servers

  33. April 2003: TrEMBL 23.4 + SWISS-PROT 41.2 • 829,111 TrEMBL entries • 123,721 SWISS-PROT entries • weekly production of a non-redundant and comprehensive protein sequence database consisting of SWISS-PROT, TrEMBL, and TrEMBLnew: ftp.ebi.ac.uk/pub/databases/sp_tr_nrdb/

  34. Goals • High level of annotation • Minimal redundancy • High level of integration with other databases • Complete and up-to-date • Availability

  35. Automatic annotation of TrEMBL • Data-mining to extract conditions from InterPro • Extract SWISS-PROT reference entries fulfilling the conditions • Extract common annotation • Store conditions and common annotation in RuleBase • Group TrEMBL by conditions • Add common annotation to TrEMBL InterPro SWISS-PROT TrEMBL RuleBase

  36. Cross-references

  37. UniProt NREF50 UniProt NREF90 UniProt NREF100 UniProt Knowledgebase: TrEMBL + SWISS-PROT Literature Based Annotation Automated Annotation Classification UniProt Archive DDBJ/ EMBL/ GenBank SWISS-PROT Other Data… Patent Data TrEMBL RefSeq PIR EnsEMBL PDB

  38. Funding • EMBL • European Commission • NIH • Industrial licenses • MRC • IUPHAR

  39. SWISS-PROT, TrEMBL, InterPro, etc, at EBI and SIB • Group leaders: Rolf Apweiler, Amos Bairoch • Co-ordinators:Wolfgang Fleischmann, Henning Hermjakob, Michele Magrane, Maria-Jesus Martin, Nicola Mulder, Claire O’Donovan, Manuela Pruess • Annotators/curators:Philippe Aldebert, Andrea Auchincloss, Kirsty Bates, Marie-Claude Blatter Garin, Brigitte Boeckmann, Silvia Braconi Quintaj, Paul Browne, Evelyn Camon, Danielle Coral, Elisabeth Coudert, Tania de Oliveria Lima, Kirill Degtyarenko, Sylvie Dethiollaz,Ann Estreicher, Livia Famiglietti,Nathalie Farriol-Mathis,Stephanie Federico, Serenella Ferro, Gill Fraser, Raffaella Gatto, Vivienne Gerritsen, Arnaud Gos, Nadine Gruaz-Gumowski, Ursula Hinz, Chantal Hulo, Janet James, Florence Jungo, Vivien Junker,Youla Karavidopoulou, Maria Krestyaninova, Kati Laiho, Minna Lehvaslaiho, Karine Michoud, Virginie Mittard, Madelaine Moinat, Sandra Orchard, Sandrine Pilbout, Sylvain Poux, Sorogini Reynaud, Catherine Rivoire, Bernd Röchert, Michel Schneider,Christian Sigrist, Andre Stutz,Shyamala Sundaram, Michael Tognolli,Sandra van den Broek, Bob Vaughan, Eleanor Whitfield • Programmers:Daniel Barrell, David Binns, Michael Darsow, Ujjwal Das, Eduardo de Castro, Alexander Fedotov, Astrid Fleischmann, Elisabeth Gasteiger, Alain Gateau, Andre Hackmann, Ivan Ivanyi, Eric Jain,Alexander Kanapin, Paul Kersey,Ernst Kretschmann, Corinne Lachaize, Chris Lewington, Xavier Martin, John Maslen, Peter McLaren, Rupinder Singh Mazara, Lorna Morris, John O’Rourke, Isabelle Phan, Astrid Rakow, Kai Runte, Florence Servant, Allyson Williams, Dan Wu • Research staff: Kristian Axelsen, Pierre-Alain Binz, Nicolas Hulo, Anne-Lise Veuthey • Clerical/secretarial assistance: Veronique Mangold, Claudia Sapsezian, Margaret Shore-Nye, Veronique Verbegue • Students: Pavel Dobrokhotov, Alexandre Gattiker, various MCF, etc

More Related