1 / 49

Database Publishing at Nature

Database Publishing at Nature. Timo Hannay Nature Publishing Group 7 October 2005. Overview. Publishing collaborations: Making databases more like journals NPG New Technology: Making journals more like databases Tagging and social bookmarking: New methods of annotation and navigation.

cahil
Download Presentation

Database Publishing at Nature

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Publishing at Nature Timo Hannay Nature Publishing Group 7 October 2005

  2. Overview • Publishing collaborations: Making databases more like journals • NPG New Technology: Making journals more like databases • Tagging and social bookmarking: New methods of annotation and navigation

  3. Database publishing at NPG • The AfCS-Nature Signaling Gateway (http://www.signaling-gateway.org/) • The CMC-Nature Cell Migration Gateway (http://www.cellmigration.org/) • Forthcoming collaborations with NCI and several other groups

  4. The AfCS-Nature Signaling Gateway • A freely available online resource for anyone interested in cellular signalling • A collaboration with the research community through the Alliance for Cellular Signaling • An experiment in the next generation of online, database-driven scientific publications

  5. News & comment written and commissioned by NPG editors • Facts and figures on major cell signaling proteins (3,700+) • Continually updated by selected experts (~1000) • Peer-review run by NPG Home, Info & News • Repository for raw experimental data from AfCS • Tools for viewing and analyzing data (online & offline) Signaling Update AfCS Data Center Molecule Pages Hardware & software hosted atSan Diego Supercomputer Center The Signaling Gateway

  6. The Molecule Pages • Comprehensive, structured data for 3,700+ proteins involved in cellular signalling • Some information automatically fed in from other online databases and updated monthly • Other information entered by selected expert authors and updated annually • Author-entered data peer-reviewed by NPG • Fully citable using digital object identifiers (DOIs)

  7. Using Digital Object Identifiers http://dx.doi.org/10.1038/35057062 Correct URL at publisher’s website Nature409, 860 - 921 (2001) doi:10.1038/35057062 IDF/CrossRef databases • Allows unambiguous identification of paper • Allows readers to find the paper online • Allows publishers to cross-link reference lists • Guaranteed not to change (even if the publisher changes)

  8. The Molecule Pages: A scientific publication The Molecule Pages has the same features as a traditional journal, except that the information it contains is more highly structured and queryable.

  9. Overview • Publishing collaborations: Making databases more like journals • NPG New Technology: Making journals more like databases • Tagging and social bookmarking: New methods of annotation and navigation

  10. Technology Purported use Eventual impact Steam engines(early 1700s) Pumping water from coal mines The Industrial Revolution Alternating current(1880s) Executing criminals The electrically powered society Web-based scientific publishing(2004) A new charging model for scientific papers Redefining the concept the scientific paper Great underestimated technologies of our age

  11. Print journal Article metadata database Structured data sets Online facsimile <rdf> Structured, interactive and queryable figures and text <svg> </svg> </rdf> circa 2000 circa 2006 Scientific papers as structured data objects

  12. Experimental article metadata database Initial data to be included: • Author and institute details • Scientific: • Molecules (InChI) • Genes (Entrez Gene) • Proteins (UniProt) • Cellular processes, functions, locations (GO) • Species (NCBI) • Citation annotations (controlled vocabulary)

  13. Preview in browser Download to desktop software Search for more data Support for structured data sets • Developing support for: • Systems Biology Markup Language • CellML • Chemical Markup Language • Others

  14. Plot graph on axes of choice Overlay data sets of choice Zoom and pan to view detail Click to download raw data SVG: Figures as interactive data objects

  15. Automated scientific markup and linking

  16. Increasing structure in text markup (1) The old way (no semantic markup): “<p>...gp120 binding to CXCR4 or CCR5 activates PYK2 and FAK…</p>” Now (key entities and concepts marked up): “<p>...<protein id="urn:lsid:uniprot.org:uniprot:P03378">gp120</protein> <action id="urn:lsid:geneontology.org:go:000548">binding</action> to <protein id="urn:lsid:uniprot.org:uniprot:P48061">CXCR4</protein> or <protein id="urn:lsid:uniprot.org:uniprot:P10147">CCR5</protein> <action id="urn:lsid:geneontology.org:go:0008047">activates</action> <protein id="urn:lsid:uniprot.org:uniprot:O43150">PYK2</protein> and <protein id="urn:lsid:uniprot.org:uniprot:Q05397">FAK</protein>…</p>”

  17. With RDF markup, the article XML itself literally becomes a relational database Increasing structure in text markup (2) The new way (full RDF/XML): <p>... <rdf:Graph xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:go="urn:lsid:geneontology.org:go:" xmlns:uniprot="urn:lsid:uniprot.org:uniprot:"> <go:000548> <uniprot:Protein rdf:resource="urn:lsid:uniprot.org:uniprot:P03378"/> <uniprot:Protein rdf:resource="urn:lsid:uniprot.org:uniprot:P48061"/> <go:0008047 rdf:resource="urn:lsid:uniprot.org:uniprot:O43150"/> <go:0008047 rdf:resource="urn:lsid:uniprot.org:uniprot:Q05397"/> </go:000548> <go:000548> <uniprot:Protein rdf:resource="urn:lsid:uniprot.org:uniprot:P03378"/> <uniprot:Protein rdf:resource="urn:lsid:uniprot.org:uniprot:P10147"/> <go:0008047 rdf:resource="urn:lsid:uniprot.org:uniprot:O43150"/> <go:0008047 rdf:resource="urn:lsid:uniprot.org:uniprot:Q05397"/> </go:000548> <rdf:label>gp120 binding to CXCR4 or CCR5 activates PYK2 and FAK</rdf:label> </rdf:Graph> …</p>

  18. Why go to all this effort?

  19. Views from the database side “Before the end of the next decade, pathway databases will become scientific journals and journals will become databases. Biologists will be greatly empowered, and bioinformatics will continue its long evolution.” Lincoln Stein (Reactome) “Is a biological database any different than a biological journal? I am working toward reaching an answer of, no, there is no difference.” Phil Bourne (Protein Data Bank)

  20. Overview • Publishing collaborations: Making databases more like journals • NPG New Technology: Making journals more like databases • Tagging and social bookmarking: New methods of annotation and navigation

  21. A few uses for Connotea • Keeping bookmarks and references in order • Sharing links and ideas within a team (perhaps geographically dispersed) • Providing readers with a (dynamic) list of further or related reading • Encouraging readers to share relevant links with the author and with each other

More Related