1 / 39

Lyubomir Penev , Terry Catapano , Donat Agosti , Teodor Georgiev , Guido Sautter, Pavel Stoev

Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the  experience of a biodiversity publisher. Lyubomir Penev , Terry Catapano , Donat Agosti , Teodor Georgiev , Guido Sautter, Pavel Stoev JATS-Con, 16 - 17 Oct 201 2. Plazi.

barbara
Download Presentation

Lyubomir Penev , Terry Catapano , Donat Agosti , Teodor Georgiev , Guido Sautter, Pavel Stoev

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy:the  experience of a biodiversity publisher Lyubomir Penev, Terry Catapano, DonatAgosti, Teodor Georgiev, Guido Sautter, PavelStoev JATS-Con,16 - 17 Oct 2012 Plazi

  2. This presentation wll focus on: Implementation of TaxPub, an extension to the general NLM JATS DTD for taxonomy publishing Semantic tagging of and enhancements to published texts Dissemination of published information to aggregators Current and future development of TaxPub

  3. Plazifounded in 2008: Swiss based NGO with members in Switzerland, Germany, US and Iran • Plazi is a research based think tank with the mission to promote the idea of open access to scientific content • Plazi has four pillars: Legal advice, technical solutions (eg TaxPub), maintenance of a treatment repository, advocacy • Plazi GmbH founded in 2012 as service SME owned by Plazi to provide document conversion services and consultation • Funding from public donors, eg. EU, and private • Clients are global Quick facts about Plazi

  4. Conservation: Global biodiversity crisis. Increasing loss of species, but no tools to measure and document it • Science: ca 1.8M species described, ca 8M expected • Scientific publications • ca 17,000 species described per annum; ca 100,000 redescriptions per annum -> rich content • highly fragmented with over 2,500 journals and books involved -> difficult access • Solution: Open Access and semantically enhanced publications allow immediate registration of new taxa and dissemination of content -> Taxpub JATS/DTD Context

  5. This presentation wll focus on: • Implementation of TaxPub, an extension to the general NLM JATS DTD for taxonomy publishing • Semantic tagging of and enhacements to published texts • Dissemination of published information to aggregators • Current and future development of TaxPub

  6. TaxPub • Lightweight extension of Blue DTD • Describe at JATS-Con 2010: “TaxPub: An Extension of the NLM/NCBI Journal Publishing DTD for Taxonomic Descriptions” (http://www.ncbi.nlm.nih.gov/books/NBK47081/) • Treatments (i.e., species descriptions) • <tp:taxon-treatment>, <tp:nomenclature>, <tp:treatment-sec> • Domain specific content • <taxon-name>: Taxonomic names • <materials-citation>references to specimens • <descriptive-statement>: descriptions of morphological features

  7. <tp:taxon-treatment> <tp:nomenclature> <tp:taxon-name> <tp:taxon-name-part taxon-name-part-type="genus">Platyscelio</tp:taxon-name-part> <tp:taxon-name-part taxon-name-part-type="species">mzantsi</tp:taxon-name-part> <object-id>urn:lsid:zoobank.org:act:D084EF48-4736-444F-916F-2C8CDE23E29B</object-id> <object-id>urn:lsid:biosci.ohio-state.edu:osuc_concepts:242617</object-id> </tp:taxon-name> <tp:taxon-authority>Taekul &amp; Johnson</tp:taxon-authority> <tp:taxon-status>sp. n.</tp:taxon-status> </tp:nomenclature> <tp:treatment-sec sec-type=”materials_examined”> ...

  8. <tp:treatment-sec sec-type="materials_examined"> <p> <tp:material-citation> <tp:type-status>Holotype</tp:type-status> worker. <tp:taxon-type-location>King Saud Museum of Arthropods (KSMA), College of Food and Agriculture Sciences, King Saud University, Riyadh, Kingdom of Saudi Arabia.</tp:taxon-type-location> <tp:collecting-event> <tp:collecting-location>SAUDI ARABIA, Al Bahah province, Amadanforest, Al Mandaq governorate, </tp:collecting-location> <named-content content-type="dwc:verbatimCoordinates">20°12'N, 41°13'E</named-content> , 1881 m.a.s.l. 19.V.2010 (M. R. Sharaf &amp; A. S. Aldawood Leg.); </tp:collecting-event> </tp:material-citation> </p> </tp:treatment-sec>

  9. TaxPub: Recent and Future Developments • Largely stable • <x> • Greenfication • Interest from journals: • European Journal of Taxonomy • Zootaxa (via EOL) • Markup of morphological descriptions

  10. <p>Spreading shrub; stems erect,<Categorical uri="http://ontology.org/plant/stem-color"> <State uri="http://ontology.org/plant/greenish">greenish</State> </Categorical>. Leaves deciduous early in summer (particularly when infected with Diseasomyces), oblong, apex obtuse, glabrous or weakly hirsute; stipules sharply pointed, <Quantitative uri="http://ontology.org/plant/stipule-width"><value value="3.2">3,2mm</value></Quantitative> wide, <Categorical uri="http://ontology.org/plant/stipule-color"> <State uri="http://ontology.org/plant/black">black</State> or <State uri="http://ontology.org/plant/brown">darkish brown,</State></Categorical>extremely rarely yellow, often shallowly joined around the node; spines stout.</p>

  11. TaxPub: Challenges • Maintenance • Sourceforge • Volunteer effort, little time, no funding… • Supported by Plazi • Documentation • Comments with ad hoc markup in extension files • Converted to HTML by NCBI Tool • Maintained at Species-ID wiki

  12. Pensoft founded in 1992: more than 700 books published; two offices in Sofia and Moscow; 16employees • ZooKeys launched in July 2008 as the first mandatoryOpen Access journal in taxonomy; 205 issues, 20,000 pages IN FOUR YEARS • All new taxaregistered in ZooBankand supplied to EOL, Plazi and the wiki Species-ID • CrossRef member, ISI and Scopuscovered, indexed in Zoological Record, DOAJ, CABI Abstracts, Google Scholar; archived inPubMedCentraland CLOCKSS • Pensoft Journal System – XML-based online editorial system; publishing services offered to society and institutional journals Quick facts about Pensoft & ZooKeys

  13. ZooKeys growth

  14. The XML landscape for legacy and prospective taxonomic literature PROSPECTIVE PUBLISHING | HISTORICAL LITERATURE TaxonX , taXMLit schemas PLAZI’ GOLDEN GATE editor Content management systems & repositories (e.g., EOL, GBIF, SCRATCHPADS) TaxPub XML schemaPENSOFT MARK UP tool Automated submission; peer-review Marked up publicationsPDF, HTML and XML Unified marked up final outputTaxon treatments, keys, images, localities archiving END USERS WIKI Species-ID Wikispecies Wikipedia Aggregators(EOL, GBIF) Electronic archives; Data Centers Indexing (IPNI, ZooBank, Myco- Bank, GNA)

  15. Four stages of the XML-based editorial workflow SUBMISSION: XML-tagged or non-tagged manuscripts? PEER-REVIEW/EDITORIAL PROCESS: The technical challenges of the XML mark up PUBLICATION: Differentpublishing formats and to whom they are addressed? DISSEMINATION: How to provide a maximum distribution of published information

  16. Nomenclature Literature Descriptions Images Occurrences But why to mark up? Is it really needed? Who will be using it? Plazi

  17. What XML gives to the readers more than the usual PDF does?

  18. Semantic enhancements to published texts

  19. Semantic enhancements to published texts

  20. Archiving in PubMedCentral

  21. Automated export of species descriptions to Encyclopedia of Life (EOL) XML MARK UP

  22. Automated harvesting and deposition of taxon treatments in Plazi

  23. Export of content to the Wiki environment

  24. Species descriptions on Wikispecies and Wikimedia Commons

  25. More semantic Web Enhancements! Pensoft Writing Tool (PWT) – a collaborative article writing platform Community-based and open peer review process Biodiversity Data Journal will publish any kind of “small data”: checklists, nomenclatural acts, taxon treatments The Future of TaxPub and its implementations

  26. The collaborative article authoring tool

  27. Why the Biodiversity Data Journal is needed?

  28. RE-USE of CONTENT Publishing and sharing of primary data Drawings: SlavenaPeneva Primary data

  29. Biodiversity Data Journal All data maters: Nolower or upper limit of manuscript size! ALLwithin a single online collaborative platform, including the writing of the manuscript! Collaborative article authoring tool Community peer review with “open” and “public” options, on the top of conventional peer-review Online editorial process and version control Standard-compliant (Darwin Core, Dublin Core, NLM JATS, etc.) Pre-defined biological Code-compliant article templates

  30. Any other data Genome data Occurrence data Life cycle of data published in the BDJ Biodiversity manuscript Phylogenetic data Morphometric data Image galleries Environmental data XML MARK UP Structured text (data!) Taxon names Taxon treatments Occurr-ence data ARTICLES Biblio-graphies COL Plazi Wiki BHL

  31. The main difficulties are caused by: The specificity of the domain (e.g., taxon names, synonyms, instability of nomenclature, lack of global LSID infrastructure, etc.) Mark up of occurrence data (certainly a great challenge) Cost efficiency of markup process Sociological barriers: the majority of authors are not willing to change their writing habits; most are still not aware about the tremendous advantages of the Web 2.0 technologies Most small taxonomy publishers (and some bigger ones) have no experience in XML-based editorial wokflows or they simply can’t afford it The lessons learned

  32. “ Semi-automatically generated semantic, enhanced e-publications are the only way to describe the missing 10 M species, and to deal with an increasing flood of data.”DonatAgosti It is not easy, but...... ... it is exciting ... .... however possible only through Open Access!

More Related