1 / 27

The Future of Microalgal Taxonomy Anne Thessen , athessen@mbl.edu David Patterson dpatterson@mbl.edu (Data Conservanc

The Future of Microalgal Taxonomy Anne Thessen , athessen@mbl.edu David Patterson dpatterson@mbl.edu (Data Conservancy, Life Sciences). Scientist’s Dream. Computer, what is the trajectory of the planet Seti Alpha 5?. Taxonomist’s Dream. How many algal species can be found on this planet?.

knut
Download Presentation

The Future of Microalgal Taxonomy Anne Thessen , athessen@mbl.edu David Patterson dpatterson@mbl.edu (Data Conservanc

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Future of Microalgal TaxonomyAnne Thessen, athessen@mbl.eduDavid Patterson dpatterson@mbl.edu(Data Conservancy, Life Sciences)

  2. Scientist’s Dream Computer, what is the trajectory of the planet Seti Alpha 5?

  3. Taxonomist’s Dream How many algal species can be found on this planet?

  4. Taxonomist’s Dream What species is this?

  5. Taxonomist’s Dream

  6. Taxonomist’s Dream

  7. Setting the stage for a ‘big new biology’ • BIG = data-centric (like particle physics and astronomy) • Characterized by data sharing via a virtual pool • New = new skill sets, tools, cyber-infrastructure to exploit the data pool • Data driven discovery as a new means of understanding • GenBank as a model within the Life Sciences

  8. Small science Small number of providers with lots of data. Large number of providers with small amounts of data.

  9. Names Limulus polyphemus Kiwahirsuta Trypanosomabrucei Aapaleacea Homo sapiens Pierisrapae Kingiaaustralis Pieris japonica Osedaxfrankpressi

  10. Many names for one taxon Gomphonemavulgare Didymospheniageminata Gomphonemageminatum Didymospheniageminata Rock snot Didimospheniageminata Didymospheniageminata Didymo Echinellageminata

  11. Reconciliation Group Didymospheniageminata Didimospheniageminata Didymo Rock Snot Echinellageminata Gomphonemageminatum Gomphonemavulgare

  12. Reconciliation Group Didymospheniageminata Didimospheniageminata Didymo Rock Snot Echinellageminata Gomphonemageminatum Gomphonemavulgare

  13. One name for many taxa Cyclophoratenuis CyclophoraCastracane 1878 Cyclophora CyclophoraHübner 1822 Cyclophoraporata . Contextual data Diatom Chloroplast Frustule Benthic Marine Contextual data Food Moth Wings Exoskeleton Caterpillar Disambiguate by authority, species, contextual data

  14. Global Names Architecture DATA AND SERVICE CONSUMERS Consumer Services GNA EXPERTS Provider Services DATA AND SERVICE PROVIDERS

  15. Managing names to manage biodiversity data • All names (scientific vernacular surrogate) • For all organisms • Many names for one species reconciled • One name for many species disambiguated • Global Names Architecture • a virtual layer, using names services to link together distributed data • Globalnames.org • Micro*scope (microscope.mbl.edu) and Encyclopedia of Life (eol.org) Names-based cyberinfrastructure

  16. Narrative tradition in biology Too much for a human Can we get a machine to do the work? NLP!!! Legacy Data

  17. Use NLP/machine learning to extract names and characters Hong Cui Legacy Data

  18. Spirogyra:chloroplasts:present Legacy Data

  19. Spirogyra:chloroplasts:present:attribution Legacy Data

  20. Coffee Ontology is a coffee drink

  21. Existing Ontology

  22. Semantic Web

  23. Data Discovery and Aggregation

  24. Future Data Triple Store

  25. Informatics/computing training Modified workflows Importance of data management and preservation The New Workforce

  26. Big New Biology is coming, taxonomy can benefit from being a part of it Existing data can be made machine-readable using information extraction algorithms Existing workflows can be modified to capture data close to the source Data can be shared using the semantic web In Summary

  27. DimaMozzherin David Shorthouse SayeedChoudhury Pete DeVries Acknowledgments

More Related