280 likes | 426 Views
Biodiversity Informatics at the Natural History Museum. Ed Baker Terrestrial Invertebrates, Department of Life Sciences & NHM Informatics Initiative. Science as a Slow Cooker. Only the surface visible Lid kept on for extended periods of time Uses cheap cuts of raggy meat
E N D
Biodiversity Informatics at the Natural History Museum Ed Baker Terrestrial Invertebrates, Department of Life Sciences & NHM Informatics Initiative
Science as a Slow Cooker • Only the surface visible • Lid kept on for extended periods of time • Uses cheap cuts of raggy meat • Ingredient lose their nutritional value • Children at risk due to high temperatures http://ispiders.blogspot.co.uk/2011/11/realtime-web.html
We like data • 70 million+ specimens collected over 400 years • 350,00+ books • ??? Unpublished datasets in archive, notebooks, computes • ??? In the minds of staff
How do we provide access? • Digitisation of specimens and associated data • Scanning and transcribing books, journals, archives • Providing tools for managing the data life cycle • Changing the way we publish: data publication
Flowing Data Collection Curation Use Publication
Flowing Data Collection Curation Sits in desk drawer or on a hard drive until…. Somebody retires Somebody dies Project is cancelled
Flowing Data Collection Curation Use Data Publication Publication Re-use Re-use Re-use Re-use
Flowing Data: from collection to reuse Collection Curation Use Data Publication Publication Re-use Re-use Re-use Re-use
Collection Citizen Science Automated identification and monitoring Traditional taxonomic sources
Flowing Data: from collection to reuse Curation Use Data Publication Publication Re-use Re-use Re-use Re-use
Curation • Websites for communities to publish and curate: • Taxonomy / nomenclature • Bibliographies • Specimen information • Character matricies
Flowing Data: from collection to reuse Use Data Publication Publication Re-use Re-use Re-use Re-use
Flowing Data: from collection to reuse Data Publication Publication Re-use Re-use Re-use Re-use
Publication (Data) • Datasets • Single species descriptions • Checklists • Software
Flowing Data: from collection to reuse Publication Re-use Re-use Re-use Re-use
Publication (Research) • Traditional research • Systematic zoology • Phylogeny • Biogeography
Flowing Data: from collection to reuse Re-use Re-use Re-use Re-use
The Problem of Scale • Data is being generated by tens of thousands of researchers, in thousands of institutions • Hard to find what you need • Hard to know if what you need actually exists • Impossible to go through researcher by researcher
NHM Data Portal • Aggregator for NHM science data • Visualisation tools for datasets • Allows export of NHM data for re-use
The Informatics Landscape >18K specimen records (local small scale coverage) >276M specimen records (worldwide coverage)
The Informatics Landscape A webpage for every species Aggregate specimen and observation data globally
Wikimedianin Residence • Make NHM content available under open licenses for use on Wikimedia projects (and elsewhere) • Reach of Wikipedia: BBC, Encyclopedia of Life • Wikisource: Transcription and translation crowd-sourcing
"Everybody makes mistakes. And if you don't expose your raw data, nobody will find your mistakes." Jean-Claude Bradley http://bit.ly/146ugIv