140 likes | 280 Views
Cynthia Parr Phenotype RCN NESCent 25 February 2013. @ cydparr @ eol. EOL aggregates and curates across topics, across the tree of life. Scientific Databases, including BHL, GBIF, ALA, INBio , COL, Scratchpads, LifeDesks Scientific Journals. Curate . Aggregate. Comment
E N D
Cynthia Parr Phenotype RCN NESCent 25 February 2013 @cydparr @eol
EOL aggregates and curates across topics, across the tree of life Scientific Databases, including BHL, GBIF, ALA, INBio, COL, Scratchpads, LifeDesks Scientific Journals Curate Aggregate Comment Rate, Collect eol.org API Quality control Third party apps
From MooreaBiocode EOL summarizes knowledge Erosariacaputserpentis Serpent's Head Cowrie Depth range based on 51 specimens in 2 taxa. Water temperature and chemistry ranges based on 40 samples. Environmental ranges Depth range (m): -5 - 67 Temperature range (°C): 23.011 - 28.496 Nitrate (umol/L): 0.048 - 0.923 Salinity (PPS): 33.821 - 35.837 Oxygen (ml/l): 4.349 - 4.825 Phosphate (umol/l): 0.088 - 0.228 Silicate (umol/l): 0.983 - 4.026 From GBIF From OBIS
Statistics 2 years ago Today 3.3 million pages – one (or more) per taxon 5 million CC-licenseddata objects Over 1 million pages with objects 200+ partner databases 1200 curators/1000s contributors/~64,000 members • 2.8 million pages – one (or more) per taxon • 2 million data objects • 500 thousand pages with objects • 100+ partner databases • 700 curators/1000s contributors/~46,000 members
We have an infrastructure . . . • Aggregation mechanisms • Names resolution • Curation mechanisms • Public and machine interfaces • User-created collections What are the next use cases to tackle? How could ontologies & annotations help?
See structured info on EOL pages Discover and identify “find taxa with these characteristics”
Browsethe whole page semantically, link to related resources (LOD: linked open data) Google Summer of Code with Phenoscape (Alex Ginsca) Using DBPedia Spotlight to extract associations among taxa and add to Linked Open Data cloud (Devries and Thessen) Linking names, literature, phylogeny (Page) Resolving archeological data on animal domestication in the near east (Alexandria Archive Institute)
Promote NLP text mining and crowdsourcing • Altitude Specificity of Flower Coloration (Wright) • Species Interaction Datasets—Integration, Visualization, and Analysis (Poelenand Mungall) • Crowd-sourced data to examine morphological impacts of extinction risk in ray-finned fishes (Chang) • Macroecological patterns in butterfly-hostplant associations (Ferrer-Parris) • Discovering habitat terms in EOL contents (Pafilis)
Easy access to analyzable data “Are blue organisms more common in high altitudes?” “How can I predict vulnerability to climate change based on life history characteristics?” “What organisms should I collect to fill in gaps in genome quality data?” • Look for data, download for all taxa • Create a collection of taxa, download all data • Use Reol: an R interface to EOL (Banbury, Omeara) http://barbbanbury.info/barbbanbury/Reol.html • Find more specialized data repositories
Dynamic online knowledge • Support summaries with networks of evidence • E.g. Bergmann’s rule: animals living in higher latitudes have larger body size • As evidence grows or changes, change the knowledge summary • Flag evidence that is in conflict with the summary
Summarize data across providers Flag outlier data Salinity envelope (n=40) Erosariacaputserpentis Serpent's Head Cowrie From OBIS
The big picture In progress: Marine computable data Draft phylogenetic tree from Open Tree of Life project TraitBank: access to computable descriptive information across the tree of life
Thanks to Our funders John D. and Catherine T. MacArthur Foundation Alfred P. Sloane Foundation Smithsonian Institution Marine Biological Laboratory Harvard University David Rubenstein and other funders and donors All our content providers and global partners Volunteer curators and individual contributors via Flickr, Wikimedia, and members of EOL