1 / 18

Hannu Saarenmaa – University of Eastern Finland

Organising data flows and modelling for the Essential Biodiversity Variables. Hannu Saarenmaa – University of Eastern Finland GEO BON, WG8 – Data Integration and Interoperability EU BON, WP2 – Data Integration and Interoperability BioVeL , WP2 – Workflows for Scientific Research.

tirzah
Download Presentation

Hannu Saarenmaa – University of Eastern Finland

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Organising data flows and modelling for the Essential Biodiversity Variables Hannu Saarenmaa – University of Eastern Finland GEO BON, WG8 – Data Integration and Interoperability EU BON, WP2 – Data Integration and Interoperability BioVeL, WP2 – Workflows for Scientific Research GEO - X PlenaryGeneva, 14 January 2014

  2. Essential Biodiversity Variables • Conceived by GEO BON Collaborators (Pereira et.al. (2013) “Essential Biodiversity Variables”, Science, Vol. 339, 18 Jan 2013). • EBVs facilitate data integration by providing an intermediate abstraction layer between primary observations and indicators. • Computed from a large number of inputs (monitoring/incidental data). • EBVs aim to help observation communities harmonise monitoring, by identifying how variables should be sampled and measured. • EBVs standardise an ontology for biodiversity and harmonise measurements, observations, and protocols. • Endorsed by Convention on Biological Diversity (CBD) and in line with the 2020 Aichi Targets. • Provide focus for GEO BON and hence for the interoperability thrust within GEO BON. • A use case that GEO BON, EU BON and BioVeL focus on.

  3. Where does the data come from? • In Europethereareabout 2000 biodiversityobservationnetworks (only 643 listedbyEUMON). • GBIF has 10,000 data sets, openlyaccessible, conforming to GEOSS Data SharingPrinciples. • LTER/DataONEhas 1,000’s biodiversitydatasets. • EU BON is carrying out a gapanalysis: • There is a massiveduplication of effort in data management, and lack of data sharing. • Thereareveryfew data setswhose ”quality” (coverage, accuracy, etc.) hasbeendocumented and guaranteed. • Socalled ”Data core” in biodiversityhasnotyetbeendefined.

  4. Biodiversity Virtual e-Laboratory BioVeL processing services and workflows • “Workflows” (series of data analysis steps) allow to process vast amounts of data. • Build your own workflow: select and apply successive “services” (data processing techniques.) • Import data from one’s own research and/or from existing libraries (i.e. GBIF, Catalogue of Life). • Access a library of workflows and re-use existing workflows. • Cut down research time and overhead expenses. Part of a workflow to study the ecological niche of the horseshoecrab

  5. Aim: Predictive modelling of biodiversity change The analytical cycle Available tools from a growing family of ENM workflows – released to public at www.biovel.eu • Data Refinement Workflow (DRW) for pre-processing • Taxonomic Name Resolution / Occurrence retrieval • Geo-temporal data selection using ‘BioSTIF’. • Data quality checks / filtering using ‘Google Refine’. • Ecological Niche Modelling Workflow (ENM) • Classic ENM with 15 algorithms • Separate BioClim workflow (requires special inputs) • ENM Statistical Workflow (ESW) for post-processing • DIFF: Extent and intensity of change • STACK: Extent, intensity, and a cumulated potential • SHIFT: of the centre of gravity (direction, length, in kilometers) Data discovery Data assembly, cleaning, and refinement Ecological Niche Modeling Statistical analysis

  6. Seamlessexchange of data layers http://openmodeller.cria.org.br/

  7. Use case: The spruce bark beetle, Ipstypographus, disturbance of forest ecosystems Difference Pre 2002 Year 2050 • Statistical processing of the difference in Finland indicates that susceptibility of spruce forests to Ipstypographus damage will get five-fold by 2050. • Policy advise: Stricter forest hygiene through tougher legislation, so that Ipspopulations are kept at minimum, because of the increased risk. • Papers for Silva Fennicaand INTECOL session proceedings at Journal of Ecology.

  8. Outline of the use case • Running Ecological Niche Modeling (ENM) workflow for large number of species • Process data points for hundreds of species (e.g. plants, butterflies, …) • Use data mostly from GBIF, but also from elsewhere • Each individual species may have 105 of data points • Run openModeller based ENM for all the data points • Choose predictive layers from WorldClim and GEOSS sources • Generate summary statistics that can answer questions such as: • How many species are increasing? How many are decreasing? Does the flora/fauna move to any direction? Is distribution fragmenting? Is distribution shrinking? How many populations are becoming marginalised? • Prototype automatic data processing for computing the Essential Biodiversity Variables (EBV) EBVs?

  9. Status of the current BioVeL ENM workflow • Current openModeller based ENM workflows work at a smaller scale – focus on one or a few selected species • Current workflow requires frequent interaction with the user (many clicks if we simply multiply runs) • We need a system that is scalable and automated to run ENM for hundreds of species • We need a system that can perform a summary analysis across all the species based on the individual ENM runs • The 2nd generation BioVeL portal will provide the required capabilities. • To be released publicly in January 2014 (currently in beta mode)

  10. Envisaged application structure • Multiple species may use the same ENM parameter set (e.g. Mediterranean dryland plants) • Parameter sets are generated and tested with another workflow (see next slide) ENM parameter sets for species Selected species EUMON query LTER query GBIF query . . . • Some species may need other offline data, or private data (uploaded from user side). ENM workflow ENM workflow ENM workflow . . . • One ENM workflow predicts the impact of environmental changes on the distribution of one species. ENM output file ENM output file ENM output file • Portal offers files for download Summary analysis • Performed with R-based custom tool outside the portal • EBV production by combining data from different models

  11. ENM parameter optimisation workflow • Possible parameter combinations. Parameter matrix Selected species Parameter test and selection job Parameter test and selection job Parameter test and selection job . . . ENM parameter sets for species • The optimal parameter input for the large ENM workflow (see previous slide)

  12. Initialising the data sweep on portal

  13. Results of data sweep, ready to be mapped, and statistically analysed

  14. Example product: Accumulated invasive potential for ecological groups • 20 blacklisted species divided in 4 ecological regimes • Zoobenthos • Phytobenthos • Zoopelagial • Phytopelagial Example: Stack of combined macrozoobenthicinvasion heatmaps Slide by Matthias Obst, BioVeL

  15. www.earthobservations.org/geobon.shtml www.eubon.eu www.biovel.eu Questions?

More Related