1 / 17

GLOBAL BIODIVERSITY

GLOBAL BIODIVERSITY. INFORMATION FACILITY. Designing a Global Network to Accommodate Contributions from all Sources and Technical Abilities. Tim Robertson GBIF Secretariat. Content. How the GBIF index is built Joining the GBIF network Technical requirements

thersal
Download Presentation

GLOBAL BIODIVERSITY

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GLOBALBIODIVERSITY INFORMATIONFACILITY Designing a Global Network to Accommodate Contributions from all Sources and Technical Abilities Tim Robertson GBIF Secretariat

  2. Content • How the GBIF index is built • Joining the GBIF network • Technical requirements • Documentation on services and standards • The use of current protocols for data harvesting • Simplified full dataset harvesting • The new GBIF integrated publishing toolkit • Extending the model – Simple Transfer Schema task group

  3. Today: How the network is structured

  4. Today: Entry requirements

  5. Basis of Record: Data served (Source: GBIF Data Portal October 2008)

  6. Basis of Record: What the standards say

  7. Comparison: International Standards Organisation • International Standards Organisation • 2 digit country codes (ISO 3166) • Multilingual (English, French + external translations) • Simple Tab Demitted File format • Loads straight into database for reuse • As simple as it needs to be… For controlled vocabularies, could this approach be adopted? Could removing complex technical schemas allow for easier contribution?

  8. Harvesting: Using existing protocols • Provider has TAPIR wrapper • Wrapper allows for 200 records per request • 260,000 records to harvest • 1300 request / responses • 9 hours total • 500MB XML transferred • Extracted to a 32MB delimited file for the index • Compressed to 3MB • Why not produce this on the provider?

  9. Harvesting: Streamlining the process • Benefits • Indexes can be more up-to-date • better for the user • benefits provider • Provider systems can be left to answer specific real queries • the original purpose for the wrapper software • Easy for small data publishers to produce • Already done in an ad-hoc manner for very large providers • Not dissimilar to Sitemaps protocol

  10. Harvesting: Streamlining the process If this is already being done in an ad-hoc manner, should it be defined as a standard?

  11. GBIF: The integrated publishing toolkit (IPT) • Publishing of • Occurrence data • Checklist data • Taxonomic data • Dataset descriptive data (metadata) • Key features • Embedded data cache • takes load off ”LIVE” system • allows for file based importing • Web application to search and browse data • TAPIR, WFS, WMS, TCS, EML, RSS, ”Local DwC Index” • Simple extensions – the ”star schema” • Can be used in a hosting environment

  12. GBIF: The integrated publishing toolkit (IPT)

  13. GBIF: The integrated publishing toolkit (IPT) • Ready for ”alpha” testing – please enquire! • Demonstrations by Markus Döring and Tim Robertson all week • Poster • Lunchtime session Tuesday

  14. Extending the model: More data types • The data being mobilised is largely “single core entity” • the “Occurrence Record” • Integrating with other areas? • Earth observation networks • Ecological networks • Task group to investigate specific use cases to determine a Common Transfer Schema: • Primarily data modeling experience • Technical implementation • Presentation to TDWG community • Perhaps multiple core entities, each extensible?

  15. Extending the model: More data types

  16. Extending the model: More data types

  17. Contact Tim Robertson GBIF Secretariat Universitetsparken 15 2100 Copenhagen Denmark trobertson@gbif.org

More Related