1 / 25

Increasing Usability of Biodiversity Databases through Semantic Enrichment

Increasing Usability of Biodiversity Databases through Semantic Enrichment. Klaus Riede Zoologisches Forschungsinstitut & Museum Alexander Koenig (ZFMK) Adenauerallee 150-164 53113 Bonn, Germany. Semantic Enrichment : Some examples. Huge Biodiversity Databases already exist.

fredrica
Download Presentation

Increasing Usability of Biodiversity Databases through Semantic Enrichment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Increasing Usability of Biodiversity Databases through Semantic Enrichment Klaus Riede Zoologisches Forschungsinstitut & Museum Alexander Koenig (ZFMK) Adenauerallee 150-164 53113 Bonn, Germany

  2. Semantic Enrichment:Some examples..... Huge Biodiversity Databases already exist. They cover distinct organims: Fishbase, Orthoptera Species File OR Distinct themes: Threat: IUCN Red List Database (www.redlist.org) Migration: Global Register of Migratory Species (www.groms.de) Why do we need semantic enrichment?

  3. Semantic Enrichment:Some examples..... Try to search for: Number of „Extinct Tropical Timber Trees“ Database: IUCN Red List Database (www.redlist.org) Query: Tropical tree Problem: plants are not classified according to life-form Plant families such as TAXODIACEAE comprise trees (e.g. Taiwania cryptomeroides - VULNERABLE) CUPRESSACEAE contain shrubs (Actinostrobus) AND trees ( Thuja spp.)

  4. Semantic Enrichment:Searching for Red-Listed Trees To search the IUCN Red List Database (www.redlist.org) for „Threatened“ trees, you have to know plant taxonomy: Searching the Order CONIFERALES (containing Taxodiaceae trees): 16 Critically Endangered, 43 Endangered, 93 Vulnerable, ...but some of those are shrubs (Cupressaceae: Actinostrobus) Threatened Cupressaceae: 2 Critically Endangered, (e.g. Thuja sutchuensis) 15 Endangered, (e.g. Juniperus cedrus) 25 Vulnerable (e.g. Cupressus gigantea)

  5. Semantic enrichment is necessary to search for „Trees“ http://www.botanik.uni-bonn.de/conifers/index.htm

  6. Two Worlds: Relational databases and complex data sets Relational Databases Digital Orthoptera Specimen Access SYSTAX GROMS Global Register of Migratory Species Complex data sets Sounds, Pictures gene sequences (links) geographic coordinates Maps (GIS-data: shapes)

  7. Example #1Data-mining for Knowledge Gaps The „Global Register of Migratory Species“ Database contains literature citations on migration. Knowledge gaps were detected by searching for text strings such as: poor* , little known, unknown www.groms.de

  8. The relational organisation of the GROMS database allows application of SQL queries for text-mining: References Table: Joint Table: Species Table ID Author, Title etc Lit_ID Species_ID Text: [.................. ....migration... unknown...................................] 1:many ID Taxon name Migration Red List status, etc many:1 5,500 entries 8,500 entries 4,355 entries Many:Many relation connects References and Species Names

  9. SQL statement:Searching for non-passerine birds with poorly known migration behaviour:

  10. Result: 349 birds with unsufficiently known migration behaviour www.groms.de mainly based on „Handbook of the birds of the World (del Hoyo et al. 1992-2003

  11. Example #2:Automatic Annotation of Sound Parameters The Orthoptera Song Repository of the DORSA project was used to annotate all 5,000 sound files automatically with sound parameters. Sound parameters were added to the SysTax database, which stores specimen data from various museum databases, including herbaria. The annotated SysTax Oracle database is now searchable for sound parameters, such as Carrier Frequency and Pulse Rate

  12. Deutsche Orthopteren Sammlungen - www.dorsa.de Orthopteren-Typenmaterial in deutschen Museen.

  13. Deutsche Orthopteren Sammlungen - www.dorsa.de • Überprüfung, Bestimmung, Verifizierung von • Angaben über Typenmaterial, • Auffinden „historischer“ Typen, • Festlegung von Lektotypen

  14. Deutsche Orthopteren Sammlungen - www.dorsa.de Taxonomic database (OSF: Orthoptera Species File, USA) Specimens (german museums, phonotheks) (www.dorsa.de) Mutual links

  15. Extraction of sound parameters by using MatLab Software Carrier frequency Pulse rate Carrier frequency In cooperation with: Dept of Neuroinformatics, Ulm

  16. Enriched sound file table:pulse distance, length, frequency etc were added to the SYSTAX table

  17. Bioacoustic, automatised classification of ethospecies allows Rapid Assessment • Mapping with microphones allows to answer • important research questions, such as: • species ranges/ endemism • species abundance • species turnover • community patterns • activity patterns • vulnerability to habitat degradation • - extermination rates

  18. Example #3Enriching databases with Geographic information - Adding lat-lon coordinates by Geo-referencing - GIS Analysis of complex geometries (shapes) by intersection with other GIS-layers and subsequent update

  19. Georeferencing is necessary to update place names with lat-lon data ?

  20. Geographic coordinates were added to place names, using Times Atlas or gazetteers (Getty, Alexandria Project)

  21. Mapping requires specimen data enriched with geographic coordinates The DORSA mapserver is available at www.dorsa.de

  22. Deutsche Orthopteren Sammlungen - www.dorsa.de Herkunftsländer des Typenmaterials in deutschen Museen

  23. Example #3Enriching databases with Geographic informationbased on GIS calculation of range territories Distribution maps (shapes) are available at www.groms.de

  24. Import of Intersection Results:1,000 mapped species - 2,522 administrative units 340,000 combinations (dbf attribute table:province – species) Queensland search results:

  25. Summary:Semantic enriching of relational databases is possible by automatic annotation: Relational database External data set (sounds, GIS) Link Running annotation program (eg GIS intersection Enriched Relational Database Table with annotation Results Importing Result table Enrichment allows SQL retrieval of complex data parameters

More Related