240 likes | 250 Views
This article explores the architecture of OBIS portal and its potential for utilization as a basis for regional OBIS nodes. It discusses the mapping tools, data providers, search options, and the OBIS index and cache.
E N D
OBIS Portal Architecture Conceptsplus potential for utilization as a basis for Regional OBIS NodesTony Rees, CSIRO Marine Research, Hobart (and OBIS Technical Subcommittee)for: OBIS Nodes meeting, Halifax, September 2004
OBIS Architecture – Version 1 (2002) Mapping tool 3 www user 1 C-squares mapper www user 2 Mapping tool 2 www user 3 OBIS Portal (etc.) = custom database wrapper data provider 1 data provider 2 data provider 3 all queries (etc.)
OBIS Architecture – Version 1 (2002) Mapping tool 3 www user 1 C-squares mapper www user 2 Mapping tool 2 www user 3 OBIS Portal (etc.) = custom database wrapper data provider 1 data provider 2 data provider 3 all queries (etc.) many searches return no data !!! (user does not know what is a viable query)
What is in the Index ... New concept (1) developed in 2003/4 – the “OBIS Index” OBIS Index name index • scientific “names with data” from data providers • “names without data” from Catalogue of Life • common names from Cat. of Life • “near match” versions of scientific names • allocation of all taxa to OBIS taxonomic categories • synonyms transferred to current names • metadata on each species (how many records, etc.) • filters on marine vs. non-marine species, etc. spatial index • list of geographic units (squares) in which each species occurs – see next slide
7500 7500:4 7500:499 7500:499:4 Spatial indexing units used for OBIS: 0.5 x 0.5º squares (approx. 50 km resolution) – stored in “c-squares” notation (www.marine.csiro.au/csquares) OBIS Index
Part of the spatial index (actually from another database, which employs finer resolution squares): OBIS Index
New search options – current OBIS version “Stage 1” searches – return lists of names, with metadata (nos. of records, etc.), taxonomic groups, common names, and “quick maps”
For each species ... (NB, all this information is from the Index) – example row: from Catalogue of Life information allocation to OBIS taxonomic category metadata (from parsing the Cache content) from Cat. of Life info pre-formatted link to “Stage 2 search” from the Spatial Index – species distribution by 0.5 x 0.5 degree squares
What is in the Cache ... New concept (2) developed in 2003/4 – the “OBIS Cache” OBIS Cache • One record per OBIS data point, with copy of key information fields for that point • Cache content is built by crawling the data providers (and refreshing at intervals) • Purpose is to provide insulation from providers being off line at any time, and to improve data retrieval speed • Also as by-product, makes the task of keeping the Index up-to-date relatively simple (crawling of the provider content is already done).
“Stage 2” (get data) queries “Stage 1”(get info) queries provider crawling metadata refresh Index building OBIS Architecture – version 2 Mapping tool 3 www user 2 Mapping tool 2 www user 1 www user 3 = DiGIR translation software OBIS Portal (search application) (etc.) data provider 1 data provider 2 data provider 3 C-squares mapper OBIS Index (etc.) OBIS Cache “Quick maps” global names list (partially complete) Cat. of Life
Implications ... • Index is a self-contained guide to what data are available in the OBIS system (metadata layer) – makes the Portal “intelligent” (content aware) • Can use the Index as a standalone tool to answer: • For what species does OBIS have data (and how much), at any particular time • What is the distribution of species “X” (by 0.5 x 0.5 degree squares) – displayed as “Quick Map” using the c-squares mapper • Which species occur in region “Y” (0.5 degree square or larger – default query from main map entry point is currently set to 10 x 10 degrees) • Browse OBIS content, e.g. by category, genus, alphabetical, plus show summary statistics (e.g. numbers of records by category) • Auto-complete scientific names, correct misspellings, etc. (above are “Stage 1” [Index] searches) • Provide 2 entry points to “Get OBIS Data” (stage 2 search): • Pre-formatted hyperlink to retrieve all data for a species (1-40,000 records), no additional typing required • Click on any “Quick Map” to retrieve spatially filtered subset for a species.
Possible OBIS node – minimalist configuration Mapping tool 3 www user 1 www user 2 Mapping tool 2 www user 3 OBIS Portal (search application) (etc.) “Stage 2” (get data) queries “Stage 1”(get info) queries C-squares mapper OBIS Index OBIS Cache “Quick maps” (at remote location)
Possible OBIS node – expanded configuration Mapping tool 3 www user 1 www user 2 Mapping tool 2 www user 3 OBIS Portal (search application) (etc.) “Stage 2” (get data) queries “Stage 1”(get info) queries C-squares mapper OBIS Index OBIS Cache “Quick maps”
OBIS search application – local entry point OBIS search application – local entry point OBIS search application – local entry point OBIS search application (master) OBIS search application (copy) RON sourced data www user www user www user Common OBIS Data Set OBIS Index (copy) OBIS Index (master) OBIS Cache (master) OBIS Cache (copy) Regional Node (Standard Configuration) Regional Node (Standard Configuration) Regional Node (Standard Configuration) www user Regional Node + full mirror function “OBIS Central” – Rutgers Catalogue of Life C-squares mapper “Quick maps” Index building Independent data provider 1 Mapping tool 2 Independent data provider 2 Independent data provider 3 provider crawling Mapping tool 3 www user (etc.) Common OBIS / 3rd party tools (web accessible)