80 likes | 200 Views
Analytical DB(s). Other data associated with taxon. Geospatial data linking. Native BIEN Traits. BIEN Confederated DB (S). Traits. Names. Phylogenies. Provider feedback Errors, annotation etc Data-source Steward. Synonymy. Sequences. Full BIEN Workflow. ***.
E N D
Analytical DB(s) Other data associated with taxon Geospatial data linking Native BIEN Traits BIEN Confederated DB (S) Traits Names Phylogenies Provider feedback Errors, annotation etc Data-source Steward Synonymy Sequences Full BIEN Workflow *** There may be linkages other data BIEN wants to integrate into core data. Integrate geospatial & environmental data associated with location There may be trait data BIEN wants to integrate into core data Heterogeneous source database(s) of Plots/Specimens/Occurrences 6.0 User/web interface API VegX DwC 4.0 Transform, Audit change, Integrate, Rule set, Rule framework 1.0 Data Mapping and provider services Staging DB 5.0 Populating Analytical DB 2.0 BIEN harvester and loader Later changes to taxon names applied to BIEN 7.0 Reapplication of updated Names *** Feedback to original data providers. Filtered push/BiSciCol?-- annotation and feedback, QC 3.0 Internal validation, Taxon scrubbing, Geospatial validation ?How are linkages to external resources made? If it is through Names then is a TNRS mediation step required against the external resources? Reference taxonomy External data sources Provider BIEN Mediated through TNRS & Geo scrubbing Modified by NS/SD from MS/BB
Provider Cache DwC/A Provider Cache VegX Specimen multi-dataset DB Veg Obs multi-dataset DB VegX VegX VegX VegX DwC DwC 1.5 TAPIR/IPT Provider web service 1.3 Provider data web service 1.0 Data mapping and provider services This workflow represents a provider using a manual process to map and transfer data to BIEN. Provider BIEN ‘Manual’ schema-mapping tools 1.1 Data mapping tool Single datasets Transfer may occur via a website or FTP process? 2.0 BIEN Harvest process 1.2 VegX programmatic mapping Harvesting protocol: OAI PMH 1.4 DwC programmatic mapping TAPIR or IPT protocol Provider Data Notification services? Would follow the TDWG specifications and architecture for harvesting DwC. This workflow represents a provider using an automated harvesting process to map and transfer data to BIEN.
Temporary XML file store BIEN staging DB VegX VegX 2.0 BIEN harvester and loader Provider BIEN Provider feedback BIEN feedback VegX files transferred to BIEN file system Transfer : website or FTP process? Parse XML to RDBMS Assign ID’s, gather metadata, set status flags, initialise versions, insert data into staging tables, update audit tracking. BIEN harvester retrieves VegX files from files system Manual process to map and transfer data to BIEN 2.2 First level schema validation Is document well formed are mandatory data present 2.3 Import into staging DB 2.1 BIEN harvester: VegX & DwC Transfer : Web services – REST, SOAP … Automated harvesting process to map and transfer data to BIEN. 1.0 Data mapping and provider services
Names Synonymy 3.0 Internal validation, Taxon scrubbing, Geospatial validation 3.1 Internal validation 3.2 Taxon scrubbing 3.3 Geospatial validation Validation rule sets TNRS Geo Resolution Service Geo –validation rule sets Reference taxonomy To be completed
4.0 Transform, Audit change, Integrate, Rule set, Rule framework To be done
Archived DM Archived DM DM DM DW Archived DW Other data sources Geo Time Taxon Agg plot Specimen Raw plot 5.0 Populating Analytical DB 5.5 Build analytical DBs 5.1 Create dimensional extracts Confederated DB (S) 5.3 Transform and load 5.2 Create individual observation extracts 5.6 Archiving 5.4 Build and compute aggregate tables To be completed
6.0 User/web interface API To be done
7.0 Reapplication of updated Names To be done