210 likes | 309 Views
Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data. Enrico Boldrini a , Daniela Luzi b , Stefano Nativi a , Fabrizio Pecoraro b a Institute of Atmospheric Pollution Research, National Research Council (CNR-IIA), Sesto Fiorentino , Italy
E N D
Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data Enrico Boldrinia, Daniela Luzib, Stefano Nativia, Fabrizio Pecorarob aInstitute of Atmospheric Pollution Research, National Research Council (CNR-IIA), Sesto Fiorentino, Italy bInstitute for Research on Population and Social Policies, National Research Council (CNR-IRPPS), Rome, Italy * CRIS2014 - Rome, 13-15 May 2014
Index • Aims • Background • Two-way Crosswalk • From ISO 19115 INSPIRE profile to CERIF • From CERIF to ISO 19115 • Proposal of CERIF extension • Proposal of a CERIF profile in ISO 19115 • Implementation in a brokering framework • Discussion
Aim Proposal of different solutions to integrate research context information with environmental datasets Two way crosswalk: From ISO to CERIF: Providing a CERIF guideline for the description of datasets according to the INSPIRE profile ISO 19115 From CERIF to ISO: Proposing an ISO profile for contextual research information on the basis of CERIF concepts Extension of the Brokering approach used in environmental e-infrastructures with contextual research information based on CERIF
ISO 19115Geographical Information metadata • Part of geographical information suite of standards (19100 series) • Description of geographic information and services: identification, extent, quality, spatial and temporal schema, spatial reference and distribution of digital geographic data • more than 400 metadata elements • Provision of rules for valid metadata extensions • ISPIRE Metadata Implementing Rules • Eu Directive to implement ISO 19115 to create a European Union spatial data infrastructure • Core set of mandatory and optional metadata and related constraints INSPIRE profile ISO 19115
CERIF • Comprehensive conceptual model on research information and related process suitable for different purposes: management, scientific exchange, evaluation … • E-R based, flexible model based on: • Base entities • Semantic layer • Multiple relationships • Constantly maintained by the euroCRIS community CERIF version 1.6
Challenges Different domains scopes structures semantics Funding Equipment Facility ExpertiseAndSkills Service Qualification Prize ElectronicAddresse CV PostalAddress Citation Metrics Indicator Measurement Country Event Language Currency
Mapping from INSPIRE ISO 19115 profile to CERIF • StraightforwardINSPIRE elements have semantically correspondent elements in the CERIF data model • Inferential mapping both INSPIRE and CERIF can refer to a data dictionary/vocabulary that contains semantically shared terms; • Convention the CERIF metadata elements can be accommodated to express some mandatory INSPIRE elements by convention of the parties exposing their metadata
Straightforward mapping • Semantically correspondent notation with CERIF entity cfResProd and some related elements • Automatic discovery and interpretation of datasets exposed in RISs using CERIF model
Inferential mapping • Information can be inferred using: • CERIF semantic layer (cfClassId …) and link entities • ISO CodeList dictionary • Important to express roles and topics univocally
Convention • Mapping of information on: • datasetqualityandlineage, • temporalreference, • language
A proposal of CERIF extension • New entities related to research products expressing: • condition of access and use, • limitation on public access • dataset language • dataset character codes • + optional ISO information related to the metadata used
Mapping from CERIF to ISO 19115 profile • Proposal of extensions according to ISO methodology based on CERIF : • project entity • publications linked to dataset • + expansion of ISO concepts providing more information on Organisations and Persons
GI-cat discovery broker GI-cat enables scientific data search across different, heterogeneous data sources. Results are profiled according to the desired model. • GI-cat broker technology powers different projects and initiatives: • Italian Antactic Data Center (IADC) • Italian Special project NextData • CNR GIIDA • ISPRA catalog of catalogs • AfroMaison • Global Earth Observation System of Systems (GEOSS) • …
Implementation results - GI-cat extensions for CERIF CERIF Docs brokers CERIF datasets published according to the CERIF XML Schema exposes the resources brokered returning documents which are conform to the CERIF XML Schema.
Test case #1 Publishing CERIF products for INSPIRE CERIF Docs CERIF Documents stored in a XML repository are brokered by GI-cat and republished according to ISO 19115 through the CSW/ISO discovery interface, required by INSPIRE. ISO profiler Aim: CERIF result products are made available according to INSPIRE CSW/ISO
Test case #2 Porting INSPIRE information to CERIF INSPIRE Catalog CSW/ISO CSW/ISO accessor The CERIF profiler enables discovery through an OpenSearch interface. INSPIRE datasets stored in a CSW ISO catalog can be discovered and converted to CERIF XML documents. Aim: INSPIRE datasets are discovered and returned according to CERIF XML Schema
Summarising some results … 1) • Data elements mapped: • 16/20 INSPIRE mandatory elements • 7 --> straightforward • 3 --> inferential • 6 --> by convention • 6/8 optional elements Discovery of primary data elements based on CERIF Result Product CERIF semantic layer facilitates a flexible application of the model in heterogeneous environments BUT needs specific constraints and rules to establish consistent semantic integration
Summarising some results … 2) • Proposal of introducing a CERIF profile to extend ISO concepts with contextual research information: • Projects • Datasets associated to publications • Implementation and successful test of the GI-cat allows without additional implementation efforts: • Integration of ISPIRE datasets in RISs • Integration of RISs with environmental dataset systems Future work: service discovery, extending mapping to ISO 19119
Discussion Proposal of different solutions to be submitted to the euroCRIS community • Some further suggestions: • Introduction of a specific entity to univocally identify datasets as research products • Establish set of rules/procedures to create CERIF valid metadata extensions