320 likes | 484 Views
The NATURE- SDI plus Validation methodology. Presentation outline. Overall testing and validation approach Validation of data specification and data encoding Data accessibility and usability testing Data Quality evaluation Data generalisation. Nature- SDI plus main outcomes.
E N D
Presentation outline Overall testing and validation approach Validation of data specification and data encoding Data accessibility and usability testing Data Quality evaluation Data generalisation
Nature-SDIplus main outcomes Data Models for 3 Annex III themes (BR, HB, SD) Harmonised DS & MD (PS, BR, HB, SD) GEOPORTAL (network services) NatSDI MD profile(s)
Nature-SDIplus Datasets and Metadata Before Task 4.1 DS + MD (PS, BR, HB, SD) before harmonisation Harmonised DS + MD (PS, BR, HB, SD) after harmonisation After Task 4.1
WP5: tasks and inter-relationships INSPIRE validation T5.1 Test on data accessibility & usability T5.2 Quality evaluation and dataset generalisation T5.3
Nature-SDIplus validation methodology • Generic validation process • Covers both • Validation of specification encoding • Validation of data encoding • NATURE-SDIplus specifications and test data as examples
Validation of specification encoding The required steps: • Validate Schema • Check transposition of specification • Check validatability The process:
Metadata Validation The required steps: • Syntactic validation • Semantic validation The process:
Data Validation The process: The required steps: • Syntactic validation • Semantic validation
Statistics of the remodelling Tools used to hamonise the 63 datasets for which the validation has been completed
Assessing Data Usability STEP 1: Design of online questionnaires STEP 2: Distribution and survey STEP 3: Result gathering and analysis STEP 4: Reporting
Data Usability on-line questionnaire (1/2) First part to collect info about the user extent of the geographical AOI used group of stakeholders belonging to type of professional activity / field of expertise Data theme assessed key-words used during data search
Data Usability on-line questionnaire (2/2) Second part to collect info about how data relevant to a given theme are usable: within the geoportal (using its functionalities) outside the geoportal (downloading the data via the geoportal and using them inside your application, and/or consuming the wms/wfs directly in your application). The user is asked to rate as poor or moderate or good or excellent her/his level of satisfaction of using: the overall Geoportal functionalities the specific search functionalities the data within the Geoportal the data outside the Geoportal Built using Google docs tools
Qustionnaires Processing (1/4) INSPIRE Conference 2011
Data Quality evaluation Main objective of the task 5.3 in terms of quality evaluation: to assess the quality of the harmonised vs. the source datasets A four steps methodology has been developed and applied
Step 1 • deep analysis of the background documentation: • the international standards EN ISO 19113, 19114, ISO/TS 19138 • the data quality issues covered by INSPIRE • the NatureSDIplus Metadata profile • from which the data quality elements and subelements, together with the corresponding measures and their reporting have been extracted Step 2 Elaboration of a set of guidelines enabling the quality evaluation of spatial datasets belonging to the four INSPIRE themes covered by NatureSDIplus (PS, BR, HB, SD) Step 3 Adaptation of the step 2 guidelines in order to use the selected data quality elements and subelements to assess the quality of the NatureSDIplus harmonised vs. source datasets Step 4 Application of the step 3 guidelines to 4 harmonised datasets (1 harmonised dataset for each of the four INSPIRE themes – PS, BR, HB, SD) and reporting of the quality evaluation results Data Quality evaluation methodology
Quality of DS & MD INSPIRE DS Req’s and Rec’s MD DQ EN ISO 19113 Geographic Information – Quality principlesEN ISO 19114 Geographic Information – Quality evaluation proceduresTS ISO 19138 Geographic information – Data quality measuresEN ISO 19115 Geographic Information – Metadata
Select the DQ elements and sub-elements (cross-checking INSPIRE PS Data Specifications, NatureSDIplus MD profile and EN ISO 19113) For each sub-element define a DQ measure (in adherence to ISO/TS 19138) For each sub-element define a DQ reporting (in adherence to EN ISO 19114 and EN ISO 19115) For each sub-element provide an example of DQ evaluation Methodology followed to develop the guidelines for DQ evaluation
The Data Quality elements and subelements have been structured according to the EN ISO 19115 formalisms, enabling their eventual future encoding as metadata according to the CEN ISO/TS 19139 The results achieved can be easily applied also to the other data themes, therefore providing a basis for Data Quality issues in the INSPIRE context Data Quality evaluation additional results
Datasets generalisation Main objective: to assess issues related to datasets generalisation from the local level to the national/European level Method: design of an off-line questionnaire to collect the feedback of the NatureSDIplus Data Providers (DPs) about the usability of the PS, BR, HB and SD Data Models and of the NatureSDIplus Metadata Profile when harmonising data and metadata at local level and aiming at generalising them from the local to the national/European level.
Datasets generalisation questionnaire In particular, the feedback focused on two main aspects: if DPs have noticed the need/opportunity to extend/modify the target data models, in order to better take into account local aspects if DPs have noticed the need/opportunity to extend/modify the source data models, in order to facilitate INSPIRE compliance. The first aspect is coherent with the Annex F (Example for an extension to an INSPIRE application schema) of the INSPIRE Data Specification D2.5 Generic Conceptual Model, according to which the INSPIRE data specifications can be modified at local level, in terms of data model, in order to take into account local aspects. The feedback collected on the second aspect can support local communities engaged in implementing INSPIRE.
Datasets generalisation questionnaire main results 19 questionnaires filled-in by 19 different DPs, replies analysed and results processed Need/opportunity to extend/modify the target data models, in order to better take into account local aspects Need/opportunity to extend/modify the source data models, in order to facilitate INSPIRE compliance
Datasets generalisation feedback (1/2) Some feedback about the need/opportunity to extend/modify the target data models, in order to better take into account local aspects: “The information contained in the source PS dataset site Protection Classification is divided to 20 values. In NATURE-SDIplus data model 7 values. It is difficult to select the suitable one” “We noticed that the Habitats and Biotopes target data Model doesn’t cover the whole information we have describing each habitats. Our information relates several habitats to one geographical feature and the target data model expects only the relation 1 – 1. Other ways, like duplicating geographical information could be taken into account but from our point of view is not the best solution in the future.” “Metadata: I would leave out Data quality - Thematic and temporal accuracy and Acquisition method. The latter because it can be described already in the lineage part. Data models: BGR: I would leave out the detailed class description parameters such as temperature, rainfall, etc… HB: Also here I would leave out a number of attribute such as elevation, activities and impacts, development Stage, monitoring Assessment.”
Datasets generalisation feedback (2/2) Some feedback about the need/opportunity to extend/modify the source data models, in order to facilitate INSPIRE compliance: “Some source datasets are missing a lot of mandatory information to be INSPIRE compliant. The need to restructure the datasets into a database (and not a collection of flat files) is crucial for some data providers” “The information contained in the source dataset is not sufficient to populate the corresponding attributes of the target data model. E.g.: The attribute ‘MANAGPL’ of the source dataset contains, for some sites, information about the type of site management, whilst it should contain the URL or citation of a document describing the site management plans. Moreover, for other sites, the attribute contain references to many documents. “Our datasets are simple shapefiles with attributes.”