330 likes | 472 Views
GeoDI Final Meeting. Yassine Lassoued < y.lassoued@ucc.ie> GeoDI Final Meeting, GSI, Dublin 24 October 2011. Layout. Overall Project Objectives Methodology Reports Recommendations IGIS Exploitation and Outreach Difficulties. Overall Project Objectives.
E N D
GeoDI Final Meeting Yassine Lassoued <y.lassoued@ucc.ie> GeoDI Final Meeting, GSI, Dublin 24 October 2011
Layout • Overall Project Objectives • Methodology • Reports • Recommendations • IGIS • Exploitation and Outreach • Difficulties GeoDI Final Meeting - GSI - Dublin
Overall Project Objectives • Review existing geoscientific datasets • Review international best practice • Specify integrated data model • Develop ontologies for geoscientific resources • Specify and develop automatic ETL tools • Specify automated processes for the generation of metadata • Specify data delivery methods GeoDI Final Meeting - GSI - Dublin
Overall Project Objectives • Identify process changes that would improve data management • Implement a prototype data storage and retrieval system • Assess potential analyses and services • Identify and evaluate tools and approaches to facilitate geospatial analysis and querying of the geoscientific data GeoDI Final Meeting - GSI - Dublin
WP1: Initiate & Identify T1.1 Review data and applications T1.2 User needs and prioritised data T1.3 Metadata and quality issues T1.4 Data familiarisation and review T1.5 Technical issues WP2: Review T2.1 Review ontologies and CVs T2.2 Identification of standards and models T2.3 Review ontology languages & tools T2.4 Review international best practice T2.5 Review state of the art ontology matching WP0 Project Management T0.1 Co-ordination of technical implementation of the project T0.2 Administration and financial management T0.3 Communication and dissemination strategy WP6: Synthesise T6.1 Assess analyses and services T6.2 Tools for geospatial analysis WP3: Specify Models T3.1 Specify integrated data models T3.2 Specify ontologies T3.3 Ontologies and models validation T3.4 Prototype & benchmarking T3.5 Ontology matching techniques WP5: Implement & Evaluate T5.1 Implement ontology server T5.2 Implement ET tools T5.3 Prototype integration & evaluation WP4: Specify System T4.1 Data delivery methods T4.2 Process changes for improved management T4.3 system specification Methodology GeoDI Final Meeting - GSI - Dublin
Reports • D1.1. Review of Geoscientific Datasets • D2.1. Review of Ontologies and Controlled Vocabularies • D2.2. Identification of Standards and Models • D2.3. Review of Ontology Languages and Tools • D2.4. Review of International Best Practice for the Management of Large Geoscientific Databases • D2.5. Review of ETL and Ontology/Schema Matching techniques and Tools GeoDI Final Meeting - GSI - Dublin
Reports • D3.1. Selected Data Model and Ontologies for Geoscientific Data • D4.1. Data Delivery Methods • D4.2. Process Changes for the Improvement of Data Management • D4.3. System Specification • D5.3. System Evaluation Report • D6.1. Potential Analyses, Services, and Tools for Geospatial Analysis and Querying GeoDI Final Meeting - GSI - Dublin
Recommendations • Recommendations based on international standards and best practice (including INSPIRE) regarding the following areas: • Data model • Data Management • Ontologies and Controlled Vocabularies • Metadata • Data delivery methods GeoDI Final Meeting - GSI - Dublin
Recommendations • Data Model • Investigated three data models: • GeoSciML • Arc Geology • Arc Marine • Recommended Arc Marine • Supports the marine feature types • Used by the Marine Data Repository, and the Biological Integrated Database (BIDI project) GeoDI Final Meeting - GSI - Dublin
Recommendations • Data Management (D2.4, and D4.3) • In general, we recommended that the OGC TC and DWG developments should be followed • Make bathymetric data available in the Bathymetric Attributed Grid (BAG) format • Database performance • Indexing, and spatial indexing • Recommendations regarding the index architectures to use • Table partitioning • Query optimisation • Consider database statistics collected by SQL Server GeoDI Final Meeting - GSI - Dublin
Recommendations • Data Management • General considerations in: • Database design • Scalability • Network • Memory • Backup and Recovery • Security • Other useful tips: • Uniform management of data and metadata • Grid computing • Torrents GeoDI Final Meeting - GSI - Dublin
Recommendations • Ontologies and Controlled Vocabularies (D2.1, D2.3, and D4.3) • Content should build on: • INSPIRE themes • SeaDataNet and BODC vocabularies • NASA’s Global Change Master Directory (GCMD) • General Multilingual Environmental Thesaurus (GEMET) • BGS vocabularies • An ontology structure was proposed GeoDI Final Meeting - GSI - Dublin
Recommendations • Ontologies and Controlled Vocabularies • Model: Simple Knowledge Organization System (SKOS) • Language: RDF and OWL • Encoding: RDF/XML • Ontology Server: Jena • General SKOS and OWL rules • Term Collection • Use an Excel spreadsheet and a conversion tool to generate RDF/XML using SKOS • Alternative: Protégé ontology editor, but requires technical skills, and is not scalable GeoDI Final Meeting - GSI - Dublin
Recommendations • Ontologies and Controlled Vocabularies • A strategy for ontology versioning was proposed • A governance structure was proposed GeoDI Final Meeting - GSI - Dublin
Recommendations • Metadata • Standards: ISO-19115 & ISO-19139 • Profile: ISDE profile (INSPIRE compliant) • Data and service bindings • Semantic annotations GeoDI Final Meeting - GSI - Dublin
Recommendations • Data delivery Methods • Mainly OGC services as recommended by INSPIRE • Data: • WFS: vector data • WCS: grid data • WMS and WMTS: maps • KML to display seabed classification data in Google Earth • Metadata: • CSW (GeoNetwork) • Ontologies: • Jena as an ontology server • Semantic Web Service for providing high-level operations over HTTP GeoDI Final Meeting - GSI - Dublin
IGIS • Integrated Geoscientific Information System • Objective • Manage, integrate and access the various geoscientific resources: data, metadata, and semantic knowledge (ontologies) • Approach • Service oriented architecture • Standards based system (INSPIRE, OGS, ISO, W3C) Manage Resources Discover Resources Access Resources IGIS Standard Protocols (OGC, W3C, INSPIRE) Geoscientific Resources Data Ontologies Metadata GeoDI Final Meeting - GSI - Dublin
Web User Web GeoDataOnline Web Portal Discover GUI (Geo Finder) Meta Viewer Ontology Browser (OB) Data Access Map CSW Mediator (CSWM) Semantic Web Service (SWS) CSW Data Delivery Interfaces Ontology Server Local use SQL point to MI Internal User Integrated Database Geoscientific Ontologies Metadata uses generates feeds ETL Tool (Semi-Automatic) New Dataset
Database Integrated Data Dataset Views Measurements Geo Points Galway Bay Multibeam Grid View1 Business Information … Geo Lines Viewi CV 200701 Tracks, Galway Bay … Sediment Samples CV 200701, Galway Bay, Sediment Samples Viewn Rasters GeoDI Final Meeting - GSI - Dublin
Ontologies Domain Ontologies • Disciplines • Themes • Instruments • Parameters • Data Types • Places • Strata • Projects Used by metadata, domain ontology browser, and CSW mediator for data discovery Mappings Data Model Ontology • Tables • Relationships • Attributes Used by ETL for transforming data Mappings Data Controlled Vocabularies • Folk Classification • Seabed Sample Structure • Seabed Texture • Grain Sorting • etc. Used within data to standardise field values GeoDI Final Meeting - GSI - Dublin
Metadata WMS WFS WCS Metadata1 Identification keywords WMS Delivery WFS Delivery … View1 Galway Bay Multibeam Grid … Geoscientific Domain Ontology Viewi CV 200701 Tracks, Galway Bay … CV 200701, Galway Bay, Sediment Samples Viewn Dataset Views Integrated Data Integrated Geoscientific Database (IGDB) GeoDI Final Meeting - GSI - Dublin
Metadata • Semantic Annotations – Option 1 (Preferred) <!--A list of keywords from the same thesaurus--> <gmd:MD_Keywords> <!--One keyword--> <gmd:keyword> <gmx:Anchor xlink:href=”http://geodi.ucc.ie/ont/20110429/geoscience.owl#SVProfiles”> Sound Velocity Profiles <gmx:Anchor> </gmd:keyword> <!--You may have as many keywords as you wish--> ... </gmd:MD_Keywords> GeoDI Final Meeting - GSI - Dublin
Metadata • Semantic Annotations – Option 2 <!--A list of keywords from the same thesaurus--> <gmd:MD_Keywords> <!--One keyword--> <gmd:keyword> <gco:CharacterString> http://geodi.ucc.ie/ont/20110429/geoscience.owl#SVProfiles <gco:CharacterString> </gmd:keyword> <!--You may have as many keywords as you wish--> <!--Keyword type--> <gmd:type> <gmd:MD_KeywordTypeCode codeList=”http://www.isotc211.org/2005/resources/codeList.xml #MD_KeywordTypeCode” codeListValue=”theme”/> </gmd:type> <gmd:thesaurusName> ... </gmd:thesaurusName> </gmd:MD_Keywords> GeoDI Final Meeting - GSI - Dublin
Semantic Framework Ontology Browser SWS Request, e.g., GetRelatedConcepts SWS Response (RDF/XML) Semantic Web Service (SWS) SPARQL Query Results XML Format SPARQL Request Ontology Server (Jena) Ontologies RDF/XML Files Ontology Loader Triple Store GeoDI Final Meeting - GSI - Dublin
Semantic Web Service Ontology Browser (GUI) CSW Mediator Or Data Discovery Interface External Applications GetConceptScheme, GetConceptHierarchy… Semantic Web Service (SWS) Advanced User URI = URL RDF Ontologies Encoded in RDF/XML Protocol: SPARQL Protocol for RDF SPARQL Query Results XML Format SPARQL Ontology Server Triple Store Geoscientific Ontologies GeoDI Final Meeting - GSI - Dublin
CSW Mediator … Keyword=Heritage … CSW or CSWM Request Semantic Web Service (SWS) CSW Mediator (CSWM) CSWM Request CSW Request … Keyword=Heritage Or Keyword=Shipwrecks … … Keyword=Heritage Or Keyword=Geoparks … … MI CSW Griffith CSW Other CSWM … GeoDI Final Meeting - GSI - Dublin
Data Delivery Services Geoscientific Web Portal External Applications Data Access Module Web Bathymetric Attributed Grid (BAG) Files WFS WCS SQL MI Internal User V1 C1 Integrated Geoscientific Database Vn Cm Vector Dataset Views Coverage Dataset Views GeoDI Final Meeting - GSI - Dublin
GeoDataOnline Portal Web User GeoDataOnline Web Portal Discover GUI (Geo Finder) Meta Viewer Ontology Browser (OB) Map Data Access Data Delivery Interfaces & Files Semantic Web Service (SWS) CSW Mediator (CSWM) WMS / WMTS WCS WFS BAG Files GeoDI Final Meeting - GSI - Dublin
Ontology Server ETL Tool New Dataset ETL Tool input Extractor Loader Transformer Shapefile String Based Matcher input Linguistic Matcher feeds Access DB Constraint Based Matcher input IGDB Data Type Matcher Excel File … input CSV File Dataset schema GeoDI Final Meeting - GSI - Dublin
Exploitation and Outreach • GeoDI semantic framework and CSW mediator have been taken over by NETMAR and ICAN • The SWS specification will be further advanced by NETMAR and submitted as a GEOSS best practice • Journal paper to be planned, ideally to be co-authored by CMRC, MI, and GSI • The IGIS may benefit ISDE, USGS GeoDI Final Meeting - GSI - Dublin
Difficulties • Technical • ETL • Data processing is very complex • Manual processing is always required • Need a mechanism to generate business information GeoDI Final Meeting - GSI - Dublin
Difficulties • Non Technical • Repercussive delays in 2008 • Time initially allocated to ETL, data loading and implementation was not sufficient GeoDI Final Meeting - GSI - Dublin
End Yassine Lassoued <y.lassoued@ucc.ie>