1 / 14

RDA Metadata Semantics

This paper explores the complicated procedure of mapping metadata schemas to ontologies in order to achieve rich metadata semantics for both human and computer understanding. It discusses semantic approaches needed for metadata schemas and the challenges and opportunities in using metadata semantics widely. The paper also highlights the importance of formal semantics in metadata schemas for discovery, queries, mediation/linking, and reasoning.

cherylholt
Download Presentation

RDA Metadata Semantics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Metadata and Semantics Research Conference, since 2005 RDA Metadata Semantics Rich Metadata Semantics needed for human AND computer understanding but Mapping metadata schemas to ontologies can be a complicated procedure.... Gary Berg-Cross SOCoP, RDA US Advisory Committee

  2. Outline of Topics • Metadata- many Standards and some Ambitious MD Requirements • RDA Metadata-Semantic Discussions & Background • Rich Metadata Semantics needed for human AND computer understanding • Semantic approaches needed for MD schemas • Adding formal semantics to metadata schemas for discovery, and queries, mediation/linking and reasoning use an be a complicated procedure.... • Illustrating 2 Semantic approaches • Semantic Annotation • Example of an Ontological Schema • Are we ready for metadata semantics to be widely used? • Where are the opportunities? • Can we agree on common or domain principles (like modularity or building blocks) or some formal semantic requirements?

  3. Metadata & Standards Evolution from file system names/types & Describing DB Fields to MD Schemas for Exchange Dublin Core attaching categorical tags and descriptions via a MD schema Attempt to make data more human understandable – capture agreed upon MD that affords understanding The MD effort now requires many interacting pieces including Metadata Application Profiles and Workflow like entities

  4. Strategy of “Modular” Theory of General and Domain Specific MD (and Ontologies) Trans-Domain (General Consensus) Metadata ID, time.... ISO MD_Keywords: Discipline, Place, Stratum, Temporal, Theme? “Harmonized” And Packaged Together Support Interoperabiity Independent?? Standardized Geo-specific metadata Standardized BioMed-specific metadata Standardized EarthScience-specific metadata Modules should be easier to create, validate, understand and maintain They may be substituted for and used and reused for composition

  5. There are specific “standards” in domains General MD • [ISO 19115:2003] Geographic information -- Metadata • [ISO 19115-2:2009] Geographic information -- Metadata -- Part 2: Extensions for imagery and gridded data Other MD OGC Object Types axis axisDirection datum dataType derivedCRSType documentType ellipsoid featureType group Meaning.... • In OGC’s O&M model Earth Observations generate “products” that have metadata. • These are organized into a metadata profile organized as a schema Support bridging heterogeneity To achieve interoperability Support data integration.

  6. Some Metadata Challenges (Earth Science from IlyaZaslavsky, CINERGI* pipeline) Common deficiencies in existing metadata descriptions: • Different metadata models and profiles, • Different details of requirements mandatory and optional fields (Dublin Core vs ISO) • Different meaning of fields and initial purpose/emphasis of data collection • Different local interpretations of how these fields should be filled out (eg “authors” and “contacts” are often mixed up). • Different classifications of resource types • Common resource types are: Organization, Webpage, Collection, Dataset (EPOS -Users, SW services, computing services) • Title may be non-descriptive • insufficiently unique (“Roads”) • meaningful, but opaque naming patterns (eg “AXXX34nn1”) • Keywords • may be missing or may be too specific to domain • may lack references to a thesaurus/CV or are freeform text • Info missing such as Abstract, Contact saying “call”, location, time without reference, wrong URL • Grouping: a range of metadata records from a single source may be very similar (only differ in one parameter e.g. location) – they may be better discovered as a group of records • Duplicates • Several metadata records from different catalogs may point to the same physical dataset (or have overlapping susbsets of distributions) Provenance Issue? ......* Community Inventory of EarthCube Resources for Geosciences Interoperability (http://workspace.earthcube.org/cinergi)

  7. Broad View of Metadata (Schema) Status & Argument for More Semantics Richness issue • Even when done well simple annotations and structured metadata are not rich enough to support ad hoc use & certainly not reasoning based on meaning. • There are many MD schemas and a broad challenge is to link/integrate them. • “Metadata schemas are created for resources’ identification and description and - most of the times - they do not express rich semantics. Even though the meaning of the metadata information can be processed by humans and its relationship to the described resource can be understood, for machine processing the actual relationships are frequently not obvious. In contrast to metadata schemas, ontologies provide rich constructs to express the meaning of data” • Stasinopoulou, Thomais, et al. "Ontology-based metadata integration in the cultural heritage domain." Asian Digital Libraries. Looking Back 10 Years and Forging New Frontiers. Springer Berlin Heidelberg, 2007. 165-175.

  8. RDA Background & Outreach on Semantics • Agrowing interest in the topic of semantic interoperability. • The centrality of semantic issues was, for example, noted following the 1st Plenary. • Semantic issues and technologies are already part of the discussion on the RDA Forum.  Research communities need to adopt and deploy technologies that help them get the most from their data, understand context, and infer meaning.  The semantic web community has much to contribute to an enabling global infrastructure and it would be great to see greater involvement in the RDA.  • Fran Berman (Professor of Computer Science, RPI, Chair of the Research Data Alliance/U.S.) • RDA should take on this issue but how? And who will participate?

  9. RDA Metadata and Semantics Intersect • Data Foundations and Terminology (WG & IG) • Data in Context IG • Data Fabric IG • Geospatial IG • Marine Data Harmonization IG( ISO 19115 etc.) • Broker IG • Research Data Provenance....... • Semantic Interoperability BoF at RDA P3 • 3 Presentations to illustrate key concepts of SI & use of ontologies- Gary Berg-Cross & Yann Le Franc • Discussed Ontology Design Patterns and Lightweight methods • EUON effort • What is a quality ontology? • 1st European Ontology Network (EUON) Workshop co-located at P4 • http://www.eudat.eu/euon/euon-2014-workshop

  10. The Need for Some Semantics is (somewhat) Understood Restrictive • MD need to be a first class, processable system, like a conceptual model, easier to use, manage and follow efforts to make data more understandable by computers. • Semantics helps address what MD annotations mean • What the shared meanings are • What the assumptions such as relations between MD items are and • How links to other data can be included? Principles and Foundations of Ontologies and Semantic Grids - Session 48. July 15th, 2009 Oscar Corcho (Universidad Politécnica de Madrid) http://www.slideshare.net/ISSGC/session-48-principles-of-semantic-metadata-management

  11. How do we add Semantics to MD? Depends on Intended Use : Example of Semantic Annotations (HTML -> RDFa) For data description and context the semantics added can be like a formal, conceptual model For search it can be like a better annotation of keywords using RDF. • Start with a collection XHTML attributes in a web page • Embed RDF annotations in the web pages using things like • DC and FOAF vocabularies easily used for most simple annotations -e.g. Creator, title, contact info Becomes From Introduction to Semantic Technologies, Ontologies and the Semantic Web Aug 2010 #39

  12. Beyond Vocabularies: Good Semantics Needs Appropriate Conceptualization of Properties Connect properties like stream flow, level, pollutants, evapotranspiration etc. in a schema hasLayer ….. Grams /cm3 Water Density Unit Water Density Water Body hasDensity Unit hasConstituent hasFeature hasUnit HasFeature IsA hasValue Area Real Number Area Quantity Chesapeake Bay Sq Miles hasQuantity hasUnit • For connecting to Chem/BioChem ontologies there might be sub-categories of Physical Features for elements – optical, hardness, color • See Dumontier Lab  ontologies to represent bio-scientific concepts and relations. • http://dumontierlab.com/?page=ontologies

  13. Ontology Design Patterns (ODPs) of Semantic Trajectory – Hydro/Ocean Observations as Annotations Hydro Obs/Device • ODPs (aka microtheories) small, modular, & coherent schemas. • Relatively autonomous but conceivably composable with other schemas. • Environmental Observations fit into this schema. • Fixes may be hydrometric feature observations & at some PoI (and offset Fix) for some point or period of time denoting important activities • Observations including time series sets might be applied to something like streamflow or temperature plots or a pollution plume or data from an ocean glider • You may query Schema : • “Show locations within Gulf of Mexico fishing area with colored dissolved organic matter” Hydro Var & attr/data or value type of Interest Paths & POIs Have Geometries including Polygon Areas Hydro Object or moving device A Geo-Ontology Design Pattern for Semantic Trajectories COSIT 2013: YingjieHuet al.

  14. Are we ready for metadata semantics to be widely used? • How do we bring current MD practice and semantic practice together? • What is a practical MD vision of this enhanced MD? • Where are the opportunities? • E.g. Is semantic annotation the sweet spot? • Do we just expand MD tags to semantic annotations and if so how? • What about ontology design patterns (ODPs)? Where are they useful? • Thoughts on where to add semantics and its technology to MD in the data/MD cycle? • How does it affect how data/md repositories function? • Some/considerable confusion about how MD should be integrated into information systems. • Can we agree on common or domain principles (like modularity or building blocks), practices and tools to employ ?

More Related