1 / 14

DFT Basic Digital & Data Concepts - data is inherently collective data

Semantic Interoperability for Data in Context IG RDA Plenary 3: Friday 28th  March 2014 ( Day32 Gary Berg-Cross ( SOCoP , RDA DFT WG co-chair), gbergcross@gmail.com . DFT Basic Digital & Data Concepts - data is inherently collective data.

deana
Download Presentation

DFT Basic Digital & Data Concepts - data is inherently collective data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantic Interoperability for Data in Context IGRDA Plenary 3: Friday 28th March 2014 (Day32Gary Berg-Cross (SOCoP, RDA DFT WG co-chair), gbergcross@gmail.com

  2. DFT Basic Digital & Data Concepts - data is inherently collective data • Digital Data refers to a structured sequence of bits/bytes that represents information content. In many contexts digital data and data are used interchangeably implying both the bits and the content. • Real-Time Data is data/data collection which is produced in its own schedule & has a tight time relation to the processes that create it and that require immediate actions. Timeliness such as real time is an attribute of data. • Dynamic Datais a type of data which is changing frequently and asynchronously. • Note: Dynamic data has also been used in the context of Workflow- workflow that is executed a "dynamic data object", or you can call the results from executing the workflow a "dynamic data object" • Referable Data is a type of data (digital or not) that is persistently stored and which is referred to by a persistent identifier. Digital data may be accesses by the identifier. Some data objects references may access a service on the object (OAI-ORE). • Citable Data is a type of referable data that has undergone quality assessment and can be referred to as citations in publications.

  3. Background to this Semantic Interoperability > • Long time work on “data integration and sharing”. • Semantics is FEATURED in the Application layer of OSI • Intensive work in the AI & knowledge engineering areas. • But to many the goal of semantic interoperability remains elusive. • More recently the Semantic Web thrust pursued the goal of robust semantic interoperability & robust exchange of data. • Needs deep knowledge and support of reasoning to fulfill SW vision.

  4. SI has a Socio-Tech Aspect Use an agile approach, based on sets of competency questions? Re-use and integration of data from heterogeneous sources within and across discipline boundaries has not been routinely achieved. Use metadata semantic annotation? Application of special technologies that infer, relate, interpret, and classify the implicit meanings of digital content are not easily adapted to the topical research interest or enfolded in traditional architectures. Don’t try too hard to train a Domain Expert in Gold Standard formal semantics? Since meaning is a cognitive agent phenomena, semantic interoperability is the technical analogue to human communication and cooperation. That makes it intrinsically HARD.

  5. Graphic Overview of Semantic /Ontology Manifesto (EarthCube) • Guiding principles • Uses Cases • Lightweight -opportunistic methods • Semantic interoperability with • semantic heterogeneity • Bottom-up & top-down approaches • Domain - ontology engineer teams • Formalized bodies of knowledge across science domains • Broader “Reasoning” services Architecture & Workflow Between “Insertion” Community Understanding of Semantic role and value Knowledge Infrastructure Vision Paper at http://stko.geog.ucsb.edu/gibda2012/gibda2012_submission_6.pdf / Based on the work of (alphabetically) Gary Berg-Cross, Isabel Cruz, Mike Dean, Tim Finin, Mark Gahegan, Pascal Hitzler, Hook Hua, Krzysztof Janowicz, Naicong Li, Philip Murphy, Bryce Nordgren, Leo Obrst, Mark Schildhauer, Amit Sheth, Krishna Sinha, Anne Thessen, Nancy Wiegand, and IlyaZaslavsky

  6. Lightweight Methods & Products • Choose lightweight approaches to support application needs and reduced entry barrier • Low hanging fruit leverages initial vocabularies & existing conceptual models to ensure that a semantics-driven infrastructure is available for early use. Triple like parts Simple parts/patterns & direct relations to data A useful set of idea that supports a useful subset of (approximate) reasoning More relation types here Bottom Up. SI

  7. GeoSpatial Data & Web Feature Service Standardizes Terms but Lacks Semantics A terminology created independently based on different conceptual models differing in terms/vocabulary but also & meanings.

  8. Better Conceptualization of Properties - for Interoperability (CUAHSI) Organize Properties like size as a physical quality since it inheres in a physical object. Qualities like physical, bulk, & measured properties like stream flow, level, pollutants, evapotranspirationetc. and make them useable concepts rather than level concepts. • Currently CUAHSI has them at many levels • E.g. 2291 Major, bulk properties 4 hasLayer ….. Grams /cm3 Water Density Unit Water Density Water Body hasConstituent hasUnit hasFeature HasFeature usesStandard IsA hasValue Area Real Number Area Quantity Chesapeake Bay Sq Miles hasQuantity hasUnit • For connecting to Chem/BioChem ontologies there might be sub-categories of Physical for elements – optical, hardness, color • See Dumontier Lab  ontologies to represent bio-scientific concepts and relations. • http://dumontierlab.com/?page=ontologies

  9. Incrementally Adding Better Semantic Relations/Properties Data models & SKOS offer some relations, but they are limited. SKOS is more useful for terms than concepts Consider Irreflexive, anti-symmetric & Transitive constructs that captures common understanding. Observation –Streams and lakes flow into rivers. • Property “flows-into” is irreflexive • any one river cannot flow into itself as a loop • “flows-into” is also anti-symmetric • if one river flows into the second, the second one can’t flow into the first. • Transitive property for Regions to say that the subRegionOf property between regions is transitive • <owl:TransitivePropertyrdf:ID="subRegionOf"> <rdfs:domainrdf:resource="#Region"/> <rdfs:rangerdf:resource="#Region"/> </owl:TransitiveProperty> If Logan, Cache County and Utah are regions, and Logan is a subRegion of Cache County , Cache County is a subRegion of Utah, then Logan is also a subRegion of Utah.

  10. Grafton Street Dublin in Context All such references are usually outside a computer Grafton Street (Irish: Sráid Grafton) is one of the two principal shopping streets in Dublin city centre. Do we refer to it a pedestrian mall or a shopping street? Is it a road object but with motor traffic restrictions? Or a public place? Or a non-identifiable part of the city surface? OpenStreetMap -

  11. What Grafton Street is Depends on its Setting – when we are talking about, AND what Features Grafton Street 1814 AD or 2014? Transport or commerce features?

  12. Semantics in Context: Connecting 3 Viewsfor Geography/GIScience Knowledge This is different than regular land and water . Philosophy Psychology Perspectives ….. Knowledge/ GeoConcepts Maybe there is more than 1 type of boundary Understand Reality: Data evidence Model to express what you understand Models representing Geo-Knowledge GeoReality Task- Regiment Language Wetland….geo-entity..what boundary? Flows Into isa Type of connected-to Boundary segments=straight lines,so overall boundary is a polyline… Name Reality

  13. Example of (Powerful) Challenges – Semantic Mismatches, Inclusions & Alignments • Pragmatics of Intentions & goals (also Grafton example) • We have different goals so application & use are targeted. We need to adjust conceptualization to accommodate these. • Ontology level (Grafton example) • Different conceptualizations such as different class scope, Hierarchy level differences, coverage or granularity. • Scientists use different concepts & categories; • What does it mean to say that Concept P includes concept S? • What does it mean to say that concept P and S are semantically close? • Scientific understanding, often requires existing concepts to be revised or supplanted in the field • Perspective – 4D vs. 3D, roads as straight lines or curves, time as interval or ratio….. • Tacit assumptions (when messaging, an agent has in mind a number of “unspoken,” implicit consequences of that message.) – “You can’t drive on Grafton”…. • Language level for expressing semantics • Syntax and logical representation differences of the past should be handled by standardization & rule translations. • Different expressivity (Owl vs. Common Logic) might be harder.

  14. One View of Semantic Representation & Heterogeneity A challenge of deep semantic interoperability is that: • A global and one size fits all (Gold Standard) representation for each distinct situation, such as Grafton St. represented by data is not realistic, • and its procrustean nature may not be desirable if it ignores real heterogeneity • The judgment of some (CF John Sowa) is that different representations might be optimal for different use cases • Different levels of detail or granularity, along with different kinds of data entry options seem in practice suitable for different domains and settings. • Since scientific research is diverse, and evolving, what approach to granular standards can be developed for use? • Perhaps it is to use formal semantics to narrow the range of ambiguity for particular purposes.

More Related