1 / 62

Ontologies in Data and Application Integration – an Update

Ontologies in Data and Application Integration – an Update. Kai Lin Bertram Ludäscher Knowledge-Based Information Systems Lab Data and Knowledge Systems (DAKS) San Diego Supercomputer Center University of California San Diego. http://www.geongrid.org. Outline. Motivation

Download Presentation

Ontologies in Data and Application Integration – an Update

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ontologies in Data and Application Integration – an Update Kai Lin Bertram Ludäscher Knowledge-Based Information Systems Lab Data and Knowledge Systems (DAKS) San Diego Supercomputer Center University of California San Diego http://www.geongrid.org

  2. Outline • Motivation • Ontology Cheat Sheet • Ontology-enabled Prototypes and Tools • Data & Service Registration (Structural + Semantic) • Scientific Workflows

  3. Ontology Cheat Sheet (1/2) • What is an ontology? An ontology usually … • specifies a theory (a set of models) by … • defining and relating… • concepts representing features of a domain of interest • Also an overloaded (sometimes sloppy) term for: • Controlled vocabularies • Database schema (relational, XML, …) • Conceptual schema (ER, UML, … ) • Thesauri (synonyms, broader term/narrower term) • Taxonomies • Informal/semi-formalrepresentations • “Concept spaces”, “concept maps” • Labeled graphs / semantic networks (RDF) • Formal ontologies, e.g., in [Description] Logic (OWL) • “formalization of a specification”  constrains possible interpretation of terms

  4. A Multi-Hierarchical Rock Classification “Ontology” (GSC) Genesis Fabric Composition Texture

  5. Ontology Cheat Sheet (2/2) • What are ontologies used for? • Conceptual models of a domain or application, (communication means, system design, …) • Classification of … • concepts (taxonomy) and • data/object instances through classes • Analysis of ontologies e.g. • Graph queries (reachability, path queries, …) • Reasoning (concept subsumption, consistency checking, …) • Targets for semantic data registration • Conceptual indexes and views for • searching, • browsing, • querying, and • integration of registered data

  6. +/- Energy +/- a few hundred million years GEON Metamorphism Equation: Geoscientists + Computer Scientists Igneous Geoinformaticists domain knowledge Application Example: Geologic Map Integration Knowledge representation Ontologies!? Nevada

  7. After registering datasets, ontologies (here: “classes”), and an application (“OMI”), the datasets can be searched and displayed in an integrated way. Geologic Map Integration in the Portal

  8. Concept-Based Queries and Analysis • After registering a source with one or more ontologies, concept-based queries and analysis can be launched • Here: light-weight client-side processing (SVG)

  9. Ontologies and Data Management • Where do ontologies fit within data management architectures? • Several answers, specifically: • An ontologyis similar to a schema or conceptual model if one exists, but is • Developed independently of a particular application • Probably given in a different language • Inherently more general • Usually not a very good schema (weak structure)

  10. Ontologies and Data Management( watch out for Semantic Data Registration later) Ontology use concepts from (explicitly or implicitly) Design Artifact Conceptual Model Conceptual Model Schema Schema Schema Schema  Metadata Data

  11. Creating and Sharing Concept Maps (here: Seismology concept map & Cmap tool) • Lock up scientists for 2+ days • Add CS/KRDB types • Create concept maps • Refine • Iterate  from napkin drawings, to concept maps, to ontologies

  12. Graph (RDF) Queries on Ontologies visualisation RQL Query: Show all “products” Query Results

  13. Community-Based Ontology Development • Draft of a geochemistry ontology developed by scientists • Current concept maps and • emerging ontologies: • Igneous Rocks/Plutons • Seismology • Geochemistry

  14. Protégé (… not so ezOWL yet…)

  15. Sparrow (a poor man’s OWL tool …) Simple ASCII-based RDF and OWL entry and manipulation

  16. Semantic Data Registration(joint work w/ Shawn Bowers)

  17. What is Data/Ontology/… Registration? • A mechanism by which data sources, ontologies, services, … • … are publishedin a repository/registry • for the purpose of “smart” discovery, querying, integration

  18. Things to Register • Data files (individual files) • Shapefile as a blob (+ file type) • Collections (of files; nested; eg satellite data) • Databases (has schema and can be queried) • Shapefile with schema registered • Ontologies • Services (web + grid services) • Other/external applications

  19. DataCollectionEvent Measurement MeasurementContext MeasurableItem SpeciesCount SpeciesAbundance AbundanceCollectionEvent Location LTERSite SBLTERSite {naples,…} ⊑ contains.Measurement ⊑ measureOf.MeasurableItem ⊓ hasContext.MeasurementContext ⊑ hasTime.DateTime ⊓ hasLocation.Location ⊑ hasUnit.Unit ⊓ hasValue.UnitValue ⊑ MeasurableItem ⊓ hasSpecies.Species ⊓ hasUnit.RatioUnit … ⊑ Measurement ⊓ measureOf.SpeciesCount ⊑ DataCollectionEvent ⊓ contains.SpeciesAbundance ⊑ position.Coordinate ⊑ Location ⊑ LTERSite ⊓ position.SBLTERCoordinate ⊑ SBLTERSite Connecting Datasets to Ontologies Ontology (snippet) How can we “register” the dataset to concepts in the Ontology? Dataset Date Site Transect SP_Code Count 2000-09-08 CARP 1 CRGI 0 2000-09-08 CARP 4 LOCH 0 2000-09-08 CARP 7 MUCA 1 2000-09-22 NAPL 7 LOCH 1 2000-09-18 NAPL 1 PAPA 5 2000-09-28 BULL 1 CYOS 57

  20. Step1: Selecting Relevant Concepts Concepts from an Ontology • DataCollectionEvent • AbundanceCollectionEvent • Measurement • Abundance • SpeciesAbundance • MeasurementContext • … • Location • LTERSite • SBLTERSite • naples • Species • … • MeasurableItem • SpeciesCount Dataset Date Site Transect SP_Code Count 2000-09-08 CARP 1 CRGI 0 2000-09-08 CARP 4 LOCH 0 2000-09-08 CARP 7 MUCA 1 2000-09-22 NAPL 7 LOCH 1 2000-09-18 NAPL 1 PAPA 5 2000-09-28 BULL 1 CYOS 57

  21. Step1: Selecting Relevant Concepts Concepts from an Ontology • DataCollectionEvent • AbundanceCollectionEvent • Measurement • Abundance • SpeciesAbundance • MeasurementContext • … • Location • LTERSite • SBLTERSite • naples • Species • … • MeasurableItem • SpeciesCount Dataset Date Site Transect SP_Code Count 2000-09-08 CARP 1 CRGI 0 2000-09-08 CARP 4 LOCH 0 2000-09-08 CARP 7 MUCA 1 2000-09-22 NAPL 7 LOCH 1 2000-09-18 NAPL 1 PAPA 5 2000-09-28 BULL 1 CYOS 57

  22. Step2: Generate Object Model Concepts from an Ontology • DataCollectionEvent • AbundanceCollectionEvent • Measurement • Abundance • SpeciesAbundance • MeasurementContext • … • Location • LTERSite • SBLTERSite • naples • Species • … • MeasurableItem • SpeciesCount Abundance Collection Event contains measureOf SpeciesCount SpeciesAbundance hasValue hasSpecies hasUnit Species RatioUnit hasTime hasLoc RatioValue SBLTERSite DateTime

  23. Applications of Semantic Registration • Mentioned before: • Smart data discovery, integration etc. • New application: • Generating data transformation semi-automatically for chaining together computational services

  24. Problem: Service Reusability • Unless “designed to fit,” independent services are structurally incompatible • Generally, the source output type will not be a subtype of the target input type Incompatible StructuralType Ps StructuralType Pt (⋠) Desired Connection Source Service Target Service Pt Ps

  25. (≺) Service Reusability • A data transformation mapping () is required to connect the services … artificially creating subtype compatibility • If such a  exists, the services are “structurally feasible” Incompatible StructuralType Ps StructuralType Pt (⋠)  (Ps) Desired Connection Source Service Target Service Pt Ps

  26. Service Reusability • Idea: • annotate services with semantic types (concept expressions) primarily for discovery of services Ontologies (OWL) Compatible (⊑) SemanticType Ps SemanticType Pt Desired Connection Source Service Target Service Pt Ps

  27. (≺) Service Reusability • Services can be semantically compatible, but structurally incompatible Ontologies (OWL) Compatible (⊑) SemanticType Ps SemanticType Pt Incompatible StructuralType Ps StructuralType Pt (⋠)  (Ps) Desired Connection Source Service Target Service Pt Ps

  28. The Ontology-Driven Framework (work w/ Shawn Bowers, SEEK) Ontologies (OWL) Compatible (⊑) SemanticType Ps SemanticType Pt Registration Mapping (Input) Registration Mapping (Output) StructuralType Ps StructuralType Pt Correspondence (Ps) Generate Source Service Target Service Transformation Pt Ps Desired Connection

  29. Example Generated Data Transformation (in XQuery) • Based on the structural correspondences and certain assumptions, we derive the transformation query: <cohortTable> { for $s in /population/sample return <measurement> { for $c in $s/meas/cnt return <obs>{$c/text()}</obs> } { for $l in $s/lsp return <phase>{$l/text()}</phase> } </measurement> } </cohortTable>

  30. Scientific Workflows(Efrat Jaeger et al.)

  31. Reverse Engineering a Scientific Workflow using the KEPLER Tool (Efrat Jaeger)

  32. A Scientific Workflow in Kepler Extract mineral composition for row Id. Igneous Rock Diagrams information. Rock Name.

  33. A Scientific Workflow in Kepler

  34. A Scientific Workflow in Kepler

  35. Reverse-Engineered the Geological Map Integration in Kepler

  36. DataMapper Sub-Workflow

  37. Result launched via the BrowserUI actor

  38. Kepler … is a community-based, cross-project, open source collaboration for “minute made” application integration using web (grid) services as basic building blocks has a joint CVS repository, mailing lists, web site, … is gaining momentum thanks to contributors and contributions BSD-style license allows commercial spin-offs a pre-packaged, shrink-wrapped version (“Kepler-to-GO”) coming soon to a place near you… KEPLER and YOU

  39. F I N – Questions?

  40. Additional Material

  41. The KEPLER GUI (Vergil from Ptolemy II) Drag and drop utilities, director and actor libraries.

  42. Running the workflow

  43. Distributed Workflows in KEPLER • Web and Grid Service plug-ins • WSDL • ProxyInit, GlobusGridJob, GridFTP, DataAccessWizard • SRB • SSH, SCP • Web Service Harvester • Imports all the operations of a specific WS (or of all the WSs in a UDDI repository) as Kepler actors • XSLT and XQuery transformers to link non-fitting services together • Web Service Deployment (…ongoing work…)

More Related