1 / 14

Contrasting typical SW and DB approaches to semantic integration

Contrasting typical SW and DB approaches to semantic integration. Arnon Rosenthal. Two versions of a common problem. Schema matching ≈ Align classes/properties in ontology Two meta-models, similar core problem Start with either: Two domain models Two schemas (for systems)

rmay
Download Presentation

Contrasting typical SW and DB approaches to semantic integration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Contrasting typical SW and DB approaches to semantic integration Arnon Rosenthal

  2. Two versions of a common problem • Schema matching ≈ Align classes/properties in ontology Two meta-models, similar core problem • Start with either: • Two domain models • Two schemas (for systems) • One domain model and one schema • Goal: Identify the relationships between • their concepts • their instance sets • same as, IS-A, “usable for” seem the main ones helpful to a “customer” • May need to transform to make things match

  3. Decades have elapsed! • Database side: Survey of schema matching research (Batini et. al., 1986) • Target schema may be constructed from inputs • Envisioned end product is a SQL view • Focus is on “where can we find clues” • Sem-Web precursors: ISI, MCC – domain model (in logic) plus articulation axioms • Constraints are within the logic • Reasoning-based. Each project had its own formalism Obvious question: Why no robust products yet?

  4. Leaping ahead to my conclusions (SW competitor) • For enterprise systems today, lean toward DB and XML tools, unless you really exploit ontologies’ greater expressive power (value taxonomies, IS-A) • Maturing sem-web environments will (by definition) import knowledge from big data integration products

  5. Correspondence topology • Direct approaches • Neutral form approaches (can be multiple) Domain model

  6. Emerging work – not associated with systems • Multiple intermediaries • Which to use when creating? describing? Domain model 1 Domain model 2 Domain model 3

  7. Basic unit: atomic concept (object or property) Small chunks  easy to relate & reuse Describe a domain model Robust for multiple uses Basic unit: relation or tree scheme Record is a good chunk for storage or display Sets are present, implicit Describe a system or a physical message Compare typical DB vs. AI approaches (1) Formalisms to describe concepts & relationshipsDB (Schema) AI (Ontology)

  8. Relate via neutral defns Reuse is easier Will administrators understand “foreign” or abstract concept defns? Direct relationships and flows between systems Instant gratification (funding is usually for an applic’n, not for integrat’n) Differences in real data lead to improved definitions Tools examine the data $billion industry  feature-rich, scalable tools Compare typical DB vs. AI approaches (2) Formalisms to describe concepts & relationshipsDB (Schema) AI (Ontology)

  9. OWL has both theory and tool communities extensible Execute by inference engine? Not tuned to query processor strengths Homegrown logic Even simple Datalogs won’t interoperate extensible Mappings are in popular query languages Efficient: parallel, query optimizers Deployablee.g., change management Compare typical AI vs. DB approaches (5) DB AI

  10. Relationships among concepts: “Usable_for” Rel’ships use formalism very similar to ontologies IS-A is “native”, i.e., part of the regular model IS-A logically merges the ontologies OWL is insufficient, rule languages overkill Relationships amongsets, via {informal or formal logic assrtns.} or query language More powerful (data exchange logicians) Terminology: TGD =  שڅ View defns are big: hard to edit (and to reuse) Compare typical AI vs. DB approaches (3) DB AI

  11. Exchange semantics: Whatever my engine infers !!! Is this tolerable? Why (not)? Exchange semantics examined from user viewpoint, precise Hard to learn or communicate Discards tuples unnecessarily? Compare typical AI vs. DB approaches (4) DB AI

  12. How can they combine • Formalism: • OWL ontologies • Need a standard construct for “can be used for” super-property ≈ tuple-generating dependencies • Direct or via neutral model: • Mix and match, share info and infer over both • Execution environment: DBMS • Parallel, query optimization, deployment • Already bilingual (SQL, XML), add RDF when it reaches critical mass ($Bs)

  13. Why “Alignment” research is hard to transfer • Conspicuous lack of widely-used products, from either community • Aligners/matchers automate some work of an integration engineer, but can’t 90+%solve a major “customer” problem • Without a robust mediator, there aint no customer! • Lesson: Touch the end users, downstream • (someone outside the IT dept) • 95% reduction in their work as schemas evolve • Generate code for end users 80% faster

  14. Summary • Two communities addressing similar problems • More standards, cleaner formalisms on S Web side • More pragmatics and richer suites on the db side • Largely formalism independent, could be imported, esp. “Instant gratification”

More Related