210 likes | 349 Views
CS690L Ontologies Interoperability (Integration, Mapping, Query). Yugi Lee STB #555 (816) 235-5932 leeyu@umkc.edu www.sice.umkc.edu/~leeyu. Semantic Web Fabric. Bootstrapping, Creation and Maintenance of Semantic Knowledge Collaborative and Sociological Processes, Statistical Techniques
E N D
CS690LOntologies Interoperability (Integration, Mapping, Query) Yugi Lee STB #555 (816) 235-5932 leeyu@umkc.edu www.sice.umkc.edu/~leeyu CS690L - Lecture 4
Semantic Web Fabric • Bootstrapping, Creation and Maintenance of Semantic Knowledge • Collaborative and Sociological Processes, Statistical Techniques • Ontology Building, Maintenance and Versioning Tools • Re-use of Existing Semantic Knowledge (Ontologies) • Annotation/Association/Extraction of Knowledge with/from Underlying Data • Information Retrieval and Analysis (Distributed Querying, Search, Inference Middleware) • Semantic Discovery and Composition of Services • Distributed Computing/Communication Infrastructures • Component based technologies, Agent based systems, Web Services Repositories for managing data and semantic knowledge • Relational Databases, Content Management Systems, Knowledge Base Systems [V. Kashyap, 2002] CS690L - Lecture 4
What DB researchers have done ? • Data Mining • Probabilistic Databases • Workflow-based Coordination Systems • Security in Database Systems • Multimedia Databases • Text and Information Retrieval Systems • Image Databases • Semantic Data Models • Multi-database Schema Heterogeneity • Multi-database/Federated Database Schema Integration • Schema Evolution • Object Oriented/XML/Deductive Databases/Rule Based Systems • Mediators and Wrappers • Multidatabase/Federated Database Query Processing • DB Research is well positioned to contribute to the Semantic Web, but: • there has been little interest in issues related to Semantics in the DB community • the Semantic Web can be the underlying theme that ties in all the disparate pieces of work [V. Kashyap, 2002] CS690L - Lecture 4
What are the missing gaps ? • Ontology Integration/Interoperation • Problem is different from Schema Integration • Need to address “semantics” of relationships such as “synonyms”, “hyponyms”, etc. • Ontology Impedance/Mismatch • Relax the requirements of consistency and completeness • Should be able to characterize the “information error/loss” that occurs.. • Dynamic Ontologies • Need to relax the assumption of the “staticness” of database schemas Inferences based on Semantics of the Data • Has been relatively ignored by the DB community [V. Kashyap, 2002] CS690L - Lecture 4
What are the missing gaps ? • Semantics of Multimedia Data • Need to focus more on non-traditional data such as text, images, etc. • Need to focus on “annotation mechanisms” as an addition to wrappers/mediators • Semantics of Processes/Plans/Workflows • Performance/Scalability • A traditional strong point of DB research • The next wave of research (esp. in the context of the Semantic Web) will focus on re-use of pre-existing data models/schemas/ontologies that describes the content of information sources… [V. Kashyap, 2002] CS690L - Lecture 4
Inter-ontological relationships • Synonyms • leads to semantics preserving translations • Hyponyms/Hypernyms • lead to semantics altering translations • typically results in loss of recall and precision • List of Hyponyms • technical-manualhyponym manual • bookhyponym book • proceedings hyponym book • thesishyponym book • misc-publicationhyponym book • technical-reports hyponym book • presshyponym periodical-publication • periodicalhyponym periodical-publication [V. Kashyap, 2002] CS690L - Lecture 4
CS690L - Lecture 4 [V. Kashyap, 2002]
CS690L - Lecture 4 [V. Kashyap, 2002]
[V. Kashyap, 2002] CS690L - Lecture 4
Role of Ontologies • Content explication Ontologies are used for the explicit description of the information source Approaches: • Single ontology • Multiple ontology • Hybrid ontology • Query model • Verification (query containment) [H. Wache, 2002] CS690L - Lecture 4
Single Ontology Approach • SIMS • One global ontology • Hierarchical terminological database • Combination of several specialized ontolgies (for modularization) • Can be used when all information sources to be integrated provide nearly the same view on a domain • Minimal ontology commitment • Susceptible to changes in the information sources [H. Wache, 2002] CS690L - Lecture 4
Multiple Ontologies • OBSERVER • Each information source is described by its own ontology (source ontology) • No shared vocabulary • No common and minimal ontology commitment is needed • Simplifies integration and supports changes in sources • Difficult to compare different source ontologies • Inter-ontology mapping is needed [H. Wache, 2002] CS690L - Lecture 4
Multiple Ontologies • COIN • Semantics of each source is described by its own ontology • Built from a a global shared vocabulary • Shared vocabulary contains basic terms of a domain • New sources can easily be added • Supports acquisition and evolution of ontologies • Source ontologies are comparable because of shared vocabulary • Existing ontologies can not easily be reused, but have to be redeveloped from scratch [H. Wache, 2002] CS690L - Lecture 4
Query Model • Integrated global view • Global query schema • User formulates query in terms of the ontology • System reformulates queries in terms of sub-queries for each source • Structure of the query model should be more intuitive for the user [H. Wache, 2002] CS690L - Lecture 4
Mappings Connecting to Information Sources • Relate the ontologies to the actual content of an information source • Approaches • Structure resemblance Produce a one-to-one copy of the structure of the database and encode it in a language that makes automated reasoning possible • Definition of terms Use ontology to define terms from the database or the database scheme • Structure enrichment (most common) A logical model is built that resembles the structure of the information source and contains additional definitions and concepts Can be done using DLs • Meta-annotation Add semantic information to an information source ontobroker, SHOE [H. Wache, 2002] CS690L - Lecture 4
Inter-Ontological Mapping Defined Mappings (KRAFT) • special customized mediator agents • Great flexibility • Fails to ensure a preservation of semantics - no verification Lexical Relations (OBSERVER) • Extend a common DL model by quantified inter-ontology relationships • Synonym, hypernym, overlap, covering, disjoint • Do not have formal semantics [H. Wache, 2002] CS690L - Lecture 4
Inter-Ontological Mapping Top-level grounding (DWQ) • Relate all ontolgies used to a single top-level ontology • Inheriting concepts from a common top-level ontology • Can resolve conflicts and ambiguities Semantic correspondences • Rely on a common vocabulary • Uses semantic labels in order to compute correspondences • Subsumption reasoning can be used to establish relations between different terminolgies [H. Wache, 2002] CS690L - Lecture 4
Conclusions • Data Models/Schemas/Ontologies will form the critical infrastructure for the Semantic Web • Re-use of pre-existing data models/schemas/ontologies is crucial in describing the semantics of various information sources • There is a need to relax consistency and completeness requirements and estimate the “error” in the results returned. • Semantics of information should be used to minimize “error” in the information obtained • The new environment is likely to be more “dynamic” in nature – schemas, workflows, queries, etc. can no longer be assumed to be static… • DB research is well positioned to participate in the Semantic Web if it “adapts” to these new requirements…. CS690L - Lecture 4
References • Vipul Kashyap, The Semantic Web:Has the DB Community Missed the Bus (again ?) NSF Workshop on DB & IS Research for Semantic Web and Enterprises, April 3, 2002 • H.Wache, T.Vogele, U.Visser, H.Stuckenschmidt, G.Schuster, H.Neumann and S.Hubner, Ontology-Based Integration of Information: A Survey of Existing Approaches CS690L - Lecture 4