250 likes | 418 Views
Block Matching for Ontologies. Wei Hu and Yuzhong Qu School of Computer Science and Engineering, Southeast University, P.R. China. Outline. Introduction Overview of the Approach Relatedness among Domain Entities Partitioning for Block Matching Evaluation Related Work Concluding Remarks.
E N D
Block Matching for Ontologies Wei Hu and Yuzhong QuSchool of Computer Science and Engineering, Southeast University, P.R. China XObjects Group - Southeast University
Outline • Introduction • Overview of the Approach • Relatedness among Domain Entities • Partitioning for Block Matching • Evaluation • Related Work • Concluding Remarks XObjects Group - Southeast University
Introduction • Ontology matching • Enabling interoperability among different but related ontologies • In practice, establishing mappings between domain entities • Block matching • The common relationship cardinality of mappings is 1:1. • However, mappings between sets of domain entities are more pervasive. • A block is a set of domain entities. • A block mapping is a pair of matched blocks from different ontologies. • Block matching is the process of discovering block mappings. XObjects Group - Southeast University
Introduction - Examples • From a microcosmic angle of view • Given two ontologies O1 and O2, O1 contains three domain entities Month, Day, Year; while O2 contains a single domain entity Date. It is more natural to match the block {Month, Day, Year} in O1 with the block {Date} in O2. • From a macroscopic angle of view • Block matching provides a general picture at a higher level to explore the correspondences between the main topics of ontologies. XObjects Group - Southeast University
Introduction (Cont’d.) • The block matching problem a special partitioning problem • All the block mappings compose a partitioning of all the domain entities from the two given ontologies. • The partitioning quality – cohesiveness & coupling • In addition, the mapping quality is inherently difficult to guarantee. • At present, most of the algorithms proposed in literature are targeted to find 1:1 mappings. • One exception – PBM • Only coping with mappings between classes – not a general solution • The mapping quality is not good enough for complicated ontologies. XObjects Group - Southeast University
Introduction – Our Approach • So, we propose a new partitioning-based approach to address the block matching problem. • The relatedness measure – Virtual Documents • Novelty – both the mapping quality & the partitioning quality can be guaranteed simultaneously. • The partitioning algorithm – A Hierarchical Bisection Algorithm • Novelty – providing block mappings at different levels of granularity. • Flat partitioning – extracting the optimal mappings with a given number of block mappings. XObjects Group - Southeast University
Overview of the Approach • Our approach starts with two ontologies as input, and then after four processing stages, the output returns block mappings. • Constructing virtual documents for domain entities • Computing relatedness among domain entities • Partitioning by a hierarchical bisection algorithm • Extracting the optimal block mappings XObjects Group - Southeast University
Onto1 Onto2 A Toy Example XObjects Group - Southeast University
Step 1 – Construction of Virtual Documents • A virtual document represents a collection of weighted tokens, which reflects the intended meaning of a domain entity. • The virtual document of a domain entity contains not only the local descriptions but also the neighboring information. • Local description – for a literal node / a URIref / a blank node • Neighboring information – subject / predicate / object neighbors XObjects Group - Southeast University
Step 2 – Computation of Relatedness • The similarity between virtual documents is measured by the Cosine value between two vectors, corresponding to the two virtual documents in the Vector Space Model. • Generating a relatedness matrix by computing the similarity among virtual documents within each of the two ontologies as well as crossing the two ontologies. • Both of linguistic and structural relatedness within each of the two ontologies are reflected in W11 and W22. • Linguistic relatedness crossing ontologies is characterized by W12. XObjects Group - Southeast University
Illustration by the Toy Example • VD(onto1:Report) • Local Description = “report” • Des(onto1:Reference) = “reference” • VD(onto1:Reference) • Local Description = “reference” • Des(onto1:Report) = “report”, Des(onto1:Book), Des(onto1:hasInstitution) • VD(onto2:Entry) • Local Description = “entry” • Des(onto2:Article), Des(onto2:Book), Des(onto2:hasInstitution) The relatedness between onto1:Report and onto1:Reference is revealed throughshared words (“report” & “reference”) obtained from neighboring relationship in Vector Space Model. The relatedness between onto1:Referenceand onto2:Entry is exploited by the shared words “book”, “institution”. XObjects Group - Southeast University
Step 3 – The Hierarchical Bisection Algorithm • The min-max cut (Mcut) function is adopted as the criterion function. • Why is a hierarchical algorithm? • It is easy to depict the partitioning for a given domain. • There may be several correct answers. • The overview of our partitioning algorithm • Input: a relatedness matrix W • It recursively bisects a matrix into two submatrices by finding the minimum Mcut. • Output: a dendrogram consisting of layers of block mappings. XObjects Group - Southeast University
Step 4 – Extraction of the Optimal Block Mappings • Obtaining a flat partitioning with a given number of block mappings pwhere g is the objective function: XObjects Group - Southeast University
Illustration by the Toy Example (Cont’d.) • The dendrogram for onto1 & onto2 is shown as follows. • If extracting 3 block mappings, then the selected ones are … √ √ √ XObjects Group - Southeast University
Illustration by the Toy Example (Cont’d.) XObjects Group - Southeast University
Evaluation – Experimental Methodology • We implement our approach in Java, called BMO. • BMO focuses on the domain entities at the conceptual level. • We evaluate the performance of BMO in three experiments: • The mapping quality of BMO • The partitioning quality of BMO • In addition, comparing BMO with PBM • For both the mapping quality and the partitioning quality XObjects Group - Southeast University
Evaluation – Case Study • Two pairs of ontologies – Russia12 and TourismAB • Russia12 • Russia1 – 151 classes & 76 properties • Russia2 – 162 classes & 81 properties • 85 reference alignments (1:1) • TourismAB • TourismA – 340 classes & 97 properties • TourismB – 474 classes & 100 properties • 226 reference alignments (1:1) XObjects Group - Southeast University
Evaluation – Evaluation Metrics • The mapping quality – observing the correctness with the variation of the number of the block mappings. • Rationale – the higher the quality of the block mappings is, the more reference alignments could be found in the block mappings. XObjects Group - Southeast University
Evaluation – Evaluation Metrics (Cont’d.) • The partitioning quality – comparing the computed block mappings by BMO with the manual ones set up by volunteers. • The f-measure is defined as a combination of the precision and recall. • The entropy considers the distribution of the domain entities in block mappings and reflects the overall partitioning quality. XObjects Group - Southeast University
Evaluation – Experimental Results • The correctness with the variation of the number of the block mappings n • The partitioning quality of BMO XObjects Group - Southeast University
Evaluation – Experimental Results (Cont’d.) • The comparison between BMO and PBM • The partitioning quality between the two approaches are almost the same. • But, the mapping quality of BMO is much better than the one of PBM. XObjects Group - Southeast University
Related Work • Ontology matching • There exist very few approaches raising the issue of block matching. • PBM – only for class hierarchies & the mapping quality isn’t good enough • In the field of schema matching • iMap – complex mapping, hard to specify the domain knowledge in some cases • Artemis – similar to our framework, but the partitioning quality isn’t so good • Ontology partitioning • Many existing works only provide a flat partitioning on a single ontology. • Our work is a hierarchical one & partitions two ontologies simultaneously. XObjects Group - Southeast University
Concluding Remarks • We discussed the block matching problem and suggested both the mapping quality and the partitioning quality should be considered in block matching. • We proposed a relatedness measure based on virtual documents that simultaneously importing both linguistic and structural characteristics of domain entities. • We presented a hierarchical bisection algorithm to provide block mappings at different levels of granularity. Also, we described a method to automatically extract the optimal block mappings. • We set up two kinds of metrics to evaluate of the quality of block matching. The experimental results demonstrated that our approach is feasible. XObjects Group - Southeast University
Concluding Remarks – Future Work • We would like to find other possible approaches to block matching, and compare them with each other. • We look forward to setting up systematic test cases for block matching. • We plan to address the block matching issue for very large ontologies. XObjects Group - Southeast University
Thanks for your attention! Any comment is welcome! XObjects Group - Southeast University