320 likes | 533 Views
RiMOM : A Dynamic Multistrategy Ontology Alignment Framework . By: Juanzi Li, Jie Tang, Yi Li and Qiong Luo Presenter: Abhijit Gali. RiMOM.
E N D
RiMOM: A Dynamic Multistrategy Ontology Alignment Framework By: Juanzi Li, Jie Tang, Yi Li and Qiong Luo Presenter: AbhijitGali
RiMOM A systematic approach to quantitatively estimate the similarity characteristics for each alignment task and strategy selection method to automatically combine the matching strategies based on two estimated factors
Problems faced by ontology alignment • Combination of different strategies for ontology alignment • When to use the combination strategies
Ontology • Definition 1: An ontology is a formal specification of a shared conceptualization. We describe the ontology as a 6-tuple : O={C, P, Hc, Hp, Ao, I} • OWL provides vocabularies to define the formal semantics of ontology • owl:Class and rdfs:subClassOf define the concepts and subconcepts • rdfs:Property and rdfs:subPropertyOf define property and subproperties • rdfs:domain and rdfs:range of a property define what concepts can have the property and what instances of the concepts can be the values of the property.
Concept description-Description(c): A concept c є C is described by a 4-tuple: Description (c)={Meta(c), Hier(c), Rest(c), Inst(c)} • Property description-Description(p): A property p єP is described by a 5-tuple: Description (p)= {Meta(p), Hier(p), Doma(p), Rang(p), Inst(p)}
Ontology alignment • Definition: Given two ontologies O1 and O2, an alignment (or alignment task) finds, for each entity in O1, a corresponding entity in O2. O1 is called the source ontology and O2 the target ontology. Align(O1,O2)={(ei1, ei2, coni, relationi)|ei1 є O1, ei2 є O2, coniє [0,1], relationiє (exact, narrower, broader, overlap)}
Dynamic Multistrategy Ontology Alignment • Goal- to detect a selection strategy and how confident we should be about the strategy • Tasks : a) Definition of criteria for selection strategy b) Dynamic selection of multiple strategies
Similarity Factors between Two Ontologies • Label similarity factor: similarity between two ontologies based on the entities’ names F_LS(O1, O2)=#iden_conc-label +#iden_prop_label max(|C1|+|P1|,|C2|+|P2|) • Structure similarity factor: similarity of two ontologies based on their structure information F_SS(O1, O2)= (#comm_nonl_conc+#comm_nonl_prop) (max(#nonl_C1+#nonl_P1, #nonl_C2+#nonl_P2)
Entity similarity • For two concepts : sim(e1,e2)= f( sim_Meta(e1,e2), sim_Hier(e1,e2), sim_Rest(e1,e2), sim_Inst(e1,e2) ) • For two properties: sim(e1,e2)= f (sim_Meta(e1,e2), sim_Hier(e1,e2), sim_Doma(e1,e2), sim_Rang(e1,e2),sim_Inst(e1,e2))
Overview of RiMOM • Preprocessing • Linguistic-based ontology alignment • Similarity combination • Similarity propagation • Alignment generation and refinement
Linguistic-Based Strategies • Edit-Distance-Based Strategy- involving calculation of sim_Name(w1, w2) and sim_Name(e1, e2) • Vector-Distance (VD)-Based Strategy
Structure-Based Strategies • Pairwise Connectivity Graph (PCG) construction and similarity propagation • Directed Labeled Graph(DLG) has edges represented by triple ( s,p,o) • Construction of DLG_O using HasSubConcept, HasSibling, HasProperty, HasRange, and HasSubProperty • Construction of SPG_O using nodes that are entity pairs from two ontologies that have some structural relationship in common
Feature Selection in Vector-Distance-BasedStrategy • Determination of Hierarchical Information Use: F_SS> threshold ε1 • Enhancement of Structure Information: Depends on the path length from the root concept, the number of properties, and the number of subconcepts of the current entity
Weight Calculation of Similarity Combination • sim(e1,e2)= (wnameσ(sim_Name(e1,e2))+wvec σ(sim_Vec(e1,e2))) (wname+wvec) σ(x)= 1/(1+exp(-5(x-α))), where α=0.5 wname= F_LS/ max(F_LS, F_SS) wvec= F_SS/max(F_LS, F_SS)
Selection of Similarity Propagation Strategy: • Concept-Concept(CC)- HasSubclass and HasConceptSibling relations • Concept-Property(CP)- HasRange and HasProperty relations • Property-Property(PP)- HasSubproperty and HasPropertySibling relations • Parameter Setting
Test Sets and Evaluation Methods • Benchmark Data Set in OAEI 2006 Name, comments, specialization hierarchy, instances, properties, classes, additions of 4 real ontologies • Directory and Food Data Sets in OAEI 2006 i) SKOS version of the United Nations Food ii) SKOS version of the United States National Agricultural Library • Evaluation Metrics: i) Precision(P) ii) Recall (R)
(a) F SS in VD-based strategy (b) F SS in SF (c) Combined effects of F SS
Result on OAEI 2007 Graph of the precision and recall. (a) OAEI 2006. (b) OAEI 2007
Summary • High performance • Effectiveness of strategy selection • Contribution of the SF strategy • Inefficiency for dealing with large-scale ontologies
Related Work • Schema Matching- COMA , Rondo, and Cupid are three composite methods • Ontology alignment and the combination of multiple ontology alignment strategies • Structure-based ontology alignment • Relationship with other alignment methods
CONCLUSION • A multistrategy framework, RiMOM, to automatically and dynamically compose strategies for individual ontology alignment tasks was proposed • Experimental results on the data sets from OAEI 2006 and OAEI 2007 demonstrate that the system performs better than most of the participants and is among the top three performers on the benchmark data sets