80 likes | 248 Views
Ontology Matching and Schema Integration using Node Ranking. Authors:- Asankhaya Sharma Dr. D.V.L.N. Somayajulu Affiliation:- Department of Computer Science and Engineering National Institute of Technology Warangal Warangal, AP, India. Ontology.
E N D
Ontology Matching and Schema Integration using Node Ranking Authors:- Asankhaya Sharma Dr. D.V.L.N. Somayajulu Affiliation:- Department of Computer Science and Engineering National Institute of Technology Warangal Warangal, AP, India
Ontology • An ontology is a specification of a conceptualization • e.g. Concept Hierarchies, Classifications, Schemas • Ontology can be represented as graph like structure • There are usually a lot of ontologies for similar things arising from different sources • One of the major problem is to match these ontologies • This problem is similar to the one encountered while integrating data from heterogeneous databases (Schema Integration) • Ontology is the preferred way to describe information in semantic web and web services.
Ontology Matching • There are several commercial ontology matchers like COMA++,CROSI etc. • COMA uses taxonomy based matching, but it requires that taxonomies be standardized and include all possible hypernym • CROSI has a ontology matching system which combines several matchers to give better alignment • CROSI can incorporate other methods as well • Some form of lexical analysis is helpful in ontology matching as specifications are usually similar lexically
Node Ranking • The node ranking algorithm works in two steps • We assume we have the ontology represented as a graph • For each node we assign the lexical similarity measure (called the lex_sim) by comparing it lexically with other nodes in the ontology graph to be matched • We use a lexical similarity algorithm like the longest common subsequence to determine the lex_sim • Now each node has an attribute called lex_sim associated with it, an assertion is made that the lexical similarity is propagated down the hierarchy
Node Ranking • Node rank of a node A is calculated by the following formula node_rank(A) = α*lex_sim(A)+ β*lex_sim(parent(A)) + γ*lex_sim(grandparent(A)) • α,β, and γ are constants which determine the degree of propagation of the lexical similarity to descendent nodes • Node ranking is done such that it is a normal distribution of the lexical similarity of the ancestors of that node • Node ranking provides a mechanism of doing contextual analysis driving solely • After node ranking the nodes are clustered based on node ranks and that determines the matching
Results • Ontologies taken from http://www.daml.org/ontologies • Successful match is determined by matching more than 50% of nodes in the given ontologies correctly.
Conclusions • The algorithm is not dependent on the method chosen for lexical analysis • The node ranking algorithm works as long as there is some initial information available about the fields to be matched with in the nodes (like the lexical similarity) • Other methods can be used to calculate the initial rank, including boot strapping the algorithm itself • This method is very fast since it doesn’t require taxonomy lookups like other methods • Node ranking can be used with other algorithms like CROSI to give better alignment
References • COMA++/COMA , http://dbs.unileipzig.de/Research/coma.html , University of Leipzig, Germany • CROSI, http://www.aktors.org/crosi/, University of Southampton/Hewlett Packard Labs, UK • MetaQuerier, http://metaquerier.cs.uiuc.edu/, University of Illinois as Urbana-Champaign, USA • http://www.ontologymatching.org, Online resource for all information related to ontology matching