80 likes | 88 Views
This article discusses the use of node ranking algorithm for ontology matching and schema integration, providing a mechanism for contextual analysis and improving alignment. It explores the concepts of ontology, ontology matching, and the node ranking algorithm. The algorithm is not dependent on the method chosen for lexical analysis and can be used with other matching algorithms for better alignment. The results show that successful matches can be achieved by correctly matching more than 50% of nodes in the given ontologies.
E N D
Ontology Matching and Schema Integration using Node Ranking Authors:- Asankhaya Sharma Dr. D.V.L.N. Somayajulu Affiliation:- Department of Computer Science and Engineering National Institute of Technology Warangal Warangal, AP, India
Ontology • An ontology is a specification of a conceptualization • e.g. Concept Hierarchies, Classifications, Schemas • Ontology can be represented as graph like structure • There are usually a lot of ontologies for similar things arising from different sources • One of the major problem is to match these ontologies • This problem is similar to the one encountered while integrating data from heterogeneous databases (Schema Integration) • Ontology is the preferred way to describe information in semantic web and web services.
Ontology Matching • There are several commercial ontology matchers like COMA++,CROSI etc. • COMA uses taxonomy based matching, but it requires that taxonomies be standardized and include all possible hypernym • CROSI has a ontology matching system which combines several matchers to give better alignment • CROSI can incorporate other methods as well • Some form of lexical analysis is helpful in ontology matching as specifications are usually similar lexically
Node Ranking • The node ranking algorithm works in two steps • We assume we have the ontology represented as a graph • For each node we assign the lexical similarity measure (called the lex_sim) by comparing it lexically with other nodes in the ontology graph to be matched • We use a lexical similarity algorithm like the longest common subsequence to determine the lex_sim • Now each node has an attribute called lex_sim associated with it, an assertion is made that the lexical similarity is propagated down the hierarchy
Node Ranking • Node rank of a node A is calculated by the following formula node_rank(A) = α*lex_sim(A)+ β*lex_sim(parent(A)) + γ*lex_sim(grandparent(A)) • α,β, and γ are constants which determine the degree of propagation of the lexical similarity to descendent nodes • Node ranking is done such that it is a normal distribution of the lexical similarity of the ancestors of that node • Node ranking provides a mechanism of doing contextual analysis driving solely • After node ranking the nodes are clustered based on node ranks and that determines the matching
Results • Ontologies taken from http://www.daml.org/ontologies • Successful match is determined by matching more than 50% of nodes in the given ontologies correctly.
Conclusions • The algorithm is not dependent on the method chosen for lexical analysis • The node ranking algorithm works as long as there is some initial information available about the fields to be matched with in the nodes (like the lexical similarity) • Other methods can be used to calculate the initial rank, including boot strapping the algorithm itself • This method is very fast since it doesn’t require taxonomy lookups like other methods • Node ranking can be used with other algorithms like CROSI to give better alignment
References • COMA++/COMA , http://dbs.unileipzig.de/Research/coma.html , University of Leipzig, Germany • CROSI, http://www.aktors.org/crosi/, University of Southampton/Hewlett Packard Labs, UK • MetaQuerier, http://metaquerier.cs.uiuc.edu/, University of Illinois as Urbana-Champaign, USA • http://www.ontologymatching.org, Online resource for all information related to ontology matching