290 likes | 424 Views
Federated Ontology Search. Vasco Calais Pedro, Eric Nyberg and Jaime Carbonell Presenter: Pushkar Acharya. Overview. Introduction Ontological Search Ontology description and selection Merging Scoring Results Related Work Challenges and Future Work Conclusion. Introduction.
E N D
Federated Ontology Search Vasco Calais Pedro, Eric Nyberg and Jaime Carbonell Presenter: PushkarAcharya
Overview • Introduction • Ontological Search • Ontology description and selection • Merging • Scoring • Results • Related Work • Challenges and Future Work • Conclusion
Introduction • Large number of open-domain ontologies available • Cyc, SUMO, Omega, Thought Treasure, Swoogle, etc. • Offer easily accessible and open domain information
Introduction • Large number of open-domain ontologies available • Cyc, SUMO, Omega, Thought Treasure, Swoogle, etc. • Offer easily accessible and open domain information • CHALLENGES ?? • Information merging and reuse • Different frameworks and languages
Introduction • SOLUTIONS?? • At the Ontology provider side – Absorb all knowledge into a single ontology beforehand • Establish Full mapping between concepts and relations • Absorb other ontologies
Introduction • At the Ontology provider side – Absorb all knowledge into a single ontology beforehand • Drawbacks – • Non-scalability • Losing autonomy of ontological knowledge • Language level mismatches • Ontology level mismatches • Updating mappings as Ontologies are updated
Introduction • At the Ontology provider side – Absorb all knowledge into a single ontology beforehand • At application developer side – Querying each ontology individually
Introduction • At the Ontology provider side – Absorb all knowledge into a single ontology beforehand • At application developer side – Querying each ontology individually • Middleware • Query multiple ontologies and merge results • Form ontological chains and inferences
Introduction • At the Ontology provider side – Absorb all knowledge into a single ontology beforehand • At application developer side – Querying each ontology individually • Middleware • Only for small fragments of ontologies • On demand basis • Take advantage of redundant and complementary knowledge to improve performance • Parallelize query execution
Ontological Search • This approach will be successful only if the “search” is separated from information need and ontology. • Abstracts the formal representation of query as required by the ontologies • Describes 3 operators – • Rel(a, b, rels) • Parents(a) • Children(a) • By defining operators we delegate the their execution to ontologies. • Freedom to use extended features
Ontological Search • Constraint – The output of the query execution should be in form of a Rooted Directed Acyclic Graph
Ontological Search • Constraint – The output of the query execution should be in form of a Rooted Directed Acyclic Graph
Ontological Search • Two sub-problems – • Ontology Description and Selection • Merging and Scoring
Ontology Description and Selection • Goal : Selection of subset of relevant ontologies • Can be modeled as P(O,q) • Difficult with constant updates to ontologies • Use of inference engines and logic mechanisms • Evaluate relative utility of different ontologies by comparing results generated for given input query. • Comparison against gold set of queries. • Time consuming process • Paper uses a parameter to model general accuracy for a given resource. • Use of machine learning algorithms like expectation maximization
Merging • Reduces problem of merging ambiguous concepts • Primary goal is to find complementary information in the results • Makes the result more complete • Involves inexact graph matching and maximum common sub-graph problems • When dealing with non-isomorphic graphs • NP-Complete problem
Merging • Isomorphic graphs
Merging • Graph similarity • Cost Based Distance • Use of edit operations • Feature Based Distance • Use a set of invariants established from the graph structural description • Maximum Common Subgraph • Maximum clique detection
Merging • Localized Confidence Boosting • Confidence is indicative of the reliability of the association. • Graphs are broken into tuples (cx, cy, r) and merged if the tuples are similar. Confidence is boosted when merging using Soft Or –
Merging • Tuple Similarity • Based on the linear combination of edge similarity and concept similarity • Uses Q-Gram distance for comparing concepts or relations
Scoring • Score outcome of each operator before final score • Each operator focuses on either precision or recall • Precision operator : relation • Recall operator : similarity • Precision : relevant results in retrieved outcome • Recall : fraction of relevant instances that are retrieved
Scoring • Precision = relevant instances in outcome Total outcome • Recall = relevant instances in outcome Relevant results
Scoring • Precision scoring metric • Recall scoring metric
Results • Experimental Setup: Type Checking • Ontologies used: WordNet and ThoughtTreasure • 9558 pairs from Javelin question answering system in TREC QA • Gold standard, for a subset of full set of pairs, was created in order to test the accuracy
Results Improved Confidence after merging Recall Precision and recall
Related Work • Different way to approach same problem • FCA-merge algorithm • IF-Map method • PROMPT system • SWOOGLE • DRAGO
Challenges and Future Work • Current approach is not robust to relations in different ontologies differ significantly • Compare the structures in which the 2 concepts occur to determine similarity • Ontology description in constantly changing ontologies is difficult • Future Work: Model an ontology based on use of random queries to determine the domain of the ontology
Conclusions • Approach discussed here presents several benefits over full merge • Helps mitigate the issue of dynamic ontologies • Establishes a parallel to federated search