270 likes | 499 Views
Zhisheng Huang, Frank van Harmelen Vrije University Amsterdam Karlsruhe , Oct 28 th 200 8. Using Semantic Distances for Reasoning with Inconsistent Ontologies. One cannot live without inconsistency . Carl Jung (1875-1961) There is nothing constant in this world b ut inconsistency .
E N D
Zhisheng Huang,Frank van Harmelen Vrije University Amsterdam Karlsruhe, Oct28th 2008 Using Semantic Distances for Reasoning with Inconsistent Ontologies
One cannot live without inconsistency. Carl Jung (1875-1961) There is nothing constantin this world butinconsistency. Jonathan Swift (1667-1745)
The importance of the inconsistency problem • A key ingredient of the Semantic Web vision is avoiding to impose asingle ontology. Hence, merging ontologies is a key step. • Merging multipleontologies can quickly lead to inconsistencies[Hameed 2003]. • Migration and evolution also lead to inconsistencies.[Schlobach et al.2003, Haase et al. 2005]
The importance of the inconsistency problem (cont.) • Many ontologies are semantically solightweight (e.g. expressible in RDF Schema only thatthe inconsistency problem doesn't arise.) • Many of these semantically lightweight ontologies makeimplicit assumptions such as the Unique NameAssumption, or assumingthat sibling classes are disjoint. • If such assumptions aremade explicit, many ontologies turn out to be inconsistent.
Outline of This Talk • Framework of Reasoning with Inconsistent ontologies • Syntactic Approach • Semantic Approach • Implementation, Test, and Evaluation • Conclusions
Processing Inconsistent Ontologies • Debugging inconsistent ontologies • diagnose and repair it when we encounter inconsistencies (Schlobach, IJCAI 2003). • Reasoning with inconsistent ontologies • simply avoid the inconsistency and apply a non-standard reasoning method to obtain meaningful answers (Huang, van Harmelen, and ten Teije, IJCAI 2005).
What an inconsistency reasoner is expected • Given an inconsistent ontology, return meaningful answers to queries. • General solution: Use non-standard reasoning to deal with inconsistency • |= : the standard inference relations | : nonstandard inference relations
Formal notions of Reasoning with Inconsistent Ontologies • Various Answers • Accepted: • Rejected: • Over-determined: • Undetermined: • Soundness: (only classically justified results) • Meaningful: (sound & never over-determined)soundness +
Reasoning with inconsistent ontologies: Main Idea Starting from the query, • select consistent sub-theory by using a relevance-based selection function. • apply standard reasoning on the selected sub-theory to find meaningful answers. • If it cannot give a satisfying answer, the selection function would relax the relevance degree to extend consistent sub-theory for further reasoning.
Over-determined Processing • If selected data set is too large so that it leads to inconstenties, we need some kinds of backtracking, called over-determined processing. • Blind over-determined processing vs. Informed over-determined processing with threshold • First Maximal consistent Set (FMC) approach
Syntactic Relevance • Direct Syntactic relevance(0-relevance). • there is a common name in two formulas: C() C() R() R()I() I(). • K-relevance: there exist formulas 0, 1,…, k such that and 0, 0 and 1 , …, k and are directly relevant.
Semantic Relevance • Relevance is measured by using semantic information of data. • Selection functions are defined in terms of Semantic Distance SD(x,y).
Using Semantic Distances for Reasoning with Inconsistent Ontologies • Google distances are used to develop semantic relevance functions to reason with inconsistent ontologies. • Assumption: two concepts appear morefrequently in thesame web page, they are semantically more relevant.
Google Distances (Cilibrasi and Vitanyi 2004) • Google distance is measured in terms of the co-occurrence of two search items in the Web by Google search engine. • Normalized Google Distance (NGD) is introduced to measure the similarity/light-weight semantic relevance • NGD(x,y)= (max{log f(x), log f(y)}-log f(x,y))/(log M-min{log f(x),log f(y)} where f(x) is the number of Google hits for x f(x,y) is the number of Google hits for the tuple of search items x and y M is the number of web pages indexed by Google.
Normalized Google Distances • NGD(x, y) can be understood intuitively as a measurefor the symmetric conditional probability of co-occurrenceof the search terms x and y.
Semantic Distances between two formulas • Define semantic distances (SD) between two formulas in terms of semantic distances between two concepts/roles/individuals (NGD)
Semantic Distances by NGD Semantic distance are measured by the ratio of the summed distance of the difference between two formulae to the maximal distance between two formulae.
Proposition • The semantic distance SD satisfies the properties Range,Reflexivity, Symmetry, Maximum Distance, and Intermediate Values.
Example: MadCow NGD(MadCow, Grass)=0.7229 NGD(MadCow, Sheep)=0.6120
Implementation: PION PION: Processing Inconsistent ONtologies http://wasp.cs.vu.nl/sekt/pion
Answer Evaluation • Intended Answer (IA):Query answer = Intuitive Answer • Cautious Answer (CA):Query answer is ‘undetermined’, but Intutitve answer is ‘accepted’ or ‘rejected’. • Reckless Answer (RA):Query answer is ‘accepted’ or ‘rejected’, but Intutive answer is ‘undetermined’. • Counter Intuitive Answer (CIA):Query answer is ‘accepted’ but Intuitive answer is ‘rejected’, or vice versa.
Syntactic approach vs. Semantic approach: quality of query answers
Summary • The run-time of the semantic approach is much better than the syntactic approach, while the quality remains comparable. • The semantic approach can be parameterised so as to stepwise further improve the run-time with only a very small drop in quality.
Summary (cont.) • The semantic approach for reasoning with inconsistent ontologies trade-off computational cost for inferential completeness, and provide attractive scalability.
Questions? Thank you for your attention!