310 likes | 429 Views
An Efficient Method for Computing Alignment Diagnoses Christian Meilicke, Heiner Stuckenschmidt University of Mannheim Lehrstuhl für Künstliche Intelligenz {christian, heiner}@informatik.uni-mannheim.de. Problem Statement.
E N D
An Efficient Method for Computing Alignment Diagnoses Christian Meilicke, Heiner StuckenschmidtUniversity of Mannheim Lehrstuhl für Künstliche Intelligenz {christian, heiner}@informatik.uni-mannheim.de
Computing a local optimal diagnosis Problem Statement • Automatically and manually (!) generated ontology alignments are often incoherent • See OAEI-2008 results of conference track • => Incoherent alignments are a problem in many application scenarios* • Instance migration results in inconsistent ontologies • Query translation results in ‚a priori‘ empty result sets • Find a way to automatically repair incoherent alignments in a very efficient way, because … • ‚Agents on the web‘ require coherent alignments on the fly • Large ontologies require efficient algorithms * C.Meilicke and H.Stuckenschmidt. Incoherence as a Basis for Measuring the Quality of Ontology Mappings. OM-08.
Computing a local optimal diagnosis Outline • Alignment Semantics • Incoherence of an alignment, MIPS alignments • Alignment Diagnosis • Diagnosis, Minimal Hitting Set, Local Optimal Diagnosis • Computing a Local Optimal Diagnosis (LOD) • Brute-Force LOD and Efficient LOD • Experimental Results • Runtime, Quality of the Diagnosis
Computing a local optimal diagnosis "Natural" Semantics Merged Ontology <1#Person, 2#Person, =, 0.98> <1#hasName, 2#name, =, 0.87> <1#writtenBy, 2#docWrittenBy, = 0.7> <1#authorOf, 2#hasWritten, =, 0.56> <1#firstAuthor, 2#Author, ⊑ , 0.56> O1∪A O2 Correspondences An alignment A and two ontologies O1 and O2 O2 O1 1#firstAuthor ⊑ 2#Author Axioms 1#Person ≣ 2#Person …
Computing a local optimal diagnosis Incoherence of an Alignment Definition: Incoherence of an Alignment An alignment A between ontologies O1 and O2 is incoherent iff there exists an satisfiable concept i#C or property i#R in Oi {1,2} that is unsatisfiable in O1 ∪A O2. can be reduced to the satisfiability of ∃i#R.⊤ Definition: MIPS Alignment (minimal conflict set) Given an incoherent alignment A between ontologies O1 and O2. A subalignment M ⊆ A is a MIPS alignment (= minimal incoherence preserving subalignment) iff M is incoherent and there exists no M‘ ⊂ M such that M‘ is incoherent.
Computing a local optimal diagnosis "Terminology" Alignment Correspondence Alignmentwith MIPS shown as subsets Alignmentin a sequence ordered by confidencesMIPS depicted by red-dotted links
Computing a local optimal diagnosis Alignment Diagnosis Definition: Alignment Diagnosis Alignment ∆ ⊆ A is an alignment diagnosis for O1 and O2 iff A \ ∆ is coherent with respect to O1 and O2 and for each ∆‘ ⊂ ∆ alignment A \ ∆‘ is incoherent with respect to O1 and O2. Proposition: Alignment Diagnosis and minimal Hitting Sets Alignment ∆ ⊆ A is an alignment diagnosis for O1 and O2 iff ∆ is a minimal hitting set over all MIPS in A.
Computing a local optimal diagnosis Local Optimal Diagnosis (LOD) high confidence • Definition: Accused correspondence • A correspondence c A is accused by A iff there exists a MIPS in A with c M such that for all c‘ ≠ c in M it holds that • (1) conf(c‘) > conf(c) and • (2) c‘ is not accused by A. • Definition: Local optimal diagnosis (LOD) • The set of all accussed correspondences is referred to as local optimal diagnosis (LOD). important! low confidence
Computing a local optimal diagnosis Algorithm 1 1 2 3 4 5 6 7 8 9 10
Computing a local optimal diagnosis Algorithm 1 Coherent?YES! 1 2 3 4 5 6 7 8 9 10
Computing a local optimal diagnosis Algorithm 1 Coherent?YES! 1 2 3 4 5 6 7 8 9 10
Computing a local optimal diagnosis Algorithm 1 Coherent?NO! 1 2 3 4 5 6 7 8 9 10
Computing a local optimal diagnosis Algorithm 1 Coherent?Now it is! 1 2 3 4 5 6 7 8 9 10
Computing a local optimal diagnosis Algorithm 1 Coherent?YES! 1 2 3 4 5 6 7 8 9 10
Computing a local optimal diagnosis Algorithm 1 Coherent?YES! 1 2 3 4 5 6 7 8 9 10
Computing a local optimal diagnosis Algorithm 1 Coherent?NO! 1 2 3 4 5 6 7 8 9 10
Computing a local optimal diagnosis Algorithm 1 Coherent?Now it is! 1 2 3 4 5 6 7 8 9 10 … continue the same way
Computing a local optimal diagnosis Algorithm 1: Result • … and after a few more slides we would end up like this: 1 2 3 4 5 6 7 8 9 10 • Note: • 10 times checking coherence for constructing a local optimal diagnosis, which is a minimal hitting set over all MIPS • We have not computed a single MIPS alignment! First sketch: Meilicke,Völker, Stuckenschmidt. Learning Disjointness for Debugging Mappings between Lightweight Ontologies (EKAW-08) With focus on relation to belief revision discussed in: Qi, Ji, Haase: A Conflict-based Operator for Mapping Revision (ISWC-09)
Computing a local optimal diagnosis „Patternbased“ reasoning • Idea: Use incomplete method for incoherence detection in A‘ ⊆A • Classify O1 and O2 once, then check for each pair of correspondence in A‘ wether a certain pattern occurs • If pattern occurs for some pair of an alignment A‘, then A‘ is incoherent • If no pattern occurs A‘ can nevertheless be incoherent! Oj Oi
Computing a local optimal diagnosis That doesn‘t work … • Use the efficient coherence test instead of complete reasoning in algorithm described above • Reasoning about A' ⊆ A does not require to reason in O1 ∪A' O2, but is replaced by iterating over all pairs in A' • Hoewever: Resulting alignment might still be incoherent and ∆ is not a LOD • Missing out one MIPS might result in a chain of incorrect follow-up decisions! • Thus, afterwards removal of missed-out MIPS does not work! • How to exploit the efficient method while still constructing a LOD?
Computing a local optimal diagnosis Algorithm 2: Example 1 2 3 4 5 6 7 8 9 10 Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence
Computing a local optimal diagnosis Algorithm 2: Example Run the BF algorithm with efficient reasoning. Still incoherent? Verification Step: Use binary search to detect correspondence k such that A[0… k-1] is coherent and A[0 … k] is incoherent safe part, efficient reasoning did not fail up to k 1 2 3 4 5 6 7 8 9 10 k=8 incorrect part,recompute! Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence
Computing a local optimal diagnosis Algorithm 2: Example Run the main algorithm again with efficient reasoning for A[k+1 … n] where ∆1-k ∪ A[k] for A[1… k] is a fixed part of the resulting diagnosis. Still incoherent?If yes, we have knew > kold repeat again the same verification step A[1…k] 1 2 3 4 5 6 7 8 9 10 A[k+1…n] Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence
Computing a local optimal diagnosis Algorithm 2: Example Final result is a LOD. 1 2 3 4 5 6 7 8 9 10 Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence
Computing a local optimal diagnosis Runtime Considerations (Theory) • n = size of alignment A • m = number of times the binary search is applied • The "more complete„ pattern-based reasoning is => the less verification steps/ iterations are necesarry • Runtime of pattern based reasoning not really matters with respect to runtime! • Runtime Comparison • Brute Force LOD: O(n) • Efficient LOD: O(log(n) * m) • Do we have m << n ?
Computing a local optimal diagnosis Results: Runtime • Based on experiments with OAEI conference ontologies and submission from 2007/08 • Expressivity SHIN(D), ELI(D), SIF(D), ALCIF(D) • Four different state of the art matching systems n m • Better results for benchmark datasets: 5 to 10 times faster
Computing a local optimal diagnosis Results: Quality of Diagnosis • Removing the LOD results in an alignment with increased precision and slightly decreased recall => slightly increased f-measure • For alignments with low precision positive effects are very strong. • In rare cases an incorrect correspondences annotated with high confidence has negative effects
Computing a local optimal diagnosis Summary • Algorithm 1: Algorithm for computing a LOD • Without computing MIPS or MUPS! • Algorithm 2: General approach for improving the algorithms of type 1 • Shown for natural interpretation of correspondences as axioms and a specific type of incomplete reasoning • In principle applicable to each semantic for which we can find a similar efficient reasoning approach! • Good results for natural interpretation + pattern based reasoning: between 2 and 10 times faster!
Thanks for attention Questions? Computing a local optimal diagnosis
Back-Up Slides Computing a local optimal diagnosis
Computing a local optimal diagnosis Property Pattern Example ∃readPaper.⊤ ⊑ Reviewer Reviewer ⊑ Person Document ⊑¬Person O2 ∃reviewOfPaper.⊤ ∃readPaper.⊤ ≣ readPaper reviewOfPaper disjoint disjoint ≣ Document Document ∃reviewOfPaper.⊤ ⊑ Review ⊑ Document O1