1 / 31

An Efficient Method for Computing Alignment Diagnoses

An Efficient Method for Computing Alignment Diagnoses Christian Meilicke, Heiner Stuckenschmidt University of Mannheim Lehrstuhl für Künstliche Intelligenz {christian, heiner}@informatik.uni-mannheim.de. Problem Statement.

december
Download Presentation

An Efficient Method for Computing Alignment Diagnoses

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Efficient Method for Computing Alignment Diagnoses Christian Meilicke, Heiner StuckenschmidtUniversity of Mannheim Lehrstuhl für Künstliche Intelligenz {christian, heiner}@informatik.uni-mannheim.de

  2. Computing a local optimal diagnosis Problem Statement • Automatically and manually (!) generated ontology alignments are often incoherent • See OAEI-2008 results of conference track • => Incoherent alignments are a problem in many application scenarios* • Instance migration results in inconsistent ontologies • Query translation results in ‚a priori‘ empty result sets • Find a way to automatically repair incoherent alignments in a very efficient way, because … • ‚Agents on the web‘ require coherent alignments on the fly • Large ontologies require efficient algorithms * C.Meilicke and H.Stuckenschmidt. Incoherence as a Basis for Measuring the Quality of Ontology Mappings. OM-08.

  3. Computing a local optimal diagnosis Outline • Alignment Semantics • Incoherence of an alignment, MIPS alignments • Alignment Diagnosis • Diagnosis, Minimal Hitting Set, Local Optimal Diagnosis • Computing a Local Optimal Diagnosis (LOD) • Brute-Force LOD and Efficient LOD • Experimental Results • Runtime, Quality of the Diagnosis

  4. Computing a local optimal diagnosis "Natural" Semantics Merged Ontology <1#Person, 2#Person, =, 0.98> <1#hasName, 2#name, =, 0.87> <1#writtenBy, 2#docWrittenBy, = 0.7> <1#authorOf, 2#hasWritten, =, 0.56> <1#firstAuthor, 2#Author, ⊑ , 0.56> O1∪A O2 Correspondences An alignment A and two ontologies O1 and O2 O2 O1 1#firstAuthor ⊑ 2#Author Axioms 1#Person ≣ 2#Person …

  5. Computing a local optimal diagnosis Incoherence of an Alignment Definition: Incoherence of an Alignment An alignment A between ontologies O1 and O2 is incoherent iff there exists an satisfiable concept i#C or property i#R in Oi  {1,2} that is unsatisfiable in O1 ∪A O2. can be reduced to the satisfiability of ∃i#R.⊤ Definition: MIPS Alignment (minimal conflict set) Given an incoherent alignment A between ontologies O1 and O2. A subalignment M ⊆ A is a MIPS alignment (= minimal incoherence preserving subalignment) iff M is incoherent and there exists no M‘ ⊂ M such that M‘ is incoherent.

  6. Computing a local optimal diagnosis "Terminology" Alignment Correspondence Alignmentwith MIPS shown as subsets Alignmentin a sequence ordered by confidencesMIPS depicted by red-dotted links

  7. Computing a local optimal diagnosis Alignment Diagnosis Definition: Alignment Diagnosis Alignment ∆ ⊆ A is an alignment diagnosis for O1 and O2 iff A \ ∆ is coherent with respect to O1 and O2 and for each ∆‘ ⊂ ∆ alignment A \ ∆‘ is incoherent with respect to O1 and O2. Proposition: Alignment Diagnosis and minimal Hitting Sets Alignment ∆ ⊆ A is an alignment diagnosis for O1 and O2 iff ∆ is a minimal hitting set over all MIPS in A.

  8. Computing a local optimal diagnosis Local Optimal Diagnosis (LOD) high confidence • Definition: Accused correspondence • A correspondence c  A is accused by A iff there exists a MIPS in A with c  M such that for all c‘ ≠ c in M it holds that • (1) conf(c‘) > conf(c) and • (2) c‘ is not accused by A. • Definition: Local optimal diagnosis (LOD) • The set of all accussed correspondences is referred to as local optimal diagnosis (LOD). important! low confidence

  9. Computing a local optimal diagnosis Algorithm 1 1 2 3 4 5 6 7 8 9 10

  10. Computing a local optimal diagnosis Algorithm 1 Coherent?YES! 1 2 3 4 5 6 7 8 9 10

  11. Computing a local optimal diagnosis Algorithm 1 Coherent?YES! 1 2 3 4 5 6 7 8 9 10

  12. Computing a local optimal diagnosis Algorithm 1 Coherent?NO! 1 2 3 4 5 6 7 8 9 10

  13. Computing a local optimal diagnosis Algorithm 1 Coherent?Now it is! 1 2 3 4 5 6 7 8 9 10

  14. Computing a local optimal diagnosis Algorithm 1 Coherent?YES! 1 2 3 4 5 6 7 8 9 10

  15. Computing a local optimal diagnosis Algorithm 1 Coherent?YES! 1 2 3 4 5 6 7 8 9 10

  16. Computing a local optimal diagnosis Algorithm 1 Coherent?NO! 1 2 3 4 5 6 7 8 9 10

  17. Computing a local optimal diagnosis Algorithm 1 Coherent?Now it is! 1 2 3 4 5 6 7 8 9 10 … continue the same way

  18. Computing a local optimal diagnosis Algorithm 1: Result • … and after a few more slides we would end up like this: 1 2 3 4 5 6 7 8 9 10 • Note: • 10 times checking coherence for constructing a local optimal diagnosis, which is a minimal hitting set over all MIPS • We have not computed a single MIPS alignment! First sketch: Meilicke,Völker, Stuckenschmidt. Learning Disjointness for Debugging Mappings between Lightweight Ontologies (EKAW-08) With focus on relation to belief revision discussed in: Qi, Ji, Haase: A Conflict-based Operator for Mapping Revision (ISWC-09)

  19. Computing a local optimal diagnosis „Patternbased“ reasoning • Idea: Use incomplete method for incoherence detection in A‘ ⊆A • Classify O1 and O2 once, then check for each pair of correspondence in A‘ wether a certain pattern occurs • If pattern occurs for some pair of an alignment A‘, then A‘ is incoherent • If no pattern occurs A‘ can nevertheless be incoherent! Oj Oi

  20. Computing a local optimal diagnosis That doesn‘t work … • Use the efficient coherence test instead of complete reasoning in algorithm described above • Reasoning about A' ⊆ A does not require to reason in O1 ∪A' O2, but is replaced by iterating over all pairs in A' • Hoewever: Resulting alignment might still be incoherent and ∆ is not a LOD • Missing out one MIPS might result in a chain of incorrect follow-up decisions! • Thus, afterwards removal of missed-out MIPS does not work! • How to exploit the efficient method while still constructing a LOD?

  21. Computing a local optimal diagnosis Algorithm 2: Example 1 2 3 4 5 6 7 8 9 10 Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence

  22. Computing a local optimal diagnosis Algorithm 2: Example Run the BF algorithm with efficient reasoning. Still incoherent? Verification Step: Use binary search to detect correspondence k such that A[0… k-1] is coherent and A[0 … k] is incoherent safe part, efficient reasoning did not fail up to k 1 2 3 4 5 6 7 8 9 10 k=8 incorrect part,recompute! Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence

  23. Computing a local optimal diagnosis Algorithm 2: Example Run the main algorithm again with efficient reasoning for A[k+1 … n] where ∆1-k ∪ A[k] for A[1… k] is a fixed part of the resulting diagnosis. Still incoherent?If yes, we have knew > kold repeat again the same verification step A[1…k] 1 2 3 4 5 6 7 8 9 10 A[k+1…n] Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence

  24. Computing a local optimal diagnosis Algorithm 2: Example Final result is a LOD. 1 2 3 4 5 6 7 8 9 10 Detectable by efficient method Only detectable by complete method Resolved due to removal of correspondence

  25. Computing a local optimal diagnosis Runtime Considerations (Theory) • n = size of alignment A • m = number of times the binary search is applied • The "more complete„ pattern-based reasoning is => the less verification steps/ iterations are necesarry • Runtime of pattern based reasoning not really matters with respect to runtime! • Runtime Comparison • Brute Force LOD: O(n) • Efficient LOD: O(log(n) * m) • Do we have m << n ?

  26. Computing a local optimal diagnosis Results: Runtime • Based on experiments with OAEI conference ontologies and submission from 2007/08 • Expressivity SHIN(D), ELI(D), SIF(D), ALCIF(D) • Four different state of the art matching systems n m • Better results for benchmark datasets: 5 to 10 times faster

  27. Computing a local optimal diagnosis Results: Quality of Diagnosis • Removing the LOD results in an alignment with increased precision and slightly decreased recall => slightly increased f-measure • For alignments with low precision positive effects are very strong. • In rare cases an incorrect correspondences annotated with high confidence has negative effects

  28. Computing a local optimal diagnosis Summary • Algorithm 1: Algorithm for computing a LOD • Without computing MIPS or MUPS! • Algorithm 2: General approach for improving the algorithms of type 1 • Shown for natural interpretation of correspondences as axioms and a specific type of incomplete reasoning • In principle applicable to each semantic for which we can find a similar efficient reasoning approach! • Good results for natural interpretation + pattern based reasoning: between 2 and 10 times faster!

  29. Thanks for attention Questions? Computing a local optimal diagnosis

  30. Back-Up Slides Computing a local optimal diagnosis

  31. Computing a local optimal diagnosis Property Pattern Example ∃readPaper.⊤ ⊑ Reviewer Reviewer ⊑ Person Document ⊑¬Person O2 ∃reviewOfPaper.⊤ ∃readPaper.⊤ ≣ readPaper reviewOfPaper disjoint disjoint ≣ Document Document ∃reviewOfPaper.⊤ ⊑ Review ⊑ Document O1

More Related