1 / 44

Ontology mapping needs context & approximation

Ontology mapping needs context & approximation. Frank van Harmelen Vrije Universiteit Amsterdam. Or: . How to make ontology-mapping less like data-base integration. and more like a social conversation. Three. Two obvious intuitions. Ontology mapping needs background knowledge.

abe
Download Presentation

Ontology mapping needs context & approximation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ontology mapping needs context & approximation Frank van Harmelen Vrije Universiteit Amsterdam

  2. Or: • How to make ontology-mapping less like data-base integration • andmore like a social conversation

  3. Three Two obvious intuitions • Ontology mapping needs background knowledge • The Semantic Web needs ontology mapping • Ontology mapping needs approximation

  4. Which Semantic Web? • Version 1:"Semantic Web as Web of Data" (TBL) • recipe:expose databases on the web, use RDF, integrate • meta-data from: • expressing DB schema semantics in machine interpretable ways • enable integration and unexpected re-use

  5. Which Semantic Web? • Version 2:“Enrichment of the current Web” • recipe:Annotate, classify, index • meta-data from: • automatically producing markup: named-entity recognition, concept extraction, tagging, etc. • enable personalisation, search, browse,..

  6. Which Semantic Web? • Version 1:“Semantic Web as Web of Data” • Version 2:“Enrichment of the current Web” data-oriented • Different use-cases • Different techniques • Different users user-oriented

  7. Which Semantic Web? • Version 1:“Semantic Web as Web of Data” • Version 2:“Enrichment of the current Web” • But both need ontologies for semantic agreement between sources between source & user

  8. Ontology research is almost done.. • we know what they are“consensual, formalised models of a domain” • we know how to make and maintain them (methods, tools, experience) • we know how to deploy them(search, personalisation, data-integration, …) Main remaining open questions • Automatic construction (learning) • Automatic mapping (integration)

  9. Three obvious intuitions • The Semantic Web needs ontology mapping • Ontology mapping needs background knowledge Ph.D. student AIO ?= • Ontology mapping needs approximation ?≈ young researcher post-doc

  10. This work with Zharko Aleksovski &Michel Klein

  11. Does context knowledge help mapping?

  12. anchoring anchoring The general idea background knowledge inference source target mapping

  13. a realistic example • Two Amsterdam hospitals (OLVG, AMC) • Two Intensive Care Units, different vocab’s • Want to compare quality of care • OLVG-1400: • 1400 terms in a flat list • used in the first 24 hour of stay • some implicit hierarchy e.g.6 types of Diabetes Mellitus) • some reduncy (spelling mistakes) • AMC: similar list, but from different hospital

  14. Context ontology used • DICE: • 2500 concepts (5000 terms), 4500 links • Formalised in DL • five main categories: • tractus (e.g. nervous_system, respiratory_system) • aetiology (e.g. virus, poising) • abnormality (e.g. fracture, tumor) • action (e.g. biopsy, observation, removal) • anatomic_location (e.g. lungs, skin)

  15. Baseline: Linguistic methods • Combine lexical analysis with hierarchical structure • 313 suggested matches, around 70 % correct • 209 suggested matches, around 90 % correct • High precision, low recall (“the easy cases”)

  16. anchoring anchoring Now use background knowledge DICE (2500 concepts, 4500 links) inference OLVG (1400, flat) AMC (1400, flat) mapping

  17. Example found with context knowledge (beyond lexical)

  18. Example 2

  19. Anchoring strength • Anchoring = substring + trivial morphology

  20. Experimental results • Source & target = flat lists of ±1400 ICU terms each • Background = DICE (2300 concepts in DL) • Manual Gold Standard (n=200)

  21. Does more context knowledge help?

  22. Adding more context • Only lexical • DICE (2500 concepts) • MeSH (22000 concepts) • ICD-10 (11000 concepts) • Anchoring strength:

  23. Results with multiple ontologies • Monotonic improvement • Independent of order • Linear increase of cost Joint

  24. does structured context knowledge help?

  25. FMA (75.000) anchoring anchoring inference CRISP (738) MeSH (1475) mapping Exploiting structure • CRISP: 700 concepts, broader-than • MeSH: 1475 concepts, broader-than • FMA: 75.000 concepts, 160 relation-types(we used: is-a & part-of)

  26. a a i Using the structure or not ? • (S <a B) & (B < B’) & (B’ <a T) ! (S <i T)

  27. Using the structure or not ? • (S <a B) & (B < B’) & (B’ <a T) ! (S <i T) • No use of structure • Only stated is-a & part-of • Transitive chains of is-a, andtransitive chains of part-of • Transitive chains of is-a and part-of • One chain of part-of before one chain of is-a

  28. Examples

  29. Examples

  30. Matching results (CRISP to MeSH) (Golden Standard n=30)

  31. Three obvious intuitions • The Semantic Web needs ontology mapping • Ontology mapping needs background knowledge • Ontology mapping needs approximation ?≈ young researcher post-doc

  32. This work with Zharko Aleksovski Risto Gligorov Warner ten Kate

  33. B2 B Approximating subsumptions(and hence mappings) • query: A v B ? • B = B1u B2u B3 Av B1, Av B2, Av B3 ? B1 B3 A

  34. Approximating subsumptions • Use “Google distance” to decide which subproblems are reasonable to focus on • Google distance where f(x) is the number of Google hits for x f(x,y) is the number of Google hits for the tuple of search items x and y M is the number of web pages indexed by Google • symmetric conditional probability of co-occurrence • estimate of semantic distance • estimate of “contribution” to B1 u B2 u B3

  35. Google distance animal plant sheep cow vegeterian madcow

  36. Google for sloppy matching • Algorithm for A v B (B=B1u B2u B3) • determine NGD(B, Bi)=i, i=1,2,3 • incrementally: • increase sloppyness threshold  • allow to ignore A v Bi with i · • match if remaining A v Bjhold

  37. Properties of sloppy matching • When sloppyness threshold goes up,set of matches grows monotonically • =0: classical matching • =1: trivial matching • Ideally: compute i such that: • desirable matches become true at low  • undesirable matches become true only at high  • Use random selection of Bi as baseline ?

  38. ArtistGigs CDNow (Amazon.com) All Music Guide MusicMoz Yahoo Size: 403 classes Size: 1073 classes Size: 222 classes Size: 2410 classes Size: 382 classes Size: 96 classes Depth: 2 levels Depth: 3 levels Depth: 7 levels Depth: 5 levels Depth: 4 levels Depth: 2 levels CD baby Artist Direct Network Experiments in music domain Size: 465 classes very sloppy terms  good Depth: 2 levels

  39. Experiment Manual Gold Standard, N=50, random pairs  =0.53 97  =0.5 60 classical precision random NGD recall 20 16-05-2006 7

  40. wrapping up

  41. Three obvious intuitions • Ontology mapping needs background knowledge • The Semantic Web needs ontology mapping • Ontology mapping needs approximation

  42. So that • shared context & approximationmake ontology-mapping a bit more like a social conversation

  43. anchoring anchoring Future: Distributed/P2P setting background knowledge inference source target mapping

  44. Frank.van.Harmelen@cs.vu.nl http://www.cs.vu.nl/~frankh Vragen & discussie

More Related