1 / 17

Integrating ontological and linguistic knowledge for Conceptual Information Extraction

Integrating ontological and linguistic knowledge for Conceptual Information Extraction. Roberto Basili, Michele Vindigni, Fabio Massimo Zanzotto Università di Roma “Tor Vergata” Italy. professor. course. teacherOf. professor(“XYZ”) course(“Database Theory”) teacherOf(

walter
Download Presentation

Integrating ontological and linguistic knowledge for Conceptual Information Extraction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Integrating ontological and linguistic knowledge for Conceptual Information Extraction Roberto Basili, Michele Vindigni, Fabio Massimo Zanzotto Università di Roma “Tor Vergata” Italy

  2. professor course teacherOf professor(“XYZ”) course(“Database Theory”) teacherOf( professor(“XYZ”), course(“Database Theory”)) Motivation

  3. Motivation • Linguistic interfaces are relevant for helping final users to deal with standardised conceptualizations (or ontologies) • Information Extraction systems may be used for this purpose. These require conceptualisations with: • high domain specificity • high coverage • High coverage and domain specific conceptualisations are difficult build, reuse of pre-existing knowledge is a must.

  4. fiber None of the dendrites were cut. ...but the dendrites and axons are often cut. Researchers don't know why, but for some reason, the ends of dendrites tangle and knot. nerve fiber dendrite Reusing conceptualisations... Let us then take a Domain Concept Hierarchy: e.g. Medical Subject Headings (MESH) and a concept:Dendrite Dendrite Neuron  Nervous System Dendrite Cell Surface Extension  Cellular Structure  Cell Dendrite  Neuron  Cell

  5. Target Problem • Integration of linguistic information (Lexical Knowledge Base, LKB) with domain knowledge (Domain Concept Hierarchy, DCH) • Need to harmonise linguistic processing (i.e. feature detection in text) with available resource (DCH) • Need to annotate texts with semantic information, that is build a linguistic interface to DCH

  6. Target problem Domain Concept Hierarchy Lexical Knowledge Base

  7. Inspiring Principles (P1)Extensional Nature of Domain Concept Hierarchy Subsumption in the DCH has an extensional interpretation in the LKB a4 a5 a1 a2 a3 Domain Concept Hierarchy Lexical Knowledge Base

  8. Inspiring Principles (P2) Intensional Strength in Lexical Knowledge Base Given a set of words W whose senses are subsumed by a in LKB, the intensional strength measures the trade-off between • the generalization required to model all the words • the capability of separating individual word senses in W

  9. Mean Tree Area of n Words CD = Actual Tree Area Word1 Word2 Word3 Intensional Strength in LKBConceptual Density 1 3 2 6 4 5 15 9 8 7 10 11 12 13 14

  10. Mapping Algorithm Preliminary definitions Extension of C ext(C)={tc’ in DCH|c subsumes C’ in DCH} Linguistic Generalisation of C lgen(C)={a in LKB|t in ext(C) and at is subsumed by a in LKB}

  11. P1 (Extensional Nature of DCH) P2 (Intensional Strength of LKB) Mapping Algorithm merge(DCH,LKB,T) CT Step 1 Determine the linguistic extensions lgen(C) in DCH made of all descendants of C Step 2 Compute the optimal mapping G(C)lgen(C), by a greedy selection maximizing the conceptual density Step 3 Attach tC to senses in G(C) Step 4text(C) Attach t to LKB iff:  is a sense for t in LKB and G(C) |  subsumes in LKB

  12. C=t ext(t)={t1,t2,t3,t4} lgen(t) ={ ,..., } t G(t)={a5,a4} t1 t2 t3 t4 Mapping Algorithm Step 1 Domain Concept Hierarchy a6 a4 a5 Step 2 a1 a2 Conceptual Density a3 Lexical Knowledge Base

  13. t t1 t2 t3 t4 Mapping Algorithm Step 3 Domain Concept Hierarchy Attach tC to senses in G(C) a6 a4 a5 Step 4 a1 a2 a3 text(C) Attach t to LKB Lexical Knowledge Base

  14. A case study: mapping MeSH in WordNet • Medical Subject Headings (MeSH) as Domain Concept Hierarchy • WordNet as Lexical Knowledge Base

  15. Mapping results: an excerpt

  16. Summary • Target Probem: Mapping a DCH in a LKB for Information Extraction • Solution: • Inspiring Principles: • Extensional Nature of DCH • Intesional Streght in LKB • The notion of conceptual density (Agirre & Rigau, 1996) • A novel mapping algorithm between DCH and LKB • A case study: MeSH in WordNet

  17. Conclusions and future work • If a Domain Concept Hierarchy is available, the presented method is a viable solution to integrate it in WordNet. But, what when it is not available? How to learn taxonomical relations from text collections? Moreover, how to induce different kind of relations between concepts?

More Related