1 / 25

Oana Adriana Şoica

Building and Ordering a SenDiS Lexicon Network. Oana Adriana Şoica. SenDiS operates on a specific lexicon network ( LexNet ) – “sense tagged glosses” relations lexicon networks obtained from other semantic / lexical relations obtaining a SenDiS LexNet :

bryson
Download Presentation

Oana Adriana Şoica

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Building and Ordering a SenDiS Lexicon Network Oana Adriana Şoica

  2. SenDiS operates on a specific lexicon network (LexNet) – “sense tagged glosses” relations lexicon networks obtained from other semantic / lexical relations obtaining a SenDiS LexNet: build a “sense tagged glosses” LexNet(manually annotate the lexicon with a specific tool) import a “sense tagged glosses” LexNet (WordNet tagged glosses, as of 2008) preprocessing (ordering) the SenDiS LexNet (before WSD) truncation of the LexNet leveling the LexNet Outline

  3. hypernyms hyponyms similar to has part Semantic/Lexical Relations • synonyms • antonyms • holonyms • meronyms • coordinate terms • troponyms • entailment

  4. Semantic/Lexical relations: WordNet An excerpt of the WordNet semantic network * Navigli, R. 2009.Word sense disambiguation: A survey. ACM Comput. Surv. 41, 2, Article 10 (2009)

  5. Semantic/Lexical relations: GRAALAN

  6. manually annotating the glosses from a lexicon(using a specific tool that can ease the process) importing an existing “gloss tagged” lexicon net (also obtained manually or semi-automatically), this usually translates in a dependency to a specific list of meanings/glosses Obtaining a SenDiS LexNet

  7. implied a significant effort, usually measured in months, involving several trained linguists using a specialized collaborative tool(BuildLNTool – Build Lexicon Network Tool) enriching the “gloss tagged” relation with three relative degrees of importance (in the gloss context) weak medium strong or ignoring the gloss word SenDiS objective, two LexNets: “gloss tagged” LexNet for the Romanian language “gloss tagged” LexNet for the English language Creating the SenDiS LexNet

  8. BuildLNTool (Build Lexicon Network Tool) provides: a visual and effective mechanism to manually annotate the lexicon glosses a synchronized overview of the already created relations a browsing mechanism for inspecting the already tagged glosses and relations BuildLNTool

  9. BuildLNTool - Sections “Lemma \ MWE Info” “Lemmas & MWEs” “Competence & Definition Trees” Messages and progress “Root & Leaf Meanings”

  10. “Lemmas & MWEs”: list of lexicon entries “Root & Leaf Meanings”: list of roots and leafs for the lexicon network “Lemma/MWE Info”: current lexicon entry being analyzed “Competence & Definition Trees”: spanning trees for a given meaning over the current lexicon net section formessagesand progress BuildLNTool – Sections II

  11. BuildLNTool – Lemmas & MWEs selection of lexicon entry type selection of viewing interval selection of unfinished lexicon entries filter text filter lexicon entry text lexicon entry status

  12. BuildLNTool – Selection of a current lexicon entry double click

  13. BuildLNTool – Browsing the meanings of the current lexicon entry lexicon entry text morphologic interpretation list of meanings filters meaning/gloss fully tagged meaning/gloss partially tagged meaning/gloss not tagged

  14. BuildLNTool – Selection of a current meaning for tagging double click

  15. BuildLNTool – Gloss constituent without interpretations unrecognizedgloss constituent ‘Enter’

  16. BuildLNTool – Degrees of relevance (in gloss context) Default setting: Medium

  17. BuildLNTool – Degrees of relevance II ‘Strong’ tokens ‘Medium’ tokens ‘Weak’ tokens Ignored (X)tokens

  18. BuildLNTool – Gloss tagging Savedannotations Unsavedannotations

  19. BuildLNTool – Gloss tagging protocol viewof meaning tagging tree selection of constituent / group of gloss constituents edit text of gloss constituent set/ modify relevance degree withoutsense interpretations current gloss constituent select/ modify the sense forthe gloss constituent further annotatemeaning / save annotations further on chose the next meaning save annotations

  20. Built LexNets for Romanian and English

  21. WordNet (3.0) is organized in synsets 117,659 synsets 155,287 words (lexicon entries) 206,941 word-sense pairs (gloss + usage examples) the synsets were split and transformed in to a classical lexicon format the lexicon network imported: Imported WordNet tagged glosses

  22. “gloss tagged” lexicon nets are large and dense graphs between 100,000 and 200.000 vertices over 1,000,000 edges / arcs to ease the operation with such graphs, “gloss tagged” lexicon nets can be preprocessed and optimized truncation of a lexicon net leveling of a lexicon net aims when optimizing a lexicon net elimination of loops or strong connected components a minimum number of removed edges leveling on a minimum number of levels minimization/maximization of roots/leafs vertices Ordering a SenDiS LexNet

  23. e9 e4 e5 e6 e7 e8 e1 e2 e3 Unordered LexNet A minimal lexicon net in the original form

  24. V e9 11 e5 10 e4 9 e2 8 e1 7 6 e3 e6 5 e7 4 e8 3 e10 2 1 e11 B Ordered (leveled) LexNet The same minimal lexicon net leveled

  25. Results on leveling experimental LexNets

More Related