1 / 30

Word sense disambiguation Study on word net ontology

Word sense disambiguation Study on word net ontology. Akilan Velmurugan Computer Networks – CS 790G. Overview. What is WSD ? How wordnet is analyzed as a Complex Network What are the results Project Methodology Area of study Key Findings/Results New approaches Improvement techniques

seth
Download Presentation

Word sense disambiguation Study on word net ontology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Word sense disambiguationStudy on word net ontology Akilan Velmurugan Computer Networks – CS 790G

  2. Overview • What is WSD ? • How wordnet is analyzed as a Complex Network • What are the results • Project Methodology • Area of study • Key Findings/Results • New approaches • Improvement techniques • Conclusion

  3. Project Description • Objective • Study on WSD • Effects of WSD in Word Sense Ontology • Characteristics of WordNet • Results • How do match words with other words • Parameters taken for study of word sense • Improvise them by making necessary changes • Study network characteristics

  4. WordNet - overview • Machine readable semantic dictionary interlinked by semantic relations • Developed at Princeton University as a large lexical database for English language • Most widely used linguistic resource • Free for public (GPL ) • Forms a scale free network with small average shortest path having words as nodes and concepts as links • Easily navigable

  5. WordNet (Structure) • Shows the relation in the form of • Noun, Verb, Adjective, adverb • Synonym • Hypernym (Is a kind of …) • Hyponym (… Is a kind of) • Troponym (particular ways to …) • Meronym (parts of …) • ---- about 25 relations • Also available for online navigation

  6. WordNet online - by Princeton University WordNet online

  7. WordNet Browser WordnetApplication

  8. WordNet (working) • WSD: • Corpus based approaches • Set of samples that enables the system • Knowledge based approaches • Machine readable dictionary with relations • WordNet Research • Open source • Ranking of synsets derived from word frequencies in the British National Corpus • Top 1000 • Content manipulation of text • Dataset I – controlled and calibrated study • Dataset II – collected using mechanical trunk using pairs WordNet Database

  9. Word Sense Disambiguation (WSD) • Task of determining the meaning of an ambiguous word in the given context • Bank • Edge of a river or • Financial institution that accepts money • Refers to the resolution of lexical semantic ambiguity and its goal is to attribute the correct senses to words (AI-complete problem)

  10. WSD: Area of Research • Assigning correct sense to words having electronic dictionary as source of word definitions • Open research field in Natural Language Processing (NLP) • Hard Problem which is a popular area for research • Used in speech synthesis by identifying the correct sense of the word

  11. JavaScript Visual WordNet Visual WordNet

  12. Visual Thesaurus Visual Thesaurus

  13. WordNet – Theoretical aspects • Wordnet – word sense ontology • Symbols are words • Synset: list of words and semantic relations between them • Word sense disambiguation • Wordnet structure using latent semantics • Variable lexical notation for a concept • Citibase – Thesaurus • Semantic relatedness • And few others…

  14. WSD: using latent semantics • Measures the semantic distance of concepts • Relatedness and between-ness are calculated • Matrix form of wordnet data structure is used • Can be used to integrate with other applications • Uses Singular Value Decomposition (SVD) algorithm • Example: Multiple synsets are • {car, gondola} • {car, railway car} • {car, automobile} • {Motor vehicle}, {Coupe}, {Sedan}, {Taxi}

  15. MDS-example 1, 2, 3, 4, 10, 12 5, 6, 7, 8, 9, 11, 13 k-means S Geodesic Distance Matrix MDS Source: Lecture18 Community Structure by Prof.Gunes

  16. WSD: using latent semantics

  17. WSD: variable lexical notations for a concept • Generic concept notation: D = I ∪ J ∪ K ∴ J = D − (I ∪ K) = (D − I )∩(D − K) = D∩ (I∪ K) J = D∩ ( I ∩K) since, B = D ∪ E ∪ F D = B − (E∪F) =(B − E)∩(B − F) = B∩(E ∪F) D =B ∩(E ∩ F) ¯¯¯¯ ¯ ¯ ¯¯¯¯ ¯ ¯ Source: Proceedings of the 20th International Conference on Advanced Information Networking and Applications

  18. WSD: variable lexical notations for a concept ¯ ¯ J = D∩ ( I ∩K) =( B∩(E ∩ F) )∩( I ∩ K) J = B∩( (E ∩ F)∩( I ∩ K) ) when J = fly, D = fish lure I = spinner k = troll And introducing boolean operators, AND for ∩ OR for ∪ NOT for ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ Source: Proceedings of the 20th International Conference on Advanced Information Networking and Applications

  19. WSD: variable lexical notations for a concept • (“fly”) becomes : (“fisherman's lure” OR “fish lure”) AND ( (NOT “spinner”) AND (NOT “troll”) ) then B = lure, E = ground bait, F = stool pigeon • (“fly”) becomes : (“bait” OR “decoy” OR “lure”) AND ( ((NOT “ground bait”) AND (NOT “stoolpigeon”) AND((NOT “spinner”)AND(NOT “troll”)) ) Source: Proceedings of the 20th International Conference on Advanced Information Networking and Applications

  20. Thesaurus as a complex network • As a Directed Graph • sink composed of the 73,046 terms with kout = 0 • source are the 30,260 terms with at least one outgoing link (kout > 0) – Root words • absolute source : without incoming links kin = 0 • normal source : (kout > 0 and kin > 0) • bridge source : without outgoing links to root words (kout(source) = 0) 1 – Normal source 2 – Bridge source 3 – Absolute source 4 – sink Source: arXiv:cond-mat/0312586 v1 2003

  21. WSD: Semantic relatedness and word sense disambiguation • Concepts that occur more frequently and closer with each others are “more related” to each others than the concepts that appear less frequently and farther one Source: Proceedings of the 20th International Conference on Advanced Information Networking and Applications

  22. WordNet Relationship • Semantic relatedness • Involves relationships among words • car-wheel (meronym) • hot-cold (antonym) • pencil-paper (functional) • penguin-antarctica (association) • Bank-trust company (synonym) • Probability and Distance calculation • Frequency of synsets or words • Performance in NLP applications

  23. WordNet Relationship Browser WordNet Relationship Browser

  24. WordNet Connect • Program to find all possible connections between two words in WordNet • Used in computing Semantic Opposition among word sense ontology • WordNet lexical database dictionary is used to read the semantic relations • Capabilities like number of paths, shortest path, overall network structure is studied

  25. WordNet Connect WordNet Connect

  26. WordNet Connect WordNet Connect

  27. WordNet Connect WordNet Connect

  28. Future work • WordNet structure in terms of complex network • Key assumptions • WordNet lexical dictionary analyzed under the scope of source node, target node with an additional reference node • Achieve a cost effective path which is conditionally related to mean reference node • Control the path traversal with a relation of focus • Include Common File Number to make it more efficient

  29. Conclusion • A single visualization can not reveal the entire structure of wordnet • There are different ways of analyzing the effectiveness of the overall system • A new method to evaluate the usefullness of the WordNet network structure

  30. Questions and Comments

More Related