1 / 42

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 4– Wordnet and Word Sense Disambiguation, cn

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 4– Wordnet and Word Sense Disambiguation, cntd ). Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th Jan , 2011. Foundation of Wordnet: Lexical Matrix. Meaning-form relationship. Meanings container.

hammer
Download Presentation

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 4– Wordnet and Word Sense Disambiguation, cn

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS460/626 : Natural Language Processing/Speech, NLP and the Web(Lecture 4– Wordnet and Word Sense Disambiguation, cntd) Pushpak BhattacharyyaCSE Dept., IIT Bombay 11th Jan, 2011

  2. Foundation of Wordnet: Lexical Matrix

  3. Meaning-form relationship Meanings container Word forms container M1 W1 M2 W2 M3 W3 . . . . . . Mk-1 Wk-1 Mk Wk Many to many relationship

  4. Meaning-form Meaning-form Homonymy (accidental identity or word borrowing) Polysemy (shades of meaning) Eg: Fall The kingdom fell The fruit fell Homography (same picture) Eg: river bank and financial bank Homophony (same sound) Eg: write and right

  5. Distributional Similarity • Words which are semantically similar tend to appear in syntactically similar contexts. • The neighbors of the words tend to be the same. • Technically known as Distributional Similarity.

  6. Synset+Gloss+Example Crucially needed for concept explication, wordnet building using another wordnet and wordnet linking. English Synset: {earthquake, quake, temblor, seism} -- (shaking and vibration at the surface of the earth resulting from underground movement along a fault plane of from volcanic activity) Hindi Synset: {भूकंप, भूचाल, भूडोल, जलजला, भूकम्प, भू-कंप, भू-कम्प, ज़लज़ला, भूमिकंप, भूमिकम्प  - प्राकृतिक कारणों से पृथ्वी के भीतरी भाग में कुछ उथल-पुथल होने से ऊपरी भाग के सहसा हिलने की क्रिया   "२००१ में गुज़रात में आये भूकंप में काफ़ी लोग मारे गये थे" (shaking of the surface of earth; many were killed in the earthquake in Gujarat) Marathi Synset:धरणीकंप,भूकंप - पृथ्वीच्या पोटात द्रव्यक्षोभ होऊन पृष्ठभाग हालण्याची क्रिया "२००१ साली गुजरातमध्ये झालेल्या धरणीकंपात अनेक लोक मृत्युमुखी पडले"

  7. Semantic Relations • Hypernymy and Hyponymy • Relation between word senses (synsets) • X is a hyponym of Y if X is a kind of Y • Hyponymy is transitive and asymmetrical • Hypernymy is inverse of Hyponymy (lion->animal->animate entity->entity)

  8. Semantic Relations (continued) • Meronymy and Holonymy • Part-whole relation, branch is a part of tree • X is a meronymy of Y if X is a part of Y • Holonymy is the inverse relation of Meronymy {kitchen} ………………………. {house}

  9. Lexical Relation • Antonymy • Oppositeness in meaning • Relation between word forms • Often determined by phonetics, word length etc. ({rise, ascend} vs. {fall, descend})

  10. WordNet Sub-Graph Hyponymy Dwelling,abode Hypernymy Meronymy kitchen Hyponymy bckyard bedroom M e r o n y m y house,home veranda A place that serves as the living quarters of one or mor efamilies Hyponymy study guestroom hermitage cottage Gloss

  11. Troponym and Entailment • Entailment {snoring – sleeping} • Troponym {limp, strut – walk} {whisper – talk}

  12. Entailment Snoring entails sleeping. Buying entails paying. • Proper Temporal Inclusion. Inclusion can be in any way. Sleeping temporally includes snoring. Buying temporally includes paying. • Co-extensiveness. (Troponymy) Limping is a manner of walking.

  13. Opposition among verbs. • {Rise,ascend} {fall,descend} Tie-untie (do-undo) Walk-run (slow,fast) Teach-learn (same activity different perspective) Rise-fall (motion upward or downward) • Opposition and Entailment. Hit or miss (entail aim) . Backward presupposition. Succeed or fail (entail try.)

  14. The causal relationship. Show- see. Give- have. Causation and Entailment. Giving entails having. Feeding entails eating.

  15. Kinds of Antonymy

  16. Kinds of Meronymy

  17. Gradation

  18. Metonymy • Associated with Metaphors which are epitomes of semantics • Oxford Advanced Learners Dictionary definition: “The use of a word or phrase to mean something different from the literal meaning” • Does it mean Careless Usage?!

  19. Insight from Sanskritic Tradition • Power of a word • Abhidha, Lakshana, Vyanjana • Meaning of Hall: • The hall is packed (avidha) • The hall burst into laughing (lakshana) • The Hall is full (unsaid: and so we cannot enter) (vyanjana)

  20. Metaphors in Indian Tradition • upamana and upameya • Former: object being compared • Latter: object being compared with • Puru was like a lion in the battle with Alexander (Puru: upameya; Lion: upamana)

  21. Upamana, rupak, atishayokti • upamana: Explicit comparison • Puru was like a lion in the battle with Alexander • rupak: Implicit comparison • Puru was a lion in the battle with Alexander • Atishayokti (exaggeration): upamana and upameya dropped • Puru’s army fled. But the lion fought on.

  22. Modern study (1956 onwards, Richards et. al.) • Three constituents of metaphor • Vehicle (items used metaphorically) • Tenor (the metaphorical meaning of the former) • Ground (the basis for metaphorical extension) • “The foot of the mountain” • Vehicle: :foot” • Tenor: “lower portion” • Ground: “spatial parallel between the relationship between the foot to the human body and the lower portion of the mountain with the rest of the mountain”

  23. Interaction of semantic fields(Haas) • Core vs. peripheral semantic fields • Interaction of two words in metonymic relation brings in new semantic fields with selective inclusion of features • Leg of a table • Does not stretch or move • Does stand and support

  24. Lakoff’s (1987) contribution • Source Domain • Target Domain • Mapping Relations

  25. Mapping Relations: ontological correspondences • Anger is heat of fluid in container

  26. Image Schemas • Categories: Container Contained • Quantity • More is up, less is down: Outputs rose dramatically; accidents rates were lower • Linear scales and paths: Ram is by far the best performer • Time • Stationary event: we are coming to exam time • Stationary observer: weeks rush by • Causation: desperation drove her to extreme steps

  27. Patterns of Metonymy • Container for contained • The kettle boiled (water) • Possessor for possessed/attribute • Where are you parked? (car) • Represented entity for representative • The government will announce new targets • Whole for part • I am going to fill up the car with petrol

  28. Patterns of Metonymy (contd) • Part for whole • I noticed several new faces in the class • Place for institution • Lalbaug witnessed the largest Ganapati Question: Can you have part-part metonymy

  29. Purpose of Metonymy • More idiomatic/natural way of expression • More natural to say the kettle is boiling as opposed to the water in the kettle is boiling • Economy • Room 23 is answering (but not *is asleep) • Ease of access to referent • He is in the phone book (but not *on the back of my hand) • Highlighting of associated relation • The car in the front decided to turn right (but not *to smoke a cigarette)

  30. Feature sharing not necessary • In a restaurant: • Jalebiikoabhidudhchaiye(no feature sharing) • The elephant now wants some coffee (feature sharing)

  31. Proverbs • Describes a specific event or state of affairs which is applicable metaphorically to a range of events or states of affairs provided they have the same or sufficiently similar image-schematic structure

  32. WSD APPROACHES

  33. WSD – Problem Definition • Obtain the sense of • A set of target words, or of • All words (all word WSD, more difficult) • Against a • Sense repository (like the wordnet), or • A thesaurus (not same as wordnet, does not have semantic relations) • Using the • Context in which the word appears.

  34. Example word: operation • operation ((computer science) data processing in which the result is completely specified by a rule (especially the processing that results from a single instruction)) "it can perform millions of operations per second" • operation, military operation (activity by a military or naval force (as a maneuver or campaign)) "it was a joint operation of the navy and air force" • operation, surgery, surgical operation, surgical procedure, surgical process (a medical procedure involving an incision with instruments; performed to repair damage or arrest disease in a living body) "they will schedule the operation as soon as an operating room is available"; "he died while undergoing surgery" • mathematical process, mathematical operation, operation ((mathematics) calculation by mathematical methods) "the problems at the end of the chapter demonstrated the mathematical processes involved in the derivation"; "they were learning the basic operations of arithmetic"

  35. KNOWLEDEGE BASED v/s MACHINE LEARNING BASED v/s HYBRID APPROACHES • Knowledge Based Approaches • Rely on knowledge resources like WordNet, Thesaurus etc. • May use grammar rules for disambiguation. • May use hand coded rules for disambiguation. • Machine Learning Based Approaches • Rely on corpus evidence. • Train a model using tagged or untagged corpus. • Probabilistic/Statistical models. • Hybrid Approaches • Use corpus evidence as well as semantic relations form WordNet. 36

  36. SELECTIONAL PREFERENCES(INDIAN TRADITION) 37 • “Desire” of some words in the sentence (“aakaangksha”). • I saw the boy with long hair. • The verb “saw” and the noun “boy” desire an object here. • “Appropriateness” of some other words in the sentence to fulfil that desire (“yogyataa”). • I saw the boy with long hair. • The PP “with long hair” can be appropriately connected only to “boy” and not “saw”. • In case, the ambiguity is still present, “proximity” (“sannidhi”) can determine the meaning. • E.g. I saw the boy with a telescope. • The PP “with a telescope” can be attached to both “boy” and “saw”, so ambiguity still present. It is then attached to “boy” using the proximity check.

  37. SELECTIONAL PREFERENCES(RECENT LINGUISTIC THEORY) 38 • There are words which demand arguments, like, verbs, prepositions, adjectives and sometimes nouns. These arguments are typically nouns. • Arguments must have the property to fulfil the demand. They must satisfy selectional preferences. • Example • Give (verb) • agent – animate • obj – direct • obj – indirect • I gave him the book • I gave him the book (yesterday in the school) -> adjunct • How does this help in WSD? • One type of contextual information is the information about the type of arguments that a word takes.

  38. WSD USING SELECTIONAL PREFERENCES AND ARGUMENTS Sense 1 Sense 2 • This airlines serves dinner in the evening flight. • serve (Verb)‏ • agent • object – edible • This airlines serves the sector between Agra & Delhi. • serve (Verb)‏ • agent • object – sector Requires exhaustive enumeration of: • Argument-structure of verbs. • Selectional preferences of arguments. • Description of properties of words such that meeting the selectional preference criteria can be decided. E.g. This flight serves the “region” between Mumbai and Delhi How do you decide if “region” is compatible with “sector” 39 39

  39. OVERLAP BASED APPROACHES • Require a Machine Readable Dictionary (MRD). • Find the overlap between the features of different senses of an ambiguous word (sense bag) and the features of the words in its context (context bag). • These features could be sense definitions, example sentences, hypernyms etc. • The features could also be given weights. • The sense which has the maximum overlap is selected as the contextually appropriate sense. 40 40

  40. LESK’S ALGORITHM Sense Bag: contains the words in the definition of a candidate sense of the ambiguous word. Context Bag: contains the words in the definition of each sense of each context word. E.g. “On burning coal we get ash.” Ash Coal • Sense 1 Trees of the olive family with pinnate leaves, thin furrowed bark and gray branches. • Sense 2 The solid residue left when combustible material is thoroughly burned or oxidized. • Sense 3 To convert into ash • Sense 1 A piece of glowing carbon or burnt wood. • Sense 2 charcoal. • Sense 3 A black solidcombustible substance formed by the partial decomposition of vegetable matter without free access to air and under the influence of moisture and often increased pressure and temperature that is widely used as a fuel for burning 41 41 In this case Sense 2 of ash would be the winner sense.

  41. Sense1: Finance Sense2: Location Money +1 0 Interest +1 0 Fetch 0 0 Annum +1 0 Total 3 0 WALKER’S ALGORITHM • A Thesaurus Based approach. • Step 1:For each sense of the target word find the thesaurus category to which that sense belongs. • Step 2: Calculate the score for each sense by using the context words. A context words will add 1 to the score of the sense if the thesaurus category of the word matches that of the sense. • E.g. The money in this bank fetches an interest of 8% per annum • Target word: bank • Clue words from the context: money, interest, annum, fetch Context words add 1 to the sense when the topic of the word matches that of the sense 42

More Related