WSD using Optimized Combination of Knowledge Sources

WSD using Optimized Combination of Knowledge Sources Authors: Yorick Wilks and Mark Stevenson Presenter: Marian Olteanu

Introduction • Regular approaches • All words • Sample (small trial section) • Problems • Ambiguity, especially at fine granularity • New senses in text that are not in dictionary

Approach • Integrates partial sources of information • Part-of-speech • Dictionary definitions • Pragmatic codes • Selectional restrictions • Integration • Filters • Partial selectors (taggers)

Dictionary for senses • Longman Dictionary of Contemporary English (LDOCE) • Two levels: • Homograph • Sense

Methodology • Preprocessing • Part-of-speech tagger (Brill) • Part-of-speech • Filter – eliminate all incompatible homographs • If no sense remains – keep all senses

Methodology (cont.) • Dictionary definitions • Partial tagger: • Count number of words that appear both in definition and the context • Normalize by the length of the definition • Return a list of candidate senses

Methodology (cont.) • Pragmatic codes • Partial tagger - Uses the hierarchy of LDOCE pragmatic codes (subject area) • Modified simulated annealing • Optimize the number of pragmatic codes of the same type in the sentence • Whole paragraph - Only for nouns ?

Methodology (cont.) • Selectional Restrictions • Filter • LDOCE senses – 35 semantic classes (H = human, M = human male, P = plant, etc) • Nouns – their type, adjs – the type of the object they modify, adv – type of their modifier, verbs – types of S, DO, IO

Methodology (cont.) • Combine knowledge sources • Decision lists • Can assign sense to unknown words, if there is a definition in LDOCE

Evaluation • Create a corpus based on SemCor (200,000 words; tagged with WordNet senses) • SENSUS – merging between LDOCE and WordNet (for Machine Translation) • Still ambiguity • 36,869 out of 85,747 words (personal opinion: strongly biased)

Baseline: 49.8% 70% of the 1st sense – correctly tagged 83.4% accuracy = 92.8% accuracy on all words (!!!) Test by voting: Results

WSD using Optimized Combination of Knowledge Sources

WSD using Optimized Combination of Knowledge Sources

Presentation Transcript

Using Sources

Sources for Our Knowledge

Chapter 2 Sources of Nursing Knowledge

Using Sources

Using Sources:

Using Sources Appropriately

Using Sources

Using Sources Effectively

Sources for the knowledge of God?

Sources of Knowledge of Ancient Greece

Modeling Consensus: Classifier Combination for WSD

Using Your Sources

Sources for Our Knowledge

Using Sources

WSD using Optimized Combination of Knowledge Sources

Using WordNet and WSD in Conceptual Query Expansion

Using Sources

WSD for Applications

Using Sources

Sources of nursing knowledge

Sources of nursing knowledge

Using Moment.js CDN Combination