160 likes | 338 Views
An Empirical Evaluation of Knowledge Sources and learining Algorithms for Word Sense Disambiguation. Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation.
E N D
An Empirical Evaluation of Knowledge Sources and learining Algorithms for Word Sense Disambiguation Presenter : Kung, Chien-HaoAuthors : YoongKeok Lee and HweeTou Ng2002,EMNLP
Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments
Motivation • Natural language is inherently ambiguous. • A word can have multiple meanings(or senses).
Objectives • This paper evaluates a variety of knowledge sources and supervised learning algorithms for word sense disambiguation on SENSEVAL-2 and SENSEVAL-1 data.
Methodology Knowledge Sources Part of speech (POS) of Neighboring Words Single Words in the Surrounding Context Syntactic Relations Local Collocations
Methodology • Part-of-Speech(POS) of Neighboring Words • This paper use 7 features to encode this knowledge source • Setence segmentation program(Reynar and Ratnaparkhi, 1997) • POS tagger(Ratnaparkhi , 1996) Reid saw me looking at the iron bars. bars and . DT NN NNS PRP VBD VBG IN NNP {IN,DT,NN,NNS,.,,}
Methodology • Single Words in the Surrounding Context • Feature selection method • Parameter:M2 Reid saw me looking at the iron bars. <0,1,0> bars {chocolate, iron, beer}
Methodology • Local Collocations • This paper extracted 11 features.C-1,-1 ,C1,1,C-2,-2,C2,2,C-2,-1,C-1,1,C1,2,C-3,-1,C-2,1,C-1,2,C1,3 Reid saw me looking at the iron bars. <the_iron> C-2,-1 bars { a_chocolate , the_wine , the_iron }
Methodology • Syntactic Relations Show w and its POS Show the sentence where w occurs Show the feature vector corresponding to syntactic relations
Methodology • Learning Algorithms • Support Vector Machines • AdaBoost • Naïve Bayes • Decision Trees • Evaluation Data Sets • SENSEVAL-2 • SENSEVAL-1
Conclusions • Using all of these knowledge sources and SVM achieves accuracy higher than the best official scores on both SENSEVAL-2 and SENSEVAL-a test data.
Comments • Advantages • This paper easy to read. • Applications • WSD