1 / 22

Part-Of-Speech Tagging using Neural Networks

Part-Of-Speech Tagging using Neural Networks. Ankur Parikh LTRC IIIT Hyderabad ankur.parikh85@gmail.com. Outline. 1.Introduction 2.Background and Motivation 3.Experimental Setup 4.Preprocessing 5.Representation 6.Single-neuro tagger 7.Experiments 8.Multi-neuro tagger 9.Results

kisha
Download Presentation

Part-Of-Speech Tagging using Neural Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Part-Of-Speech Tagging using Neural Networks Ankur Parikh LTRC IIIT Hyderabad ankur.parikh85@gmail.com

  2. Outline 1.Introduction 2.Background and Motivation 3.Experimental Setup 4.Preprocessing 5.Representation 6.Single-neuro tagger 7.Experiments 8.Multi-neuro tagger 9.Results 10.Discussion 11.Future Work

  3. Introduction • POS-Tagging: It is the process of assigning the part of speech tag to the NL text based on both its definition and its context. Uses: Parsing of sentences, MT, IR, Word Sense disambiguation, Speech synthesis etc. Methods: 1. Statistical Approach 2. Rule Based

  4. Background: Previous Approaches • Lots of work has been done using various machine learning algorithms like • TNT • CRF for Hindi. • Trade-off: Performance versus Training time - Less precision affects later stages - For a new domain or new corpus parameter tuning is a non-trivial task.

  5. Background: Previous Approaches & Motivation • Empirically chosen context. • Effective Handling of corpus based features • Need of the hour: - Good performance - Less training time - Multiple contexts - exploit corpus based features effectively • Two Approaches and their comparison with TNT and CRF • Word level tagging

  6. Experimental Setup : Corpus statitstics • Tag set of 25 tags

  7. Experimental Setup: Tools and Resources • Tools - CRF++ - TNT - Morfessor Categories – MAP • Resources - Universal word – Hindi Dictionary - Hindi Word net - Morph Analyzer

  8. Preprocessing • XC tag is removed (Gadde et. Al., 2008). • Lexicon - For each unique word w of the training corpus => ENTRY(t1,……,t24) - where tj = c(posj , w) / c(w)

  9. Representation: Encoding & Decoding • Each word w is encoded as an n-element vector INPUT(t1,t2,…,tn) where n = size of the tag set. • INPUT(t1,t2,…,tn) comes from lexicon if training corpus contains w. • If w is not in the training corpus - N(w) = Number of possible POS tags for w - tj = 1/N(w) if posj is a candidate = 0 otherwise

  10. Representation: Encoding & Decoding • For each word w, Desired Output is encoded as D = (d1,d2,….,dn). - dj = 1 if posj is a desired ouput = 0 otherwise • In testing, for each word w, an n-element vector OUTPUT(o1,…,on) is returned. - Result = posj, if oj = max(OUTPUT)

  11. Single – neuro tagger: Structure

  12. Single – neuro tagger: Training & Tagging • Error Back-propagation learning Algorithm • Weights are Initialized with Random values • Sequential mode • Momentum term • Eta = 0.4 and Alpha = 0.1 • In tagging, it can give multiple outputs or a sorted list of all tags.

  13. Experiments: Development Data

  14. Development of the system

  15. Multi – neuro tagger: Structure

  16. Multi – neuro tagger: Training

  17. Multi – neuro tagger: Learning curves

  18. Multi – neuro tagger: Results

  19. Multi – neuro tagger: Comparison • Precision after voting : 92.19%

  20. Conclusion • Single versus Multi-neuro tagger • Multi-neuro tagger versus TNT and CRF • Corpus and Dictionary based features • More parameters need to be tuned • 24^5 = 79,62,624 n-grams, while 250,560 weights • Well suited for Indian Languages

  21. Future Work • Better voting schemes (Confidence point based) • Finding the right context (Probability based) • Various Structures and algorithms - Sequential Neural Network - Convolution Neural Network - Combination with SVM

  22. Queries??? Thank You!!

More Related