1 / 19

CS460/626 : Natural Language Processing/Speech, NLP and the Web Lecture 35: Closing

CS460/626 : Natural Language Processing/Speech, NLP and the Web Lecture 35: Closing. Pushpak Bhattacharyya CSE Dept., IIT Bombay 12 th Nov, 2012. NLP: Two pictures. Problem. NLP. Semantics. NLP Trinity. Parsing. Part of Speech Tagging. Morph Analysis. Vision. Speech. Marathi.

lazaro
Download Presentation

CS460/626 : Natural Language Processing/Speech, NLP and the Web Lecture 35: Closing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS460/626 : Natural Language Processing/Speech, NLP and the WebLecture 35: Closing Pushpak BhattacharyyaCSE Dept., IIT Bombay 12th Nov, 2012

  2. NLP: Two pictures Problem NLP Semantics NLP Trinity Parsing PartofSpeech Tagging Morph Analysis Vision Speech Marathi French HMM Statistics and Probability + Knowledge Based Hindi English Language CRF MEMM Algorithm

  3. NLP layer Discourse and Corefernce Increased Complexity Of Processing Semantics Extraction Parsing Chunking POS tagging Morphology

  4. Things done: POS tagging • POS Tagging • Generative Framework: HMM+Viterbi, Baum Welch, Forward-Backward • Assignments • Comparison with A* • Comparison with Discriminative • Evaluation: P, R, F-score, Error Analysis

  5. Things done: Parsing • Parsing • Classical Parsing • Probabilistic Parsing • PCFG, Inside-Outside algo • Challenges for Parsing • Embedding, Multiple categories, length • Assignments • Projection (study project along with NLTK)

  6. Things done: Wordnet and WSD • Wordnet • Lexical Matrix Principle • Lexical and Semantic Relations • Indowordnet • Word Sense Disambiguation • Supervised, Knowledge Based, Unsupervised • Assignment • YAGO (wordnet+wikipedia) • Navigation

  7. Things done: Speech, Phonetics and Phonology, Transliteration • Mechanisms of sound production is speech • Phonetics: consonants, vowels, place and manner of articulation • Syllables and syllabification, CMU pronunciation dictionary • Transliteration • Alignment of graphemes and phonemes

  8. Things done: EM (quite deeply) • Convexity • Convexity of log (second derivative >=0) • Log likelihood • Expectation of log likelihood • IMP: MLE of multinomial, multivariate mixture distribution (its wide applicability) • E and M steps of the above MLE

  9. Student seminars • Covered quite a lot of breadth • Summarization • Smoothing • Neuro linguistics • Multimodal sentiment analysis • Text Entailment • Watson • Computers and Art • Semantic Web and (others to follow)

  10. Proposed seminar topics

  11. Goals of the M.Tech seminar • Independent study: breadth, depth, recency • Coherent study and write up on things read/discussed: a very good report • Final presentation to examiner(s) • Regular review of papers, working on foundations • May create the foundation for MTP • IMP: complete awareness of the area, excellent report, linking with literature and presentation

  12. Seminar Topics Empirical Theoretical Cross Lingual IR Text Entailment Machine Translation Projections Quantum Computing and NLP Sentiment Analysis Lexical Resources & Semantic Web

  13. Details: Theoretical • Quantum Computing and NLP • Study of quantum mechanical techniques, especially for Shallow Parsing- POS tagging and Chunking • Projections • Projecting probabilistic parameters from annotated data of one language to another • Projecting annotation structures from one language to another

  14. Details: CLIR • Study of EM based Query Expansion • Engineering of Cross Lingual Search with Indian Language in focus • Query Processing • Crawl • Named Entities • Multiwords

  15. Details: Machine Translation • Classical Machine Translation • Interlingua based • Transfer based • Statistical MT • Alignment • Decoding • Factored SMT • Tree based SMT

  16. Details: Sentiment Analysis • Foundations and Approaches • Bag of Words based • Syntax based • Deep Semantics based • Cross Lingual Sentiment Analysis • Challenges • Thwarting (ML with skewed data) • Sarcasm (Multimodal SA)

  17. Details: Text Entailment • Definition and Approaches • Bag of Words based • Syntax based • Shallow Semantics based • Challenges • Use of deep semantics

  18. Details: Semantic Web and Lexical Resources • Linked Open Data • Multilingual Projections • Disambiguation using LOD • Integration of wordnets and ontologies

  19. It was great teaching you God bless and God speed!

More Related