190 likes | 331 Views
CS460/626 : Natural Language Processing/Speech, NLP and the Web Lecture 35: Closing. Pushpak Bhattacharyya CSE Dept., IIT Bombay 12 th Nov, 2012. NLP: Two pictures. Problem. NLP. Semantics. NLP Trinity. Parsing. Part of Speech Tagging. Morph Analysis. Vision. Speech. Marathi.
E N D
CS460/626 : Natural Language Processing/Speech, NLP and the WebLecture 35: Closing Pushpak BhattacharyyaCSE Dept., IIT Bombay 12th Nov, 2012
NLP: Two pictures Problem NLP Semantics NLP Trinity Parsing PartofSpeech Tagging Morph Analysis Vision Speech Marathi French HMM Statistics and Probability + Knowledge Based Hindi English Language CRF MEMM Algorithm
NLP layer Discourse and Corefernce Increased Complexity Of Processing Semantics Extraction Parsing Chunking POS tagging Morphology
Things done: POS tagging • POS Tagging • Generative Framework: HMM+Viterbi, Baum Welch, Forward-Backward • Assignments • Comparison with A* • Comparison with Discriminative • Evaluation: P, R, F-score, Error Analysis
Things done: Parsing • Parsing • Classical Parsing • Probabilistic Parsing • PCFG, Inside-Outside algo • Challenges for Parsing • Embedding, Multiple categories, length • Assignments • Projection (study project along with NLTK)
Things done: Wordnet and WSD • Wordnet • Lexical Matrix Principle • Lexical and Semantic Relations • Indowordnet • Word Sense Disambiguation • Supervised, Knowledge Based, Unsupervised • Assignment • YAGO (wordnet+wikipedia) • Navigation
Things done: Speech, Phonetics and Phonology, Transliteration • Mechanisms of sound production is speech • Phonetics: consonants, vowels, place and manner of articulation • Syllables and syllabification, CMU pronunciation dictionary • Transliteration • Alignment of graphemes and phonemes
Things done: EM (quite deeply) • Convexity • Convexity of log (second derivative >=0) • Log likelihood • Expectation of log likelihood • IMP: MLE of multinomial, multivariate mixture distribution (its wide applicability) • E and M steps of the above MLE
Student seminars • Covered quite a lot of breadth • Summarization • Smoothing • Neuro linguistics • Multimodal sentiment analysis • Text Entailment • Watson • Computers and Art • Semantic Web and (others to follow)
Goals of the M.Tech seminar • Independent study: breadth, depth, recency • Coherent study and write up on things read/discussed: a very good report • Final presentation to examiner(s) • Regular review of papers, working on foundations • May create the foundation for MTP • IMP: complete awareness of the area, excellent report, linking with literature and presentation
Seminar Topics Empirical Theoretical Cross Lingual IR Text Entailment Machine Translation Projections Quantum Computing and NLP Sentiment Analysis Lexical Resources & Semantic Web
Details: Theoretical • Quantum Computing and NLP • Study of quantum mechanical techniques, especially for Shallow Parsing- POS tagging and Chunking • Projections • Projecting probabilistic parameters from annotated data of one language to another • Projecting annotation structures from one language to another
Details: CLIR • Study of EM based Query Expansion • Engineering of Cross Lingual Search with Indian Language in focus • Query Processing • Crawl • Named Entities • Multiwords
Details: Machine Translation • Classical Machine Translation • Interlingua based • Transfer based • Statistical MT • Alignment • Decoding • Factored SMT • Tree based SMT
Details: Sentiment Analysis • Foundations and Approaches • Bag of Words based • Syntax based • Deep Semantics based • Cross Lingual Sentiment Analysis • Challenges • Thwarting (ML with skewed data) • Sarcasm (Multimodal SA)
Details: Text Entailment • Definition and Approaches • Bag of Words based • Syntax based • Shallow Semantics based • Challenges • Use of deep semantics
Details: Semantic Web and Lexical Resources • Linked Open Data • Multilingual Projections • Disambiguation using LOD • Integration of wordnets and ontologies
It was great teaching you God bless and God speed!