150 likes | 307 Views
CS60057 Speech &Natural Language Processing. Autumn 2005. Lecture 1 21 July 2005. Course Information. Instructor : Sudeshna Sarkar Course Web Page: http://www.facweb.iitkgp.ernet.in/~sudeshna/courses/nlp/ Teaching Assistants: Monojit Choudhury. Today’s slides adapted from
E N D
CS60057Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005 Natural Language Processing
Course Information • Instructor : Sudeshna Sarkar • Course Web Page: • http://www.facweb.iitkgp.ernet.in/~sudeshna/courses/nlp/ • Teaching Assistants: • Monojit Choudhury Natural Language Processing
Today’s slides adapted from Ilyas Cicekli’s slide http://www.cs.ucf.edu/~ilyas/Courses/CAP6640 Martin & Jurafsky’s book Natural Language Processing
Preliminaries Required • Basic formal language theory • We will introduce the basics for those not familiar • Knowledge of linguistic terminology will be useful. • You can get a good overview of the field from “Survey of the State of the Art in Human Language Technology” http://cslu.cse.ogi.edu/HLTsurvey/HLTsurvey.html (1996) • Assignment 1: Read the Survey Natural Language Processing
Text Books • Daniel Jurafsky, and James H. Martin, "Speech and Language Processing", Prentice Hall, 2000. Other References • James Allen, "Natural Language Understanding", Second edition, The Benjamin/Cumings Publishing Company Inc., 1995 • Christopher D. Manning, and Hinrich Schutze, "Foundations of Statistical Natural Language Processing", The MIT Press, 1999. Natural Language Processing
Goal of NLP • Develop techniques and tools to build practical and robust systems that can communicate with users in one or more natural language Natural Language Processing
Morphological Processing Part-of-Speech Tagging Parsing Algorithms for Context-Free Languages Features and Augmented Grammars Lexicalized and Probabilistic Parsing Semantic Analysis Lexical Semantics and Word Sense Disambiguation Discourse Natural Language Generation Machine Translation Probability & Information Theory Language Modeling N-gram models Parameter estimation Some linguistics Phonology, morphology, syntax, semantics, discourse Words & the lexicon Course Topics Natural Language Processing
NLP • The ultimate research goal: • To develop an automated language understanding system • What is NLP? • The process of computer analysis of input provided in a human language (natural language), and conversion of this input into a useful form of representation. • The field of NLP is concerned with • Primarily: getting computers to perform useful and interesting tasks with human languages. • Secondarily: helping us come to a better understanding of human language. • Why is this useful? Natural Language Processing
Motivation for Natural Language Processing / Understanding • Getting computers to perform useful and interesting tasks with human languages Enables communication: • Human computer interaction e.g., IR • Computer assisted human-human communication e.g., MT • Computer modeling of NLP helps us to: • Understand language processing in humans • Understand other human cognitive processes Challenging task- • Requires high level of knowledge about the world • Ability to represent the knowledge and reason with it Natural Language Processing
Forms of Natural Language • The input/output of a NLP system can be: • written text: newspaper articles, letters, manuals, prose, … • Speech: read speech (radio, TV, dictations), conversational speech, commands, … • To process written text, we need: • lexical, • syntactic, • Semantic knowledge about the language • discourse information, • real world knowledge Natural Language Processing
Forms of Natural Language • To process written text, we need: • lexical, syntactic, semantic knowledge about the language • discourse information, real world knowledge • To process spoken language, we need • everything above plus • speech recognition • speech synthesis Natural Language Processing
Components of NLP • Natural Language Understanding • Mapping the given input in the natural language into a useful representation. • Different level of analysis required: morphological analysis, syntactic analysis, semantic analysis, discourse analysis, … • Natural Language Generation • Producing output in the natural language from some internal representation. • Different level of synthesis required: • deep planning (what to say), • syntactic generation Which is harder? Natural Language Processing
Why NL Understanding is hard? • Natural language is extremely rich in form and structure, and very ambiguous. • How to represent meaning, • Which structures map to which meaning structures. • One input can mean many different things. Ambiguity can be at different levels. • Lexical (word level) ambiguity -- different meanings of words • Syntactic ambiguity -- different ways to parse the sentence • Interpreting partial information -- how to interpret pronouns • Contextual information -- context of the sentence may affect the meaning of that sentence. • Many input can mean the same thing. • Interaction among components of the input is not clear. • Noisy input (e.g. speech) Natural Language Processing
Knowledge of Language • Phonology– concerns how words are related to the sounds that realize them. • Morphology – concerns how words are constructed from more basic meaning units called morphemes. A morpheme is the primitive unit of meaning in a language. • Syntax – concerns how can be put together to form correct sentences and determines what structural role each word plays in the sentence and what phrases are subparts of other phrases. • Semantics – concerns what words mean and how these meaning combine in sentences to form sentence meaning. The study of context-independent meaning. Natural Language Processing
Knowledge of Language • Pragmatics – concerns how sentences are used in different situations and how use affects the interpretation of the sentence. • Discourse – concerns how the immediately preceding sentences affect the interpretation of the next sentence.For example, interpreting pronouns and interpreting the temporal aspects of the information. • World Knowledge – includes general knowledge about the world. What each language user must know about the other’s beliefs and goals. Natural Language Processing