140 likes | 151 Views
This course provides an overview of Natural Language Processing (NLP) and its applications in engineering and science. It covers interdisciplinary aspects, basic problems, levels of linguistic knowledge, ambiguity, and algorithms used in NLP. The course also explores various approaches, topics, and tools in NLP, such as semantics, statistics, information extraction, machine translation, and information retrieval.
E N D
CSA3180Natural Language Processing Introduction and Course Overview CSA3180 NLP
Acknowledgement • Material for some of these slides taken from J Nivre, University of Gotheborg, Sweden CSA3180 NLP
Why Language and Computers • Engineering • NLP is concerned with the design and implementation of effective NL input and output components for computational systems (Robert Dale 2000) • Scientific • The use of computers for linguistic research and applications CSA3180 NLP
NLP is Interdisciplinary • Linguistics • Theoretical • Applied • Computer Science • Algorithms • Compiling Techniques • Artificial Intelligence • Understanding, reasoning • Intelligent Action CSA3180 NLP
Uszkoreit’s (2000) Five Points • Solving the human language puzzle • by implementing complex theories directly • Teaching computers to communicate with people • by exploiting natural modes of communication • Friendly software should listen and speak • through development of multimodal communication • Machines can help people communicate with each other. • by developing multilingual applications • Language is the fabric of the web • through language technology for knowledge management CSA3180 NLP
Application Areas • Document Processing • Classification • Summarisation • Information Extraction • Question Answering • Information Retrieval • Dialogue • Multilinguality • Machine Translation • Translation tools • Multimodality • speech • intonation • image CSA3180 NLP
Basic Problems • Analysis • Conversion of NL input to internal representations • Generation • Conversion of internal representations to NL output • Issues • What kind of input/output/representations • Evaluation • Learning CSA3180 NLP
Levels of Linguistic Knowledge • Phonetics/Phonology: sound structure • Morphology: word structure • Syntax: sentence structure • Semantics: meanings • Pragmatics: use of language in context • Discourse: paragraphs, texts, dialogues CSA3180 NLP
Ambiguity • Morpho-SyntacticWe saw her duck • Lexical SemanticThey went to the bank • Structural semanticYoung men and women • ReferentialShe did it • PragmaticCan you pass the salt CSA3180 NLP
Ways of Studying NLP • By ApplicationMT, IE, IR etc. • By Approachrational vs. empirical • By Linguistic Levelmorphology, syntax etc. • By Algorithm CSA3180 NLP
Algorithms • State Machines • automata and transducers • Rule Systems • regular and context free grammars • Search • top-down/bottom-up parsing • Probabilistic algorithms CSA3180 NLP
Approach in this CoursePart I - Algorithms • Words [3] • Finite State Algorithms • Morphological Processing • Sentences [3] • Parsing • (Generation) • Texts [3] • Tagging • Chunking CSA3180 NLP
Approach in this CoursePart II – Topics and Tools • Semantics [6] • Statistics [6] • Information Extraction [6] • Machine Translation [4] • Information Retrieval [3] CSA3180 NLP
Course Information • Course Websitewww.cs.um.edu.mt/~mros/csa3180 • Reference TextJurafsky and Martin • Tools • Prolog: SWI Prolog • NLTK: nltk.sourceforge.net CSA3180 NLP