180 likes | 566 Views
Introduction to the Module. John Barnden School of Computer Science University of Birmingham Natural Language Processing 1 201 5 /1 6 Semester 2. I/Me/Mine. John Barnden is my name And natural language processing is my game ... Specifically and mainly: metaphor theory & processing
E N D
Introduction to the Module John Barnden School of Computer Science University of Birmingham Natural Language Processing 1 2015/16 Semester 2
I/Me/Mine • JohnBarnden is my name • And natural language processing is my game ... • Specifically and mainly: metaphor theory & processing • I’m Professor of Artificial Intelligence • I’m also Diversity & Equality officer for the School. • Coords: • Room 136 • Tel. 4-3816 • J.A.Barnden@cs.bham.ac.uk
Demonstrator • Mohab Elkaref M.E.A.R.Elkaref@cs.bham.ac.ukroom 218 • His job: • Help you with any aspect of the module • Incl.: understanding the material, getting a start on exercises (even when assessed), using some computer programs that will be available, helping with marking, giving a lecture on his own work. • NB: I’m also getting three other PhD students to give lectures on their work and the technology they use.
You • What degrees are you on? • Why did you choose this module? • What have you heard about NLP?
Syllabus Page and Website • FIND and READ the syllabus page for this module!! • In the Relevant Links section, follow the link to my own top webpage for the module. • Mainly, the Canvas page will just point to that page and include the recordings of the lectures. • READ that top webpage. • Lecture slides, exercises, etc. will probably hang from it, • not directly from the Canvas page. • I will have slides up a day or two before a lecture, but probably not more, as I like to allow lots of flexibility in class.
Assessment • 1.5 hour exam (80%). • NB: in its detail, will differ considerably from previous exams. • Mid-term test (10%), on the Thursday in week 5 of this term. • i.e. Thurs 11th Feb. • Exercise-set as homework, Weeks 9-11 (10%) • To be done individually, with limited collaboration (to be clarified later). • Be aware of the plagiarism documentation in the student handbook on the School website!!
Official Aims of Module(plus Notes by me) • Introduce Natural Language Processing as one of the components of Artificial Intelligence, both from engineering and cognitive viewpoints. Note: • NLP gives insight into mind, and into AI in general. • Provide a basis for the programming of NLP techniques …. Notes: • The module is not a software workshop, and only aims to give you abstract algorithms and other background for NLP programming. • Emphasis will be more on the underlying concepts, theory, problems, and understanding of algorithms. • But you will also be introduced to some practical tools.
More Notes on Aims of Module • The module will largely be about processing of textual language. • Only occasional comments will be made about processing of speech. • The language-processing field is largely divided into textual and speech-processing aspects. • Speech brings in a host of extra technical problems. • Text processing is (more than) enough for (more than) one module! • The main module textbook contains much information about speech processing (optional reading). • The module will (very briefly) mention ramifications into sign language and manual gesture. • There will be some attention to variations such as textese.
Unofficial Aims of Module • Make you aware of language as a really fun think to think about! • To show you it acts strangely and wonderfully all around us all the time! • To show you it’s technically challenging to deal with, in all sorts of fascinating ways!
Textbook and Its Relationship to Module • Main textbook is the Jurafsky & Martin 2009 book on syllabus page. • Plays an important role in the module. • In many cases the lectures can only give a brief intro to a more detailed treatment in the textbook. • Assessed work will assume a (reasonable level of) knowledge of specified parts of the textbook. • Lectures will cover some things not covered in the textbook, and will further illuminate some things that are. • You can of course ask me or the demonstrator privately for help with understanding textbook material.
Nature of Class Sessions • Mainly lecture, but with • Occasional in-class exercises (formative) • Mid-term in-class test (assessed -- 10%). • You are strongly encouraged to ask questions or make comments in class. • I will have detailed lecture slides (accessible via my module website), but may say important things that are not on the slides. • These slides will always be on the web. • I will occasionally supply additional notes (electronic), including answer notes about exercises.
What the Study of Language Covers, 1(NB: not all covered in this module!) • What language is, as distinct from other things we do or use. • But also how it’s related to some such things. • Whether other creatures use language. • Speech aspects, textual aspects, signing aspects, gestural aspects. • Connection of language to diagrams, pictures, music, thought ... • Poetic and other artistic aspects of language. • Specific purposes of language such as persuasion and intimacy-building. • Learning/teaching of language(either naturally or deliberately). • Development of language over history.
What the Study of Language Covers, 2 • How do we get meaning (in broadest sense, including things like emotion) from discourse. • How discourse is broken down into components (e..g, sentences, phrases, words, parts of words). • How the meaning of a phrase, sentence or complex discourse segment depends on the meanings of the parts and other information. • How the above differs between: text, speech, signing, ... • Translation between different languages.
Language Technology • Any use of language processing by a computer system. Some main topical examples, all of extensive, current practical importance: • Machine translation. • Document summarization. • Information extraction. • Text mining. • Information retrieval (usually = retrieval of whole documents). • Conversational agents, whether for • general chat as in fronting of sites (IKEA, US Army, ...), chatrooms and artificial companions • or for specific tasks such as booking tickets, therapy, or other life help. • Sentiment analysis: extracting the emotional/evaluative tone of language objects such as product reviews, customer complaints or user interactions with an HCI system. • Web searching.
A Standard Breakdown • Language is traditionally (and still currently) viewed as having the following aspects or levels: • Phonological / orthographical(and the analogous level in sign language): • The patterns of sounds, letters or hand/body movements in basic units such as words, and what happens to them when words (etc.) are put together • Morphological: • Largely about how words are broken down into conceptually significant segments (i.e. not just into letters, etc.) • Syntactic: • The patterns of words of various types found in bigger units such as sentences. • Semantic: • The primary meanings of words, phrases and sentences. • Pragmatic: • More subtle and/or context-dependent aspects of the way in which meaning and other effects arise from language. Often extends beyond sentence boundaries.
But This Breakdown is Broken-Down! • There is no sharp distinction between morphology and syntax. • For one thing, what counts as a word is unclear. And words can be built from other words. The nature of the distinction varies between languages. • The syntax/semantics distinction is somewhat difficult and theory-laden. • Even defining what the traditional “parts of speech” (nouns, verbs, etc.) are in an objective way is tricky, and brings in both syntax and semantics. • The semantics/pragmatics distinction is hugely contentious and theory-laden. • There are many different versions of what sort of meaning semantics gets at, and of what pragmatics adds. • Even if the breakdown could be theoretically maintained, • it would not imply that language processingwould, should, or even could, be correspondingly divided, • because of the extensive interaction between the different aspects.
Rough Set of Topics • What counts as a word ? • (Morphology) • Simple Grammar and Parts of Speech (POSs) • POS Analysis • Syntactic Analysis • Some Logic needed for ... • Semantic Analysis • Pragmatics and Other Advanced Topics
Some Intriguing Exercises You do “Introductory Exercise-Set A.” If there’s time, we discuss those exercises. You do “Introductory Exercise-Set B.” That will lead into the next segment of the module ...