CS 479, section 1: Natural Language Processing

This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License. CS 479, section 1:Natural Language Processing Lecture #22: Part of Speech Tagging, Hidden Markov Models Thanks to Dan Klein of UC Berkeley for many of the materials used in this lecture.

Announcements • Reading Report #9: Hidden Markov Models • Due: now • Project #2, Part 1 loose ends • Resolve issues raised by TA and respond promptly • Project #2, Part 2 • Early: Friday • Due: Monday • Mid-course Evaluation • Your feedback is important • Colloquium by Dr. Jordan Boyd-Graber • Thursday at 11am

Objectives • New general problem: labeling sequences • First Application: Part-of-Speech Tagging • Introduce first technique: Hidden Markov Models (HMMs)

Parts-of-Speech • Syntactic classes of words – where do they come from? • Useful distinctions vary from language to language • Tag-sets even vary from corpus to corpus [See M&S p. 142] • Some tags from the Penn tag-set:

Part-of-Speech Ambiguity • Favorite example: VBD VB VBN VBZ VBP VBZ NNP NNS NN NNS CD NN Fed raises interest rates 0.5 percent

Why POS Tagging? • Text-to-speech: • record[v] vs. record[n] • lead[v] vs. lead[n] • object[v] vs. object[n] • Lemmatization: • saw[v]  see • saw[n]  saw • Quick-and-dirty NP-chunk detection: • grep {JJ | NN}* {NN | NNS}

Why POS Tagging? • Useful as a pre-processing step for parsing • Less tag ambiguity means fewer parses • However, some tag choices are better decided by parsers! IN DT NNP NN VBD VBN RP NN NNS The Georgia branch had taken on loan commitments … VBN DT NN IN NN VBD NNS VBD The average of interbank offered rates plummeted …

Part-of-Speech Ambiguity • Back to our example: • What information sources would help? • Two basic sources of constraint: • Grammatical environment • Identity of the current word • Many more possible features: • … but we won’t be able to use them just yet VBD VB VBN VBZ VBP VBZ NNP NNS NN NNS CD NN Fed raises interest rates 0.5 percent

How? • Recall our two basic sources of information: • Grammatical environment • Identity of the current word • How can we use these insights in a joint model? • previous tag  tag • own tag  word – remember, think generative! • What would that look like?

Hidden Markov Model (HMM)

“HMM”?

Hidden Markov Model (HMM) • A generative model over tag sequences and observations • Assume: Tag sequence is generated by an order Markov chain • Assume: Words are chosen independently, conditioned only on the tag • E.g., order 2: • Need two “local models”: • Transitions: • Emissions:

Parameter Estimation • Transition model: • Use standard smoothing methods to estimate transition scores, e.g.: • Emission model: • Trickier. What about … • Words we’ve never seen before • Words which occur with tags we’ve never seen • One option: break out the Good-Turing smoothing • But words aren’t black boxes: 343,127.23 11-year Minteriareintroducible

Disambiguation • Tagging is disambiguation: Finding the best tag sequence • Roughly, think of this as sequential classification, where the choice also depends on the uncertain decision made in the previous step. • Given an HMM (i.e., distributions for transitions and emissions), we can score any word sequence and tag sequence: • In principle, we’re done! We have a tagger: • We could enumerate all possible tag sequences • Score them all • Pick the best one   NNP VBZ NN NNS CD NN . STOP Fed raises interest rates 0.5 percent .

Next • Efficient use of Hidden Markov Models for POS tagging: the Viterbi algorithm

CS 479, section 1: Natural Language Processing

CS 479, section 1: Natural Language Processing

Presentation Transcript

CSA3180: Natural Language Processing

Thematic Language Stimulation Therapy

Natural Product Chemistry B.Sc. [Biochem]

Language and Linguistics

Language processing: introduction to compiler construction

by Language Learning Support Section, EDB

I256 Applied Natural Language Processing Fall 2009

Unit One

Section 1b of the exam question:

Chapter: The Nature of Science

Deep learning

Subjectivity and Sentiment Analysis: from Words to Discourse

Natural Language Processing

Building Natural Language Generation Systems

Human-Machine Dialogue Espere and Reality

*Introduction to Natural Language Processing (600.465) Language Modeling (and the Noisy Channel)

Unsupervised learning of natural language morphology

CHAPTER 4 Syntax ANALYSIS Section 0 Approaches to implement a Syntax analyzer

CS712 : Topics in Natural Language Processing (Lecture 1– Introduction; Machine Translation)

FFY2012 EAP Annual Training Section 3