Part-of-Speech Tagging

Part-of-Speech Tagging Torbjörn Lager Department of Linguistics Stockholm University

Part-of-Speech Tagging: Definition • From Jurafsky & Martin 2000: • Part-of-speech tagging is the process of assigning a part-of-speech or other lexical class marker to each word in a corpus. • The input to a tagging algorithm is a string of words and a specified tagset. The output is a single best tag for each word. • A bit too narrow for my taste... NLP1 - Torbjörn Lager

Part-of-Speech Tagging: Example 1 • Input He can can a can • Output He/pron can/aux can/vb a/det can/n • Another possible output He/{pron} can/{aux,n} can/{vb} a/{det} can/{n,vb} NLP1 - Torbjörn Lager

Tag Sets • The Penn Treebank tag set (see appendix in handout) NLP1 - Torbjörn Lager

Why Part-of-Speech Tagging? • A first step towards parsing • A first step towards word sense disambiguation • Provide clues to pronounciation • "object" -> OBject or obJECT • (but note: BAnan vs baNAN) • Research in Corpus Linguistics NLP1 - Torbjörn Lager

Part-of-Speech Tagging: Example 2 • I canlight a fire and you canopen a can of beans. Now the can is open and we can eat in the light of the fire. NLP1 - Torbjörn Lager

Relevant Information • Lexical information • Local contextual information NLP1 - Torbjörn Lager

Part-of-Speech Tagging: Example 2 • I can light a fire and you can open a can of beans. Now the can is open and we can eat in the light of the fire. • I/PRP can/MD light/VB a/DT fire/NN and/CC you/PRP can/MD open/VB a/DT can/NN of/IN beans/NNS ./. Now/RB the/DT can/NN is/VBZ open/JJ and/CC we/PRP can/MD eat/VB in/IN the/DT light/NN of/IN the/DT fire/NN ./. NLP1 - Torbjörn Lager

Knowledge Processor POS tagged text Text Part-of-Speech Tagging • Needed:- Some strategy for representing the knowledge - Some method for acquiring the knowledge - Some method of applying the knowledge NLP1 - Torbjörn Lager

Approaches to PoS Tagging • The bold approach: 'Use all the information you have and guess"' • The whimsical approach: 'Guess first, then change your mind if nessessary!' • The cautious approach: 'Don't guess, just eliminate the impossible!' NLP1 - Torbjörn Lager

Knowledge Processor POS tagged text Text Some POS-Tagging Issues • Accuracy • Speed • Space requirements • Learning • Intelligibility NLP1 - Torbjörn Lager

Cutting the Cake • Tagging methods • Rule based • Statistical • Mixed • Other methods • Learning methods • Supervised learning • Unsupervised learning NLP1 - Torbjörn Lager

HMM Tagging • The bold approach: 'Use all the information you have and guess"' • Statistical method • Supervised (or unsupervised) learning NLP1 - Torbjörn Lager

NLP1 - Torbjörn Lager

The Naive Approach and its Problem • Traverse all the paths compatible with the input and then pick the most probable one • Problem: • There are 27 paths in the HMM for S="he can can a can" • Doubling the length of S (with a conjunction in between) -> 729 paths • Doubling S again -> 531431 paths! • Exponential time complexity! NLP1 - Torbjörn Lager

Solution • Use the Viterbi algorithm • Tagging can be done in time proportional to the length of input. • How and Why does the Viterbi algorithm work? We save this for later... NLP1 - Torbjörn Lager

Training an HMM • Estimate probabilities from relative frequencies. • Output probabilities P(w|t): the number of occurrences of w tagged as t, divided by the number of occurrences of t. • Transitional probabilities P(t1|t2): the number of occurrences of t1 followed by t2, divided by the number of occurrences of t2. • Use smoothing to overcome the sparse data problem (unknown words, uncommon words, uncommon contexts) NLP1 - Torbjörn Lager

Transformation-Based Learning • The whimsical approach: 'Guess first, then change your mind if nessessary!' • Rule based tagging, statistical learning • Supervised learning • Method due to Eric Brill (1995) NLP1 - Torbjörn Lager

A Small PoS Tagging Example rules tag:NN>VB <- tag:TO@[-1] otag:VB>NN <- tag:DT@[-1] o.... input She decided to table her data lexicon data:NN decided:VB her:PN she:PN table:NN VB to:TO NP VB TO NN PN NN NLP1 - Torbjörn Lager

I PRP Now RB a DT and CC beans NNS can MD eat VB fire NN VB in IN is VBZ light NN JJ VB of IN open JJ VB the DT we PRP you PRP . . Lexicon for Brill Tagging NLP1 - Torbjörn Lager

A Rule Sequence tag:'NN'>'VB' <- tag:'TO'@[-1] o tag:'VBP'>'VB' <- tag:'MD'@[-1,-2,-3] o tag:'NN'>'VB' <- tag:'MD'@[-1,-2] o tag:'VB'>'NN' <- tag:'DT'@[-1,-2] o tag:'VBD'>'VBN' <- tag:'VBZ'@[-1,-2,-3] o tag:'VBN'>'VBD' <- tag:'PRP'@[-1] o tag:'POS'>'VBZ' <- tag:'PRP'@[-1] o tag:'VB'>'VBP' <- tag:'NNS'@[-1] o tag:'IN'>'RB' <- wd:as@[0] & wd:as@[2] o tag:'IN'>'WDT' <- tag:'VB'@[1,2] o tag:'VB'>'VBP' <- tag:'PRP'@[-1] o tag:'IN'>'WDT' <- tag:'VBZ'@[1] o ... NLP1 - Torbjörn Lager

blue yellow blue brown red red blue blue brown green Transformation-Based Painting K. Samuel 1998 NLP1 - Torbjörn Lager

Transformation-Based Learning NLP1 - Torbjörn Lager

Transformation-Based Learning • see appendix in handout NLP1 - Torbjörn Lager

Constraint-Grammar Tagging • Due to Fred Karlsson et al. • The cautious approach: 'Don't guess, just eliminate the impossible!' • Rule based • No learning ('learning by injection') NLP1 - Torbjörn Lager

Constraint Grammar Example • I can light a fire and you can open a can of beans. Now the can is open and we can eat in the light of the fire. • I/{PRP} can/{MD,NN} light/{JJ,NN,VB} a/{DT} fire/{NN} and/{CC} you/{PRP} can/{MD,NN} open/{JJ,VB} a/{DT} can/{MD,NN} of/{IN} beans/{NNS} ./{.} Now/{RB} the/{DT} can/{MD,NN} is/{VBZ} open/{JJ,VB} and/{CC} we/{PRP} can/{MD,NN} eat/{VB} in/{IN} the/{DT} light/{JJ,NN,VB} of/{IN} the/{DT} fire/{NN} ./{.} NLP1 - Torbjörn Lager

Constraint Grammar Example tag:red 'RP' <- wd:in@[0] & tag:'NN'@[-1] otag:red 'RB' <- wd:in@[0] & tag:'NN'@[-1] otag:red 'VB' <- tag:'DT'@[-1] otag:red 'NP' <- wd:'The'@[0] otag:red 'VBN' <- wd:said@[0] otag:red 'VBP' <- tag:'TO'@[-1,-2] otag:red 'VBP' <- tag:'MD'@[-1,-2,-3] otag:red 'VBZ' <- wd:'\'s'@[0] & tag:'NN'@[1] otag:red 'RP' <- wd:in@[0] & tag:'NNS'@[-1] otag:red 'RB' <- wd:in@[0] & tag:'NNS'@[-1] o... NLP1 - Torbjörn Lager

Constraint Grammar Example • I can light a fire and you can open a can of beans. Now the can is open and we can eat in the light of the fire. • I/{PP} can/{MD} light/{JJ,VB} a/{DT} fire/{NN} and/{CC} you/{PP} can/{MD} open/{JJ,VB} a/{DT} can/{MD,NN} of/{IN} beans/{NNS} ./{.} Now/{RB} the/{DT} can/{MD,NN} is/{VBZ} open/{JJ} and/{CC} we/{PP} can/{MD} eat/{VB} in/{IN} the/{DT} light/{NN} of/{IN} the/{DT} fire/{NN} ./{.} NLP1 - Torbjörn Lager

Evaluation • Two reasons for evaluating: • Compare with other peoples methods/systems • Compare with earlier versions of your own system • Accuracy (recall and precision) • Baseline • Ceiling • N-fold cross-validation methodology => Good use of the data + More statistically reliable results. NLP1 - Torbjörn Lager

Assessing the Taggers • Accuracy • Speed • Space requirements • Learning • Intelligibility NLP1 - Torbjörn Lager

Demo Taggers • Transformation-Based Tagger: • www.ling.gu.se/~lager/Home/brilltagger_ui.html • Constraint-Grammar Tagger • www.ling.gu.se/~lager/Home/cgtagger_ui.html • Featuring tracing facilities! • Try it yourself! NLP1 - Torbjörn Lager

Parsing Torbjörn Lager Department of Linguistics Stockholm University

Parsing • Parsing with a phrase structure grammar • Shallow parsing NLP1 - Torbjörn Lager

Fragmentlisa springer lisa skjuter en älg Grammars --> np, vp. np --> pn.np --> det, n. vp --> v.vp --> v, np. pn --> [kalle].pn --> [lisa].det --> [en].n --> [älg].v --> [springer].v --> [skjuter]. A Simple Phrase Structure Grammar NLP1 - Torbjörn Lager

Recognition and Parsing • Recognition ?- s([lisa,springer],[]).yes ?- s([springer,lisa],[]).no • Parsing • ?- s(Tree,[lisa,springer],[]).Tree = s(np(pn(lisa)),vp(v(springer))) NLP1 - Torbjörn Lager

A Top-Down Parser in Prolog parse(A,P0,P,A/Trees) :- (A --> B), parse(B,P0,P,Trees). parse((B,Bs),P0,P,(Tree,Trees)) :- parse(B,P0,P1,Tree), parse(Bs,P1,P,Trees). parse([Word],[Word|P],P,Word). NLP1 - Torbjörn Lager

Trying It Out s --> np, vp. det --> [en]. np --> pn. n --> [älg]. np --> det, n. tv --> [skjuter]. vp --> v, np. pn --> [lisa]. ? - parse(s,[lisa,skjuter,en,älg],[],Tree). Tree = s/(np/pn/lisa,vp/(v/skjuter,np/(det/en,n/älg))) NLP1 - Torbjörn Lager

Tree = s/ np/ pn/lisa, vp/ v/skjuter, np/ det/en, n/älg The Resulting Tree NLP1 - Torbjörn Lager

Syntactic Ambiguity • Den gamla damen träffade killen med handväskan • John saw a man in the park with a telescope • Råttan åt upp osten och hunden och katten jagade råttan NLP1 - Torbjörn Lager

Local Ambiguity • The old man the boats • The horse raced past the barn fell NLP1 - Torbjörn Lager

Indeterminism and Search • A depth-first, top-down, left-to-right, backtracking parser can handle (both forms) of ambiguity. • Parsing as a form of search NLP1 - Torbjörn Lager

A Problem • Left-recursive rules np --> np, pp. np --> np, conj, np. • Indirect left-recursion A --> B, C.B --> A, D. NLP1 - Torbjörn Lager

Another Problem s --> np, vp. vp --> v, np. vp --> v, np, pp. vp --> v, np, vp. ... • Ex: John saw the man talk with the actress • Parsing is exponential in the worst case! NLP1 - Torbjörn Lager

Solution • Use a table (chart) in which parsed constituents are stored. No constituent is added to the chart which is already in it. • Parsing can be done in O(n3) time (where n is length of input). NLP1 - Torbjörn Lager

Knowledge Processor Parsed text Text Some Parsing Issues • Accuracy • Speed • Space requirements • Robustness • Learning NLP1 - Torbjörn Lager

Problems with Traditional Parsers • Bad coverage • Brittleness • Slowness • Too many trees! NLP1 - Torbjörn Lager

Problems with Traditional Parsers • Correct lowlevel parses are often rejected because they do not fit into a global parse -> brittleness • Ambiguity -> indeterminism -> search -> slow parsers • Ambiguity -> sometimes hundreds of thousands of parse trees, and what can we do with these? NLP1 - Torbjörn Lager

Another strategy (Abney) • Start with the simplest constructions (’easy-first parsing’) and be as careful as possible when parsing them -> ’islands of certainty’ • ’islands of certainty’ -> do not reject these parses even if they do not fit into a global parse -> robustness • When you are almost sure of how to resolve an ambiguity, do it! -> determinism • When you are uncertain of how to resolve an ambiguity, don’t even try! -> ’containment of ambiguity’ -> determinism • determinism -> no search -> speed NLP1 - Torbjörn Lager

Shallow Parsers • Works on Part-of-Speech tagged data • Analyses less complete than conventional parser output • Identifies some phrasal constituents (e.g. NPs), without indicating their internal structure and their function in the sentence. • or identifies the functional role of some of the words, such as the main verb, and its direct arguments. NLP1 - Torbjörn Lager

Deterministic bottom-up parsing • Adapted from Karttunen 1996: define NP [(d) a* n+] ; regex NP @-> “[NP” ... “]” .o. v “[NP” NP “]” @-> “[VP” ... “]” ; apply down dannvaan [NP dann][VP v [NP aan]] • Note the use of the longest-match operator! NLP1 - Torbjörn Lager

Part-of-Speech Tagging