340 likes | 795 Views
Chapter 12 Lexicalized and Probabilistic Parsing. Guoqiang Shan University of Arizona November 30, 2006. Outline. Probabilistic Context-Free Grammars Probabilistic CYK Parsing PCFG Problems. Probabilistic Context-Free Grammars. Intuition Behind
E N D
Chapter 12Lexicalized and Probabilistic Parsing Guoqiang Shan University of Arizona November 30, 2006
Outline • Probabilistic Context-Free Grammars • Probabilistic CYK Parsing • PCFG Problems
Probabilistic Context-Free Grammars • Intuition Behind • To find “correct” parse for the ambiguous sentences • i.e. can you book TWA flights? • i.e. the flights include a book • Definition of Context-Free Grammar • 4-tuple G = (N, Σ, P, S) • N: a finite set of non-terminal symbols • Σ: a finite set of terminal symbols, where N ΛΣ = Φ • P: A β , where A is in N, and β is in (N V Σ)* • S: start symbol in N • Definition of Probabilistic Context-Free Grammar • 5-tuple G = (N, Σ, P, S, D) • D: A function P [0,1] to assign a probability to each rule in P • Rules are written as A β[p], where p = D(A β) • i.e. A a B [0.6], B C D [0.3]
PCFG Example Det that .5 Det the .8 Det a .15 Noun book .1 Noun flights .5 Noun meal .4 Verb book .3 Verb include .3 Verb want .4 Aux can .4 Aux does .3 Aux do .3 ProperN TWA .4 ProperN Denver .6 Pronoun you .4 Pronoun I .6 S NP VP .8 S Aux NP VP .15 S VP .05 NP Det Nom .2 NP ProperN .35 NP Noun .05 NP ProNoun .4 Nom Noun .75 Nom Noun Nom .2 Nom ProperN Nom .05 VP Verb .55 VP Verb NP .4 VP Verb NP NP .05
Probability of A Sentence in PCFG • Probability of any parse tree T of S • P(T,S) = Π D(r(n)) • T is the parse tree and S is the sentence to be parsed • n is a sub tree of T and r(n) is a rule to expand n • Probability of A parse tree • P(T,S) = P(T) * P(S|T) • A parse tree T uniquely corresponds a sentence S, so P(S|T) = 1 • P(T) = P(T,S) • Probability of a sentence • P(S) = Σ P(T), where T is in τ(S), the set of all the parse trees of S • In particular, for an unambiguous sentence, P(S) = P(T)
Example • P(Tl) = 0.15*0.40*0.05* 0.05*0.35*0.75* 0.40*0.40*0.30* 0.40*0.50= 3.78*10-7 • P(Tr) = 0.15*0.40*0.40* 0.05*0.05*0.75* 0.40*0.40*0.30* 0.40*0.50= 4.32*10-7
Probabilistic CYK Parsing of PCFG • Bottom-Up approach • Dynamic Programming: fill the tables of partial solutions to the sub-problems until they contain all the solutions to the entire problem • Input • CNF: ε-free, each production in form A β or A BC • n words, w1, w2, …, wn • Data Structure • Π[i, j, A]: the maximum probability for a constituent with non-terminal A spanning j words from wi • β[i, j, A] = {k, B, C}, where A BC, and B spans k words from wi (for rebuilding the parse tree) • Output • The maximum probability parse will be Π[1,n,1] • The root of the parse tree is S, and spans entire string
Π [i,0,A] {k, B, C} CYK Algorithm • Base case • Consider the input strings of length one • By the rules A wi • Recursive case • For strings of words of length>1, A → wij • There exists some rules A BC and k • 0<k<j • B → wik (known) • C → w(i+k)(j-k) (known) • Compute the probability of wij by multiplying the two probabilities • If there are more than one A BC, pick the one that maximize the probability of wij My implementation is in lectura under directory /home/shan/538share/pcyk.c
PCFG Example – Revisit to rewrite Det that .5 Det the .8 Det a .15 Noun book .1 Noun flights .5 Noun meal .4 Verb book .3 Verb include .3 Verb want .4 Aux can .4 Aux does .3 Aux do .3 ProperN TWA .4 ProperN Denver .6 Pronoun you .4 Pronoun I .6 S NP VP .8 S Aux NP VP .15 S VP .05 NP Det Nom .2 NP ProperN .35 NP Noun .05 NP ProNoun .4 Nom Noun .75 Nom Noun Nom .2 Nom ProperN Nom .05 VP Verb .55 VP Verb NP .4 VP Verb NP NP .05
Example (CYK Parsing) - Rewrite as CNF S NP VP .8 (S Aux NP VP .15) S Aux NV .15 NV NP VP 1.0 (S VP .05) S book .00825 S include .00825 S want .011 S Verb NP .02 S Verb DNP .0025 NP Det Nom .2 (NP ProperN .35) NP TWA .14 NP Denver .21 (NP Nom .05) NP book .00375 NP flights .01875 NP meal .015 NP Noun Nom .01 NP ProperN Nom .0025 (NP ProNoun .4) NP you .16 NP I .24 (Nom Noun .75) Nom book .075 Nom flights .375 Nom meal .3 Nom Noun Nom .2 Nom ProperN Nom .05 (VP Verb .55) VP book .165 VP include .165 VP want .22 VP Verb NP .4 (VP Verb NP NP .05) VP Verb DNP .05 DNP NP NP 1.0
PCFG Problems • Independence Assumption • Assumption: the expansion of one nonterminal is independent of the expansion of others. • However, examination shows that how a node expands is dependent on the location of the node • 91% of the subjects are pronouns. • She’s able to take her baby to work with her. (91%) • Uh, my wife worked until we had a family. (9%) • But only 34% of the objects are pronouns. • Some laws absolutely prohibit it. (34%) • All the people signed confessions. (66%)
PCFG Problems • Lack of sensitivity of words • Lexical information in a PCFG can only be represented via the probability of pre-terminal nodes (such as Verb, Noun, Det) • However, lexical information and dependencies turns out to be important in modeling syntactic probabilities. • Example: Moscow sent more than 100,000 soldiers into Afghanistan. • In PCFG, into Afghanistan may attach NP (more than 100,000 soldiers) or VP (sent) • Statistics shows that NP attachment is 67% or 52% • Thus, PCFG will produce an incorrect result. • Why? the word “Send” subcategorizes for a destination, which can be expressed with the preposition “into”. • In fact, when the verb is “send”, “into” always attaches to it
However, PCFG assigns them the same probability, since the structures are using exactly the same rules. PCFG Problems • Coordination ambiguity • Look at the following case • Example: dogs in houses and cats • Semantically, dogs is a better conjunct for cats than houses • Thus, the parse [dogs in [NP houses and cats]] intuitively sounds unnatural, and should be dispreferred.
References • NLTK Tutorial: Probabilistic Parsing: http://nltk.sourceforge.net/tutorial/pcfg/index.html • Stanford Probabilistic Parsing Group: http://nlp.stanford.edu/projects/stat-parsing.shtml • General CYK algorithm http://en.wikipedia.org/wiki/CYK_algorithm • General CYK algorithm web compute http://www2.informatik.hu-berlin.de/~pohl/cyk.php?action=example • Probabilistic CYK parsing http://www.ifi.unizh.ch/cl/gschneid/ParserVorl/ParserVorl7.pdf http://catarina.ai.uiuc.edu/ling306/slides/lecture23.pdf
Questions? Thank You!