370 likes | 525 Views
Probabilistic Parsing. Ling 571 Fei Xia Week 4: 10/18-10/20/05. Outline. Misc: Hw3 and Hw4: lexicalized rules CYK recap Converting CFG into CNF N-best Quiz #2 Common prob equations Independence assumption Lexicalized models. CYK Recap. Converting CFG into CNF. CNF Extended CNF
E N D
Probabilistic Parsing Ling 571 Fei Xia Week 4: 10/18-10/20/05
Outline • Misc: Hw3 and Hw4: lexicalized rules • CYK recap • Converting CFG into CNF • N-best • Quiz #2 • Common prob equations • Independence assumption • Lexicalized models
Converting CFG into CNF • CNF • Extended CNF • CFG in general vs. CFG for natural languages • Converting CFG into CNF • Converting PCFG into CNF • Recovering parse trees
Definition of CNF • A, B,C are non-terminal, a is terminal, S is start symbol • Definition 1: • A B C, • A a, • S Where B, C are not start symbols. • Definition 2: -free grammar • A B C • A a
Extended CNF • Definition 3: • A B C • A a or A B • We use Def 3: • Unit rules such as NPN are allowed. • No need to remove unit rules during conversion. • CYK algorithm needs to be modified.
CYK algorithm with Def 2 • For every rule Aw_i, • For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: if then
CYK algorithm with Def 3 • For every position i for all A, if Aw_i, for all A and B, if A=>B, update • For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: …. for all non-terminals A and B, if AB, update
CFG • CFG in general: • G=(N, T, P, S) • Rules: • CFG for natural languages: • G=(N, T, P, S) • Pre-terminal: • Rules: • Syntactic rules: • Lexicon:
Conversion from CFG to CNF • CFG (in general) to CNF (Def 1) • Add S0S • Remove e-rules • Remove unit rules • Replace n-ary rules with binary rules • CFG (for NL) to CNF (Def 3) • CFG (for NL) has no e-rules • Unit rules are allowed in CNF (Def 3) • Only the last step is necessary
An example • VP V NP PP PP • To recover the parse tree w.r.t original CFG, just remove added non-terminals.
Converting PCFG into CNF • VPV NP PP PP 0.1 => VPV X1 0.1 X1 NP X2 1.0 X2 PP PP 1.0
N-best parse trees • Best parse tree: • N-best parse trees:
CYK algorithm for N-best • For every rule Aw_i, • For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: for each if val > one of probs in then remove the last element in and insert val to the array remove the last element in B[begin][end][A] and insert (m, B,C,i, j) to B[begin][end][A].
Three types of probability • Joint prob: P(x,y)= prob of x and y happening together • Conditional prob: P(x|y) = prob of x given a specific value of y • Marginal prob: P(x) = prob of x for all possible values of y
An example • #(words)=100, #(nouns)=40, #(verbs)=20 • “books” appears 10 times, 3 as verbs, 7 as nouns • P(w=books)=0.1 • P(w=books,t=noun)=0.07 • P(t=noun|w=books)=0.7 • P(nouns)=0.4 • P(w=books|t=nouns)=7/40
Independence assumption • Two variables A and B are independent if • P(A,B)=P(A)*P(B) • P(A)=P(A|B) • P(B)=P(B|A) • Two variables A and B are conditional independent given C if • P(A,B|C)=P(A|C) * P(B|C) • P(A|B,C)=P(A|C) • P(B|A,C)=P(B|C) • Independence assumption is used to remove some conditional factors, which will reduce the number of parameters in a model.
PCFG parsers It assumes each rule is independent of other rules
Problems of independence assumptions • Lexical independence: • P(VPV, Vbought) = P(VPV)*P(Vbought) See Table 12.2 on M&S P418.
Problems of independence assumptions (cont) • Structural independence: • P(SNP VP, NPPron) = P(SNP VP) * P(NPPron) See Table 12.3 on M&S P420.
Dealing with the problems • Lexical rules: • P(VPV | V=come) • P(VPV | V=think) • Adding context info: is a function that groups into equivalence classes.
PCFG It assumes each rule is independent of other rules
An example • he likes her
Remaining problems • he likes her • The Prob(T,S) is the same if the sentence is changed to “her likes he”.
New formula • he likes her