270 likes | 463 Views
PROBABILISTIC CFGs & PROBABILISTIC PARSING. Universita’ di Venezia 3 Ottobre 2003. Probabilistic CFGs. Context-Free Grammar Rules are of the form: S NP VP In a Probabilistic CFG, we assign a probability to these rules: S NP VP, P(SNP,VP|S). Why PCFGs?.
E N D
PROBABILISTIC CFGs &PROBABILISTIC PARSING Universita’ di Venezia 3 Ottobre 2003
Probabilistic CFGs • Context-Free Grammar Rules are of the form: • S NP VP • In a Probabilistic CFG, we assign a probability to these rules: • S NP VP, P(SNP,VP|S)
Why PCFGs? DISAMBIGUATION: with a PCFG, probabilities can be used to choose the most likely parse ROBUSTNESS: rather than excluding things, a PCFG may assign them a very low probability LEARNING: CFGs cannot be learned from positive data only
PCFGs in Prolog (courtesy Doug Arnold) s(P0, [s,NP,VP] ) --> np(P1,NP), vp(P2,VP), { P0 is 1.0*P1*P2 }. ….vp(P0, [vp,V,NP] ) --> v(P1,V), np(P2,NP ), { P0 is 0.7*P1*P2 }.
Independence assumptions PCFGs specify a language model, just like n-grams We need however to make some independence assumptions yet again: the probability of a subtree is independent of:
Using PCFGs to disambiguate: “Astronomers saw stars with ears”
Parsing with PCFGs:A comparison with HMMs An HMM defines a REGULAR GRAMMAR:
Inside and outside probabilities(cfr. forward and backward probabilities for HMMs)
Learning probabilities Reconstruct the rules used in the analysis of the Treebank Estimate probabilities by:P(AB) = C(AB) / C(A)
Probabilistic lexicalised PCFGs(Collins, 1997; Charniak, 2000)
Readings • Manning and Schütze, chapters 11 and 12
Acknowledgments • Some slides and the Prolog code are borrowed from Doug Arnold • Thanks also to Chris Manning & Diego Molla