PROBABILISTIC CFGs & PROBABILISTIC PARSING

PROBABILISTIC CFGs &PROBABILISTIC PARSING Universita’ di Venezia 3 Ottobre 2003

Probabilistic CFGs • Context-Free Grammar Rules are of the form: • S  NP VP • In a Probabilistic CFG, we assign a probability to these rules: • S  NP VP, P(SNP,VP|S)

Why PCFGs? DISAMBIGUATION: with a PCFG, probabilities can be used to choose the most likely parse ROBUSTNESS: rather than excluding things, a PCFG may assign them a very low probability LEARNING: CFGs cannot be learned from positive data only

An example of PCFG

PCFGs in Prolog (courtesy Doug Arnold) s(P0, [s,NP,VP] ) --> np(P1,NP), vp(P2,VP), { P0 is 1.0*P1*P2 }. ….vp(P0, [vp,V,NP] ) --> v(P1,V), np(P2,NP ), { P0 is 0.7*P1*P2 }.

Notation and assumptions

Independence assumptions PCFGs specify a language model, just like n-grams We need however to make some independence assumptions yet again: the probability of a subtree is independent of:

The language model defined by PCFGs

Using PCFGs to disambiguate: “Astronomers saw stars with ears”

A second parse

Choosing among the parses, and the sentence’s probability

Parsing with PCFGs:A comparison with HMMs An HMM defines a REGULAR GRAMMAR:

Parsing with CFGs: A comparison with HMMs

Inside and outside probabilities(cfr. forward and backward probabilities for HMMs)

Parsing with probabilistic CFGs

The algorithm

Example

Initialization

Example

Learning the probabilities: the Treebank

Learning probabilities Reconstruct the rules used in the analysis of the Treebank Estimate probabilities by:P(AB) = C(AB) / C(A)

Probabilistic lexicalised PCFGs(Collins, 1997; Charniak, 2000)

Parsing evaluation

Performance of current parsers

Readings • Manning and Schütze, chapters 11 and 12

Acknowledgments • Some slides and the Prolog code are borrowed from Doug Arnold • Thanks also to Chris Manning & Diego Molla

PROBABILISTIC CFGs & PROBABILISTIC PARSING