Incrementally Learning Parameter of Stochastic CFG using Summary Stats

Incrementally Learning Parameter of Stochastic CFG using Summary Stats Written by:Brent Heeringa Tim Oates

Goals: • To learn the syntax of utterances Approach: • SCFG (Stochastic Context Free Grammar) M=<V,E,R,S> V-finite set of non-terminal E-finite set of terminals R-finite set of rules, each r has p(r). Sum of p(r) of the same left-hand side = 1 S-start symbol

Problems with most SCFG Learning Algorithms 1)Expensive storage: need to store a corpus of complete sentences 2)Time-consuming: algorithms needs to repeat passes throughout all data

Learning SCFG • Inducing context-free structure from corpus(sentences) • Learning – the production(rules) probabilities

General method: Inside/Outside algorithm Expectation-Maximization (EM) Find expectation of rules Maximize the likelihood given both expectation & corpus Disadvantage of Inside/Outside algo. Entire sentence corpus must be stored using some representation(eg. chart parse) Expensive storage (unrealistic for human agent!) Learning SCFG –Cont

Proposed Algorithm • Use Unique Normal Form (UNF) • Replace all terminal A-z to 2 new rules • A->D p[A->D]=p[A->z] • D-> z p[D->z]=1 • No two productions have the same right hand side

Learning SCFG- Proposed Algorithm -cont • Use Histogram • Each rule has 2 histograms (Hor, HLr)

Proposed Algorithm -cont • Hor -contructed when parsing sentences in O • HLr- -will continue to be updated throughout learning process • HLr rescale to fixed size h • Why?! • Recently used rules has more impact on histogram

Comparing between HLr & Hor • Relative entropy • T decrease- increase prob of rules used • (if s large, increase prob of rules used when parsing last sentence ) • T increase- decrease prob of rules used (eg pt+1(r)=0.01* p t+1(r)

Inside/Outside O(n3) Good 3-5 iterations Bad Need to store complete sentence corpus Proposed Algo O(n3) Bad 500-1000 iterations Good Memory requirements is constant! Comparing Inside/Outside Algo with the proposed algorithm

Incrementally Learning Parameter of Stochastic CFG using Summary Stats