100 likes | 199 Views
Incrementally Learning Parameter of Stochastic CFG using Summary Stats. Written by:Brent Heeringa Tim Oates. Goals:. To learn the syntax of utterances Approach : SCFG (Stochastic Context Free Grammar) M=<V,E,R,S> V-finite set of non-terminal E-finite set of terminals
E N D
Incrementally Learning Parameter of Stochastic CFG using Summary Stats Written by:Brent Heeringa Tim Oates
Goals: • To learn the syntax of utterances Approach: • SCFG (Stochastic Context Free Grammar) M=<V,E,R,S> V-finite set of non-terminal E-finite set of terminals R-finite set of rules, each r has p(r). Sum of p(r) of the same left-hand side = 1 S-start symbol
Problems with most SCFG Learning Algorithms 1)Expensive storage: need to store a corpus of complete sentences 2)Time-consuming: algorithms needs to repeat passes throughout all data
Learning SCFG • Inducing context-free structure from corpus(sentences) • Learning – the production(rules) probabilities
General method: Inside/Outside algorithm Expectation-Maximization (EM) Find expectation of rules Maximize the likelihood given both expectation & corpus Disadvantage of Inside/Outside algo. Entire sentence corpus must be stored using some representation(eg. chart parse) Expensive storage (unrealistic for human agent!) Learning SCFG –Cont
Proposed Algorithm • Use Unique Normal Form (UNF) • Replace all terminal A-z to 2 new rules • A->D p[A->D]=p[A->z] • D-> z p[D->z]=1 • No two productions have the same right hand side
Learning SCFG- Proposed Algorithm -cont • Use Histogram • Each rule has 2 histograms (Hor, HLr)
Proposed Algorithm -cont • Hor -contructed when parsing sentences in O • HLr- -will continue to be updated throughout learning process • HLr rescale to fixed size h • Why?! • Recently used rules has more impact on histogram
Comparing between HLr & Hor • Relative entropy • T decrease- increase prob of rules used • (if s large, increase prob of rules used when parsing last sentence ) • T increase- decrease prob of rules used (eg pt+1(r)=0.01* p t+1(r)
Inside/Outside O(n3) Good 3-5 iterations Bad Need to store complete sentence corpus Proposed Algo O(n3) Bad 500-1000 iterations Good Memory requirements is constant! Comparing Inside/Outside Algo with the proposed algorithm