60 likes | 219 Views
An Attempt at Unsupervised Learning of Hierarchical Dependency Parsing. via the Dependency Model with Valence (DMV). Motivation. Dependency Parsing: Search Query Refinement Statistical Machine Translation Unsupervised Learning: Availability of Large Quantities of Data. DMV.
E N D
An Attempt at Unsupervised Learning of Hierarchical Dependency Parsing via the Dependency Model with Valence (DMV)
Motivation • Dependency Parsing: • Search Query Refinement • Statistical Machine Translation • Unsupervised Learning: • Availability of Large Quantities of Data
DMV • Pick a Direction (left or right) • Generate the first child, or stop; • Generate more children, until stop. • Repeat in the other direction. • Recurse… • Porder • Pstop • Pattach
EM • Inside-Outside Algorithm: • Inside: Pi(i,X,j) = P(X derives i…j) • Outside: Po(i,X,j) = P(S derives 0…iXj…l) • Re-Estimation: • Frequency of sub-tree (i,X,j)=Pi(i,X,j)*Po(i,X,j)
Evaluation • Head-percolation of Penn Treebank parses; • % edges correct (directed or undirected) in the best (P)CFG parse… • Zero Knowledge: 14.4 (29.9) • Adjacent Word Heuristic: 33.6 • Klein & Manning: 43.2 (63.7) • Oracle: 75.5 (77.5) • - Pattach: 60.0 (63.3) - Pstop: 53.9 (57.7) • - PstopA: 50.0 (54.8) - PstopN: 12.5 (30.8)
EM • Didn’t work out… always made things worse, even when initialized with very good solutions. • If started using Zero Knowledge, then after 1 iteration already gets 18.4 (38.4), then worsens. • If started using an Ad-Hoc Harmonic for Pattach, then 21.5 (47.1) after 1 iteration, then worse, and similarly even for the Oracle solution… • Summary: • - DMV – useful, simple, extensible model; • - EM – more thorough debugging needed.