Approximate Factoring for A* Search

Approximate Factoring for A* Search Aria Haghighi, John DeNero, and Dan Klein Computer Science Division University of California Berkeley

Inference for NLP Tasks A* Search

Inference as Search Partial Hypothesis a3 a2 a2 y a1

Bitext Parsing as Search S S S S’ NP VP NP VP Target Source translation is hard , la traducción es dificil Weighted Synchronous Grammar Parsing O(n6) Modified CKY over bi-spans (X[i,j],X’[i’,j’])

A* Search y Score So Far Completion Score

A* Search • Heuristic Design • Tight small • Admissible • Efficient to compute This way hypothesis! Optimal Result A* Heuristic Man

A* Example: Bitext Search Bi-Span Viterbi Inside Score Cost So Far

A* Bitext Search Viterbi Outside Score Completion Score Ideal Heuristic O(n6)

Of Stately Projections ¼ S’ S S S’ S S’ NP’ VP’ NP VP S’ S NP’ VP’ NP VP S S’ S S

A* Bitext Search S S S S’ S S’ NP VP NP VP NP VP NP’ VP’ Suppose, Then,

Projection Heuristic O(n6) O(n3) O(n3) Klein and Manning [2003]

When models don’t factorize

When models don’t factorize c(a) y x Át(a) Ás(a) ¼s(y) ¼t(y) ¼t(x) ¼s(x) Pointwise Admissibility

When models don’t factorize y ¼t(y) ¼s(y) Admissibility

Finding Factored Costs How to find Ás and Át? Pointwise Gap

Finding Factored Costs Small gaps

Finding Factored Costs Pointwise Admissibility

Finding Factored Costs

Bitext Experiments Synchronous Tree-to-Tree Transducer • Trained on 40k sentences of English-Spanish Europarl [Galley et. al, 2004] • Rare words replaced with POS tags • Tested on 1,200 sent. max length 5-15 Optimization Problem • Solved only once per grammar • 206K Variables • 160KConstraints • 29 minutes

Bitext Experiments

Bitext Experiments Zhang and Gildea (2006)

Lexicalized Parsing S-(is,VBZ) VP-(is,VBZ) NP-(translation,NN) (is,VBZ) S (translation, NN) NP VP Klein and Manning [2003]

Lexicalized Parsing

Lexicalized Parsing Too many constraints to efficiently solve! Over 64e13 possible lexicalized rules

Lexicalized Parsing

Lexicalized Model Experiments Standard Setup • Train on section 2-21 of the treebank • Test on section 23 (length · 40) Models Tested • Factored model [Klein and Manning, 2003] • Non-Factored Model

Lexicalized Parsing Factored Model [Klein and Manning, 2003]

Lexicalized Parsing Non-Factored Model

Conclusions • Generaltechnique for generating A* estimates • Can explicitly control admissibility tightness trade-off • Future Work: Explore different objectives and applications

Thanks http://nlp.cs.berkeley.edu

Approximate Factoring for A* Search

Approximate Factoring for A* Search

Presentation Transcript