1 / 31

Fast Finite-state Relaxation Method for Enforcing Global Constraints on Sequence Decoding

This paper presents a fast finite-state relaxation method for enforcing global constraints on sequence decoding, specifically in the context of label structure, semantic role labeling, and agreement in named entity recognition. The proposed approach exploits the quality of local models and dynamically applies only those global constraints that are violated by the input sequence. The method outperforms traditional ILP-based approaches and achieves faster decoding runtimes.

jefferyw
Download Presentation

Fast Finite-state Relaxation Method for Enforcing Global Constraints on Sequence Decoding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A FastFinite-state Relaxation Methodfor Enforcing Global Constraintson Sequence Decoding Roy Tromble & Jason Eisner Johns Hopkins University

  2. Agreement: Named Entity Recognition (Finkel et al., ACL 2005) Seminar announcements (Finkel et al., ACL 2005) Label structure: Bibliography parsing (Peng & McCallum, HLT-NAACL 2004) Semantic Role Labeling (Roth & Yih, ICML 2005) *One role per string *One string per role We know what the labels should look like! Seminar – Friday, April 1 Speaker: Monty Hall Location: Auditorium #1 “Let’s Make a Dilemma” Monty Hall will host a discussion of his famous paradox.

  3. Finite-state constraint relaxation Finite-state constraint relaxation Local models Sequence modeling quality Decoding runtime Global constraints Exploit the quality of the local models!

  4. Salesfor the quarterroseto $ 1.63 billionfrom $ 1.47 billion. A1 A4 A3 Semantic Role Labeling • Label each argument to a verb • Six core argument types (A0-A5) • CoNLL-2004 shared task • Penn Treebank section 20 • 4305 propositions • Follow Roth & Yih (ICML 2005) A1 A1 A1 O O A4 O A3 O

  5. Encoding constraints as finite-state automata

  6. Roth & Yih’s constraints as FSAs [^A0]*A0*[^A0]* [^A1]*A1*[^A1]* NO DUPLICATE ARGUMENTS Each argument type (A0, A1, ...) can label at most one sub-sequence of the input.

  7. Roth & Yih’s constraints as FSAs • Regular expressions on any sequences: • grepfor sequence models O*[^O]?* AT LEAST ONE ARGUMENT The label sequence must contain at least one instance that is not O.

  8. Roth & Yih’s constraints as FSAs DISALLOW ARGUMENTS Only allow argument types that are compatible with the proposition’s verb.

  9. Roth & Yih’s constraints as FSAs KNOWN VERB POSITION The proposition’s verb must be labeled O.

  10. Roth & Yih’s constraints as FSAs Any constraints on bounded-length sequences ARGUMENT CANDIDATES Certain sub-sequences must receive a single label.

  11. Unigram model! Roth & Yih’s local model as a lattice “Soft constraints” or “features”

  12. Local model Sentence Labeling Global constraints Decode Intersect A brute-force FSA decoder

  13. NO DUPLICATE A0

  14. NO DUPLICATE A0, A1

  15. NO DUPLICATE A0, A1, A2

  16. Satisfying global constraints is NP-hard. Any approach would blow up in worst case! NO DUPLICATE ARGUMENTS

  17. Handling an NP-hard problem Roth & Yih (ICML 2005): • Express path decoding and global constraints as an integer linear program (ILP). • Apply ILP solver: • Relax ILP to (real-valued) LP. • Apply polynomial-time LP solver. • Branch and bound to find optimal integer solution.

  18. The ILP solver doesn’t know it’s labeling sequences Path constraints: State 0: outflow ≤ 1; State 3: inflow ≤ 1 States 1 & 2: outflow = inflow At least one argument: Arcs labeled O: flow ≤ 1

  19. Maybe we can fixthe brute-force decoder?

  20. Local model usually violated no constraints

  21. Most constraints were rarely violated

  22. Finite-state constraint relaxation • Local models already capture much structure. • Relax the constraints instead! • Find best path using linear decoding algorithm. • Apply only those global constraints that path violates.

  23. Local model Sentence Labeling Global constraints Decode Intersect Brute-force algorithm

  24. Local model Sentence Labeling Global constraints Decode Intersect Violated constraints C1 C2 C3 Test Optimal! Never intersected! Constraint relaxation algorithm no yes

  25. Why? Finite-state constraint relaxation is faster than the ILP solver • State-of-the-art implementations: • Xpress-MP for ILP, • FSA (Kanthak & Ney, ACL 2004) for constraint relaxation.

  26. Many take one iteration even though two constraints were violated. No sentences required more than a few iterations

  27. Buy one, get one free Salesfor the quarterroseto $ 1.63 billionfrom $ 1.47 billion. A1 A1 A4 A3

  28. Arcs at each iteration for examples that required 5 intersections Arcs in brute force lattice for examples that required 5 intersections Lattices remained small

  29. Take-home message • Global constraints aren’t usually doing that much work for you: • Typical examples violate only a small number using local models. • They shouldn’t have to slow you down so much, even though they’re NP-hard in the worst case: • Figure out dynamically which ones need to be applied.

  30. Future work • General soft constraints (We discuss binary soft constraints in the paper.) • Choose order to test and apply constraints, e.g. by reinforcement learning. • k-best decoding

  31. Thanks • to Scott Yih for providing both data and runtime, and • to Stephan Kanthak for FSA.

More Related