140 likes | 267 Views
Modeling Discriminative Global Inference. Nick Rizzolo and Dan Roth University of Illinois at Urbana-Champaign ICSC ‘07. Motivation. Interesting systems are increasingly complex. Single classifier systems Document categorizer Sentence boundary detector “Local” classifiers + constraints
E N D
Modeling Discriminative Global Inference Nick Rizzolo and Dan Roth University of Illinois at Urbana-Champaign ICSC ‘07
Motivation Interesting systems are increasingly complex • Single classifier systems • Document categorizer • Sentence boundary detector • “Local” classifiers + constraints • Chunking • Named Entity Recognition • Semantic Role Labeling • Coreference
Motivation Interesting systems are increasingly complex • Each new system, a massive programming effort • Yet they all solve the same problem: Given related objects… Find the “best” legal labeling • Feature extraction • Learning • Inference
Learning Based Java Eases the programmer’s burden • Organized, integrated feature extraction • Learning operator abstracts training • Convenient constraint programming language
Outline • Motivation • A Model for Discriminative Inference • A Detailed Example • Formalization • Learning Based Java • Support for Integer Linear Programming
Training data wi-1 wi+1 wi wi-1 wi wi+1 wi-1 wi wi+1 wi-1 wi+1 wi wi-1 wi+1 wi wi-1 wi wi+1 Learning wi-1 wi+1 wi open(wi) {“yes”, “no”} w1 w2 w3 w4 w5 w6 w1 w2 w3 w4 w5 w6 w1 w2 w3 w4 w5 w6 Constraints Example: Chunking Inference POS tags Forms Feature extraction
Objective: maximize scoring function object; needs a label possible label inference variable A Discriminative Inference Model • Constrained optimization • We use FOL to represent constraints • Inference algorithms: ILP, Beam search, … • Learning operator
Example: Chunking Modular feature extraction Learn to mimic an oracle From data Using particular features Using particular features With an algorithm Application of the learned function Application of the learned function Declarative, FOL-style syntax
Example: Part of Speech Tagging • 97.4% accuracy on Penn Treebank WSJ • Closed world: baseline feature computed from all data Resulting function returns a conjunctive value LBJ source file 54 total lines LBJ functions become classes; simply instantiate LBJ functions become classes Simply instantiate … … and invoke Java program Effortless integration
Outline • Motivation • A Model for Discriminative Inference • A Detailed Example • Formalization • Learning Based Java • Support for Integer Linear Programming
Integer Linear Programming • Expressive • Linear inequality constraints are hard to design
Final Thoughts • Designers of intelligent systems think about: • What classifiers to learn • How to combine them • LBJ enables programming at this level • Download LBJ from: http://l2r.cs.uiuc.edu/~cogcomp • Future work • More explicit modeling of structure • Applications in NLP and elsewhere
Example: Semantic Role Labeling Declarative, FOL-style constraints written in terms of functions applied to Java objects Inference produces new functions that respect the constraints