190 likes | 283 Views
Enhanced Answer Type Inference from Questions using Sequential Models. Vijay Krishnan Sujatha Das Soumen Chakrabarti IIT Bombay. Types in question answering. Factoid questions What country ’s president won a Fields medal? What is Bill Clinton’s wife’s profession ?
E N D
Enhanced Answer Type Inference from Questions using Sequential Models Vijay Krishnan Sujatha Das Soumen Chakrabarti IIT Bombay
Types in question answering • Factoid questions • What country’s president won a Fields medal? • What is Bill Clinton’s wife’s profession? • How much does a rhino weigh? • Nontrivial to anticipate the answer type • Broad clues like how much can be misleading: How much does a rhino cost? • Must drive carefully around possessives • Motivation: initial passage scoring and screening via semi-structured query • atype={weight#n#1, hasDigits} NEAR “rhino” Krishnan, Das, Chakrabarti
Answer types and informer spans • Space of atypes: any type membership you can recognize reliably in the corpus • WordNet synsets, named (i.e., typed) entities • Orthography and other surface patterns • How tall is Tom Cruise? • NUMBER:distance (UIUC atype taxonomy) • hasDigits (surface) linear_measure#n#1 (WN) • A single dominant informer span is almost always enough on standard benchmarks • Name the largest producer of wheat • Which country is the largest producer of wheat? Krishnan, Das, Chakrabarti
Exploiting atypes in QA This talk 2 1 Informerspantagger (CRF) Atypeclassifier (SVM) how tall How tall is theEiffel Tower Filter outinformers and stopwords IR systemsupportingtype tags and proximity NUMBER:distancehasDigits Eiffel Tower Short-listed candidate passages Krishnan, Das, Chakrabarti
Part 1: Sequence-tagging of informers 3-stategenerator 2-stategenerator • Parse the question • Features from parse tree at many levels of detail • Use a CRF to learn discriminative feature weights Krishnan, Das, Chakrabarti
Example parse tree and feature ideas • Being an informer token is correlated with part of speech • “IsTag” features • And whether the token is in the first chunk of its kind, or the second… • “IsNum” features + Neighborhood info Tags for “capital” are NN, NP, null, NP, SQ, SBARQ Japan is part of the second NP at level 2 Krishnan, Das, Chakrabarti
A multi-resolution feature table • Training data too sparse to lexicalize • Offset i fires boolean feature IsTag(y,t,ℓ) iff • E.g. position 4 fires IsTag(1,NP,2) • Offset i fires boolean feature IsNum(y,n,ℓ) iff • Lots of multi-resolution features, great for CRF! Tag Num Krishnan, Das, Chakrabarti
Experimental setup • 5500 training and 500 test questions from UIUC (Li and Roth) • 6 coarse atypes, 50 fine atypes • We tagged informer spans by hand • Almost perfect agreement in hand-tagging • Accuracy Measures for Informer Tagging: • Exact match score: predicted informer token set exactly equals true set • Jaccard score = |XY|/|XY| where X is predicted informer token set, Y true set Krishnan, Das, Chakrabarti
Contributions of CRF features MultilevelPOS tags Known Predicted Offset withinchunk type Neighbor tags Markov • IsNum gives soft bias toward earlier NPs • Neighbor tags tune to VBZ, IN, POS(sessives) • IsEdge makes a big difference • Modeling Markov state transition essential Krishnan, Das, Chakrabarti
Heuristic baseline Effective mapping heurstics often used in QA systems: • For What, Which and Name questions, use head of the NP adjoining wh-word • For How questions, tag how and the subsequent word • For other questions (When, Where, Who etc.), choose the Wh-word Krishnan, Das, Chakrabarti
Breakup by question cue Major improvement for “what”and “who” questions with diverse atypes Jaccard (%) Heuristic informers not nearly as good 3-state much better than 2-state Krishnan, Das, Chakrabarti
Robustness to wrong parses • Our Learning Approach is much more robust to slightly incorrect sentence parses. For e.g.: (X (X (WP What)) (NP (NP (NN passage)) (SBAR (S (VP (VBZ has) (NP (DT the) (CD Ten) (NNS Commandments))))))) The parse should instead be ((WHNP (WH What) (NN Passage))… ….) Krishnan, Das, Chakrabarti
Part 2: Mapping informers to atypes This talk 2 1 Informerspanidentifier (CRF) Atypeclassifier (SVM) how tall How tall is theEiffel Tower Filter outinformers and stopwords IR systemsupportingtype tags and proximity NUMBER:distancehasDigits Eiffel Tower Shortlisted candidate passages Krishnan, Das, Chakrabarti
Learning SVMs with informer features • Choose q-grams from each “field”, “named apart” • Also add WordNet hypernyms of informers • Map scientist/president/CEO/… to feature person#n#1 • Target HUMAN:individual is coarse grained • Target better correlated with generalizations of informer Krishnan, Das, Chakrabarti
SVM meta-learning results Bigram linear SVMclose to best so far Small gainsfrom parsetree kernels If human-annotated informers are used by the SVM, we beat all existing numbers by a large margin Even with some errors committed by the CRF we retain most of the benefit Krishnan, Das, Chakrabarti
Linear SVM feature ablation Bigrams beatall other q-grams Bigrams beatall other q-grams Both informer bigrams and hypernyms help Good to hedge bets by retaining ordinary bigrams Good to hedge bets by retaining ordinary bigrams Hypernyms of all question tokens does not help at all Krishnan, Das, Chakrabarti
Atype accuracy percent by question cue What and which make most diff Heuristic informers less effective Krishnan, Das, Chakrabarti
Observations • Retain most of the benefit of “perfect” informers • Significantly more accurate atype classification • Heuristic informers yields relatively small gains • We frequently fix errors in what/which questions whose answer types are harder to infer • Hypernyms of informer tokens help, but hypernyms of all question tokens don’t • Therefore, the notion of minimal informer span is important and non-trivial Krishnan, Das, Chakrabarti
Summary and ongoing work • Informers and atypes are important and non-trivial aspects of factoid questions • Simple, clean model for exploiting question syntax and sequential dependencies • CRF+SVM meta-learnerhigh accuracy informer and atype prediction • Can map informer directly to WordNet noun hierarchy—improves precision further • Can supplement WordNet with KnowItAll-style compilations—improves recall further Krishnan, Das, Chakrabarti