Question Answering From Zero to Hero

Question AnsweringFrom Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar

Sources • TREC-9. 2001. • http://la.lti.cs.cmu.edu/Javelin • E. Voorhees. "The Overview of the TREC-9 Question Answering track." • J. Prager, E. Brown, A. Coden and D. Radev. "Question answering by predictive annotation." SIGIR '00. • C.L.A. Clarke, G.V. Cormack and T.R. Lynam. "Exploiting redundancy in question answering." In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2001. V P C

Question Answering • IR • Successful in large scale text search problems • Retrieve full documents • IE • Successful in extracting very precise answers from text • Work on pre-specified domains • Combining the strengths

QA track in TREC • Collection of unstructured documents (table 1 in V) • Short factual questions in English (Why can't ostriches fly ? Where did Bill Gates go to college ?) also figure 1 in V • Return answer as a ranked list of 5 fragments of documents (2 categories: 50 and 250 bytes)

Evaluation • By people • Reciprocal rank of first correct answer or 0 • % answers which were found • Strict and Lenient scores (supported and unsupported judgment) • Short and Long version

2 QA TREK systems • Question Answering by Predictive Annotation - Prager, Brown, Coden (IBM) and Radev (U of Michigan) • Exploiting Redundancy in Question Answering - Clarke, Cormack, Lynam (U of Waterloo) • Ranking - Table 2 in V

Exploiting Redundancy in Question Answering Figure 1 in C Question -> a query for submission to a passage retrieval component -> a set of selection rules what guides the process of extracting answers from the passages (answer category) Get a list of k passages Identify possible answers Rank the possible answers Question analysis – IR – IE

3 features with greatest contribution • Flexibility of the parser • Passage retrieval technique (high quality passages) • Redundancy in the answer selection component – contribution of evidence from multiple passages to identify the most likely answer

Passage Retrieval techniques • Each document D is an ordered sequence of terms D= d1 d2 d3 … dm • Extent (u, v) (minimal) • Query Q generated from the question Q={q1, q2, q3, …} • Compute the score for an extent(u, v) for which TQ is a cover • Higher scores to passages whose P of occurrence is lower

Redundancy • Each candidate term t is is assigned a weight that takes into account the number of distinct passages in which the term appears, as well as the relative frequency of the term in the database • Wt = Ct log (N/ft) • Ct is the number of distinct passages in which t appears • Summing the weights of a all terms in a candidate answer • Determine the first one, reduce weights to 0, do all over until have 5 • Figure 2 in C

Exploiting redundancy • “Who” questions • 100 GB corpus • K depth, W width • Figure 2 in C

Who wants to be a Millionaire? • Real life example • 70% correct overall • Figure 5 in C

Question answering by predictive annotation • IBM system • Shallow NLP • System structure Figure 1 in P • Annotation • Indexing

Question Answering From Zero to Hero

Question Answering From Zero to Hero

Presentation Transcript

Question-Answering

Question Answering

Talk Schedule Question Answering from Email

From “Zero” to “Hero”

Question AnswerinG

From zero to HERO with

From Zero To Hero

From Question-Answering to Information-Seeking Dialogs

From Question-Answering to Information-Seeking Dialogs

GIDEON: FROM ZERO TO HERO

AnswerFinder Question Answering from your Desktop

From Zero to Hero in 60 Minutes

Question Answering

Question Answering

Question Answering

Question Answering

From Question-Answering to Information-Seeking Dialogs

Question Answering

Question Answering

Question Answering

Question Answering via Question-to-Question Mapping