Cross-Language French-English Question Answering using the DLT System at CLEF 2003

Cross-Language French-English Question Answering using the DLT System at CLEF 2003 Aoife O’Gorman Igal Gabbay Richard F.E. Sutcliffe Documents and Linguistic Technology Group Univeristy of Limerick

Outline • Objectives • System architecture • Key components • Task performance evaluation • Findings

Objectives • Learn the issues involved in multilingual QA • Combine the components of our existing English and French monolingual QA systems

System architecture Query classification Query translation (Google) & re-formulation Named entity recognition Text retrieval (dtSearch) Answer entity selection

Query classification • Categories based on translated TREC 2002 queries • Keyword based classification • what_country • De quel pays le jeu de croquet est-il originaire • De quel nation..? • Unknown

Query translation and re-formulation • Submitting the French query in its original form on the Google Language Tools page • Tokenisation • Selective removal of stopwords • Example: • Qui a été élu gouverneur de la California? • Who was elected governor of California? • [ ‘elected’, ‘governor’, ‘California’]

Text Retrieval: Submitting queries to dtSearch • dtSeach indexed the doc collection based on <DOC> tags • Inserting a w/1 connector between two capitalised words • Submitting untranslated quotations for exact match • Inserting an AND connnector between all other terms (Boolean) • Limited verb expansion based on common verbs used in TREC questions

Named Entity Recoginition: General Names • Captures any instances of general names in cases where we are not sure what to look for. • A general_name is defined in our system to be up to five capitalised terms interspersed with optional prepositions. • Examples: Limerick City • University of Limerick

Answer entity selection • highest_scoring • What year was Robert Frost born? • in entity(date,[1,8,7,5],[[],[],[], [], [1,8,7,5]],[],[],[]), poet target([Robert]) target(Frost]) was target([born]) in San Francisco • most_frequent • When did “The Simpsons” first appear on television? • When target([The]) target([Simpsons]) was target(first]) broadcast in entity(date[1,9,8,9,,[[],[],[],[],[],[1,9,8,9],[],[],])

Task performance evaluation Adapted from Magnini (2003)

Findings • Query classification: unexpected formulation of queries, too few categories • Translation: problems with names, titles, • - We need better query-specific translation • - Localisation of names/titles • - Possibly limit translation to search terms • An interface could be built for the parser to enable it to be tested by an end user • Error types 6-13 could be investigated and the parser extended to handle some of them • Practical studies in the use of STS could be carried out

Findings • Text retrieval: allow relaxation and more sophisticated expansion of search queries • Named entity recognition: find better alternatives to answer questions of type Unknown • Answer entity selection: take into account distance and density of query terms • Usability issue: answers may need to be translated back to French

Cross-Language French-English Question Answering using the DLT System at CLEF 2003

Cross-Language French-English Question Answering using the DLT System at CLEF 2003

Presentation Transcript

Question-Answering

Question Answering

The Jikitou Biomedical Question Answering System :

Answering a question using RADS

Answering a question using RADS

Question Answering for Machine Reading Evaluation Evaluation Campaign at CLEF 2011

CLEF 2007 Multilingual Question Answering Track

Natural Language Processing Question Answering

Cross-Language Evaluation Forum CLEF Workshop 2005

Océ at CLEF 2003

The Multiple Language Question Answering Track at CLEF 2003

The CLEF 2005 Cross-Language Image Retrieval Track

CLEF 2009, Corfu Question Answering Track Overview

CLEF 2008 Multilingual Question Answering Track

Question Answering

AQUA: AQUAINT Question Answering System

Question Answering

Question Answering

Evaluating Multilingual Question Answering Systems at CLEF

Question Answering System