140 likes | 380 Views
French Question Answering in Technical and Open Domains. Aoife O’Gorman Documents and Linguistic Technology Group Univeristy of Limerick. Outline. Information Retrieval Vs Question Answering Monolingual Question Answering Cross-Lingual Question Answering Problems and Possible Solutions
E N D
French Question Answering in Technical and Open Domains Aoife O’Gorman Documents and Linguistic Technology Group Univeristy of Limerick
Outline • Information Retrieval Vs Question Answering • Monolingual Question Answering • Cross-Lingual Question Answering • Problems and Possible Solutions • Research Objectives
Information Retrieval Vs Question Answering • Information Retrieval (IR) “responding to a user’s need for information by retrieving a small number of documents within which the relevant information is to be found.” [van Rijsbergen, 1999] • (Factoid) Question Answering (QA) Often the user wants not whole documents but brief answers to specific questions. Eg: Q: When was the storming of the Bastille? A: July 14, 1789
MonoLingual Question-Answering • Text REtrieval Conference (TREC) - Q&A Track • DLT TREC System - English English • FYP system - French French
Examples English English Q: What year did Alaska become a state? [TREC 2002] A:January 3, 1959 Q: “How did Einstein die?” [TREC 2003] A: Ruptured abdominal aortic aneurysms French French Q: Quand Mike Tyson a-t-il mordu l’oreille de Holyfield? A: le 28 juin, 1997 Q: Quelle est la capitale de l’Algérie? A:Alger [TREC 2002 queries translated by Caroline Corsini for FYP project]
Cross-Lingual Question-Answering • Cross-Language Evaluation Forum (CLEF) • European equivalent to TREC • Multilngual IR tasks including French-English QA Q: Combien d'Oscars le film "Sur les quais" a-t-il remportés? A: eight Q: Quelle est la capitale de la Tchétchénie? [CLEF 2003] A: Grozny
CLEF System Architecture Query classification Query translation (Google) & re-formulation Named entity recognition Text retrieval (dtSearch) Answer entity selection
Problemsidentified in CLEF • Verbs and their arguments are idiomatically linked faire la pêche make a fish • Sometimes proper names should be translated, sometimes not Tchétchénie Chechnya Grand Prix Grand Prize • Titles La Belle et la Bête The beautiful one and the animal
Problems identified in CLEF • Slang words in source language that may not exist in target language • Eg: “carjacking” • “actes de piraterie routière” • Anaphoric References: • Eg: “Einstein died of one…..” (ruptured abdominal aortic aneurysm)
Possible Solution • Predictive Annotation: • document collection is analysed • output = dictionary containing the collection vocabulary • proper names and multi-word terms • Eg: the sculpure, “Chicken Boy” (sculpture de « garçonà tête de poulet »)
Research Objectives • Focus on translation in Cross-Lingual QA • Identify French names and titles in queries • Combine data from Corpora, Dictionaries and Web • Predict likely English equivalents • Evaluate in context of CLEF competion