10 likes | 125 Views
Goal of a question answering (QA) system is to answer precisely questions formulated in natural language. Different from the more widely known search engines such as Google which retrieve documents based on a set of keywords.
E N D
Goal of a question answering (QA) system is to answer precisely questions formulated in natural language. • Different from the more widely known search engines such as Google which retrieve documents based on a set of keywords. • A general domain QA system it is not specially tuned or prepared to answer questions in a particular domain or subject. Iterate over all the alternative questions Anaphor Resolution Search Doc Collections News, Wikipedia Search Web Question Reformulation Question NER N-Grams Esfinge Answer=NIL Database of Co-ocurrences • General domain Portuguese QA system • Use of information redundancy to retrieve documents (CHAVE collection, Wikipedia and Web) • Anaphor resolution, using PALAVRAS [Bick, 2000] • Multiple question generation, multiple answers • Experimenting with several types of search patterns • Named entity recognizer SIEMÊS [Sarmento, 2006] used to retrieve candidate answers to questions that imply answers of particular types of NE. • Web interface and source code used in some of the system's modules available at http://www.linguateca.pt/Esfinge/ Choice of longer answers Filters Search Supporting Documents Answer Selection Answer Answer(s) CLEF Results • The Cross-Language Evaluation Forum (CLEF) promotes R&D in multilingual information access. • Esfinge participates in CLEF since 2004. • Errors at CLEF 2007: Test set 1: 200 questions from QA@CLEF 2007 for PT-PT Table 1. Result of the experiments (F:170 factoid questions; D: 30 definition questions) • Wrong or incomplete search patterns (63/165 wrong answers) • Document retrieval failure (33/165 wrong answers) • Missing patterns to identify the type of answer • Search in Wikipedia Table 2. Causes for wrong answer in the best run Test set 2: 200 questions from QA@CLEF 2008 for PT-PT Table 3. Result of the experiments (F:171 factoid questions; D: 29 definition questions) Answering Portuguese Questions Luís Fernando Costa & Luís Miguel Cabral {Luis.costa, Luis.M.Cabral}@sintef.no Linguateca / SINTEF ICT PB 124, Blindern NO-0314 Oslo, Norway http://www.linguateca.pt Question Answering Arquitecture Experiments • More complete search patterns (added noun phrases) • Remove the verbs from the search • Combine two types of search patterns simultaneously: • Example: “Que país declarou a independência em 1291?” • Predefined text patterns: • "declarou a independência em 1291“ país / 20 • país declarou a independência em 1291 / 1 • Patterns generated using PALAVRAS: • declarou; a independência em 1291; país; • a independência em 1291; país; (without verbs) Bick, E.: The Parsing System "Palavras": Automatic Grammatical Analysis of Portuguese in a Constraint Grammar Framework. Aarhus: Aarhus University Press (2000) Sarmento, L.: SIEMÊS - a named entity recognizer for Portuguese relying on similarity rules. In 7th Workshop on Computational Processing of Written and Spoken Language (PROPOR'2006)(Itatiaia, RJ, Brasil, 13-17 May 2006), Springer, pp. 90-99. Conclusions • Using patterns without verbs as a backup strategy yield better results both with 2007 and 2008 QA@CLEF questions), but only for factoid questions. • Benefits of the combination of two types of search patterns • were not confirmed by the experiment with 2008 questions. • Errors moved to a later stage in the system’s execution. • This work was done in the scope of the Linguateca, contract nº339/1.3/C/NAC, project jointly funded by the Portuguese Government and the European Union.