250 likes | 421 Views
Question Answering Technologies. Lyubomyr Havrylyuk. University of Konstanz. Feb 07,2011. Outline. QA Systems Historical Introduction. Modern approaches. Question focus recognition. Intelligent numerical answers generation. Answer recognition and extraction.
E N D
Question Answering Technologies Lyubomyr Havrylyuk UniversityofKonstanz Feb 07,2011
Outline QA Systems Historical Introduction Modern approaches Question focus recognition Intelligent numerical answers generation Answer recognition and extraction Performance issues and error analysis Conclusions Question Answering systems Information Retrieval - Konstanz Uni2
QA systems historical review Question AnsweringSystem(QA) is the system targeting the task of automatically answering a question posed in natural language. QA systems originated in early 1960s as systems for answering questions about a certain domain of knowledge. In 1965, the first generation of fifteen experimental question-answering systems was already reviewed . These included a social-conversation machine, systems that translated from English into limited logical calculus, and programs that attempted to answer questions from English text. Question Answering systems Information Retrieval - Konstanz Uni3
“First generation“ • Two most famous examples of QA systems of that time: • BASEBALL - answered questions about the US baseball league over a period of one year . • LUNAR - answered questions about the geological analysis of rocks returned by the Apollo moon missions. The common feature of all those systems is that they had a core database or knowledge system that was hand-written by experts of the chosen domain. • First generation systems were often handicapped by : • - the lack of adequate linguistic models • being written in low level languages such as FAP and IPL • different blunder often occurred, because systems didn’t store previously gained information, which led to hard updatability Question Answering systems Information Retrieval - Konstanz Uni4
“Second generation“ QA systems • Second generation QA systems : • Programmed with the higher level languages (e.g. Lisp, SNOBOL, ALGOL) • Better updatable due to the inclusion of limited features for remembering • Fact-retrieval systems, with generalization of several approaches previously mentioned topics and facts. • Data statements: • There are 5 fingers on a hand • There is one hand on an arm • There are 2 arms on a man Question: How many fingers are on man Answer: 10 Inference rules as conditional statements with variables: 1. If there are m X’s on a V and if there are n V’s on a Y, then there are m*n X’s on a Y. DEDUctive COMmunicator (DEDUCOM) Example Question Answering systems Information Retrieval - Konstanz Uni5
Modern QA Systems Nowadys interest to QA increases due to: - the popularity of InternetQA services (e.g.Ask.com , TrueKnowledge, EAGLi ,etc.) - the recent evaluations of domain-independentQA systems organized in the context of theText REtrieval Conference (TREC) TREC restrictions: 1.Exists at least one document in the test collection that contains answer to a test question 2. Answer length is limited (e.g. 250 bytes) Question Answering systems Information Retrieval - Konstanz Uni6
QA online systems examples QA online service, with the list of relevant online answers QA online system, with the answer in an excerpt from online document Question Answering systems Information Retrieval - Konstanz Uni7
Finding answer To find the answer to a question several stepsmust be taken: • question semantics needs to be captured identifying: expected answer type questionkeywords • index of the document collectionmust be used • answer extraction Question Answering systems Information Retrieval - Konstanz Uni8
Question representation Possible issues : - establishing possible answer type i.e. PERSON, LOCATION, TIME, ORGANISATION, DATE, MONEY, NUMBER etc. - finding interdependencies between question keywords The answer type is the object of the verb visit, which is defined by the semantic category LANDMARK. The answer type replaces the question stem. Question Answering systems Information Retrieval - Konstanz Uni9
Semantic mapping Syntactic dependencies vary across question reformulations or equivalent answers made possible by the productive nature of natural language. Verbs see and visit are synonyms; visitor can be replaced by possible actor pronoun I. Question ET2: What could I see in Reims ? The unifying mapping of ET1 and ET2. Helps to recognize equivalent answers, when lexical and semantic alternations are allowed Establishes dependency relations, and defines the search space based on alternations of the questions and answer concept. Question Answering systems Information Retrieval - Konstanz Uni10
Feedback supporting open-domain QA Answer correctness justification relies on lexico-semantic knowledge base (i.e. WordNet ). Sometimes answers fusion needed. Question Answering systems Information Retrieval - Konstanz Uni11
Question focus recognition Question focus is a noun phrase (NP) that is likely to be present in the answer. Question : Who was first governor of Alaska? FOCUS = the first governor of Alaska FOCUS-HEAD = governor MODIFIERS-FOCUS-HEAD= ADJ first, COMP Alaska NP synonyms of the questions focus head are also looked for. NPs can be associated with the score for relevance ranking if they are delimited. “This score takes into account the origin of the NP and the modifiers found in the question: when the NP contains the modifiers present in the question, its score is increased. The best score is obtained when all of them are present.” [4] Question Answering systems Information Retrieval - Konstanz Uni12
Answer from a set of candidate answers Most systems provide the user with: - either a setof potential answers (ranked or not) - the ”best”answer according to some relevance criteria. What about information from a set of candidate answers ? Example 1 : How many inhabitants are there in France? - Population census in France (1999): 60184186. - 61.7: number of inhabitants in France in 2004. Example 2 : What is the average age of marriage of women in2004? - In Iran, the average age of marriage of womenwas 21 years in 2004. - In 2004, Moroccan women get married at theage of 27. Question Answering systems Information Retrieval - Konstanz Uni13
Numeric results variation criteria Variation exists if there are at least k different numerical values with different criteria (time, place, other restrictions) among retrieved N frames or snippets (i.e. k = N / 4) Numerical value varies according to: Question Answering systems Information Retrieval - Konstanz Uni14
Variation criteria Question Answering systems Information Retrieval - Konstanz Uni15
Buiding a trend In case of variation (over the time ) a trend can be drawn, and with correlation coefficient (i.e. Pearson c. c. r ) explanation can be generated. Variationmode: Howmanyinhabitants arethereinFrance? Question Answering systems Information Retrieval - Konstanz Uni16
Numerical answer generation Once extracted numerical values are characterized, a cooperative answer can be generated. It is composed of two parts: - a direct answer if available, - an explanation of the value variation. A direct answer generation is mainly guided by constrains, if such are explicitly stated in the question. Ct -constrains on time Cp – constrains on place Cr – constrain on restriction C={Ct,Cp,Cr} Question Answering systems Information Retrieval - Konstanz Uni17
Numerical answer generation A direct answer has to be generated from the set of snippets AC which satisfy the set of constrains C. Question Answering systems Information Retrieval - Konstanz Uni18
Example Question : What is the average age of marriage in France ? A = {AC1;AC2}with: AC1 = {a1; a3; a5}, subset for restrictionwomen, AC2 = {a2; a4; a6}, subset for restrictionmen. having : a1= 27.7 a2=29.8 a3= 28 a4=30 a5= 28.5 a6=30.6 Direct answer after aggregation process : In2000,theaverageageofmarriageinFrancewasabout30yearsformenand28years forwomen. Question Answering systems Information Retrieval - Konstanz Uni19
Serial system representation QA system, as a serial system representation : Question Answering systems Information Retrieval - Konstanz Uni20
Distibution of error per system module Question Answering systems Information Retrieval - Konstanz Uni21
Conclusion • QA systems have been extended in recent years to explore critical new scientific and practical dimensions : automatic answering to temporal and geospatial questions, definitional questions, biographical questions, multilingual questions, and questions about different multimedia items. • Nevertheless, the overall performance of QA systems is directly related to the depth of NLP resources, even being significantly enhanced by lexico-semantic information from different large lexical databases of English, and online documents. • Bottlenecks of QA systems : • the derivation of the expected answer type • the keyword expansion • The main problem is the lack of powerful schemes and algorithms for modeling complex questions in order to derive as much information as possible, and for performing a well-guided search through thousands of text documents. Question Answering systems Information Retrieval - Konstanz Uni22
References 1.) Robert F. Simmons. 1970. Natural language question-answering systems: 1969. Commun. ACM 13, 1 (January 1970), 15-30. 2) Marius Pas.ca, Sanda M. Harabagiu. 2001. Answer mining from on-line documents. In Proceedings of the workshop on Open-domain question answering - Volume 12 (ODQA '01), Vol. 12. Association for Computational Linguistics, Stroudsburg, PA, USA, 1-8. 3) Véronique Moriceau . 2006. Generating intelligent numerical answers in a question-answering system. In Proceedings of the Fourth International Natural Language Generation Conference (INLG '06). Association for Computational Linguistics, Stroudsburg, PA, USA, 103-110. 4) O. Ferret, B. Grau, M. Hurault-Plantet, G. Illouz, L. Monceaux, I. Robba, and A. Vilnat. 2001. Finding an Answer Based on the Recognition of the Question Focus. In 10th Text Retrieval Conference. 5) Dan Moldovan, Sanda Harabagiu, and Mihai Surdeanu. 2003. Performance issues and error analysis in an open-domain question answering system. ACM Trans. Inf. Syst. 21, 2 (April 2003), 133-154. Question Answering systems Information Retrieval - Konstanz Uni23
Thank you, for your attention!!! Question Answering systems Information Retrieval - Konstanz Uni24