420 likes | 547 Views
TRUST & QRISTAL ( TRUST = T ext R etrieval U sing S emantic T echnologies) ( QRISTAL = Q uestions- R éponses I ntégrant un S ystème de T raitement A utomatique des L angues) Questions-Replies Integrating a System to Treat (process) Automatically the Languages.
E N D
TRUST & QRISTAL(TRUST = Text Retrieval Using Semantic Technologies)(QRISTAL =Questions-Réponses Intégrant un Système de Traitement Automatique des Langues)Questions-Replies Integrating a System to Treat (process) Automatically the Languages Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
1. TRUST Presentation 2. QRISTAL Presentation 3. QRISTAL Evaluation Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
1. TRUST Presentation • TRUST is an R&D project co-financed by the EU Commission, under Synapse technological leadership, ,and addressing a multilingual QA system.It was submitted by a consortium of 6 Smes • Synapse Développement, Toulouse, France • Expert System Solutions, Modène, Italie • Priberam, Lisbonne, Portugal • TiP, Katowice, Pologne • Convis, Berlin, Allemagne & Paris, France • Sémiosphère, Toulouse, France (coordination) Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
TRUST started in November 2001 and completed in October 2003. It was designed to be an industrial project with aim to commercialise in B2B and B2C, a QA software allowing to any user to retrieve one or several answers to a general purpose or factual question. It was bound to answer to questions from a finite corpus (hard disk, set of documents…), or questions addressed to Internet, via a meta-engine, using the most popular engine (Google, MSN, Altavista, AOL, etc.) Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
The targeted languages were French, Italian, Polish, Portuguese. English was not part of Trust but was developed in parallel. The pivot language, allowing to ask a question in one language and get the reply in another is English. All partners owned a syntactic analyser and important linguistic resources. Synapse, as technology transferor, had at disposal a previously commercialised engine (called Chercheur) to index and retrieve. Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Question Réponse(s) Documents Indexation Traitement Question Extraction réponse Découpage des blocs Correction orthographe Extraction réponse(s) Correction orthographe Analyse syntaxique Ontologie générale Cohérence, justification Analyse syntaxique Analyse conceptuelle Tri des phrases Analyse conceptuelle Extraction mots-clés Sélection phrase(s) Résolution anaphores Type de la question Dico des formes dérivées Détection des métaphores Index entités nommées Traduction si multilingue Résolution des anaphores Index têtes dérivation Recherche dans Index Mots-clés du bloc Index des concepts Synonymes + converses Type de la réponse Ontologie des types questions Index des domaines Sélection des blocs Analyse conceptuelle Index mots-clés blocs Ordonnancement blocs Analyse syntaxique Index des types de questions-réponses Extraction des blocs Correction orthographe Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Trust Engine description • At completion , the Trust engine has very original features : • the indexation is carried-out on words, expressions, named entities but also on concepts, domains and the types of QA • The excerpt search, and the answer extraction are using a very deep and sharp syntactic, conceptual, and semantic analysis.. Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
A Modular Architecture Module Linguistique français Module Linguistique italien Module Linguistique portugais Module Linguistique polonais Module Linguistique anglais Moteur d’indexation Moteur d’extraction de blocs de texte Documents Visualisation Des résultats Index Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Document Indexation TRUST indexes numerous document formats (.html, .doc, .pdf, .ps, .sgml, .xml, .hlp, .dbx, etc.) as well as archived/compressed (.zip) and ascii texts. An automated spelling checking may be carried out prior to it. Beyond the usual indexation of the terms, a semantic and syntactic analysis performs the indexation of the concepts and the typology of answers (ex. : a date of birth,a title or an occupation for a person, etc.) Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
The simple words are indexed by « head of derivation » i.e. words such as « symétrie », « symétriques », « asymétrie », « dissymétrique », « symétriseraient » ou « symétrisable » will be indexed under the same heading « symétrie ». This technique allows to reduce the size of the indexes and facilitates the grouping of neighbouring notions, thus avoiding the classical « term expansion » process during the request. Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Technical characteristics Currently indexation is performed in 1Ko blocks, ie the texts are sliced in 1Ko blocks and any head of derivation will be indexed and allocated an occurrence number (ex: found 3 times in the blocks, occurrence is 3) The indexation speed is very different according to the languages.It is about 300 Mo/hour in French and Polish, about 240 Mo/hour in Portuguese, about 100 Mo/hour for English and about 10 Mo/hour for italian. Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Conceptual Indexation and Ontology • TRUST shares a common ontology with all linguistic modules of the various languages attached to it.This ontology, developed by Synapse, includes 5 hierarchical levels corresponding to : • 28 categories at the main superior level • 94 categories at the second level • 256 categories at the third level • 3387 categories at the fourth level • over 71 000 terms (including 25 000 meanings for 9 000 words) & over 50 000 « syntagmes » at the basic level. Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Indexation & Types of questions TRUST indexes the types of questions. It means that each linguistic module,when analysing each block of text, attempts to detect/profile the possible answer for each type of question (person, date, event, cause, aim, etc.) The present taxonomy of the type of questions comprises 86 different categories.It goes beyond the « factual » because including notions such as « usefulness » « comparison » « judgment » but also categories like « yes/no » or a classification. Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Analysis of the question When the question is keyed in by the user, automatically the language of the question is detected, and its matching linguistic module performs the semantic and syntactic analysis of the question. When some words of the question have several meanings, the most probable meaning is choosen, but the user may force the meaning of each word. The same linguistic modules determines the domain, the concepts and above all the type of the question. Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
The text search From the data obtained via the analysis of the question,(heads of derivation,named entities, domains, concepts, the question profile/type), the search engine extracts from the index, the blocks of texts best suiting the set of data. A balance of the different available data is carried out in order to avoid that a disambiguation error relating to the meaning or the type of the question prevents obtention of the blocks of texts that may contain an answer. Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Extraction of Answers For a given question, after possible spell-check,syntactic, semantic, conceptual analysis, then detection of the question, heads of derivations,named entities, concepts, domains, the types of QA are compared to the indexes for these different types. The best ranked blocks are analysed and answers extracted. The extraction of the answer is performed by the search of the named entities or syntactic groups in « position of use for the answering ». Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Response time A keyed-in question on a closed corpus (hard disk, corpus, Intranet) the answer is provided in French in less than 3 seconds. With other languages it can be up to 10 seconds. A keyed-in question on Internet,the response time may be anything between 2 to 14 seconds, depending on the language used, the number of pages analysed (user-definable) and the type of the question (a few answers are retrieved very quickly just on the available resumé or short description) Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
2. QRISTAL Presentation QRISTAL (is acronym de Questions-Réponses Intégrant un Système de Traitement Automatique des Langues) is the B2C version of TRUST. It is priced at 99 € and commercialised in retail computer outlets and in large consummer market distributors such as Virgin Stores or FNAC. Fruit of a 6 year development , QRISTAL performs beyond the TRUST set limits, but is undoubtedly arising from this project. Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
QRISTAL may be used in 2 major functions : • Provide exact answers to questions on « closed corpora » (hard disk, emails, Intranet, etc.), these being previously indexed so as extract the answers from the blocks of text corresponding to the analysis of the question. • Provide the exact answers to questions addressed to Internet (web). In this case, Qristal converts the questions in « understandable requests » for the standard engines, extracts the returned pages and their short description, analyses them and computes the answers. Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
In Qristal, a special attention has been given to the « user- self definability » . In design, Qristal is targeting those unfamiliar with SQL or web requests, and wishing to obtain directly an answer while formulating their questions in common natural language. Therefore the interface must be very user-friendly and as simple as possible,in order for them to profile Qristal usage to suit their habits and wishes. For more experimented users, files of questions as well as work on several indexes permit a more advanced usability, Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Commercialisation QRISTAL has been commercialised since December 2004 and has had a registered base of more than 700 users in that single month. Users are satisfied of the results obtained in French, while their judgment on the other language results is ( a bit unfairly) critical. Qristal appears to be very « reliable and stable », user-friendly as very few calls to the support/customer service may justify this appreciation. Users expectations are very large and their satisfaction will mean for us to produce a lot of efforts. Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Press Article in « La Dépêche du Midi » du 4 January 2005 Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Perspectives • QRISTAL will have updates in the coming years, with the following improvements : • improve the rate of exact answers, eliminate noise • use the notoriety of the pages to order them • carry out more precise inferences to extract the answers • allow « user profiles » • include other languages (German, Spanish ) • better differentiate the answer mode (alone, all ) • better situate the answers in their context Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
3. QRISTAL Evaluation QRISTAL was evaluated within a contest called EQUER, as a campaign of evaluation of QA systems of the the EVALDA project (www.technolangue.net). EVALDA and Technolangue projects, have been initiated by the French Ministry for Industry , Research and Culture. The EQUER campaign was organised by ELDA (Evaluations and Language resources Distribution Agency, www.elda.org) and was deployed between January 2003 and December 2004. Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
The EQUER campaign, very similar in its principles to • TREC-QA (USA) or NTCIR (Japan), included 2 different tests : • 500 all domain/ all purpose questions, mainly factual, on a journalistic and administrative corpus of 1,5 Go. • 200 questions, very often non-factual, on a medical corpus made of scientific articles and web pages of about 50 Mo. Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
The 500 all purpose questions were sectioned in : • 407 factual simple questions (ex.: Comment s'appelle le fils de Juliette Binoche ?) • 31 questions having a list as answer (ex.: Quels sont les trois pays qui bordent la Bosnie-Herzégovine ?) • 32 questions having a definition as answer (ex.: Qu’est-ce que la NSA ?) • 30 binary questions binaires, having Yes/No as answer (ex.: La carte d’identité existe-t-elle au Royaume-Uni ?) Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
The EQUER contestants were : • 4 commercial companies :(the first 2 are very large firms) • Commissariat à l’Énergie Atomique, Saclay, France • France Telecom, Lannion, France • Sinequa, Paris, France • Synapse Développement, Toulouse, France • 3 University laboratories : • LIA & SMART, Avignon, France • LIMSI, Orsay, France • Université de Neuchâtel, Suisse Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Procedure and Metrics to allocate marks The used metrics to allocate marks to the results was MRR (Mean Reciprocal Rank), i.e. 1 for an exact answer in a first position, ½ for an exact answer in a second position, 1/3 for an exact answer in a third position, etc. Only 5 answers were accounted for, except for binary question with one exact justified answer was to be accepted. For the questions having a list as answer, the used metrics was NIAP (Non Interpolated Average Precision). Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
The Synapse QA system evaluated during the EQUER campaign was a « pre-version » of QRISTAL, not having all the functionalities to extract the exact answer. With EQUER, Synapse participated to its first ever campaign to evaluate QA systems, while many other contestants had experience in participating to TREC-QA or CLEF-QA, for the English language QA or French Language QA. Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Technical performance The full set of the 500 questions of the general corpus was processed in 23 minutes and 17 seconds, hence less than 3 seconds per question. The speed of the linguistic analysis of the blocks was about 400 Mo/hour for the indexation, i.e 18 000 words/second. The speed of analysis and extraction of the answer was about 230 Mo/h, i.e 10 000 words/second. On 500 questions, the type « correct » has been determined in 98% of the cases. These speed tests were carried out on a Pentium 3 GHz with 1 Go Ram memory. Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
As shown in the previous graph, the best EQUER QA system (i.e Synapse), performs as well as the best one in TREC or NTCIR (MRR of 0,58 versus 0,68 & 0,61) for exact answers. This level is, in all cases, superior to the second best in TREC or NTCIR . These results consolidate the theoritical options and the quality of the resources developed within TRUST and implemented in QRISTAL. Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Other Evaluations During the evaluation, a set of 100 references of texts, originating from a standard engine, was provided for each question. With this data, Synapse’engine performance was 0,64 (versus 0,70) for the « passages » and 0,48 (versus 0,58) for the exact answers. An in-house test has later shown that in « inhibiting » the function « the type of the question » the MRR fell from 0,70 to 0,46 for the « passages », hence epitomising the importance of this functionality. Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
After extraction of the texts enclosing the answers of the general corpus of1,5 Go, we achieved to reduce it to 180 Go. It is noticeable that the results of the 500 questions are very near on each of the 2 corpora. This leads us to think that the size of the corpus could be considered as negligeable for the Quality of the results, contrary to an usually admitted idea in information retrieval. The said corpus of questions included « reformulations ». A benchmark comparing the answers of the questions at the « start position » versus the position after « reformulations » has shown that the results are very near to each other (93% of answers in first position are identical). Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
Future evaluations Synapse has the intention to participate in CLEF-QA both in monolingual and multilingual options in 2005 Currently, no other evaluation campaign is planned in France to follow-up EQUER, but an evaluation of a transcript from an oral corpus should take place in the coming month. Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT
FIN End Merci ! Thank you Présentation M-CAST, 10 janvier 2005, Synapse Développement, D. LAURENT