1 / 24

ResPubliQA 2010: QA on European Legislation

ResPubliQA 2010: QA on European Legislation. Anselmo Peñas , UNED, Spain Pamela Forner , CELCT , Italy Richard Sutcliffe , U. Limerick, Ireland Alvaro Rodrigo , UNED , Spain http://celct.isti.cnr.it/ResPubliQA/. Outline.

teigra
Download Presentation

ResPubliQA 2010: QA on European Legislation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ResPubliQA 2010:QA on European Legislation Anselmo Peñas, UNED, Spain Pamela Forner, CELCT, Italy Richard Sutcliffe, U. Limerick, Ireland Alvaro Rodrigo, UNED, Spain http://celct.isti.cnr.it/ResPubliQA/

  2. Outline • The Multiple LanguageQuestionAnsweringTrack at CLEF – a bit ofHistory • ResPubliQAthisyear • Whatisnew • Participation, Runs and Languages • Assessment and Metrics • Results • Conclusions ResPubliQA 2010, 22 September, Padua, Italy

  3. Multiple LanguageQuestionAnswering at CLEF Started in 2003: eighthyear Era I: 2003-2006 Ungrouped mainly factoid questions asked against monolingual newspapers; Exactanswersreturned Era II: 2007-2008 Grouped questions asked against newspapers and Wikipedia; Exact answers returned Era III: 2009-2010 ResPubliQA - Ungroupedquestionsagainstmultilingualparallel-aligned EU legislative documents; Passagesreturned ResPubliQA 2010, 22 September, Padua, Italy

  4. ResPubliQA 2010 – SecondYear Butalso some novelties… • Key points: • same set of questions in all languages • same document collections: parallel aligned documents • Sameobjectives: • to move towards a domain of potential users • to allow the direct comparison of performances across languages • to allow QA technologies to be evaluated against IR approaches • to promote use of Validation technologies ResPubliQA 2010, 22 September, Padua, Italy

  5. What’s new New Task (AnswerSelection) New documentcollection (EuroParl) New questiontypes AutomaticEvaluation ResPubliQA 2010, 22 September, Padua, Italy

  6. The Tasks NEW ResPubliQA 2010, 22 September, Padua, Italy • Paragraph Selection (PS) • to extract a relevant paragraph of text that satisfies completely the information need expressed by a natural language question • Answer Selection (AS) • to demarcate the shorter string of text corresponding to the exact answer supported by the entire paragraph

  7. The Collections NEW • Subset of JRC-Acquis (10,700 docs per lang) • EU treaties, EU legislation, agreements and resolutions • Between 1950 and 2006 • Parallel-aligned at the doc level (not always at paragraph) • XML-TEI.2 encoding • Small subset of EUROPARL (~ 150 docs per lang) • Proceedings of the European Parliament • translations into Romanian from January 2009 • Debates (CRE) from 2009 and Texts Adopted (TA) from 2007 • Parallel-aligned at the doc level (not always at paragraph) • XML encoding ResPubliQA 2010, 22 September, Padua, Italy

  8. EuroParlCollection The specific fragments of JRC-Acquis and Europarl used by ResPubliQA is available at http://celct.isti.cnr.it/ResPubliQA/Downloads • is compatible with Acquis domain • allows to widen the scope of the questions • Unfortunately • small number of texts • documents are not fully translated ResPubliQA 2010, 22 September, Padua, Italy

  9. Questions • two new question categories: • OPINION What did the Council think about the terrorist attacks on London? • OTHER What is the e-Content program about? • Reason and Purpose categories merged together Why was PerwizKambakhsh sentenced to death? • And also Factoid, Definition, Procedure ResPubliQA 2010, 22 September, Padua, Italy

  10. ResPubliQA Campaigns More participants and more submissions ResPubliQA 2010, 22 September, Padua, Italy

  11. ResPubliQA 2010 Participants 13 participants 8 countries 4 newparticipants ResPubliQA 2010, 22 September, Padua, Italy

  12. Submissionsby Task and Language ResPubliQA 2010, 22 September, Padua, Italy

  13. System Output • Twooptions: • Giveananswer (paragraph or exactanswer) • Return NOA asresponse = no answer is given The system is not confident about the correctness of its answer • Objective: • avoid to return an incorrect answer • reduce only the portion of wrong answers ResPubliQA 2010, 22 September, Padua, Italy

  14. EvaluationMeasure nR: number of questions correctly answered nU: number of questions unanswered n: total number of questions (200 this year) If nU = 0 then c@1=nR/n  Accuracy ResPubliQA 2010, 22 September, Padua, Italy

  15. Assessment 31% of the answers automatically marked as correct Twosteps: • Automaticevaluation • responses automatically compared against the Gold Standard manually produced • answers that exactly match with the GoldStandard, are given the correct value (R) • correctness of a response: exact match of Document identifier, Paragraph identifier, and the text retrieved by the system with respect to those in the GoldStandard • Manual assessment • Non-matching paragraphs/ answers judged by human assessors • anonymous and simultaneous for the same question ResPubliQA 2010, 22 September, Padua, Italy

  16. Assessment for Paragraph Selection (PS) • binary assessment: • Right (R) • Wrong (W) • NOA answers: • automatically filtered and marked as U (Unanswered) • discarded candidate answers were also evaluated • NoA R: NoA, but the candidate answer was correct • NoA W: NoA, and the candidate answer was incorrect • Noa Empty: NoA and no candidate answer was given • evaluators were guided by the initial “gold” paragraph • only a hint ResPubliQA 2010, 22 September, Padua, Italy

  17. Assessment for Answer Selection (AS) R (Right): the answer-string consists of an exact and correct answer, supported by the returned paragraph; X (ineXact): the answer-string contains either part of a correct answer present in the returned paragraph or it contains all the correct answer plus unnecessary additional text; M (Missed): the answer-string does not contain a correct answer even in part but the returned paragraph in fact does contain a correct answer; W (Wrong): the answer-string does not contain a correct answer and moreover the returned paragraph does not contain it either; or it contains an unsupported answer ResPubliQA 2010, 22 September, Padua, Italy

  18. MonolingualResultsforPS ResPubliQA 2010, 22 September, Padua, Italy

  19. Improvement in the Performance Monolingual PS Task: ResPubliQA 2010, 22 September, Padua, Italy

  20. Cross-languageResultsfor PS • In comparisontoResPubliQA 2009: • More cross-languageruns (+ 2) • Improvement in the best performance: from c@1 0.18 to 0.36 ResPubliQA 2010, 22 September, Padua, Italy

  21. Resultsfor the AS Task ResPubliQA 2010, 22 September, Padua, Italy

  22. Conclusions • SuccessfulcontinuationofResPubliQA 2009 • AS task: fewgroups and poorresults • Overallimprovementofresults • New documentcollection and newquestiontypes • c@1 evaluationmetricencourages the useofvalidationmodule ResPubliQA 2010, 22 September, Padua, Italy

  23. More on System Analyses and Approaches MLQA’10 Workshop on Wednesday14:30 – 18:00 ResPubliQA 2010, 22 September, Padua, Italy

  24. ResPubliQA 2010:QA on European Legislation Thankyou!

More Related