Joint SemEval/CLEF tasks: Contribution of WSD to CLIR

Joint SemEval/CLEF tasks: Contribution of WSD to CLIR UBC: Agirre, Lopez de Lacalle, Otegi, Rigau, FBK: Magnini Irion Technologies: Vossen CLEF 2007 - Budapest

WSD and SemEval • Word Sense Disambiguation • When I went to bed at around two o'clock that night , everyone else was still out in the party. • party:N:1 political organization • party:N:2 social event • Potential for more precise expansion (translation) • SemEval 2007 • Framework for semantic evaluations • Under auspices of SIGLEX (ACL) • 19 tasks incl. WSD, SRL, full frames, people, … • > 100 attendants in ACL workshop CLEF 2007 - Budapest

Motivation for the task • WSD perspective • In-vitro evaluations not fully satisfactory • In-vivo evaluations in applications (MT, IR, …) • IR perspective • Usefulness of WSD on IR/CLIR disputed, but … • Real compared to artificial experiments • Expansion compared to just WSD • Weighted list of senses compared to best sense • Controlling which word to disambiguate • WSD technology has improved • Coarser-grained senses (90% acc. on Semeval 2007) CLEF 2007 - Budapest

Motivation for the task • Combining WSD and IR: • Many possible variations • Unfeasible for a single research team • A public common dataset allows for the community to explore different combinations. • Tasks where we could hope to get positive impact: • High recall IR scenarios • Short passage IR scenarios • Q&A • CLIR • We selected CLIR because of previous expertise of some of the organizers. CLEF 2007 - Budapest

Two-stage framework • First stage (SemEval 2007 task 01): • Participants: submit WSD results • Sense inventory WordNet 1.6 (multilinguality) • Organizers: • Expansion / translation strategy fixed • IR/CLIR system fixed (IR as upperbound) • Second stage (Proposed CLEF 2008 track): • Organizers: provide several WSD annotations • Participants: submit CLIR results with/without WSD annotations CLEF 2007 - Budapest

Outline • Description of the SemEval task (1st stage) • Evaluation of results (1st stage) • Conclusions (1st stage) • Next step (2nd stage) CLEF 2007 - Budapest

Description of the taskDatasets • CLEF data: • Documents in English: LA94, GH95170.000 documents, 580 Mb raw text • 300 topics: both in English and Spanish • Existing relevance judgments • Due to time limitations of the exercise • 16,6% of document collection (we will have 100% shortly) • Subset of relevance judgments, 201 topics CLEF 2007 - Budapest

Description of the taskTwo subtasks for participants English WSD of the following: • the document collection • the topics We limit to English at the time being. Return WN 1.6 senses. CLEF 2007 - Budapest

Description of the taskSteps of CLIR/IR system Step 1: Participants return WSD results Step 2: Expansion / Translation • Multilingual Central Repository (based on EuroWN) • 5 languages tightly connected • To ILI concepts (WN 1.6 synsets) • Mappings to other WN versions • Example: car sense 1 • Expanded to synonyms: automobile • Translated to equivalents: auto, coche CLEF 2007 - Budapest

Description of the taskSteps of CLIR/IR system Step 3: IR/CLIR system • Adaptation of TwentyOne (Irion) • Pre-processing: XML • Indexing: detected noun phrases only • Title and description used for queries • Stripped down to vector-space matching CLEF 2007 - Budapest

Description of the taskThree evaluation settings • IR with WSD of documents (English) • WSD of English documents • Expansion of senses in the documents • IR with WSD of topics (English) • WSD of English documents • Expansion of senses in the documents • IR as upperbound of CLIR • CLIR with WSD of documents: • WSD of English documents • Translation of English documents • Retrieval using Spanish topics • CLIR with WSD of topics (Spanish WSD, NO) CLEF 2007 - Budapest

Evaluation and resultsParticipant systems • Participants returned sense-tagged documents and topics • Two systems participated: • PUTOP from Princeton, unsupervised • UNIBA from Bari, KB using WordNet • In-house system: • ORGANIZERS, supervised, kNN classifiers • Other baselines: • Noexp: original text • Fullexp: expand to all senses • WSDrand: return sense at random • 1st: return first sense in WordNet • Wsd50: 50% best senses (in-house WSD system only) CLEF 2007 - Budapest

Evaluation and resultsS2AW and S3AW control • Indication of performance of WSD • Not necessarily correlated with IR/CLIR results • Supervised system (ORG) fares better CLEF 2007 - Budapest

Evaluation and resultsResults (Mean Average Precision MAP) IR: noexp best CLIR: • fullexp best • ORG close • far from IR Expansion and IR/CLIR system too simple Mean Average Precision CLEF 2007 - Budapest

Analysis# words in expansion of docs. IR: the less the better (but) • MAP: noexp > ORG > UNIBA • MW: noexp < … < ORG CLIR: the more the better (but) • MAP: fullexp > ORG > ORG (50) • MW: fullexp > ORG(50) > ORG WSD allows for more informed expansion CLEF 2007 - Budapest

Conclusions • Main goals met: • First try on evaluating WSD on CLIR • Large dataset prepared and preprocessed • WSD allows for more informed expansion • On the negative side: • Participation low • SemEval overload, 10 interested • No improvement over baseline • Expansion and IR/CLIR naive CLEF 2007 - Budapest

Next stage: CLEF 2008 • WSD results provided: • WSD of whole collection • Best WSD systems in SemEval 2007 • CLEF teams will be able to try more sophisticated IR/CLIR methods • Feasibility of a Q/A exercise • Suggestions for cooperation on other tasks welcome Thank you for your attention! http://ixa2.si.ehu.es/semeval-clir CLEF 2007 - Budapest

Joint SemEval/CLEF tasks: Contribution of WSD to CLIR