1 / 12

Thomas Mandl

Thomas Mandl. R obust T ask at CLEF 2007? Break-Out Session. 7 th Workshop of the. C ross- L anguage E valuation F orum ( CLEF ). Alicante 22 Sept. 2006. Purpose of Robust 2006: Robustness in CLIR. Robustness in multilingual retrieval

Download Presentation

Thomas Mandl

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Thomas Mandl Robust Task at CLEF 2007? Break-Out Session 7th Workshop of the Cross-Language Evaluation Forum (CLEF) Alicante 22 Sept. 2006

  2. Purpose of Robust 2006: Robustness in CLIR • Robustness in multilingual retrieval • Stable performance over all topics instead of high average performance (like at TREC, just for other languages) • Stable performance over all topics for multi-lingual retrieval • Stable performance over different languages (so far at CLEF ?)

  3. Ultimate Goal • „More work needs to be done on customizing methods for each topic“ (Harman 2005)

  4. Positive Aspects • Many runs were submitted • by merely 8 groups • Large topic set has been assembled • e.g. sensitivity analysis can be done with a larger topic set • Cheap • no relevance assessments were necessary

  5. Negative Aspects • Probably too many sub tasks • Not too much specific work on robustness • Inconsistency in data has been criticized • Robustness over languages has not been tackled • probably unpractical • so this seems to be no option for another robust task

  6. Results • High correlation between MAP and GMAP (higher than at TREC) • Is robust analysis necessary? • Problems finding difficult topics in multilingual retrieval • Topics are not inherently difficult, but in combination with a collection

  7. Future of Robust CLEF task? • No more robust task • Robustness between different collections • Enforce Topic specific treatment • Repeat a similar robust task • Robustness over different user models • Participants need to determine Topic difficulty • In depth failure analysis for individual topics (RIA) • Robustness over different tasks /Team up with another track • New Measures Proved hard at TREC hard to organize in a CLEF style hard to organize

  8. Robustness over different user models? • high recall vs. high precision

  9. Robustness between different collections • Similar to human-assisted runs at last robust track at TREC • Use relevance assessments from one collection to find documents in another • Routing • Can easily be done cross-lingual • Is it about robustness?

  10. Enforce Topic specific treatment • Force participants to submit runs • where the topic set is split into two or more subsets • which are treated with specific methods

  11. Repeat a similar robust task? • People want to work on a task at least a second year • Less sub-tasks • No bi-lingual • Drop German (inconsistency) • New topic split?

  12. Thanks for your Attention I am looking forward to the Discussion

More Related