240 likes | 398 Views
Impressions of 10 years of CLEF. Donna Harman Scientist Emeritus National Institute of Standards and Technology. TREC to CLEF. TREC ran the first cross-language evaluation in TREC-6 (1997), using Swiss newswire that had been obtained by the Swiss Federal Institute of Technology, Zurich
E N D
Impressions of 10 years of CLEF Donna Harman Scientist Emeritus National Institute of Standards and Technology
TREC to CLEF • TREC ran the first cross-language evaluation in TREC-6 (1997), using Swiss newswire that had been obtained by the Swiss Federal Institute of Technology, Zurich • The languages were English, French and German and there were 13 participating groups • NIST hired two “tri-lingual” assessors to build the topics and to do the assessments • The track was well received, good research happened, but the topic building/assessing was a disaster!!
TREC-7 and 8 • For this reason, the next two years of the CLIR task was run in a distributed manner across 4 groups, with each group building topics in their own language and also doing the relevance assessments. • The groups were • NIST (English) • University of Zurich (French) • Social Science Information Centre, Bonn and the University of Koblenz (German) • CNR, Pisa (Italian)
The move to Europe • This distributed method worked fine but it became obvious after three years that the U.S. participants did not have the background (or maybe the interest) to progress much further and it was decided to move to Europe • Peter Schäuble convinced Carol Peters to take this on and CLEF started in 2000.
Monolingual, Bilingual and Multilingual Research • Monolingual runs in all of these languages, adding new sources of linguistic tools such as stemmers, decompounders, etc. • Bilingual runs across many languages, including some “unusual” pairs; new sources of bilingual data often found • Multilingual runs that require merging of results across all target languages
Research groups across Europe and around the World • 2000: 20 groups from Netherlands, Switzerland, Germany, France, Italy, USA, UK, Canada, Spain, Finland • 2001: 34 groups, adding Hong Kong, Thailand, Japan, Taiwan, Sweden • 2002: 37 groups including 9 new ones • 2003: 42 groups including 16 new ones, & group from Ireland • 2004: 26 groups including 2 new groups from Portugal • 2005: 23 groups including 5 new groups and 3 new countries (Indonesia, Russia and Hungary) • 2006: 25 groups including 10 new groups and 2 new countries (Brazil and India) • 2007: 22 groups including groups from the Czech Republic
CLEF 2000 – 2009Participation per Track CLEF 2009 Workshop 30 September – 2 October, Corfu, Greece
ImageCLEF (2003) • Data was 30,000 photographs from well-known Scottish photographers; all photographs have manual English captions • 50 topics were built in English and then translated by native speakers to Spanish, Dutch, German, French and Italian • The task was to work from these languages to the target English captions • 4 groups participated
ImageCLEF (2004) • Continued work with Scottish photos • Added medical retrieval task using 8,725 medical images, such as scans and X-rays • Most of these images had case notes in French or English • Topics included query-by-example images for the medical collection • 18 groups participated using both CLIR and image retrieval methodologies
ImageCLEF (2005) • Continued work with Scottish photos; 19 groups participated using 26 different topic languages • More medical images: 50,000 images total with annotations in “assorted” languages (English, French and German) • Medical topics more complex, specifically targetted for either image retrieval, text retrieval, or a need for both • Added automatic annotation task for the medical images
ImageCLEF (2006) • ImageCLEFphoto: new collection of 20,000 “touristic” photos with captions in English, German and Spanish • 60 topics for this collection based on a real log file • 36 participating groups using 12 topic languages • 20 groups for a general annotation task (21 classes) • Same 50,000 medical images, but medical topics taken from log files, however specifically selected to cover two of four “categories” • Anatomic region shown in image • Image modality (x-ray, MRI, etc.) • Pathology or disease shown in image • Abnormal visual observation
QA@CLEF • 2003: 8 groups did factoid question answering for 200 questions (translated into 5 languages) against target text in 3 languages • 2004: 18 groups did factoid and definition questions (700 questions), with the questions in 9 languages against target text in 7 languages
QA@CLEF • 2005: 24 groups did factoid, definition, and NIL questions, with the questions in 10 languages against target text in 9 languages • 2006: 30 groups did “snippet” answers for 200 questions in 11 source languages; also 2 pilot tasks • Answer validation and WiQA • 2007: “main task” adding Wikipedia data plus an answer validation plus a pilot for QA using spoken questions
iCLEF • 2002,2003: user studies on CLIR document selection (e.g., new interfaces to help with translation) or new interactive tools of doing CLIR • 2004: interactive cross-language QA • 2005: continued interactive QA plus interactive search of Scottish photos • 2006, 2007: interactive Flickr searching
Structured Data Retrieval • TRECs 7 and 8 CLIR worked with GIRT-2, including over 50,000 structured documents (bibliographic data including manual index terms); plus an English-German thesaurus • 9 years of CLEF saw this data grow to 151,319 German documents (also translated to English), plus 20,000 English documents and 145,802 Russian documents; plus vocabulary mappings across this collection • CLEF 2008 used the TEL collection of library catalogs in English, French and German
WebCLEF • 2005: EuroGOV collection of over 3.5 million pages in 20 languages; tasks were known-item topics, homepages and named pages; 11 teams built topics, did assessments and did experiments • 2006: 1,940 known item topics, some automatically generated • 2007 and 2008: 30 specially-built topics to mimic building a Wikipedia article or survey
GeoCLEF • 2005: English and German CLEF newspapers with 25 “old” CLEF topics retrofitted with geospatial information • 2006: 25 topics in 5 languages with target source documents in English, German, Portuguese and Spanish • 2007: much deeper analysis of what actually makes a topic geospatial! • 2008: included a Wikipedia pilot task
Assorted@CLEF • speech retrieval • 2003, 2004 using topics in 6 languages against the English TREC speech data • 2005, 2006, 2007 searching spontaneous speech in Czech and English • Non-European languages • 2007: topics in Bengali, Hindi, Marathi, Tamil and Telugu against the English data • 2008: topics and documents in Persian
CLEF contributions • First large test collections for 13 European languages • Extensive research in IR within and across these languages • Training for many new IR groups across Europe (and elsewhere) • Lots of new linguistic tools developed as a result of CLEF
More CLEF contributions • The first major evaluation of image retrieval, both in the photographic area and in the medical area • This led not only to these major image collections but to new groups and lots of excellent research in image retrieval • First step in geospatial retrieval • Major spontaneous speech retrieval effort, • ETC. ETC. ETC