240 likes | 354 Views
Welcome to CLEF 2008. Carol Peters ISTI-CNR Pisa, Italy. CLEF Objectives. Stimulate the development of multilingual IR systems (for European languages !) To create a CLIR/MLIA community Construct publicly available test-suites. Conducting annual evaluation campaigns
Welcome to CLEF 2008 Carol Peters ISTI-CNR Pisa, Italy
CLEF Objectives • Stimulate the development of multilingual IR systems (for European languages !) • To create a CLIR/MLIA community • Construct publicly available test-suites • Conducting annual evaluation campaigns • Designing tracks/tasks to meet emerging needs and to stimulate research in the”right” direction Objective: truly multilingual/multimedia systems CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
Come to CLEF – and see Europe! CLEF 2003 Trondheim CLEF 2008 Aarhus CLEF 2004 Bath CLEF 2001 Darmstadt CLEF 2005 Vienna CLEF 2007 Budapest CLEF 2000 Lisbon CLEF 2002 Rome CLEF 2006 Alicante
Athena Research Center, Greece Business Information Systems, U. Applied Sciences Western Switzerland, Sierre, Switzerland Centre for Evaluation of Human Language & Multimodal Communication Technologies (CELCT), Italy Centruum vor Wiskunde en Informatica, Amsterdam, NL Computer Science Department, U. Basque Country, Spain Computer Vision and Multimedia Lab, U. Geneva, CH Data Base Research Group, U. Tehran, Iran Dept. of Computer Science, U. Indonesia Dept. of Computer Science & Medical Informatics, RWTH Aachen U., Germany Dept. of Computer Science and Information Systems, U. Limerick, Ireland Dept. of Medical Informatics and Clinical Epidemiology, Oregon Health and Science U., USA Dept. of Information Engineering, U. Padua, Italy Dept. of Information Science, U. Hildesheim, Germany Dept. of Information Studies, U. Sheffield, UK Dept. Medical Informatics, U. Hospitals and University of Geneva, Switzerland Evaluations and Language Resources Distribution Agency, Paris, France German Research Centre Artificial Intelligence, DFKI GESIS-IZ Social Science Information Centre, Germany Information and Language Processing Systems, U. Amsterdam, The Netherlands Information Science, U. Groningen, The Netherlands Institute of Computer Aided Automation, Vienna University of Technology, Austria Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI), Orsay, France U. Nacional de Educación a Distancia, Madrid, Spain Linguateca, Sintef, Oslo, Norway Linguistic Modelling Lab., Bulgarian Acad Sci Microsoft Research Asia NIST, USA Research Computing Center of Moscow State U. Research Inst. Linguistics, Hungarian Acad. Sciences School of Computer Science and Mathematics, Victoria U., Australia School of Computing, DCU, Ireland TALP , U. Politècnica de Catalunya, Barcelona, Spain UC Data Archive and School of Information Management and Systems, UC Berkeley, USA U. "Alexandru Ioan Cuza", IASI, Romania CLEF2008 Coordination CLEF is coordinated by the Istituto di Scienza e Tecnologie dell'Informazione, Consiglio Nazionale delle Ricerche, Pisa The following Institutions are contributing to the organisation of the different tracks of the CLEF 2008 campaign: CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
Maristella Agosti, U.Padove, Italy Martin Braschler, Zurich, Switzerland Amedeo Cappelli, ISTI-CNR & CELCT, Italy Hsin-Hsi Chen, National Taiwan U., Taipei, Taiwan Khalid Choukri, ELRA/ELDA, Paris, France Paul Clough, University of Sheffield, UK Thomas Deselaers, RWTH Aachen University, Germany Giorgio Di Nunzio, U. Padova, Italy David A. Evans, Clairvoyance Corporation, USA Nicola Ferro, U. Padova, Italy Christian Fluhr, CEA-LIST, Fontenay-aux-Roses, France Norbert Fuhr, University of Duisburg, Germany Frederic C. Gey, U.C. Berkeley, USA Julio Gonzalo, LSI-UNED, Madrid, Spain Donna Harman, NIST, USA Gareth Jones, Dublin City University, Ireland Franciska de Jong, University of Twente, Netherlands Noriko Kando, NII, Tokyo, Japan Jussi Karlgren, SICS, Sweden Michael Kluck, German Institute for International and Security Affairs, Berlin, Germany Natalia Loukachevitch, Moscow State University, Russia Bernardo Magnini, ITC-irst, Trento, Italy Thomas Mandl, U. Hildesheim, Germany Paul McNamee, Johns Hopkins University, USA Henning Müller, University & University Hospitals of Geneva, Switzerland Douglas W. Oard, University of Maryland, USA Anselmo Peňas, LSI-UNED, Madrid, Spain Maarten de Rijke, University of Amsterdam, Netherlands Diana Santos, Linguateca, Sintef, Oslo, Norway Jacques Savoy, University of Neuchatel, Switzerland Peter Schäuble, Eurospider Information Technologies, Switzerland Richard Sutcliffe, University of Limerick, Ireland Max Stempfhuber, Informationszentrum Sozialwissenschaften Bonn, Germany Hans Uszkoreit, German Research Center for Artificial Intelligence (DFKI), Germany Felisa Verdejo, LSI-UNED, Madrid, Spain José Luis Vicedo, University of Alicante, Spain Ellen Voorhees, NIST, USA Christa Womser-Hacker, University of Hildesheim, Germany CLEFSteering Committee CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
CLEF 2008: Track Coordinators • Ad Hoc: Abolfazl AleAhmad, Hadi Amiri, Eneko Agirre, Giorgio Di Nunzio, Nicola Ferro, Thomas Mandl, Nicolas Moreau, Vivien Petras • Domain-Specific: Vivien Petras, Stefan Baerisch • iCLEF: Paul Clough, Julio Gonzalo, Jussi Karlgren • QA@CLEF: Danilo Giampiccolo, Anselmo Peñas, Pamela Forner, Iñaki Alegria, Corina Forăscu, Nicolas Moreau, Petya Osenova, Prokopis Prokopidis, Paulo Rocha, Bogdan Sacaleanu, Richard Sutcliffe, Erik Tjong Kim Sang, Alvaro Rodrigo, Jodi Turmo, Pere Comas, Sophie Rosset, Lori Lamel, Djamel Mostefa • ImageCLEF: Allan Hanbury,Paul Clough, Thomas Arni, Mark Sanderson, Henning Müller, Thomas Deselaers, Thomas Deserno, Michael Grubinger,Jayashree Kalpathy–Cramer, and William Hersh • Web-CLEF: Valentin Jijkoun and Maarten de Rijke • GeoCLEF:Thomas Mandl, Fredric Gey, Giorgio Di Nunzio, Nicola Ferro, Ray Larson, Mark Sanderson, Diana Santos, Paula Carvalho • VideoCLEF: Martha Larson, Gareth Jones • INFILE: Djamel Mostefa • DIRECT: Marco Duissan, Giorgio Di Nunzio, Nicola Ferro CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
CLEF 2008: Participating Groups CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
CLEF: Trend in Participation Europe = 69(51); N. America = 12(14); Asia = 15(14), S. America = 3(1), Africa = 1(0) CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
CLEF 2008 Tracks • Multilingual textual document retrieval (Ad Hoc) • Mono- and cross-language information retrieval on structured scientific data (Domain-Specific) • Interactive cross-language retrieval (iCLEF) • Multiple language question answering (QA@CLEF) • Cross-language retrieval in image collections (ImageCLEF) • Multilingual retrieval of web documents (WebCLEF) • Cross-language geographical information retrieval (GeoCLEF) Pilots: Cross-language Video Retrieval (VideoCLEF) Multilingual Information Filtering (INFILE) CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
No. of Participants per Track • Ad Hoc: 26(22) • Domain-Spec: 6(5) • iCLEF: 6(na) • QA@CLEF: 29(28) • ImageCLEF: 42 (35) • WebCLEF: 3(4) • GeoCLEF: 11(13) • plus VideoCLEF: 5 INFILE: 1 CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
CLEF 2000 – 2008Participation per Track CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
CLEF 2008:Test Collections 2000 • News documents in 4 languages • GIRT German Social Science database 2008 • CLEF multilingual comparable corpus of more than 3M news docs in 15 languages: BG,CZ,DE,EN,ES,EU,FI,FR,HU,IT,NL,RU,SV,PT and Persian • The European Library Data in DE, EN, FR (>3M docs) • GIRT-4 social science database in EN and DE, Russian ISISS collection; Cambridge Sociological Abstracts • Online Flickr database • IAPR TC-12 photo database (20,000 image, captions in EN, DE); • ARRS Goldminer database (200,000 medical images) • IRMA: 10,000 images for automatic medical image annotation • INEX Wikipedia image collection (150,000 images) • Dutch / English documentary TV videos • Agence France Press (AFP)newswire in Arabic, French & English CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
CLEF 2008: Highlights • Big rise in participation 100 groups in 2008 (81 in 2007); workshop >150 Participants (115 in 2007) • Expansion of test-suites • Ad Hoc: new collections – TEL & Persian – new tasks • Domain-specific holds its own! • Enormous success of ImageCLEF • Confirmation of interest in QA@CLEF, GeoCLEF • iCLEF – lots of interest • WebCLEF & INFILE – what happened??? • CLEF 2008 Proceedings Published CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
CLEF 2008 Proceedings Advances in Multingual and MultiModal Information Retrieval 8th Workshop of the Cross-Language Evaluation Forum, CLEF 2007, Budapest, Hungary, September 19-21, 2007 Revised Selected PapersSeries: Lecture Notes in Computer Science , Vol. 5152 Peters, C.; Jijkoun, V.; Mandl, Th.; Müller, H.; Oard, D.W.; Peñas, A.; Petras, V.; Santos, D. (Eds.) 2008, XXI, 922 p. With online files/update., Softcover ISBN: 978-3-540-85759-4 CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
CLEF 2000 – 2008 Results • Creation of strong CLIR research community (increase in participation over years ) • Promotion of research in key areas (multilingual IR; results merging; cross language access in multimedia; interactive query formulation and results presentation) • Encouraged takeup of techniques/resources between research groups • Stimulated synergy between researchers from different areas (IR, NLP, Image Processing, User Interfaces, …) • Literature: Working Notes, Proceedings and other publications report state of the art plus emerging trends • Production of language resources; test suites CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
Points for Discussion • What new tasks/evaluation methodologies are needed to address more advanced information requirements? • How can we best reduce the gap between research and application communities? • Who are the users? Does CLEF have a future? CLEF 2007 Workshop, Budapest, Hungary 19-21 September 2007
CLEF & TrebleCLEF • CLEF is an activity of the TrebleCLEF Coordination Action under the Seventh Framework Programme of the European Commission. • TrebleCLEF organises a set of dissemination activities in the multilingual information access field. CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
Treble-CLEF The CLEF research results have led to development of a new generation of multilingual retrieval system prototypes BUT lack of technology transfer Treble-CLEF extends the CLEF activity by: • continuing to promote MLIA R&D via evaluation campaigns; • providing a consistent training activity: tutorials, workshops, summer school; • producing best practice guidelines for system implementation; • providing resources to encourage the multilingual system development. www.trebleclef.eu CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
Approach • Evaluation • test collections and laboratory evaluation • user evaluation • log analysis • Best Practices & Guidelines • system-oriented aspects of MLIA applications • collaborative user studies • user-oriented aspects of MLIA interfaces • Dissemination and Training • tutorials • workshops • summer school
Treble-CLEF Events Workshop on Novel Methodologies for Evaluation in Information Retrieval, ECIR’08, Glasgow, Scotland Best Practices Workshops Workshop on Best Practices for the Development of Multilingual Information Access Systems, Segovia, Spain, June 08 Workshop on Best Practices for System Developers: Bringing Multilingual Information Access to Operational Systems, Winterthur, Switzerland, October 2008 Workshop on Best Practices in Query Log Analysis, Spring 2009 MLIA Technology Day – Dissemination of results of Best Practices Workshops, Fall 2009 CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
Workshop Memory Stick Workshop Programme List of Participants Book of Abstracts CLEF 2008 Questionnaire Map at Workshop Venue Social Dinner - 17 September 2008 TrebleCLEF Other CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
Treble-CLEF Summer School Focus: How to build effective MLIA systems and How to evaluate them Program will cover the following areas: Multilingual Text Processing (language specific tokenization, indexing, stemming); Cross-Language Information Retrieval (approaches and technologies used for CLIR); Multingual Information Retrieval and MultiModality (querying, retrieving & presenting results from a multingual/multimedia collection System Architectures and Multilinguality (theory & practice) Resources for MLIA (information on language processing tools and linguistic resources); Best Practices in User-oriented MLIA Evaluation for Multilingual Systems and Components CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
CLEF2008 Questionnaire Aim of the Questionnaire is to collect information on the current needs of MLIA system developers in terms of applications, resources, evaluation activities Compile the questionnaire online at www.trebleclef.eu/clef_2008_questionnaire.php CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008
CLEF 2008 Thank you for your attention and ENJOY THE Workshop ! CLEF 2008 Workshop, Aarhus, Denmark 17-19 September 2008