150 likes | 368 Views
TREC Video Retrieval Evaluation TRECVID. Paul Over* Alan Smeaton (Dublin City University) George Awad* Wessel Kraaij (TNO, Radboud University Nijmegen) Lori Buckland* Darrin Dimmick * Georges Quénot ( Laboratoire d’Informatique de Grenoble)
E N D
TREC Video Retrieval EvaluationTRECVID Paul Over* Alan Smeaton (Dublin City University) George Awad* Wessel Kraaij (TNO, Radboud University Nijmegen) Lori Buckland* Darrin Dimmick* Georges Quénot (Laboratoired’Informatique de Grenoble) Jonathan Fiscus** Stephanie Strassel+ Brian Antonishek** Amanda Morris+ Martial Michel^ et al * Retrieval Group / ** Multimodal Information Group Information Access Division Information Technology Laboratory NIST ^ Systems Plus + Linguistic Data Consortium Rockville, MD
What is TRECVID? Workshop series (2001 – present) http://trecvid.nist.gov to promote research/progress in content-based video analysis/exploitation Foundation for large-scale laboratory testing Forum for the • exchange of research ideas • discussion of approaches – what works, what doesn’t, and why. Focus: content-based approaches • search / detection / summarization / segmentation / … Aims for realistic system tasks and test collections • unfiltered data • focus on relatively high-level functionality (e.g. interactive search) • measurement against human abilities Provides data, tasks, and uniform, appropriate scoring procedures TRECVID @ NIST
TRECVID Yearly Cycle Data Procurement Post-workshop experiments, final papers (Dec – Feb) Call for Participation (1. February) Task definitions complete (1. April) TRECVID Workshop (mid-November) Results analysis and workshop paper/presentation preparation Search topic, ground truth development System building & experimentation; Community contributions (shots, training data, ASR, MT, etc.) Results Evaluation TRECVID @ NIST
TRECVID’s Evolution …2003 2004 2005 2006 2007 2008 2009 2010 2011 New development or test data as added Data: (hours) English TV News BBC rushes S&V Shot boundaries ■■■■■■■■■■■■■■■■■■■■■■■■■■■■ ■■■■■■■■■■ Ad hoc search ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ Features/semantic indexing ■■■■■■■■■■■■■■■■■ ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ ■■■■■■■■■■■■■■■■■■■■■ Stories■■■■■■■■■■■■■ Camera motion ■■■■■■■ BBC rushes - - - - ■■■■■■■■■■■■■■■■■■■■■Summaries ■■■■■■■■■■■■ Copy detection - - - - - - - - - - - - - - ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ Surveillance events - - - - - - - - - - - - ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ Known-item search- - - - - - - - - - - - - - - - - - - - -■■■■■■■■■■■■■■■■■■■■■ Instance search pilot - - - - - - - - - - - - - - - - - - - ■■■■■■■■■■ ■■■■■■■■■■ Multimedia event detection (MED) - - - - - - - - - - - - - - - -■■■■■■■■■■■■■■■■■■■■■ Tasks: Participanting teams: TRECVID @ NIST
TRECVID 2010 Tasks and Data TRECVID @ NIST
TV2010 Finishers Aalto University School of Science and Technology Aristotle University of Thessaloniki Asahikasei Co. AT&T Labs - Research Beijing Jiaotong University Beijing University of Posts and Telecom.-MCPRL Brno University of Technology Carnegie Mellon University - INF Chinese Academy of Sciences - MCG City University of Hong Kong Columbia University / UCF Computer Research Inst. of Montreal DFKI-MADM Dublin City University EURECOM Florida International University France Telecom Orange Labs (Beijing) Fudan University Fuzhou University Hungarian Academy of Sciences IBM T. J. Watson Research Center / Columbia Informatics and Telematics Inst. INRIA-TEXMEX INRIA-willow Inst. de Recherche en Informatique de Toulouse - Equipe SAMoVA TRECVID @ NIST
TV2010 Finishers Inst. for Infocomm Research Istanbul Technical University JOANNEUM RESEARCH KB Video Retrieval KDDI R&D Labs and SRI International Laboratoire d'Informatique Fondamentale de Marseille Laboratoire d'Informatique de Grenoble for IRIM LSIS / UMR CNRS & USTV Mayachitra, Inc. Nanjing University National Chung Cheng University National Inst. of Informatics National Taiwan University National University of Singapore NHK Science and Technical Research Laboratories Nikon Corporation NTNU and Academia Sinica NTT Communication Science Laboratories-CSL NTT Communication Science Laboratories-NII NTT Communication Science Laboratories-UT Oxford/IIIT Peking University-IDM Quaero consortium Queen Mary, University of London Ritsumeikan University TRECVID @ NIST
TV2010 Finishers Shandong University SHANGHAI JIAOTONG UNIVERSITY-IS Simon Fraser University Sun Yat-sen University - GITL Telefonica Research Tianjin University TNO ICT - Multimedia Technology Tokushima University Tokyo Inst. of Technology + Georgia Inst. of Technology Tsinghua University-IMG TUBITAK - Space Technologies Research Inst. Universidad Carlos III de Madrid University of Amsterdam University of Brescia University of Chile University of Electro-Communications University of Illinois at Urbana-Champaign & NEC Labs.America University of Klagenfurt University of Marburg University of Sfax Waseda University Xi'an Jiaotong University York University TRECVID @ NIST
TV2010 Finishers TRECVID @ NIST
Some impacts … • Continuing improvement in feature detection (automatic tagging) • in the University of Amsterdam’s MediaMill system • Performance on 36 features doubled: 2006 –> 2009 • Within domain (train and test) MAP 0.22 -> 0.41 • Cross domains MAP 0.13 -> 0.27 • Bibliometric study of TRECVID’s scholarly impact: 2003 - 2009 • (Dublin City University & University College, Dublin) • 310 workshop papers • 2073 peer-reviewed journal/conference papers • 2010 RTI International economic impact study of TREC/TRECVID • “…for every $1 that NIST and its partners invested in TREC[/TRECVID], at least $3.35 to $5.07 in benefits accrued to IR [Information Retrieval] • researchers” TRECVID @ NIST
Brewster Kahle (Internet Archive's founder) and R. Manmatha (U. Mass, Amherst) suggested in December of 2008 that TRECVID take another look at the resources of the Archive. Cara Binder and Raj Kumar @ archive.org helped explain how to query and download automatically from the Internet Archive. Georges Quénot with Franck Thollard, Andy Tseng, BahjatSafadi from LIG and Stéphane Ayache from LIF shared coordination of the semantic indexing task and organized additional judging with support from the Quaero program Georges Quénot and Stéphane Ayache again organized a collaborative annotation of 130 features. Shin'ichi Satoh at NII along with Alan Smeaton and Brian Boyle at DCU arranged for the mirroring of the video data Support Funders: • National Institute of Standards and Technology (NIST) • Intelligence Advanced Research Projects Activity (IARPA) • Department of Homeland Security (DHS) • Contributors: • Colum Foley and Kevin McGuinness (DCU) helped segment the instance search topic examples and set up the oracle at DCU for interactive systems in the known-item search task. • The LIMSI Spoken Language Processing Group and VexSys Research provided ASR for the IACC.1 videos. • Laurent Joyeux (INRIA-Roquencourt) updated the copy detection query generation code. • MatthijsDouze from INRIA-LEAR volunteered a camcorder simulator to automate the camcording transformation for the copy detection task. • EmineYilmaz (Microsoft Research) and EvangelosKanoulas (U. Sheffield) updated their xinfAP code (sample_eval.pl) to estimate additional values and made it available. TRECVID @ NIST
Event detection at TRECVID before MED • No formal definition of event • Just one of many concept or search target types: objects, people, locations, events • Usually seen in various combinations with objects, people, locations • Often recognizable for a human from one frame • Run (2003-2009) against professionally produced video • Temporal nature posed problems for sparse sampling approaches (e.g., 1 keyframe per shot) TRECVID @ NIST
Event detection at TRECVID before MED Ad hoc event search topics: 2003 a basket being made 2003 an airplane taking off 2003 a helicopter in flight or on the ground 2003 a rocket or missile taking off 2003 a person diving into some water 2003 a locomotive approaching the viewer 2004 a street scene ,multiple pedestrians & vehicles in motion 2004 people and dogs walking together. 2004 fingers striking the keys on a keyboard 2004 people moving a stretcher 2004 a person hitting a golf ball that then goes into the hole 2004 a handheld weapon firing. 2004 a tennis player contacting the ball with the tennis racket. 2004 horses in motion 2004 skiers skiing a slalom course 2005 people shaking hands 2005 a goal being made in a soccer match 2006 emergency vehicles in motion 2006 people leaving or entering a vehicle 2006 soldiers, police, or guards escorting a prisoner 2006 George W. Bush, Jr. walking 2006 a greeting by at least one kiss on the cheek 2007 people walking up stairs 2007 a door being opened 2007 a person sitting at a keyboard typing or using a mouse 2007 a person talking on a telephone 2007 a train in motion 2007 people walking with one or more dogs 2007 a boat moves past 2007 a woman talking to the camera in an interview 2007 a road taken from a moving vehicle thru the front windshield 2007 one or more people playing musical instruments 2008 a person opening a door 2008 a road taken from a moving vehicle, looking to the side 2008 vehicles passing the camera 2008 people walking into a building 2008 a vehicle moving away from the camera 2008 a person on the street, talking to the camera 2008 waves breaking onto rocks 2008 a person pushing a child in a stroller or baby carriage 2008 a person talking on a telephone 2008 people riding a bicycle 2008 a person talking behind a microphone 2008 a person getting out of or getting into a vehicle 2008 people, singing and/or playing a musical instrument 2009 closeup of a hand, writing, drawing, coloring, or painting. 2009 people shaking hands 2009 person pointing. 2009 person playing a piano. 2009 a train in motion, seen from outside. TRECVID @ NIST
MED • Focus on events • Much more variety in the video data • Much more complexity in the events • MUCH more data • More to do - detect with future event recounting task (MER) in mind Significant leap forward required … TRECVID @ NIST