110 likes | 313 Views
The TREC Conferences http://trec.nist.gov. Ellen Voorhees. TREC Philosophy. TREC is a modern example of the Cranfield tradition system evaluation based on test collections Emphasis on advancing the state of the art from evaluation results
E N D
The TREC Conferenceshttp://trec.nist.gov Ellen Voorhees
TREC Philosophy • TREC is a modern example of the Cranfield tradition • system evaluation based on test collections • Emphasis on advancing the state of the art from evaluation results • TREC’s primary purpose is not competitive benchmarking • experimental workshop: sometimes experiments fail!
Cranfield at Fifty • Evaluation methodology is still valuable… • carefully calibrated level of abstraction • has sufficient fidelity to real user tasks to be informative • general enough to be broadly applicable, feasible, relatively inexpensive • …but is showing some signs of age • size is overwhelming our ability to evaluate • new abstractions need to carefully accommodate variability to maintain power
Evaluation Difficulties • Variability • despite stark abstraction, user effect still dominates Cranfield results • Size matters • effective pooling has corpus size dependency • test collection construction costs depend on number of judgments • Model coarseness • even slightly different tasks may not be good fit • e.g., legal discovery, video features
TREC 2009 • All tracks used some new, large document set • Different trade-offs in adapting evaluation strategy • tension between evaluating current participants’ ability to do the task and building reusable test collections • variety of tasks that are not simple ranked-list retrieval
ClueWeb09 Document Set • Snapshot of the WWW in early 2009 • crawled by CMU with support from NSF • distributed through CMU • used in four TREC 2009 tracks: Web,Relevance Feedback, Million Query, and Entity • Full corpus • about one billion pages and 25 terabytes of text • about half is in English • Category B • English-only subset of about 50 million pages (including Wikipedia) to permit wider participation
The TREC Tracks Personal documents Retrieval in a domain Answers, not documents Searching corporaterepositories Size, efficiency, & web search Beyond text Beyond just English Human-in-the-loop Streamed text Static text Blog Spam Chemical IR Genomics Novelty QA, Entity Legal Enterprise Terabyte, Million Query Web VLC Video Speech OCR Cross-language Chinese Spanish Interactive, HARD, Feedback Filtering Routing Ad Hoc, Robust 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009
TREC 2010 • Blog, Chemical IR, Entity, Legal, Relevance Feedback, Web continuing • Million Query merged with Web • New “Sessions” track: investigate search behavior over a series of queries (series of length 2 for first running in 2010)
TREC 2011 • Track proposals due Monday (Sept 27) • New track on searching free text fields of medical records likely