1 / 17

Planning for the TREC 2008 Legal Track

Planning for the TREC 2008 Legal Track. Douglas Oard Stephen Tomlinson Jason Baron. Agenda. Track goals Deciding on a document collection “Beating Boolean” Handling nasty OCR Making the best use of the metadata Ad hoc task design Interactive task design Relevance feedback task design

weldon
Download Presentation

Planning for the TREC 2008 Legal Track

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Planning for the TREC 2008Legal Track Douglas Oard Stephen Tomlinson Jason Baron

  2. Agenda • Track goals • Deciding on a document collection • “Beating Boolean” • Handling nasty OCR • Making the best use of the metadata • Ad hoc task design • Interactive task design • Relevance feedback task design • Other issues

  3. Track Goals • Develop a reusable test collection • Documents, topics, evaluation measures • Foster formation of a research community • Establish baseline results

  4. Choosing a Collection • FERC Enron (w/attachments, full headers) • Somewhat larger than CMU • Email is the real killer app for E-discovery • IIT CDIP version 1.0 (same as 2006/07) • We have 83 topics. Do we need more? • State Department Cables • Task model would be FOIA, not E-Discovery

  5. TREC Topic Number: 1 • Title: Marketers or Traders of Electricity on the Financial Market • Description: Identify Enron employees who bought and sold electricity on California’s financial (long-term sales) energy market, solely for the purpose of re-buying/re-selling this energy later for a profit. • Narrative: A relevant document must at a minimum identify the name and email address of the marketer, as well as the Enron subsidiary to which he/she belonged. The marketer’s phone number would be helpful as well, to help analysis of the corresponding Enron voice dataset. • Hint: Enron Power Marketing, Inc. (EPMI), Enron Energy Services, Inc. and Enron Energy Marketing Corporation all appear to have conducted long-term marketing services for Enron. This observation is based on the fact that Enron submitted information for all three of these subsidiaries in its reply to FERC’s data request 2 (DR2). (DR2 asked Enron to submit information about its short-term and long-term sales. Enron replied with data from these three subsidiaries.) (38, pp. 1-2, plus personal analysis.) It would be good, however, to know for sure which entities or persons did marketing at Enron. • Query Possibilities: • • (marketer or marketers or “Enron Power Marketing” or EPMI or “Enron Energy Services” or “Enron Energy Marketing Corporation”) • • (marketer or marketers or “Enron Power Marketing” or EPMI or “Enron Energy Services” or “Enron Energy Marketing Corporation”) and (MW or KW or watt* or MwH or KwH) • o This is to target electricity sales rather than natural gas sales. All the subsequent electricity queries can be similarly modified. • • (marketer or marketers or EPMI) and (short or long) • o As in have a long or short position in sales/purchases. • • (marketer or marketers or EPMI) and (NYMEX or CBOT or “Mid-Columbia” or COB or “California-Oregon Border” or “Four Corners” or “Palo Verde” or EOL) • o The electricity futures hubs were Mid-Columbia, COB, Four Corners, and Palo Verde, as best the author can tell. (85) NYMEX and CBOT ran these. (89; 15, p. 78) • o EOL was the forward market trading place. (36, p. 3)

  6. 82,084 addr-name 3,151 addr-nickname 19,708 addr-addr Identity Modeling in Enron m scott susan m scott m..scott@enron.com susan scott suebob sue sscott susan susan scott sscott5@enron.com again sscott5 susan ciao susan m scott friday com members scott susan scott.susan@enron.com 66,715 models susan m scott susan scott

  7. Enron Identity Test Collections Test Collections Enron-all Enron-subset Sager Shapiro

  8. Example Document Scanned OCR Metadata Philip Moxx's. U.S.A. x.dr~am~c. cvrrespoaa.aa Benffrts Departmext Rieh>pwna, Yfe&ia Ta: Dishlbutfon Data aday 90,1997. From: Lisa Fislla Sabj.csr CIGNA WeWedng Newsbttsr -Yntsre StratsU During our last CIGNA Aatfoa Plan meadng, tlu iasuo of wLetSae to i0op per'Irw+ng artieles aod discontinue mndia6 CIGNA Well-Being aawslener to om employees was a msiter of disanision . I Imvm done somme reaearc>>, and wanted to pruedt you with my Sadings and pcdiminary recwmmeadatioa for PM's atratezy Ieprding l4aas aewelattee* . I believe .vayone'a input is valusble, and would epproolate hoarlng fmaa aaeh of you on whetlne you concur with my reeommendatioa … Title:CIGNA WELL-BEING NEWSLETTER - FUTURE STRATEGY Organization Authors:PMUSA, PHILIP MORRIS USA Person Authors:HALLE, L Document Date:19970530 Document Type:MEMO, MEMORANDUM Bates Number:2078039376/9377 Page Count:2 Collection:Philip Morris

  9. State Department Cables 791,857 records – 550,983 of which are full text

  10. State Department Cables

  11. Handling Nasty OCR • Index pruning • Error estimation • Character n-grams • Duplicate detection • Expansion using a cleaner collection

  12. How to “Beat Boolean” • Work from reference Boolean? • Swap out low-ranked-in for high-ranked-out • Relax Boolean somehow? • Cover density, proximity perturbation, …

  13. Using Metadata • Title (term match) • Author (social network • Bates number (sequence)

  14. Ad Hoc Task Design • Evaluation measures • R@B?, P@R?, Index size? • Error bars / Statistical significance testing • Limits on post-hoc use of the collection? • What are “meaningful” differences? • Topic design • Negotiation transcript? • Inter-annotator agreement

  15. Interactive Track Design • Evaluation measure • Precision-oriented? • Recall-oriented? • Effect of assessor disagreement

  16. Relevance Feedback Task • Evaluation measure • Residual recall at B_Residual? • Two-stage feedback?

  17. Some Open Questions • Test collection reusability • Unbiased estimates? Tight error bars? • Why can’t we beat Boolean??? • Different strategies? Detailed failure analysis? • Can we improve topic formulation? • Structured relevance relevance feedback? • Is OCR masking effects we need to see? • Is it time for a new collection? • Must it be de-duped? Is metadata needed? • Does Δscope invalidate the interactive task?

More Related