Structured Queries for Legal Search

Structured Queries for Legal SearchTREC 2007 Legal Track Yangbo Zhu, Le Zhao,Jamie Callan, Jaime Carbonell Language Technologies Institute School of Computer Science Carnegie Mellon University 11/06/2007

Agenda • Introduction • Main task – ad hoc search • Routing task – relevance feedback

AND OR W/5 guide OR OR strategy family movie approval “G rated” film What is legal search • Goal: retrieve all documents for production requests. • Production request: describes a set of documents that the plaintiff forces the defendant to produce. • Recall-oriented: high risk (value) of missing (finding) important documents. Final query Sample request text: All documents discussing, referencing, or relating to company guidelines, strategies, or internal approval for placement of tobacco products in movies that are mentioned as G-rated.

Data set • 7 million business records from tobacco companies and research institutes. • Metadata: title, author, organizations, etc. • OCR text: contain errors • 50 topics generated from four hypothetical complaints created by lawyers

Main task – Ad hoc search Indri query formulation • Without boolean constraint #combine(ranking function) • With boolean constraints #filreq( #band(boolean constraint) #combine(ranking function) )

AND OR W/5 guide OR OR strategy family movie approval “G rated” film Boolean constraint • Translate the Final Query

AND OR W/5 guide OR OR strategy family movie approval “G rated” film Ranking functions • Bag of words (guide strategy approval family G rated movie film) • Respect phrase operators (guide strategy approval family #1(G rated) movie film) • Group synonyms together (#syn(guide strategy approval) #syn(family #1(G rated)) #syn(movie film))

Experiments and findings • Boolean constraints improve recall and precision • Structured queries outperform bag-of-words ones * B is the number of documents matching the Final Query. Its average value is 5000.

Per topic performance(Difference to the median of 29 manual runs) • est_RB • est_PB

Routing task of Legal track 2007 • Structured queries are known to be hard to construct. • Not, with supervision • Questions • Weighted query help? • Metadata&Annotations help? • A definitive answer from Supervised Structured Query Construction

Structured query • #weight( w1 t1 w2 t2 … wn tn)

Supervised Structured Query Construction • Relevance feedback => supervised learning • Train linear SVM with keyword, keyword.field feature • SVM classifier • fi : training weights for terms, choose to be tfidf/LM scores • Retrieval: #weight( w1 t1 w2 t2 … ) • fi : tfidf/LM scores for terms • Advantages • Given enough training, know for sure whether one type of feature helps

Example Query • <RequestNumber>13</RequestNumber> • <RequestText>All documents to or from employees of a tobacco company or tobacco organization referring to the marketing, placement, or sale of chocolate candies in the form of cigarettes.</RequestText> • <FinalQuery>(cand! OR chocolate) w/10 cigarette!</FinalQuery>

Annotations NE: bush.person sentence: violate.sent meta: television.title • Feedback query:

Performance On 39 topics of Legal 2006 (2/3 of judged documents for training, the rest for testing) On 10 topics of Legal 2007 routing task

Routing Conclusions • A principled way of constructing structured queries • Annotations • Query term weights • Answers from a supervised learning algorithm • Weights helps, annotations less.

Thank you! Questions?

Structured Queries for Legal Search

Structured Queries for Legal Search

Presentation Transcript

twitter for legal

Structured Queries for Legal Search

Legal

Legal Updates 2007

LEGAL RESEARCH METHODOLOGY OCTOBER 2007

Structured Transactions Legal Aspects

Legal research for the non-legal librarian

Legal Document for legal forms of Business

TREC 2011 Medical Track

TREC-2006 Legal Track Planning Session

Legal Services Act 2007

Research on Enterprise Track of TREC 2007

UIC at TREC 2007: Genomics Track

Planning for the TREC 2008 Legal Track

Planning for the TREC 2008 Legal Track

Legal evidence Legal causation

Dartmouth Legal Track

Sabir at TREC 2007 Legal Workshop

Interactive Task of the TREC Legal Track: Theory meets Practice

legal advice|free legal advice,legal advice online|legal help

Experiments with the Negotiated Boolean Queries of the TREC 2007 Legal Discovery Track

Sabir at TREC 2007 Legal Workshop