1 / 17

Structured Queries for Legal Search

syn(guide strategy approval) #syn(family #1(G rated)) #syn(movie film) ... movie. film. Experiments and findings. Boolean constraints improve recall and precision ...

Patman
Download Presentation

Structured Queries for Legal Search

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Structured Queries for Legal Search TREC 2007 Legal Track Yangbo Zhu, Le Zhao, Jamie Callan, Jaime Carbonell Language Technologies Institute School of Computer Science Carnegie Mellon University 11/06/2007

    2. Agenda Introduction Main task – ad hoc search Routing task – relevance feedback

    3. What is legal search Goal: retrieve all documents for production requests. Production request: describes a set of documents that the plaintiff forces the defendant to produce. Recall-oriented: high risk (value) of missing (finding) important documents.

    4. Data set 7 million business records from tobacco companies and research institutes. Metadata: title, author, organizations, etc. OCR text: contain errors 50 topics generated from four hypothetical complaints created by lawyers

    5. Main task – Ad hoc search Indri query formulation Without boolean constraint #combine(ranking function) With boolean constraints #filreq( #band(boolean constraint) #combine(ranking function) )

    6. Boolean constraint Translate the Final Query

    7. Ranking functions Bag of words (guide strategy approval family G rated movie film) Respect phrase operators (guide strategy approval family #1(G rated) movie film) Group synonyms together (#syn(guide strategy approval) #syn(family #1(G rated)) #syn(movie film))

    8. Experiments and findings Boolean constraints improve recall and precision Structured queries outperform bag-of-words ones

    9. Per topic performance (Difference to the median of 29 manual runs) est_RB

    10. Routing task of Legal track 2007 Structured queries are known to be hard to construct. Not, with supervision Questions Weighted query help? Metadata&Annotations help? A definitive answer from Supervised Structured Query Construction

    11. Structured query #weight( w1 t1 w2 t2 … wn tn)

    12. Supervised Structured Query Construction Relevance feedback => supervised learning Train linear SVM with keyword, keyword.field feature SVM classifier fi : training weights for terms, choose to be tfidf/LM scores Retrieval: #weight( w1 t1 w2 t2 … ) fi : tfidf/LM scores for terms Advantages Given enough training, know for sure whether one type of feature helps

    13. Example Query <RequestNumber>13</RequestNumber> <RequestText>All documents to or from employees of a tobacco company or tobacco organization referring to the marketing, placement, or sale of chocolate candies in the form of cigarettes.</RequestText> <FinalQuery>(cand! OR chocolate) w/10 cigarette!</FinalQuery>

    14. Annotations Feedback query:

    15. Performance

    16. Routing Conclusions A principled way of constructing structured queries Annotations Query term weights Answers from a supervised learning algorithm Weights helps, annotations less.

    17. Thank you! Questions?

More Related