How to Make Manual Conjunctive Normal Form Queries Work in Patent Search

How to Make Manual Conjunctive Normal Form Queries Workin Patent Search Le Zhao and Jamie Callan Language Technologies Institute School of Computer Science Carnegie Mellon University

Technology Survey Task @ Chem • Document Collection • 1.3 million patents + 0.18 million scientific articles • Tend to be long, have XML field structure • Topics • 6 topics (last year only 2 groups submitted runs, not reusable) • About use/detection of chemicals (in certain applications) • Similar to Ad hoc retrieval queries

Example Topic: TS-20 • <title>tests for HCG hormone</title><narrative>The hormone Human Chorionic Gonadotrophin (HCG) is produced when a women becomes pregnant. Tests are usually carried out by analysing blood or urine. We are looking for articles and patents on these pregnancy test kits or the chemical tests used to produce them.</narrative><details><chemicals>Human Chorionic Gonadotrophin OR HCG</chemicals><condition>pregnancy</condition><target>Human Chorionic Gonadotrophin OR HCG</target></details>

Our Runs • Automatic Queries • Unweighted bag of word baseline • Weighting and combining words from different query fields • Manual Queries • Interactive search using Boolean CNF queries • (test OR check OR detection OR detect)AND(HCG OR “Human Chorionic Gonadotrophin” OR “Chorionic Gonadotropin” OR Choriogonadotropin OR Choriogonin) • Effective, used by lawyers, librarians, medical, IR thesaurus & interaction check top ranked results MeSH etc. thesauri

Lemur CGI Identify synonyms 0.5 hours per topic

Results at Large (xinfAP) Not much difference on average Worst manual queries have reasonable AP Manual queries lower some high AP topics slightly Figure credit: MihaiLupu

Observations • Weighting different query fields helped. • Boolean CNF query (manual interaction) • Good • Expressive • Helps a lot for hard (low AP) queries • Bad • Takes time & care to create & interact • Manual error in formulating those queries • Phrase or window restrictions improves top precision, but destroys lower level recall/precision • Difficult to identify from top rank, new tools needed

Comparisons with Best Runs • Fraunhofer-SCAI • Semantic search (similar to our CNF queries) • IPC classification filtering • Doc field based term weighting • Topics that our manual queries got better • TS-22 detect => detection test predict check determine determination • TS-29 minimum inhibitory concentration => … • Expanded all terms, but not all resulted in 

Thanks to track organizers • NSF grant IIS-1018317 • Questions?

How to Make Manual Conjunctive Normal Form Queries Work in Patent Search

How to Make Manual Conjunctive Normal Form Queries Work in Patent Search

Presentation Transcript

Conjunctive Queries, Datalog, and Recursion

Patent Search

Patent Search

PATENT SEARCH

How range queries work

How to make this work…

How to Make Change Work ?

Convert to Conjunctive Normal Form (CNF)

Conjunctive Queries

How to Make HEG Work

How range queries work

How to make a literature search

Extended Conjunctive Queries

Bounded Conjunctive Queries

Minimizing Conjunctive Queries

Conversion to Conjunctive Normal Form

Patent Search Services In India| How to get a patent

Bounded Conjunctive Queries

How to Implement Form/Genre Search in Millennium

Conjunctive Queries

HOW TO MAKE WORK LEGAL?

How to Make HTML Form Work and Send Mail