10 likes | 92 Views
PathBinderH: a Tool for Linnaean Taxonomy-Aware Literature Searches. L. Hughes, J. Ding, K. Viswanathan, D. Berleant, A. Fulmer, and P.S. Schnable. Approach Provide a MEDLINE-based resource with taxonomy awareness, and sentence-based retrieval
E N D
PathBinderH: a Tool for Linnaean Taxonomy-Aware Literature Searches L. Hughes, J. Ding, K. Viswanathan, D. Berleant, A. Fulmer, and P.S. Schnable • Approach • Provide a MEDLINE-based resource with • taxonomy awareness,and • sentence-based retrieval • Users can make queries that retrieve sentences from PubMed • Clicking a sentence gets its containing abstract • Specify a biological taxonto constrain searches • Specifying plants (Viridiplantae) searches within abstracts mentioning any plant species or other taxon • Plant taxa are currently handled; principle is extensible to full taxonomy Fig 3. Overview of PathBinderH. The major operations of selecting the query terms and setting the taxonomy filter are shown. • Results of a Sample Query • Query of embryo and development, and species filter plants (Viridiplantae) was posed to PathBinderH • Analogous query embryo [text word] development [text word] plant [text word] was posted to PubMed • PathBinderH returned 542 abstracts • PubMed returned 890 abstracts • 383 out of 542 PathBinderH abstracts mentioned a plant subtaxon but not “plant” • These were not returned by PubMed • Some were relevant to plant embryo development; some were not • Only 159 abstracts were returned by both PathBinderH and PubMed • 251 of the PathBinderH 542 were relevant to embryo development in plants • 102 of the non-overlapping 383 were relevant (and not found by PubMed) • When PubMed was allowed to create its own “intelligently” expanded queries from the user-typed query embryo plant development: • Recall was improved • Precision was worsened • PathBinderH still found numerous abstracts that PubMed could not • Conclusion: taxonomy-aware literature access is useful and important • Summary • PathBinderH providestaxonomy-sensitive searches • PathBinderH provides sentence-focused searches • Evaluations shows that PubMed can fail to find numerous relevant abstracts • PathBinderH can help • PathBinderH is accessible by anyone over the web • Objective: Taxonomy- • Sensitive Retrieval • Find sentences only in abstracts that mention taxa of interest (e.g. species) • Hierarchical nature expands a user-specified taxon to all its subtaxa • Expands names to allow for synonyms, both scientific and common names • See Figure 1 Figure 1. Selecting green plants (and therefore all plant species and other sub-taxa) • Objective: Sentence- • based Retrieval • Display sentences that contain the query terms • Clicking a sentence displays its abstract at the PubMed site • Query terms that co-occur in the same sentence are likely to be… • explicitly connected conceptually • Query terms match occurrences of the terms and their synonyms • See Figure 2 Fig. 4 Architectural Overview of PathBinderH Figure 2. Sentences (including titles) containing two specified terms. Only sentences from PubMed entries mentioning a green plant species or other taxon are listed. Acknowledgments This research was funded in part by The Procter & Gamble Co. Support was also provided by Hatch Act and State of Iowa funds. Computer support was provided in part by the Virtual Reality Applications Center (VRAC) at Iowa State University. • Availability • Please try PathBinderH! URL is www.plantgenomics.iastate.edu/PathBinderH • A tutorial is at the web site • A paper is at: class.ee.iastate.edu/berleant/s/paperPathBinderHreport.pdf • Source code and databases are available on request