420 likes | 608 Views
Beyond “Bag of Words”: Towards a Framework for Conceptual Retrieval. Jimmy Lin College of Information Studies University of Maryland Thursday, October 4, 2007 IPAM Workshop, UCLA. Beyond “Bag of Words”. IR is fundamentally based on counting words
E N D
Beyond “Bag of Words”: Towards a Framework for Conceptual Retrieval Jimmy Lin College of Information Studies University of Maryland Thursday, October 4, 2007 IPAM Workshop, UCLA
Beyond “Bag of Words” • IR is fundamentally based on counting words • Different ways of “bookkeeping”: vector space, probabilistic, LM, DFR, etc. • So… • Words aren’t enough to capture meaning • Term statistics aren’t enough to capture meaning
Thus… • IR systems should go beyond term statistics: concepts, relations, etc. • Hypothesis: • However… • A reasonable hypothesis? • Where’s the empirical support? IR based on concepts, relations, etc. >> IR based on words
Outline • Previous attempts to go beyond BoW • Slightly different approach • Start with specialized applications • Generalize • Case study in the medical domain • A clinical question answering system in support of evidence-based medicine (EBM) • Broader applicability?
Previous Work • Beyond “bags” • Indexing phrases • Modeling term dependencies • Beyond “words” • Query expansion: • Word Sense Disambiguation • Results? Mixed e.g., (Fagan, 1987; Smeaton et al., 1994; etc.) e.g., (Gao et al., 2004; Liu et al., 2004; Metzler and Croft, 2005; Cui et al., 2005; etc.) e.g., (Voorhees, 1993; 1994) e.g., (Sanderson, 1994; Mihalcea and Moldovan, 2000)
A Different Approach • Previous work focuses on the general domain • Broad but (relatively) shallow • Hampered by commonsense problem • Difficult to acquire large amounts of knowledge • Our approach: • Develop a general framework • Instantiate in domain-specific applications • Leverage lessons learned to refine the framework • Rinse, repeat
“Conceptual Retrieval” Questions Conceptual representation Semantic Matcher Knowledge Extractor Answers Conceptual representation Collection
What type of knowledge? • Knowledge about the problem structure • What representations are useful for capturing the information need? • Knowledge about user tasks • Why is this information needed? • How will it be further used? • Knowledge about the domain • What background knowledge is needed to reason about the information need?
K1: Problem Structure • Knowledge representations are important! • Helps experts reason about problems • Form the basis for tractable computational structures • GO’FAI • Frames (Minsky) • Scripts (Schank) • Semantic networks (attribution less clear) Knowledge about problem structure Knowledge about user tasks Knowledge about the domain
K2: User Tasks • The user is important! • Users are different • High school student vs. intelligence analyst • Different types of relevance • Topical, situational, etc. Knowledge about problem structure Knowledge about user tasks Knowledge about the domain
K3: Domain • Why is the sky blue? • Users bring a tremendous amount of knowledge to bear when asking questions • Specialized, technical knowledge • Commonsense “To really learn something, you basically have to already know it.” Knowledge about problem structure Knowledge about user tasks Knowledge about the domain
K4 … Kn? • More types of knowledge need? • Working hypothesis: • {K1, K2, K3} comprise a necessary set
Introductions Dr. , Ph.D. Dr. Dr. Dina Demner-Fushman, M.D., Ph.D.
Why the Medical Domain? • Evidence-Based Medicine • = A paradigm of medical practice that emphasizes decision-support from high-quality clinical research • Provides a basis for K1, K2, and K3 • Need for retrieval systems is well documented: • Clinical QA: • “Ready-made” domain for exploring conceptual retrieval • Availability of corpora, resources, etc. • Important and potentially high-impact application e.g., (Gorman et al., 1994; Chambliss and Conley, 1996; Cogdill and Moore, 1997; Ely et al., 2005; Sutton et al., 2005)
K1: Problem Structure • EBM identifies four components of a question • Originally developed as a clinical tool • Can serve as a knowledge representation “In children with an acute febrile illness, what is the efficacy of single-medication therapy with acetaminophen or ibuprofen in reducing fever?” = PICO frame Knowledge about problem structure Knowledge about user tasks Knowledge about the domain
K2: User Tasks • Clinical tasks • Considerations for strength of evidence • Strength of Recommendations Taxonomy (SORT): three evidence grades Knowledge about problem structure Knowledge about user tasks Knowledge about the domain
K3: Domain • The Unified Medical Language System (UMLS) • 2004 version: 1+ million biomedical concepts, > 5 million concept names • Software for leveraging this resource: • MetaMap, SemRep for identifying concepts, relations Anti-infective agent Antibacterial drugs Antifungal Disinfectants and cleansers Mucous membrane antifungal agent Quinolone Borate product ofloxacin Ciclopirox boric acid Knowledge about problem structure Knowledge about user tasks Knowledge about the domain
MEDLINE Re: Conceptual Retrieval Question:In children with an acute febrile illness, what is the efficacy of single-medication therapy with acetaminophen or ibuprofen in reducing fever? Task therapy P children/acute febrile illness I acetaminophen C ibuprofen O reducing fever Answer: Ibuprofen provided greater temperature decrement and longer duration of antipyresis than acetaminophen when the two drugs were administered in approximately equal doses. P children/acute febrile illness I acetaminophen C ibuprofen O reducing fever NLM’s authoritative repository of 17 million+ abstracts
System Architecture Question(query frame) query frame scored citations Query Formulator Semantic Matcher Answer Generator annotated abstracts search query Knowledge Extractors PubMed Answers abstracts
Test Collection • Manually gathered 50 clinical questions from FPIN and the Parkhurst Exchange • Reflects distribution of real-world questions • Divided into development and test collections
Gathering Judgments • Manually formulated PubMed queries • ~40 minutes per question; gathered top 50 fits • Manually evaluated all retrieved citations • ~2 hours per question Question: What is the best treatment for analgesic rebound headaches? PubMed Query: (((“analgesics”[TIAB] NOTMedline[SB]) OR “analgesics”[MeSH Terms] OR “analgesics”[Pharmacological Action] OR analgesic[TextWord]) AND ((“headache”[TIAB] NOT Medline[SB]) OR “headache”[MeSH Terms] OR headaches[TextWord]) AND (“adverse effects”[Subheading] OR side effects[Text Word])) AND hasabstract[text] AND English[Lang] AND “humans”[MeSH Terms]
Query Formulator Semantic Matcher Answer Generator Question Knowledge Extractors PubMed Answers Knowledge Extraction Example Antipyretic efficacy of ibuprofen vs acetaminophen. OBJECTIVE--To compare the antipyretic efficacy of ibuprofen, placebo, and acetaminophen. DESIGN--Double-dummy, double-blind, randomized, placebo-controlled trial. SETTING--Emergency department and inpatient units of a large, metropolitan, university-based, children's hospital in Michigan. PARTICIPANTS--37 otherwise healthy children aged 2 to 12 years with acute, intercurrent, febrile illness. INTERVENTIONS--Each child was randomly assigned to receive a single dose of acetaminophen (10 mg/kg), ibuprofen (7.5 or 10 mg/kg), or placebo. MEASUREMENTS/MAIN RESULTS--Oral temperature was measured before dosing, 30 minutes after dosing, and hourly thereafter for 8 hours after the dose. Patients were monitored for adverse effects during the study and 24 hours after administration of the assigned drug. All three active treatments produced significant antipyresis compared with placebo. Ibuprofen provided greater temperature decrement and longer duration of antipyresis than acetaminophen when the two drugs were administered in approximately equal doses. No adverse effects were observed in any treatment group. CONCLUSION--Ibuprofen is a potent antipyretic agent and is a safe alternative for the selected febrile child who may benefit from antipyretic medication but who either cannot take or does not achieve satisfactory antipyresis with acetaminophen. Am J Dis Child. 1992 May; 146(5):622-5 Antipyretic efficacy of ibuprofen vs acetaminophen. OBJECTIVE--To compare the antipyretic efficacy of ibuprofen, placebo, and acetaminophen. DESIGN--Double-dummy, double-blind, randomized, placebo-controlled trial. SETTING--Emergency department and inpatient units of a large, metropolitan, university-based, children's hospital in Michigan. PARTICIPANTS--37 otherwise healthy children aged 2 to 12 years with acute, intercurrent, febrile illness. INTERVENTIONS--Each child was randomly assigned to receive a single dose of acetaminophen (10 mg/kg), ibuprofen (7.5 or 10 mg/kg), or placebo. MEASUREMENTS/MAIN RESULTS--Oral temperature was measured before dosing, 30 minutes after dosing, and hourly thereafter for 8 hours after the dose. Patients were monitored for adverse effects during the study and 24 hours after administration of the assigned drug. All three active treatments produced significant antipyresis compared with placebo. Ibuprofen provided greater temperature decrement and longer duration of antipyresis than acetaminophen when the two drugs were administered in approximately equal doses. No adverse effects were observed in any treatment group. CONCLUSION--Ibuprofen is a potent antipyretic agent and is a safe alternative for the selected febrile child who may benefit from antipyretic medication but who either cannot take or does not achieve satisfactory antipyresis with acetaminophen. Am J Dis Child. 1992 May; 146(5):622-5 PopulationProblem Interventions Outcome
Query Formulator Semantic Matcher Answer Generator Question Knowledge Extractors PubMed Answers Knowledge Extractors • Population, Problem, Intervention: IE task • Exploited coverage of medical concepts in UMLS • Additional candidate ranking based a few features • Outcome: sentence-level classification task • “Kitchen sink approach”, ensemble of classifiers • Features: • Manually-defined cue words • N-grams • Position in abstract • Presence of certain UMLS concepts • … • Semantics helps!
? ? ? ? 95% 0% 5% 80% 0% 20% 90% 5% 5% 80% 13% 7% Query Formulator Semantic Matcher Answer Generator Question Knowledge Extractors PubMed Answers Knowledge Extractors Antipyretic efficacy of ibuprofen vs acetaminophen. OBJECTIVE--To compare the antipyretic efficacy of ibuprofen, placebo, and acetaminophen. DESIGN--Double-dummy, double-blind, randomized, placebo-controlled trial. SETTING--Emergency department and inpatient units of a large, metropolitan, university-based, children's hospital in Michigan. PARTICIPANTS--37 otherwise healthy children aged 2 to 12 years with acute, intercurrent, febrile illness. INTERVENTIONS--Each child was randomly assigned to receive a single dose of acetaminophen (10 mg/kg), ibuprofen (7.5 or 10 mg/kg), or placebo. MEASUREMENTS/MAIN RESULTS--Oral temperature was measured before dosing, 30 minutes after dosing, and hourly thereafter for 8 hours after the dose. Patients were monitored for adverse effects during the study and 24 hours after administration of the assigned drug. All three active treatments produced significant antipyresis compared with placebo. Ibuprofen provided greater temperature decrement and longer duration of antipyresis than acetaminophen when the two drugs were administered in approximately equal doses. No adverse effects were observed in any treatment group. CONCLUSION--Ibuprofen is a potent antipyretic agent and is a safe alternative for the selected febrile child who may benefit from antipyretic medication but who either cannot take or does not achieve satisfactory antipyresis with acetaminophen. Am J Dis Child. 1992 May; 146(5):622-5 Problem Population Intervention Outcome Details: Dina Demner-Fushman and Jimmy Lin. Answering Clinical Questions with Knowledge-Based and Statistical Techniques. Computational Linguistics, 33(1):63-103, 2007
Query Formulator Semantic Matcher Answer Generator Question Knowledge Extractors PubMed Answers Semantic Matching • Three score components: SEBM = SPICO + SSoE + SMeSH SPICO Matching PICO frame elements SSoE Strength of evidence considerations SMeSH MeSH indicators for each clinical task Problem Structure User Tasks Details: Dina Demner-Fushman and Jimmy Lin. Answering Clinical Questions with Knowledge-Based and Statistical Techniques. Computational Linguistics, 33(1):63-103, 2007
Query Formulator Semantic Matcher Answer Generator Question Knowledge Extractors PubMed Answers Semantic Matching: Evaluation • Research Questions • Does it work? • What are the relative contributions of each component? • What is the interaction between knowledge-based and statistical techniques? • Approach • Reranking experiments with test collection • Ablation studies
(((“analgesics”[TIAB] NOTMedline[SB]) OR “analgesics”[MeSH Terms] OR “analgesics”[Pharmacological Action] OR analgesic[TextWord]) AND ((“headache”[TIAB] NOT Medline[SB]) OR “headache”[MeSH Terms] OR headaches[TextWord]) AND (“adverse effects”[Subheading] OR side effects[Text Word])) AND hasabstract[text] AND English[Lang] AND “humans”[MeSH Terms] Clinical task, PICO frame P MEDLINE C Knowledge Extractor Semantic Matcher I O vs. original PubMed ordering vs. Indri baseline (state-of-the-art LM) Query Formulator Semantic Matcher Answer Generator Question Knowledge Extractors PubMed Answers Evaluation: Abstract Reranking Question: What is the best treatment for analgesic rebound headaches?
Query Formulator Semantic Matcher Answer Generator Question Knowledge Extractors PubMed Answers Results: Complete Model • Performance on held-out blind test set: Results are statistically significant Details: Jimmy Lin and Dina Demner-Fushman. The Role of Knowledge in Conceptual Retrieval: A Study in the Domain of Clinical Medicine. SIGIR 2006.
Query Formulator Semantic Matcher Answer Generator Question Knowledge Extractors PubMed Answers Results: Parameter Settings • Tuning each component • No statistically significant difference • Combining EBM + Indri • Better performance, but not statistically significant SEBM = λ1SPICO + λ2 SSoE + (1 - λ1 - λ2 ) SMeSH SEBM+Indri = λSEBM + (1- λ ) SIndri Details: Jimmy Lin and Dina Demner-Fushman. The Role of Knowledge in Conceptual Retrieval: A Study in the Domain of Clinical Medicine. SIGIR 2006.
Query Formulator Semantic Matcher Answer Generator Question Knowledge Extractors PubMed Answers Results: Contributions • What’s the contribution of each EBM facet? • What types of knowledge are important? • Problem structure (K1) helps a lot • User tasks (K2) help, but not as much Problem Structure User Tasks Problem Structure User Tasks ** = sig. at 99%, * = sig. at 95% Details: Jimmy Lin and Dina Demner-Fushman. The Role of Knowledge in Conceptual Retrieval: A Study in the Domain of Clinical Medicine. SIGIR 2006.
Query Formulator Semantic Matcher Answer Generator Question Knowledge Extractors PubMed Answers Results: Partial Models • Can we use limited knowledge to improve term-based methods? • Any knowledge helps! Term Statistics + Problem Structure + User Tasks ** = sig. at 99%, * = sig. at 95% Details: Jimmy Lin and Dina Demner-Fushman. The Role of Knowledge in Conceptual Retrieval: A Study in the Domain of Clinical Medicine. SIGIR 2006.
Query Formulator Semantic Matcher Answer Generator Question Knowledge Extractors PubMed Answers Answer Generation • Physicians are most interested in outcomes • Approach: identify outcome sentences • Generate an answer from each citation: abstract title and three highest scoring outcome sentences Question: Does combining aspirin and warfarin decrease the risk of stroke for patients with nonvalvular atrial fibrillation? Answer:Prevention of thromboembolic events in atrial fibrillation:The results from the SPAF III study demonstrated that a combination of mini-intensity warfarin plus aspirin was insufficient for stroke prevention in atrial fibrillation.Other trials now indicate, that oral anticoagulation at INR-values below 2.0 is not effective for stroke prevention in these patients.The present clinical challenge is to ensure effective and safe oral anticoagulation to patients with atrial fibrillation at high risk of stroke. Answer: Prevention of thromboembolic events in atrial fibrillation: The results from the SPAF III study demonstrated that a combination of mini-intensity warfarin plus aspirin was insufficient for stroke prevention in atrial fibrillation. Other trials now indicate, that oral anticoagulation at INR-values below 2.0 is not effective for stroke prevention in these patients. The present clinical challenge is to ensure effective and safe oral anticoagulation to patients with atrial fibrillation at high risk of stroke. abstract titleoutcome1outcome2outcome3
Query Formulator Semantic Matcher Answer Generator Question Knowledge Extractors PubMed Answers Evidence Synthesis • Integrate findings from multiple citations • Question: What is the best treatment for chronic prostatitis? • ► anti-microbial • [temafloxacin] Treatment of chronic bacterial prostatitis with temafloxacin. Temafloxacin 400 mg b.i.d. administered orally for 28 days represents a safe and effective treatment for chronic bacterial prostatitis. • [ofloxacin] Ofloxacin in the management of complicated urinary tract infections, including prostatitis. In chronic bacterial prostatitis, results to date suggest that ofloxacin may be more effective clinically and as effective microbiologically as carbenicillin. • ... • ► Alpha-adrenergic blocking agent • [terazosine] Terazosin therapy for chronic prostatitis/chronic pelvic pain syndrome: a randomized, placebo controlled trial. CONCLUSIONS: Terazosin proved superior to placebo for patients with chronic prostatitis/chronic pelvic pain syndrome who had not received alpha-blockers previously. • ...
Query Formulator Semantic Matcher Answer Generator Question Knowledge Extractors PubMed Answers Semantic Clustering Cluster1 Cluster2 relevant citations Cluster3 Answer Extraction Semantic Clustering Interactive Presentation
Query Formulator Semantic Matcher Answer Generator Question Knowledge Extractors PubMed Answers Evaluation: Evidence Synthesis • What is the best treatment of X? • Compare • Top three answers from PubMed • First answer in three largest semantic clusters • Evaluation by a physician: Details: Dina Demner-Fushman and Jimmy Lin. Answer Extraction, Semantic Clustering, and Extractive Summarization for Clinical Question Answering. ACL 2006.
Findings • K1 + K2 + K3→ “conceptual retrieval” • Knowledge helps a lot! • But here’s the catch: • Limited domain: “narrow but deep” • Dependent on availability of existing resources • Beyond “bag of words”: • Develop a general framework • Instantiate in domain-specific applications • Leverage lessons learned to refine the framework • Rinse, repeat
MEDLINE Re: Re: Conceptual Retrieval Question:In children with an acute febrile illness, what is the efficacy of single-medication therapy with acetaminophen or ibuprofen in reducing fever? Task therapy P children/acute febrile illness I acetaminophen C ibuprofen O reducing fever Answer: Ibuprofen provided greater temperature decrement and longer duration of antipyresis than acetaminophen when the two drugs were administered in approximately equal doses. P children/acute febrile illness I acetaminophen C ibuprofen O reducing fever facet Task therapy P children/acute febrile illness I acetaminophen C ibuprofen O reducing fever facet facet = faceted query! facet facet NLM’s authoritative repository of 17 million+ abstracts
Conceptual Retrieval • “Building blocks” strategy in library science • Decompose information need into conceptual facets • Identify terms that represent those facets • Instantiate in a structured query • EBM-based retrieval is a specific case of facet analysis and structured querying! ( A1 A2 …) ( B1 B2 …) ( C1 C2 …) ( D1 D2 …) … P I C O
A General Framework? • For a domain • Identify prototypical information needs • Develop a frame-based representation • Build extractor for frame elements • Instantiate semantic matcher • Watch performance go up! • The subject of ongoing work…
What comes next? • Retrieval in the biomedical domain • Complex question answering Information describing the role(s) of a [gene] involved in a [disease]. gene: Interferon-beta disease: Multiple Sclerosis Information describing the role of a [gene] in a specific [biological process]. gene: nucleoside diphosphate kinase (NM23) biological process: tumor progression What evidence is there for transport of [art looted by the Nazis in WWII] from [Germany] to [France]? What [familial ties] exist between [Neanderthals] and [humans]? What [common interests] exist between [Network Solutions] and [the Internet Corporation for Assigned Names and Numbers (ICANN)]?
Acknowledgments • Dina Demner-Fushman (Ph.D., 2006) • This work was funded in part by NLM
References Dina Demner-Fushman and Jimmy Lin. Answering Clinical Questions with Knowledge-Based and Statistical Techniques. Computational Linguistics, 33(1):63-103, 2007. Jimmy Lin and Dina Demner-Fushman. The Role of Knowledge in Conceptual Retrieval: A Study in the Domain of Clinical Medicine. Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2006), 2006, pp. 99-106. Dina Demner-Fushman and Jimmy Lin. Answer Extraction, Semantic Clustering, and Extractive Summarization for Clinical Question Answering. Proceedings of the 21th International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (COLING/ACL 2006), 2006, pp. 841-848.