280 likes | 292 Views
This study explores the automated retrieval and translation of clinical data to improve the retrieval of health knowledge. It discusses the challenges and experiments conducted to identify and translate patient data, and the success rates of different search methods and terminologies. The study concludes with suggestions for future steps to improve the retrieval and analysis of health information.
E N D
Using Patient Data to Retrieve Health Knowledge James J. Cimino, Mark Meyer, Nam-Ju Lee, Suzanne Bakken Columbia University AMIA Fall Symposium October 25, 2005
Automated Retrieval with Clinical Data 4 5 2 Automated Translation Get Information From EMR Resource Terminology 6 1 MRSA Querying Understand Information Needs 3 7 Resource Selection Presentation
What’s Hardest about Infobuttons? • It’s not knowing the questions • It’s not integrating clinical info systems • It’s not linking to resources • It’s translating source data to target terms
Automated Retrieval with Clinical Data 4 5 2 Automated Translation Get Information From EMR Resource Terminology 6 1 MRSA Querying Understand Information Needs 3 7 Resource Selection Presentation
What’s Hardest about Infobuttons? • It’s not knowing the questions • It’s not integrating clinical info systems • It’s not linking to resources • It’s translating source data to target terms
Types of Source Terminologies • Uncoded (narrative): • Radiology reports (?) "…infiltrate is seen in the left upper lobe." • Coded • Lab tests (6,133) AMIKACIN, PEAK LEVEL • Sensitivity tests (476) AMI 6 MCG/ML • Microbiology results (2,173) ESCHERECHIA COLI • Medications (15,311) UD AMIKACIN 1 GM VIAL
Types of Target Terminologies • Narrative search: • PubMed • RxList • Up to Date • Micromedex • Lab Tests Online • OneLook • National Guideline Clearinghouse • Coded resource: • Lexicomp • CPMC Lab Manual • Coded search • PubMed
The Experiments • Identify sources of patient data • Get random sample of terms for each source
Term Samples • 100 terms from radiology reports using MedLEE • 100 Medication ingredients • 100 Lab test analytes • 100 Microbiology results • 94 Sensitivity test reagents
The Experiments • Identify sources of patient data • Get random sample of terms for each source • Translate terms if needed (multiple methods) • Perform automated retrieval with terms
Searches Performed Narrative Concept Concept Resource Resource Search Un- Coded C o d e d
Mapping Methods • Microbiology results to MeSH: • Semi-automated • Lab tests to MeSH analytes: • Automated, using UMLS • Medications to Lexicomp: • Natural language processing • Lab tests to CPMC Lab Manual: • Manual matching
Results: Multiple Documents Retrieval success is represented as percent of terms that successfully retrieved any results; numbers in parentheses indicate average numbers of results (citations, documents, topics, definitions, etc., depending on the target resource) for those searches that retrieved at least one result.
Uncoded versus Coded Searches • 1,028/2,173 (47.3%) of microbiology tests terms mapped to MeSH • 940/1041 (90.3%) of lab analytes mapped to LOINC • 485/940 (51.6%) LOINC analytes mapped to MeSH
Results: Single Document Retrieval success is represented as percent of terms that successfully retrieved any results; numbers in parentheses indicate average numbers of results (citations, documents, topics, definitions, etc., depending on the target resource) for those searches that retrieved at least one result.
Results: Page of Links Results for Rx List and Micromedex are difficult to quantify, because they provided heterogeneous lists of links; rather than provide link counts, we assessed the true positive and false negative rates, shown in brackets.
Micromedex versus RXList 194 Terms 9 missed by both RxList: 163 Micromedex: 180 158 Terms found by both 5 found by RxList but missed by Micromedex 22 found by Micromedex but missed by RxList
See For Yourself! www.dbmi.columbia.edu/cimino/2005amia-data.html
Discussion • 7 sources, 894 terms, 11 resources, 1,592 searches • Automated retrieval is technically possible • Found something 73-100% of the time • 12/16 experiments “succeeded” 94-100% • Translation often unsuccessful • Automated indexing works • Usefulness of translation to MeSH is marginal • Good quality when retrieving pages of links (Micromedex and RxList) • Good quality when with concept-indexed resources • Recall/precision of document retrievals unknown • Need to define the question • Additional evaluation needed
Next Steps • Creation of terminology management and indexing suite • Formal analysis of qualities of answers
Acknowledgments This work is supported in part by NLM grants R01LM07593 and R01LM07659 and NLM Training Grants LM07079-11 and P20NR007799.