Using Patient Data to Retrieve Health Knowledge

Using Patient Data to Retrieve Health Knowledge James J. Cimino, Mark Meyer, Nam-Ju Lee, Suzanne Bakken Columbia University AMIA Fall Symposium October 25, 2005

Automated Retrieval with Clinical Data 4 5 2 Automated Translation Get Information From EMR Resource Terminology 6 1 MRSA Querying Understand Information Needs 3 7 Resource Selection Presentation

What’s Hardest about Infobuttons? • It’s not knowing the questions • It’s not integrating clinical info systems • It’s not linking to resources • It’s translating source data to target terms

Automated Retrieval with Clinical Data 4 5 2 Automated Translation Get Information From EMR Resource Terminology 6 1 MRSA Querying Understand Information Needs 3 7 Resource Selection Presentation

What’s Hardest about Infobuttons? • It’s not knowing the questions • It’s not integrating clinical info systems • It’s not linking to resources • It’s translating source data to target terms

Types of Source Terminologies • Uncoded (narrative): • Radiology reports (?) "…infiltrate is seen in the left upper lobe." • Coded • Lab tests (6,133) AMIKACIN, PEAK LEVEL • Sensitivity tests (476) AMI 6 MCG/ML • Microbiology results (2,173) ESCHERECHIA COLI • Medications (15,311) UD AMIKACIN 1 GM VIAL

Types of Target Terminologies • Narrative search: • PubMed • RxList • Up to Date • Micromedex • Lab Tests Online • OneLook • National Guideline Clearinghouse • Coded resource: • Lexicomp • CPMC Lab Manual • Coded search • PubMed

The Experiments • Identify sources of patient data • Get random sample of terms for each source

Term Samples • 100 terms from radiology reports using MedLEE • 100 Medication ingredients • 100 Lab test analytes • 100 Microbiology results • 94 Sensitivity test reagents

The Experiments • Identify sources of patient data • Get random sample of terms for each source • Translate terms if needed (multiple methods) • Perform automated retrieval with terms

Searches Performed Narrative Concept Concept Resource Resource Search Un- Coded C o d e d

Mapping Methods • Microbiology results to MeSH: • Semi-automated • Lab tests to MeSH analytes: • Automated, using UMLS • Medications to Lexicomp: • Natural language processing • Lab tests to CPMC Lab Manual: • Manual matching

Results: Multiple Documents Retrieval success is represented as percent of terms that successfully retrieved any results; numbers in parentheses indicate average numbers of results (citations, documents, topics, definitions, etc., depending on the target resource) for those searches that retrieved at least one result.

Uncoded versus Coded Searches • 1,028/2,173 (47.3%) of microbiology tests terms mapped to MeSH • 940/1041 (90.3%) of lab analytes mapped to LOINC • 485/940 (51.6%) LOINC analytes mapped to MeSH

Results: Single Document Retrieval success is represented as percent of terms that successfully retrieved any results; numbers in parentheses indicate average numbers of results (citations, documents, topics, definitions, etc., depending on the target resource) for those searches that retrieved at least one result.

Results: Page of Links Results for Rx List and Micromedex are difficult to quantify, because they provided heterogeneous lists of links; rather than provide link counts, we assessed the true positive and false negative rates, shown in brackets.

Micromedex versus RXList 194 Terms 9 missed by both RxList: 163 Micromedex: 180 158 Terms found by both 5 found by RxList but missed by Micromedex 22 found by Micromedex but missed by RxList

See For Yourself! www.dbmi.columbia.edu/cimino/2005amia-data.html

Discussion • 7 sources, 894 terms, 11 resources, 1,592 searches • Automated retrieval is technically possible • Found something 73-100% of the time • 12/16 experiments “succeeded” 94-100% • Translation often unsuccessful • Automated indexing works • Usefulness of translation to MeSH is marginal • Good quality when retrieving pages of links (Micromedex and RxList) • Good quality when with concept-indexed resources • Recall/precision of document retrievals unknown • Need to define the question • Additional evaluation needed

Next Steps • Creation of terminology management and indexing suite • Formal analysis of qualities of answers

Acknowledgments This work is supported in part by NLM grants R01LM07593 and R01LM07659 and NLM Training Grants LM07079-11 and P20NR007799.

Using Patient Data to Retrieve Health Knowledge

Using Patient Data to Retrieve Health Knowledge

Presentation Transcript

Using Knowledge to Cleanse Data with Data Quality Services

Using data to improve patient care

Using School Health Data Effectively

Using Clinical Data to Study Women’s Health

From Data to Knowledge

Easy to retrieve

From Data to Knowledge

Linking Knowledge to Patient Care

From Data to Knowledge

Retrieve Form for Data Capture (RFD)

Patient-Generated Health Data

The Promise of Clinical Data Registries: Using Data to Improve Patient Health Outcomes

Using Fragile Families Data to Study Health

D2K – Data To Knowledge

Session #4: “Voice of the Patient” Using Data to Transform the Patient Experience

Using Child Health Data Effectively

Patient-Generated Health Data

Data to Knowledge

Retrieve Form for Data Capture (RFD)

Data Recovery Services Help Retrieve Damaged Data Professionally

Patient Segmentation Using Data Mining Techniques