180 likes | 199 Views
Learn about SemanticFind, a tool that improves how physicians search patient records beyond simple term matches. Explore search technologies, match types, evaluation methods, and more. Discover the future of medical record search capabilities.
E N D
SemanticFind: Locating What You Want in a Patient Record, Not Just What You Ask ForJohn M. Prager, Jennifer J. Liang, Murthy V. DevarakondaIBM T.J. Watson Research Center AMIA Joint Summits, San Francisco CA, March 31 2017
Overview • What is SemanticFind? • What it is not • How is SemanticFind different from Crtl/F? • 13 match types • 4 technologies • Prototype User Interface • Evaluation • Conclusions
What is SemanticFind? • An application that aids the physician search a patient’s medical record for matches to terms of interest • Extension of the familiar Ctrl/F “find” capability in document creation and reading applications • Not limited to matching solely the entered search term • Can find matching content along a variety of different dimensions
What it is Not • Not an application designed for finding matching records (patients) amongst a large collection, i.e. clinical trial matching • Can be done, just not optimized for this • Not a presentation of a new user interface • Prototype GUI is used in subsequent slides for demonstration of functionality • Not a Question-Answering System
So how is it different from Ctrl/F • Search terms represent information needs, but • Information needs cannot usually be answered fully just by locating instances of them in text • Ambiguous intent behind the search term E.g. if the search term is a disease, user might be wondering • When/how it was first diagnosed • What indicative labs for it were over time • Are there counterindications • Are there complications • What treatments were prescribed for it • Approach is to perform a variety of different searches simultaneously, and present the results organized by search type • EHR systems have both structured information (e.g. tables of orders, diagnoses, lab results) and unstructured (e.g. progress and other notes, free text) • Both kinds are searched • Search terms are unrestricted: symptoms, diseases, medications, anatomical structures, any sequence of characters. • Search is mediated by UMLS, so search terms that correspond to concepts in UMLS can be more fully explored
SemanticFind Search Types (1) Trad. Search Conceptual Search Associative Search 7
SemanticFind Search Types (2) Inferential Search 8
Technologies Used • Literal Match • As traditional search, but case-insensitive and disregards singular/plural • Conceptual Match • UMLS concepts and relations for Semantic, More General/Specific • Our own Medical Concept Annotator, conceptually similar to cTAKES and MetaMap, but higher accuracy (paper in preparation) • Lab values and vitals mapped to indicated conditions • K 6.1 gets annotated as hyperkalemia • Parse- and Linguistic-principle-based transformations to catch semantically matching concepts/variants in UMLS • Pain in the abdomen not a variant of abdominal pain • NLP for Negation and Hypothetical • Patient denies discomfort with the rash • Ordered urine test to rule out arsenic poisoning • Associative Match • Uses Latent Semantic Analysis • Finds terms in the record that occur in the same contexts in the literature as the search terms • Useful for finding terms correlated with search term, but no “named” relation, e.g. sob wheezing • Inferential Match • Finds terms in the record that are related through curated relation chains to the search term • Most useful for <treats>, <prevents>, <causes> relations, e.g. Infection <includes> Lower respiratory tract infection<treated by> Amoxicillin <is ingredient of> Augmentin 875 mg-125 mg tablet
Evaluation • 3 types suitable for evaluation • Semantic Match • More Specific • Contradicted • 10 records selected at random • Average of 250 clinical notes per record • MD developed list of (13-32) search terms for each • Total of 169 terms, 134 unique • 4th-yr medical students used as assessors • Assessors generated a list of paraphrases for each search term • 0-13 per term. Total of 652. • Based on medical knowledge and/or lookup, not seeing medical records.
Assessment task, per search term • SemanticFind used interactively to locate matches • Precision: • GUI enhanced with evaluation widgets for assessors to enter judgments of GOOD or BAD • #GOOD = True Positives (TP) • #BAD = False Positives (FP) • Precision = TP/(TP+FP) • Recall: • System automatically searched for user-generated paraphrases (via Literal Match), and counted how many of these did not correspond to GOOD in Precision task. • This count = False Negatives (FN) • Recall = TP/(TP+FN) • F-Measure • F = 2PR/(P+R)
Results (1) • Precision • Error analysis shows most FPs due to • ambiguity of abbreviations • negation detection error • Recall: 2 modes evaluated • Unconstrained = all supplied paraphrases • Constrained = only those paraphrases that matched UMLS concepts • F-Measure Unconstrained/Constrained = 0.87/0.92
Results (2) • Progressive analysis of GOOD matches: • Relative to Literal Match as a baseline • Semantic Match • Corresponds (very roughly) to Ctrl/F + Synonym Expansion • Semantic Match + More Specific • Semantic Match, More Specific + Contradicted 103% Dark Matter
Interesting Negation Detection Error • Due to somewhat informal formatting/writing of clinical notes, e.g.: • Implicit sentence-end clear to humans, but not to computer, giving rise to recognition of no smoking • On fixing problem, reduced false positives by 30% … alcohol use : no smoking : yes …
Conclusions • SemanticFind is application to search within a patient record • 13 searches performed simultaneously • using a variety of NLP technologies • Organised in a tabbed interface • High accuracy • F = 0.87 or 0.92 • Est. 2 points higher when sentence-end problem fixed • “Dark Matter” calculation shows that Ctrl/F misses as many desirable matches as it finds