620 likes | 798 Views
Improving Access to Clinical Data Locked in Narrative Reports: An Informatics Approach . Wendy W. Chapman, PhD. Division of Biomedical Informatics University of California, San Diego. Overview. The promise of natural language processing (NLP)
E N D
Improving Access to Clinical Data Locked in Narrative Reports: An Informatics Approach Wendy W. Chapman, PhD Division of Biomedical Informatics University of California, San Diego
Overview • The promise of natural language processing (NLP) • Challenges of developing NLP in the clinical domain • Challenges in applying NLP in the clinical domain • Improving access to text through NLP resources
The promise of NLP • Vast & growing amounts of clinical text • Rich in information • Patient care • Evaluation/QC • Comparative effectiveness research • Epidemiology • Locked in free text • Natural language promising can help unlock that information • Encouraging NLP success stories
The promise of NLP Murff (2011)JAMA Results: “... higher sensitivity and lower specificity compared with patient safety indicators based on discharge coding.” • NLP captures: • Renal failure • Pulmonary embolism • Deep vein thrombosis • Sepsis • Pneumonia • Miocardial infarction “The promise of natural language processing ... may be closer than ever.”
Other promising NLP accomplishments ... • Smoking status (Savova, Hazlehurst) • Peripheral arterial disease (Pathak) • Medication extraction (Uzuner) • Pneumonia (Chapman) • Colonoscopy quality metrics (Harkema) • Breast cancer recurrence (Carrell) • Colorectal cancer screening behavior (Denny) • Rheumatoid arthritis (Zeng)
Overview • The promise of natural language processing (NLP) • Challenges of developing NLP in the clinical domain • Challenges in applying NLP in the clinical domain • Improving access to text through NLP resources
NLP Success “IBM's computer could very well herald a whole new era in medicine." ComputerWorld February 17, 2011 Dr. Watson?? Fresh off its butt-kicking performance on Jeopardy!, IBM’s supercomputer "Watson" has enrolled in medical school at Columbia University,”New York Daily News February 18th 2011
Clinical NLP Since 1960’s Why has clinical NLP had little impact on clinical care?
Barriers to Development • Sharing clinical data difficult • Have not had shared datasets for development and evaluation • Modules trained on general English not sufficient • Insufficient common conventions and standards for annotations • Data sets are unique to a lab • Not easily interchangeable
Limited collaboration • Clinical NLP applications silos and black boxes • Have not had open source applications • Reproducibility is formidable • Open source release not always sufficient • Software engineering quality not always great • Mechanisms for reproducing results are sparse
Overview • The promise of natural language processing (NLP) • Challenges of developing NLP in the clinical domain • Challenges in applying NLP in the clinical domain • Improving access to text through NLP resources
Security & Privacy Concerns Institutions are reluctant to share data • Clinical texts have many patient identifiers • 18 HIPAA identifiers • Names • Addresses • Items not regulated by HIPAA • tight end for the Steelers • Unique cases • 50s-year-old woman who is pregnant • Sensitive information • HIV status
Lack of user-centered development and scalability • Perceived cost of applying NLP outweighs the perceived benefit (Len D’Avolio)
Overview • The promise of natural language processing (NLP) • Challenges of developing NLP in the clinical domain • Challenges in applying NLP in the clinical domain • Improving access to text through NLP resources
Resources for NLP Developers Knowledge Bases Domain Schema Ontology Modifier Ontology Clinical Data Annotations • Modifiers of clinical elements • Linguistic representation of clinical elements Annotation Environment Disease: colon cancer Experiencer: family Negation: no Historical: yes “Patient denies a family history of colon cancer” Evaluation Melissa Tharp
Modifier Ontology Affirmation/negation Uncertainty Experiencer Historical/Recent Severity Allowable modifiers For each clinical element Modifiers are important for interpreting text • Chest radiograph confirms pneumonia • Family history of pneumonia • No evidence of pneumonia
Modifier Ontology Types of modifiers Linguistic expressions Actions Translations
Schema Ontology Imports Modifier Ontology Medications • Type • Dose • Frequency • Route Diagnosis • Negation • Uncertainty • Severity • History • Experiencer Consistent with other models: Clinical element models, cTAKES type system, Common model
Domain Ontology for NLP Instance of schema ontology Clinical elements from a particular domain
Synonyms Misspellings Regular expressions
Resources for NLP Experts Schemas Clinical Data Annotations Annotation Environment Evaluation Lack of shareable data is a barrier • University of Pittsburgh Repository • 111,045 reports of 9 types • 600 users • No longer available • MT Samples • 2,300 reports from MTSamples.com • De-identified
Resources for NLP Experts Schemas Clinical Data Annotations Annotation Environment Evaluation B South, D Mowery, S Velupillai, L Christensen, S Meystre AMIA NLP Working Group ShARe- Sharing Annotated Resources 5R01GM090187: Chapman, Savova, Elhadad • 600 clinical notes from MIMIC II repository • Annotate disorders and modifiers • Anatomic location • Map to SNOMED codes • CLEF Shared Task 2013 and 2014 • https://sites.google.com/site/shareclefehealth/
Resources for NLP Experts Schemas Annotator Registry Clinical Data eHOST Annotation Admin Annotations Annotation Environment Web application iDASH cloud Client app Evaluation VA, SHARP, and NIGMS : S Duvall, B South, B Adams, G Savova, N Elhadad, H Hochheiser Distributed annotation in secure environment
Annotator Registry Annotators • Enlist for annotation • Certify for annotation tasks • Personal health information • Part-of-speech tagging • UMLS mapping • Set pay rate NLP Admins • Search for annotators http://nlp-ecosystem.ucsd.edu/annotators
Resources for NLP Experts Schemas Annotator Registry Clinical Data eHOST Annotation Admin Annotations Annotation Environment Web application iDASH cloud Client app Evaluation Distributed annotation in secure environment
Resources for NLP Experts Schemas Clinical Data Annotations Annotation Environment Evaluation • Compare output of NLP annotators • NLP system vs human annotation • View annotations • Calculate outcome measures • Drill down to all levels of annotation • Perform error analysis
Select Classifications to View Document & annotations Outcome Measures for Selected Annotations Report List Attributes for Selected Annotation Relationships for Selected Annotation VA and ONC SHARP: Christensen, Murphy, Frabetti, Rodriguez, Savova
Controlled Vocabs Dry cough Productive cough Cough Hacking cough Bloody cough User’s Concepts Cough Dyspnea Infiltrate on CXR Wheezing Fever Cervical Lymphadenopathy Which concepts?
Attribute-values Temp 38.0C Low-grade temperature User’s Concepts Cough Dyspnea Infiltrate on CXR Wheezing Fever Cervical Lymphadenopathy What values?
Efficient Access to Information in the Patient Chart “Family history of colon cancer” Knowledge Author Schema Builder Chart Review Interface Disease: colon cancer Experiencer: family Negation: no Historical: yes NLP Schema Domain Ontology
Knowledge Author B Scuba, F Fana, Liqin Wang, Mingyuan Zhang, Y Liu, M Kong, F Drews • Front end interface for users • Back end • Schema ontology • Modifier ontology • Output • Domain ontology • Schema for NLP system
African American Adult Questions | Discussion wwchapman@ucsd.edu
No family history of colon cancer Linguistic modifiers
Access Information in Patient Chart Knowledge Author Chart Review Interfaces • Navigate patient data more efficiently • Point chart reviewer to ambiguous and contradictory information • Reduce bias
Access Information in Patient Chart Knowledge Author Viz NLP Subjects, Diagnoses Findings, Anatomical Locations EMR Chart Review Interfaces Feedback – improve models Population Patient Document Expression User Identifies Patients Meeting Criteria Interactive Search and Review of Clinical Records with Multi-layered Semantic Annotation NLM 1R01LM010964-01. Chapman, Wiebe, Hwa.