50 likes | 63 Views
Text Mining of Medical Documents. Michael Elhadad - Raphael Cohen Dept of Computer Science. Natural Language Processing. Analyze free text to extract “information” Key challenges: Ambiguity: heart, ברק Variability: diabetes, dm, diab. Applications: Search
E N D
Text Mining of Medical Documents Michael Elhadad - Raphael Cohen Dept of Computer Science
Natural Language Processing • Analyze free text to extract “information” • Key challenges: • Ambiguity: heart, ברק • Variability: diabetes, dm, diab. • Applications: • Search • Text Mining: information extraction, relations • Summarization
NLP for Medical Domain Opportunity • Availability of online textual documents • EHR: mostly textual (release notes) • Scientific literature (PubMed) Challenge • Methods developed on “regular language” fail on “medical language”
Specific Interest • EHR • Exploit rich textual data in EHR. • In Hebrew! • Hebrew NLP • Complex morphology, no dictionaries, no UMLS • Domain Adaptation • Machine learning methods to port NLP models from one domain to medical domain.
Recent Work in Domain • Raphael Cohen, Michael Elhadad and Ohad S Birk, Analysis of free online physician advice services, PLOS ONE, 2013 • Raphael Cohen, Noemie Elhadad, Michael Elhadad, Redundancy in Electronic Health Record Corpora: Analysis, Impact on Text Mining Performance and Mitigation Strategies BMC Bioinformatics, 2013. • Raphael Cohen and Michael Elhadad, Syntactic Dependency Parsers for Biomedical-NLP, AMIA Proceedings 2012, pp121-128 • Raphael Cohen, Yoav Goldberg and Michael Elhadad, Domain Adaptation of a Dependency Parser with a Class-Class Selectional Preference Model, ACL 2012, SRW • Raphael Cohen, Avitan Gefen, Michael Elhadad and Ohad S Birk, CSI-OMIM - Clinical Synopsis Search in OMIM, BMC Bioinformatics 2011, 12:65