170 likes | 268 Views
Development of ConText Tools in Python. Brian E. Chapman, PhD, Glenn Dayton, Wendy W. Chapman , PhD. Division of Biomedical Informatics. Caveats and Apologies.
E N D
Development of ConText Tools in Python Brian E. Chapman, PhD, Glenn Dayton, Wendy W. Chapman, PhD Division of Biomedical Informatics
Caveats and Apologies • I’m not a linguist, computational or otherwise, nor am I a grammarian, or multilingual, or particularly well spoken in my native English • In fact I’m a medical physicist who has drifted into imaging informatics with an emphasis on the image part of imaging informatics • Which is a long way of saying that I got into this field because of a specific problem
Motivation • Received NIH funding for computer-aided detection of pulmonary embolism in CT pulmonary angiography (CTPA) • How to identify appropriate cases from clinical PACS?
Case Identification Approach #1 • Talk to an honest broker • Who was obviously overworked • Who used procedure codes from RIS to identify potential cases • Who then read the dictated report • Who then classified the case • Who nearly fainted when I told her I needed hundreds of positive cases • Who then quickly asked, “Do you have a lot of money?”
Case Identification Approach #2 • Honest broker’s task is perfect for NegEx • Use procedure codes to identify reports in MARS repository at University of Pittsburgh • Use NegEx to classify reports as +/- for PE • Within minutes find hundreds of cases • Very happy honest broker
What if you wanted to answer more questions? • Disease uncertainty • Disease temporality • Image quality • Can we a priori specify all of these?
peFinder • Application to characterize CTPA reports • Presence or absence of PE • Temporal state of positive PE • Uncertainty of disease state • Technical quality of the exam
For Review: NegEx Clinical condition: Cough Negation: Negated scope Patient deniescoughbut complains of headache. No change in the patient’s chest pain. trigger term termination term pseudo-trigger term
Python Implementations • What Drove My Organic Design • What existed in NegEx • GUI program written in Tcl/Tk • Lots of enumerated trigger terms • What I wanted • I wanted a package that could be used to build a variety of accurate applications • I wanted it to be easy for others to use • I am an engineer and so lazy • Generalize relationships • Replace exhaustive enumeration of trigger terms with regular expressions
pyConText: Basic Framework • Item Objects: 4-tuple containing Lexical and Domain Knowledge • Literal (label): “pulmonary embolism” • Category/Concept • Regular expression • r‘’‘(pulmonary )(artery )?(embol[a-z]+)’’‘ • Rule • Directional influence of item in sentence • Category interaction?
pyConText: Basic Framework • Item Objects parse sentence to create Tag Objects within sentences • Tag Objects interact/modify each other • Targets • Modifiers • Conjunctions • Prune to eliminate subset tag objects • Directional Graph represents relationships
Did I Meet My Objectives? • Accurate • Yes: JBI 2011 • Modular • Yes: package in pypi • Easy for others to use • Depends on your definition of others • Wilson, et al. Journal of Pathology Informatics • Gentili and Chapman RSNA
Did I Meet My Objectives? • Easy for others to use (continued) • Can any application relying on user to provide regular expressions be defined as easy?
Current and Future Work • Web and GUI applications • Django • Django with Twisted for desktop port
Current and Future Work • Improved Knowledge Representation • Separating linguistic and domain knowledge • Integration with external knowledge bases • Use graphs to further reduce enumeration of items • No/definite/evidence of/pulmonary embolism
Thanks for the invitation • Looking forward to • Learning and • Working and • Skateboarding • For the next three weeks