110 likes | 125 Views
This research focuses on the construction of graphs of consistent concepts in the field of medical data mining. The study includes hospital workflow, chest X-ray orders, medical record digitization, and the use of the Unified Medical Language System (UMLS) for semantic annotation and concept mapping.
E N D
Graphs of Consistent Concepts Data mining in a medical domain (Pawel Matykiewicz, Wlodzislaw Duch, John Pestian)
The story(1) • Hospital workflow: • Chest X-Ray order (Electronical Medical Record) • Chest X-Ray (High Quality JPEG) • Dictation ( “CLINICAL HISTORY:9-month-29-day-old male with a history of cough. Rule out pneumonia. PROCEDURE COMMENTS: None. COMPARISON: XX/XX/XX. FINDINGS:There is mild hyperinflation of the lungs with increased peribronchial markings most consistent with viral versus reactive airway disease. Hazy increased density is seen in the right middle lobe, left lower lobe which could represent subsegmental atelectasis. Hazy increased density is also noted at the lingula with partial effacement of the left heart contour which could represent atelectasis versus early pneumonia. No pleural effusion is noted. The cardiothymic silhouette is within normal limits. Soft tissues and bony structures are unchanged. IMPRESSION:Findings most consistent with viral versus reactive airway disease. Patchy atelectasis is associated. Lingular early infiltrate cannot be excluded.” ) • Billing ( ICD9CM 590.8 = INFECTIONS OF KIDNEY: OTHER PYELONEPHRITIS OR PYONEPHROSIS, NOT SPECIFIED AS ACUTE OR CHRONIC )
Creating a novel tool(2) Recognition memory Semantic memory Episodic memory Full text annotation
UMLS (3) • UMLS = Unified Medical Language System • UMLS contains: • 1,195,781 unique English concepts (CUI) • 2,873,310 unique English phrases (SUI) • 3,283,983 unique English, normalized words (WUI) • 88 different ontologies (e.g. ICD9CM = 15871 CUIS) • 36,627,948 relations • 11,495,405 co-occurrence relations
Example(4) • Concept description: • ENG|zygopleurage zygospora|C1473040|L5302079|S6018172| • C1533582|ENG|P|L5432111|PF|S6215413|Y|A7881881|2532798015|412807000||SNOMEDCT|PT|412807000|Serum inhibin measurement|4|N|| • Relation description: • C0000039|A6841046|CODE|RO|C0364349|A0683492|CODE|has_component|R39728053||LNC|LNC||Y|N||
Sense Disambiguation(5) • Word Sense Disambiguation: • “cold” (word): • "I am taking aspirin for my cold" • "Let's go inside, I'm cold“ • Phrase Sense Disambiguation: • “cold” (WUI): • cold temperature (CUI) • Common Cold (CUI) • Cold Therapy (CUI) • Chronic Obstructive Airway Disease (CUI) • Cold Sensation (CUI) • Cold brand of chlorpheniramine-phenylpropanolamine (CUI)
Concept Mapping(6) • Tough way: • Easy way:
Graphs of consistent concepts(7) JJ(X) => ( NN(Y) => C(XY) ) X versus Y Z => ( C(YZ) => C(XZ) ) X is associated => ( C(X) => P(X) = 1 )
Summary(10) • Data set: • 30training documents ( 6 ICD9CM codes, 137CUIs ) • 30testing documents ( 6 ICD9CM codes, 301 CUIs) • 30training documents ( 6 ICD9CM codes, 301CUIs ) • 30testing documents ( 6 ICD9CM codes, 137 CUIs) • To do: • Construction finding • Concept discovery • State discovery