1.23k likes | 1.54k Views
Cornerstone I: Representing Knowledge. From Data to Knowledge Through Concept-Oriented Terminologies James J. Cimino. The first step on the path to knowledge is getting things by their right names. -Chinese saying. Overview. What is “data to knowledge”? Knowledge representation choices
E N D
Cornerstone I: Representing Knowledge From Data to Knowledge Through Concept-Oriented Terminologies James J. Cimino
The first step on the path to knowledge is getting things by their right names. -Chinese saying
Overview • What is “data to knowledge”? • Knowledge representation choices • Knowledge-based terminology efforts • Medical Entities Dictionary • Proof of concepts
What is “data to knowledge”? • Start with patient data in the medical record • Enhance knowledge by: • gaining a better understanding of the patient • learning relevant knowledge • bringing smart systems to bear to apply knowledge • discovering new knowledge from health data
Knowledge Representation • Terminology for representing symbols • Format for arranging the symbols
Knowledge Representation Choices • Guideline implementation
Guideline Implementation • Starren and Xie, SCAMC, 1994 • National Cholesterol Education Panel Guideline
Cholesterol 200 to 239 Cholesterol <200 Cholesterol >239 Cholesterol 200 to 239 HDL <35 or 2 Risks HDL >35, <2 Risks HDL >35, <2 Risks Provide dietary information Reevaluate in 2 years National Cholesterol Education Panel Guideline Measure Cholesterol & Assess Risk Factors
Guideline Implementation • Starren and Xie, SCAMC, 1994 • National Cholesterol Education Panel Guideline • Three representations: • PROLOG (first-order logic)
NCEP Guideline in PROLOG rule_j(PID):- check_lab(PID,hdl,HDL,_),!, HDL >= 35, total_risk(PID,Risk),!, Risk < 2, check_lab(PID,cholesterol), C,_), C >= 200, C =< 239, print_rule_j.
Guideline Implementation • Starren and Xie, SCAMC, 1994 • National Cholesterol Education Panel Guideline • Three representations: • PROLOG (first-order logic) • CLASSIC (frames)
NCEP Guideline in CLASSIC (CL-DEFINE-CONCEPT ‘C-PATIENT ‘(AND (ALL CHOL (AND INTEGER (MIN 200) (MAX 239))))) (CL-DEFINE-CONCEPT ‘G-PATIENT ‘(AND C-PATIENT LOW-RISK-PATIENT (ALL HDL (AND INTEGER (MIN 35)))))
Guideline Implementation • Starren and Xie, SCAMC, 1994 • National Cholesterol Education Panel Guideline • Three representations: • PROLOG (first-order logic) • CLASSIC (frames) • CLIPS (production rules)
NCEP Guideline in CLIPS (defrule C2G2J “Rules to reach box J” ?f1 <- (calculated-patient (state c) (done no) (hdl ?hdl) (name ?name) (test (>= ?hdl 35)) => (printout “Patient “ ?name “needs treatment”)
Guideline Implementation • Starren and Xie, SCAMC, 1994 • National Cholesterol Education Panel Guideline • Three representations: • PROLOG (first-order logic) • CLASSIC (frames) • CLIPS (production rules) • “All three representations proved adequate for encoding the guideline”
Knowledge Representation Choices • Guideline implementation • Terminologic knowledge
Terminology Representation Choices • Frame-based
Frame-Based Representation Serum Glucose Test is-a: Lab Test Measures: Glucose Specimen: Serum Units: “mg/dl”
Terminology Representation Choices Terminology Representation Choices • Frame-based • Semantic network
Chemical Lab Test Body Substance is-a is-a is-a Glucose Serum specimen measures Semantic Network Representation Serum Glucose Test
Terminology Representation Choices Terminology Representation Choices • Frame-based • Semantic network • Conceptual graphs
Conceptual Graph Representation [Serum Glucose Test] - (is-a) -> [Lab Test] (measures) -> [Glucose] (specimen) -> [Serum]
Terminology Representation Choices Terminology Representation Choices • Frame-based • Semantic network • Conceptual graphs
Knowledge Representation Choices • Guideline implementation • Terminologic knowledge
Knowledge Representation • Terminology for representing symbols • Format for arranging the symbols • Terminology and format for representing terminologic knowledge
Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991
femur increased_uptake right site site_attr during bone_phase Jochen Bernauer, SCAMC, 1991 • Conceptual graphs to model findings
Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993
Rector, Nolan and Glowinski, SCAMC, 1993 • GALEN project conditions grammatically haveLocation bodyparts fractures sensibly haveLocation bones femurs sensiblyAndNecessarily haveDivision neck
Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993 • Campbell and Musen, SCAMC, 1993
Campbell and Musen, SCAMC, 1993 • Conceptual graphs and SNOMED • Pain + Chest + Radiation to + Left + Arm [Pain] - (located in) -> [Chest] (radiating to) -> [Arm] -> (with laterality) -> [Left]
Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993 • Campbell and Musen, SCAMC, 1993 • Lindberg, Humphreys, McCray, Methods 1993
Lexical group String String Lindberg, Humphreys, McCray, Methods 1993 • Unified Medical Language System Concept Lexical group String String
Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993 • Campbell and Musen, SCAMC, 1993 • Lindberg, Humphreys, McCray, Methods 1993 • Rocha, Huff, et al., CBM, 1994
Rocha, Huff, et al., CBM, 1994 • VOSER • A server architecture for managing terminologic knowledege
Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993 • Campbell and Musen, SCAMC, 1993 • Lindberg, Humphreys, McCray, Methods 1993 • Rocha, Huff, et al., CBM, 1994 • Campbell, Cohn, Chute, et al., SCAMC 1996
Campbell, Cohn, Chute, et al., SCAMC 1996 • Convergent Medical Terminology • SNOMED/Kaiser/Mayo • Galapagos
Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993 • Campbell and Musen, SCAMC, 1993 • Lindberg, Humphreys, McCray, Methods 1993 • Rocha, Huff, et al., CBM, 1994 • Campbell, Cohn, Chute, et al., SCAMC 1996 • Brown, O’Neil and Price, Methods, 1997
Brown, O’Neil and Price, Methods, 1997 • Read Codes • Representation with GALEN model
Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993 • Campbell and Musen, SCAMC, 1993 • Lindberg, Humphreys, McCray, Methods 1993 • Rocha, Huff, et al., CBM, 1994 • Campbell, Cohn, Chute, et al., SCAMC 1996 • Brown, O’Neil and Price, Methods, 1997 • Spackman, Campbell, and Côte, SCAMC 1997
Spackman, Campbell, and Côte, SCAMC 1997 • SNOMED RT (Reference Terminology) • Convergent Medical Terminology • Description Logic Format
Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993 • Campbell and Musen, SCAMC, 1993 • Lindberg, Humphreys, McCray, Methods 1993 • Rocha, Huff, et al., CBM, 1994 • Campbell, Cohn, Chute, et al., SCAMC 1996 • Brown, O’Neil and Price, Methods, 1997 • Spackman, Campbell, and Côte, SCAMC 1997 • Huff, Rocha, McDonald, et al., JAMIA 1998
Huff, Rocha, McDonald, et al., JAMIA 1998 • Logical Observations, Identfiers, Names and Codes (LOINC) 4764-5 | GLUCOSE^3H POST 100 G GLUCOSE PO | SCNC | PT | SER/PLAS | QN|
Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993 • Campbell and Musen, SCAMC, 1993 • Lindberg, Humphreys, McCray, Methods 1993 • Rocha, Huff, et al., CBM, 1994 • Campbell, Cohn, Chute, et al., SCAMC 1996 • Brown, O’Neil and Price, Methods, 1997 • Spackman, Campbell, and Côte, SCAMC 1997 • Huff, Rocha, McDonald, et al., JAMIA 1998 • Pharmacy system knowledge base vendors
Drug Class International Package Identifiers is-a Not-Fully-Specified Drug is-a Ingredient Class is-a Clinical Drug is-a is-a is-a Composite Clinical Drug Trademark Drug is-a is-a Pharmacy System Knowledge Base Vendors Country-Specific Packaged Product Ingredient Manufactured Components Composite Trademark Drug
Knowledge-Based Terminology Efforts • Jochen Bernauer, SCAMC, 1991 • Rector, Nolan and Glowinski, SCAMC, 1993 • Campbell and Musen, SCAMC, 1993 • Lindberg, Humphreys, McCray, Methods 1993 • Rocha, Huff, et al., CBM, 1994 • Campbell, Cohn, Chute, et al., SCAMC 1996 • Brown, O’Neil and Price, Methods, 1997 • Spackman, Campbell, and Côte, SCAMC 1997 • Huff, Rocha, McDonald, et al., JAMIA 1998 • Pharmacy system knowledge base vendors
Medical Entities Dictionary (MED) • New York Presbyterian Hospital • 60,000 concepts (procs, results, drugs, probs) • 208,242 synonyms • 84,677 hierarchical links • 113,906 semantic links • 238,040 other attributes • 66,404 translations (ICD9-CM, LOINC, MeSH, UMLS)
MED Data Structures • Semantic network
Substance Laboratory Specimen Event Chemical Anatomic Substance Plasma Specimen Diagnostic Procedure Substance Sampled Plasma Laboratory Test Laboratory Procedure Has Specimen Carbo- hydrate Bioactive Substance CHEM-7 Part of Glucose Substance Measured MED Semantic Network Medical Entity Plasma Glucose