610 likes | 743 Views
An Approach to Using Controlled-Vocabularies in Clinical Information Systems. Jeff Wilcke, DVM, MSc, DACVCP and Art Smith, M.S. Lists of words…. Nomenclature The system or set of names for things, etc., commonly employed by a person or community (Petchamp, SNVDO, SNOMED) Vocabulary
E N D
An Approach to Using Controlled-Vocabularies in Clinical Information Systems Jeff Wilcke, DVM, MSc, DACVCP and Art Smith, M.S.
Lists of words… • Nomenclature • The system or set of names for things, etc., commonly employed by a person or community (Petchamp, SNVDO, SNOMED) • Vocabulary • A collection or list of words with explanations of their meanings (SNOMED) • Classification • The result of classifying; a systematic distribution, allocation, or arrangement, in a class or classes; esp. of things which form the subject-matter of a science or of a methodic inquiry. (SNOMED)
What do we need? • Nomenclature ONLY • Provides a simple list for data entry • Vocabulary / Classification • We can be CERTAIN that the “term” (description in SNOMED) means what we think it means. • We can develop rules that allow us to combine concepts to express ideas more complicated than those contained in the nomenclature. • We can use the knowledge base supported by the vocabulary/classification to search, retrieve and analyze our data.
Vocabulary Suitability • Adequate Content • Multiple granularities • Functional Subsetting • Rich Semantic Structure
Adequate Content • Lower boundary? • Values for all patient-care context(s) in the medical record system. • Values to allow for patient and patient-care specific specializations. • left, severe, chronic, etc. • Upper boundary? • Adequate content IS NOT the same thing as “any conceivable medical utterance”. • Some content belongs to specialized vocabularies • Pharmacy (e.g., specific brand name items)
Multiple granularities • Granularities appropriate for various patient care settings. • Problem list • Fractured femur • Surgery report summary • Closed spiral fracture of the midshaft of the femur
Functional Sub-setting • We only need PORTIONS of SNOMED for any one part of a Clinical Information System (CIS) • We need DIFFERENT portions of SNOMED for different parts of CIS. • We must be able to use ALL of SNOMED to search, retrieve, analyze data produced using sub-sets.
Medical Record Semantics • CIS Data Structure • Meaning of fields carried in the data dictionary for the CIS • Meta-semantics • Internal Vocabulary Semantics • Instance-semantics • Structurally identical to Meta-semantics • Some attributes from the Vocabulary • Some attributes used ONLY in instance semantics
CIS Data structure Simple “Write everything about the patient in the box” Body System Organ Tissue Complex “Each field must have at least one entry before you proceed to the next screen.” Path Process Etiology Vector Episodic Nature Duration Severity
Body System Organ Tissue Path Process Etiology Vector Episodic Nature Duration Severity Find a Balance Tell about Patient here.
CIS “Content” Problem list Rule out Final diagnosis Treatments Surgical Procedures Diagnostic Procedures Vocabulary “Content” Body system Morphology Etiology Approach Instrument Generic drug name Middle Ground
Finding Middle Ground • Option A – Single “findings” field • “Problem diabetes mellitus” • “Final diagnosis diabetes mellitus” • “Tentative diagnosis diabetes mellitus” • “Rule-out diabetes mellitus” Option A puts all the complexity of nomenclature management at the interface level.
Finding Middle Ground • Option B – Multiple “findings” fields • Problem list field – value = “diabetes mellitus” • Final diagnosis field – value = “diabetes mellitus” • Rule-out field – value = “diabetes mellitus” Option B shifts complexity to the medical record structure. Messages re-introduce nomenclature complexity.
SNOMED RT Definition “Closed fracture of shaft of femur (disorder) ”
Meta-semantics • The built-in semantic structure of the nomenclature system itself. • Object – attribute – value triples • Defined and sanctioned attributes • Associated Morphology • Associated Topography • Associated Etiology • Allowed value sets
Defined Attribute(s) • “ASSOC-ETIOLOGY names the direct causative agent (organism, toxin, force) of a disease or disorder. It does not include vectors (such as the mosquito that transmits malaria). It also does not include method or mechanism by which the etiology is introduced to the body”.
Instance semantics • Instance-semantics are used to express a particular occurrence of a concept by allowing the addition of details. • Object – attribute – value triples • Instance attributes • Has severity • Has laterality • Has duration
Defined Attribute(s) • “HAS LATERALITY* names the specific organ when that organ exists as left and right pairs (such as left and right femur)” *NOT sanctioned at this time. Meta-semantics will require this attribute.
Medical Record “Instance” “Closed fracture of shaft of the LEFT femur” Object – Attribute – Value Closed fracture of shaft of femur (disorder) Has laterality – Left
Choosing the correct object Closed fracture of shaft of femur (disorder) Has laterality – Left It is NOT a left FRACTURE It is NOT a left SHAFT It IS a left femur
Choosing the correct object Refineability (SNOMED) • It is NOT a left fracture • Fracture of shaft of femur is not refineable by laterality, but has associated topography shaft of femur. • It is NOT a left shaft • Shaft of femur is not refineable by laterality, but “is a” femur structure. • It IS a left femur • Femur is refineable.
Refineability(SNOMED) • The instance semantics need not include the femur itself to establish laterality, but must processed against the meta-semantics. • Rule (sic) – “When laterality is processed against a finding, assume that it is assigned to the topography. If the first occurrence of topography is not refineable by laterality, find a parent that is refineable.”
SNOMED Structure • Concepts are linked to other concepts by specific named Relationships (which are also concepts in SNOMED). • The full linkage of associated concepts can be staggering -- not just a tree, but rather a complex network of relationships. • Concepts and Relationships are stored in separate relational tables.
Thigh Bone Traumatic Abnormality Long Bone Bone of extremity Fracture Bone of lower extremity Fracture of bone Injury of thigh Fracture of lower limb Femur Fracture, Closed Fracture of the Femur Shaft of Femur Fracture of the Shaft of Femur Closed Fracture of the Femur Closed Fracture of the Shaft of the Femur Findings Concept, ISA connection Key: Anatomy Concept, Associated topography connection Morphology Concept, Associated Morpholgoy connection Part of connection Complicated! Powerful!
Concepts Concept ID SNOMED ID Status Fully Specified Name Relationships Concept ID 1 Relationship ID Concept ID 2 Refineability (SNOMED CT only) SNOMED Tables(two of them, anyway)
Concepts Table Concept ID SNOMED ID Concept Name 71620000 DD-13100 Fracture of Femur (Disorder) 116676008 G-C504 Associated Morphology 72704001 M-12000 Fracture (morphologic abnomality) Relationship Table Concept ID Relationship ID Concept ID 71620000 116676008 72704001 SNOMED Tables
Instance Semantics Structure • Must follow the same pattern as the SNOMED structure to allow seamless searching. • The table structure used to represent the nomenclature should also be used to represent controlled vocabulary entries. • There are some differences, though….
Meta Semantics Each concept appears just once (abstract concepts). Linkage can be and usually is a complex network Abstract (defining) relationships (e.g., IS-A, Part-of, Associated Topography) Instance Semantics Concepts may occur more than once (concrete instances). Linkage is a simple tree structure (modeling a noun phrase) Concrete (qualifying) relationships (e.g., Has laterality, Has severity, Associated-topography) Differences in Semantics
Consolidation of caudal and middle lobes of the right lung.(Tree Structure) D2-50020 Consolidation of lung G-C505 Associated topography T-28A20 Caudal lobe of lung G-C220 Has laterality G-A100 Right Instance Semantics G-C505 Associated topography T-28A20 Middle lobe of lung G-C220 Has laterality G-A100 Right T-28000 Lung (Refinable – left, right, both) Meta Semantics T-28A20 Caudal lobe of lung (Not refineable) T-28A20 Middle lobe of lung (Not refineable)
Three Sources of Information • The field in the system: • Data entered under the “Discharge diagnosis” field is semantically different than the same data in the “Rule-out list” field. • The data entered in that field: • Either a single SNOMED concept or a phrase constructed using explicit instance semantics. • The nomenclature system: • The related SNOMED concepts determined by the implicit meta-semantics.
Searching the TableUsing Semantics • When searching the table, automatically expand all concepts to include their IS-A descendants (children, grandchildren, etc.). • A match on any of those is considered a match on the parent. • Consider an example: “Find all diagnoses of lung disease occurring in the caudal lobe”
“Find all diagnoses of lung diseases occurring in the caudal lobe.” • Search for a diagnosis field entry with both D2-50000 (Disease of Lung) AND T-28A20 (Caudal Lobe of Lung) in the Value column. • Expand with IS-A descendants • D2-50000 has 41 IS-A children including D2-61010 (Abscess of Lung). Many of these children have IS-A children which are also included. • T-28A20 has no IS-A children. • Search for: (D2-50000 or D2-61010 or…) and T-28A20
“Tight” vs. “Loose” Searches • Tight Search: • Target known to match search criteria. Find all diagnoses of known lung disease that are known to occur in the caudal lobe of the lung. • Lose Search: • Target might match search criteria. Find all diagnoses that may be lung disease that may be located in the caudal lobe of the lung. • Generally we want a “tight” search. • Looking at our example search…
“Tight” vs. “Loose” Searches • Consider a diagnosis of simply D2-61010 Abscess of Lung • A “Tight” search would not find this diagnosis. • It does not match the criteria or the IS-A descendants of the criteria (i.e., not known to be caudal lobe). • A “Loose” search would find this diagnosis. • It could match the criteria or the IS-A descendants of the criteria (i.e., it might be in the caudal lobe). • “Tight” searches match criteria and their IS-A descendants. • “Loose” searches match criteria, their IS-A descendants AND their IS-A ancestors.
Pre-coordinated concept: DD-13152 = Closed fracture of shaft of femur Post-coordinated concept phrase: DD-13100 = Fracture of femur Associated Morphology Fracture, Closed Associated topography Shaft of Femur Pre-coordinated vs. Post-coordinated Concepts
Pre-coordinated vs. Post-coordinated Concepts Pre-coordinated (meta only) Is a Fracture of shaft of femur Is a Closed fracture of femur Closed fracture of shaft of femur Associated morphology Fracture, closed Associated topography Shaft of femur Post-coordinated (meta + instance) Is a Fracture of lower limb Is a Injury of thigh Fracture of femur Associated morphology Fracture Associated topography Femur Associated morphology Fracture, closed Associated topography Shaft of femur Are these computational equivalents?
Instance Template Main Concept Attribute (1) Value (1) Attribute (1.1) Value (1.1) Attribute (1.n) Value (1.n) Attribute (2) Value (2) Attribute (2.1) Value (2.1) Attribute (n) Value (n)
Coding an Instance(why?) • We need a compact yet unambiguous format for representing an instance structure. • Transmission of records • Storage of record data for presentation • NOT for searches or statistical reports. • Two forms – verbose and terse • Terse: concepts, attributes & values are just codes. • T-12710 • Verbose: concepts, attributes & values contain English • T-12710[Femur]
Coding an Instance(how?) • Concept • Concept ( attribute : value ) • Concept ( attribute1 : value1; attribute2 : value2 ) • Concept (attribute1 : value1 ( attribute 1.1 : value1.1 ) ; attribute2 : value2 ) • Concept ( attribute1 : value1 ( attribute1.1 : value1.1 ; attribute1.2 : value1.2 ) )
Storing an instance(why?) • We need a way to store the data in a relational database that will facilitate searches and statistics • Must allow for representation of structure. • Must fit the relational model (tables). • Must allow easy searching for concepts. • Must be efficient • Simple concepts take little space. • More complicated instances take more space.
Storing an instanceFive columns in Table • Key –unique identifier for the row. • Entry –unique identifier for the instance. • Single concept instance is one row. • Complicated instances take multiple rows. • Parent – for attribute/value modifiers. • Key for concept that they modify. • Empty for main concept. • Attribute • Shows attribute (relationship type) for attribute/value modifiers. • Empty for main concept • Value • Shows main concept or value of a attribute/value modifiers. • Searchable field (indexed). Match pulls all rows with same Entry.
Example 1 Medical Record Statement = Closed spiral fracture of shaft of the left femur Most specific SNOMED Leaf = Closed fracture of shaft of femur Missing additional modifiers = Fracture, spiral Left (femur)
Closed spiral fracture of shaft of the left femur (Tree Structure) DD-13152 Closed fracture of shaft of femur G-C504 Associated morphology M-12030 Fracture, spiral G-C220 Has laterality G-A100 Right
Closed spiral fracture of shaft of the left femur (Coding) Verbose: DD-13152[Closed fracture of shaft of femur] (G-C504[ASSOCIATED-MORPHOLOGY]: M-12030[Fracture, spiral]; G-C220 [HAS-LATERALITY]: G-A101[Left]) Terse: DD-13152 (G-C504:M-12030;G-C220:G-A101)
Closed spiral fracture of shaft of the left femur(Suggested Storage Form) DD-13152 Closed fracture of shaft of femur G-C504 Associated morphology M-12130 Fracture, closed, spiral G-C220 Has laterality G-A100 Right
Closed spiral fracture of shaft of the left femur (Computed definition, partial) From “root” definition Assignment computed
Example 2 Medical Record Statement = Abscess of caudal lobe of right lung due to Mannheimia heamolytica Most specific SNOMED Leaf = Abscess of Lung Missing additional modifiers = Caudal lobe of lung Right Mannheimia heamolytica
Abscess of caudal lobe of right lung due to Mannheimia heamolytica (Tree Structure) D2-61010 Abscess of lung G-C505 Associated topography T-28A20 Caudal lobe of lung G-C220 Has laterality G-A100 Right G-C503 Associated etiology L-22803 Mannheimia haemolytica
Abscess of caudal lobe of right lung due to Mannheimia heamolytica (Coding) Verbose: D2-61010[Abcess of Lung] (G-C505[ASSOCIATED-TOPOGRAPHY]: T-28A20[Caudal Lobe of lung] (G-C220[HAS-LATERALITY]: G-A100[Right]); G-C503 [ASSOCIATED-ETIOLOGY]: L-22803:[Mannheimia heamolytica]) Terse: D2-61010 (G-C505:T-28A20 (G-C220:G-A100); G-C503:L-22803)
Abscess of caudal lobe of right lung due to Mannheimia heamolytica (Relational Table) D2-61010 Consolidation of lung G-C505 Associated topography T-28A20 Caudal lobe of lung G-C220 Has laterality G-A100 Right L-22803 Mannheimia haemolytica G-C503 Associated etiology