200 likes | 368 Views
Language, medical terminologies and structured electronic patient records: how to escape the Bermuda triangle . Dr. W. Ceusters Dir R&D Language & Computing nv. The Medical Informatics dogma. To structure or NOT to be Fact: computers can only deal with a structured representation of reality:
E N D
Language, medical terminologies and structured electronic patient records: how to escape the Bermuda triangle. Dr. W. Ceusters Dir R&D Language & Computing nv Dr. W. Ceusters http://www.landc.be
The Medical Informatics dogma To structure or NOT to be • Fact: computers can only deal with a structured representation of reality: • structured data: • relational databases, spread sheets • structured information: • XML simulates context • structured knowledge: • rule-based knowledge systems • Conclusion: a need for structured data entry Dr. W. Ceusters http://www.landc.be
Structured data entry • Current technical solutions: • rigid data entry forms • coding and classification systems • But: • the description of biological variability requires the flexibility of natural language and it is generally desirable not to interfere with the traditional manner of medical recording (Wiederhold, 1980) • Initiatives to facilitate the entry of narrative data have focused on the control rather than the ease of data entry (Tanghe, 1997) Dr. W. Ceusters http://www.landc.be
Drawbacks of structured data entry • Loss of information • qualitatively • limited expressiviness of coding and classification systems, controled vocabularies, and “traditional” medical terminologies • use of purpose oriented systems • don’t use data for another purpose than originally foreseen (J VDL) • quantitatively • to time-consuming to code all information manually • Speech recognition and structured data entry forms are not best friends Dr. W. Ceusters http://www.landc.be
The three pilars of modern M.I. • Clinical language • medical narrative • Clinical terminologies • coding and classification systems • nomenclatures • formal ontologies • Electronic Healthcare Record Systems Dr. W. Ceusters http://www.landc.be
Language Terminology EHCRS How to harmonise the pilars ? Domain of discourse: healthcare Comparability of data Crossborder care Decision support Abstraction / grouping ... Faithfull data recording Sufficient level of detail ... Individual patient care Seemless care Historical overview ... Dr. W. Ceusters http://www.landc.be
Dominating entity Mediating entity Language EHCRS Terminology Language EHCRS Terminology Six possible approaches Dr. W. Ceusters http://www.landc.be
What do the symbols stand for ? • Text based EHCRS able to generate structured data • An EHCR exclusively build around a collection of coded data generated out of free text • A multimedia EHCRS with clinical narrative registration and structured data generation • A multimedia EHCRS with structured data entry and text generation • An EHCR exclusively build around texts generated out of controled vocabularies • An EHCR exclusively build around a collection of structured data able to generate text Dr. W. Ceusters http://www.landc.be
Main harmonisation principle: Accept what is offered and offer what you accepted... Dr. W. Ceusters http://www.landc.be
Dominating entity Mediating entity Language EHCRS Terminology Language EHCRS Terminology Harmony = maximum total surface 100% 73% 43% 59% 41% 83% Dr. W. Ceusters http://www.landc.be
by poorly accepting the offerings of the dominant component when mediating Terminology often hurts ! • by offering too little to the component over which it dominates • The more dominant the position of terminology, the more is lost: Dr. W. Ceusters http://www.landc.be
Controlled vocabulary based systems • lists too long or too short • difficult to use • System with “ICPC-syndrome” • list too purpose oriented • drammatically inadequate expressivity • EHCR with predictive data-entry • still expressivity problem but better integration Terminology centered approaches Dr. W. Ceusters http://www.landc.be
What about the UMLS approach ? • Simply using everything in the Metathesaurus does not make a good coding system [W. Hole, 2000] • The problems with the Metathesaurus as a single monolithic vocabulary are: • There is a wide range of granularity of terms in different vocabularies • The Metathesaurus itself has no unifying hierarchy • There may be other features of vocabularies that get lost in their "homogenisation" upon being entered into the Metathesaurus.” W. Hersh, 2000 Dr. W. Ceusters http://www.landc.be
Are formal ontologies better ? • The current implementation of SNOMED-RT does not have the depth of semantics necessary to arrive at comparable data or to algorithmically map to classifications such as ICD-9-CM” Peter Elkine, 1999 • A serious limitation of the Galen approach is that specialisation is invariably linked to a conceptual relation [Udo Hahn, 1999] Dr. W. Ceusters http://www.landc.be
What is the problem ? • ICD, ICPC, MedDRA, …: • too purpose oriented • UMLS • build without a formal ontology • Galen, SNOMED-RT, …: • build without language (as a medium of communication) in mind Dr. W. Ceusters http://www.landc.be
Language A Proprietary Terminologies Language B Lexicon Lexicon Others ... Grammar ICPC Grammar SNOMED ICD-xxx The future is in linguistic ontologies Formal Domain Ontology Linguistic Ontology MedDRA Dr. W. Ceusters http://www.landc.be
Linguistic ontologies in action WE WE-P-Type WE-P-State P-P-Type P-P-State Material Entity 4 Feature Feature State P-Scale Scale State 1 Temperature State Water 3 High Scale State 5 Temperature “high” High Temperature State “temperature” 6 Warm water 1: H-P-T 2: H-WE-S 3: H-WE-P-S 4: H-P-P-S 5: H-Expr-P-S 6: H-Real-P-S 2 “warm” “Warm water” Dr. W. Ceusters http://www.landc.be
The problem summarised • natural language is the only medium that is able to communicate clinical information about individual patients without loss of necessary detail; • structured data repositories are required to make subsequent analyses possible; • any transformation from free language to coding and classification systems results in information loss that is unacceptable for individual patient care, but at the other hand is a conditio sine qua non for population based studies; • today’s graphical user interfaces can deal reasonably well with picking lists build around controlled vocabularies that fulfil a bridging function from free language towards coding and classification systems but are incompatible with speech recognition technology. Dr. W. Ceusters http://www.landc.be
shaft fracture fracture ksjdklsd sdkj lskdjfl sldkjfl lskdjlf sldkjf lsdjlf jslkdjfl ksjlkdjf lskdjfl sjdlkfj lsdjlfksjdlkjf e er lskdjflks eokpr pozekprozke zerz zerze er zer epkzeppozekprozkepr kpzekrp o opzepro zeirpoi zzeprzeporip zerpoiz epr fracture of arm fracture of long bone GUI SPEECH proximal radial shaft fracture radius fracture NLU Pouteau fracture Colles’ fracture Smiths’ fracture SYSTEM I/O Interface SYSTEM Knowledge base SYSTEM Information store Towards an adequate solution Dr. W. Ceusters http://www.landc.be
Questions ? Dr. W. Ceusters http://www.landc.be