190 likes | 289 Views
Alan Rector & Luigi Iannone with thanks to Robert Stevens BioHealth Informatics Group School of Computer Science & Northwest Institute of BioHealth Informatics University of Manchester, Manchester M13 9PL rector@cs.manchester.ac.uk. http://dx.doi.org/10.1016/j.jbi.2011.10.002.
E N D
Alan Rector & Luigi Iannonewith thanks to Robert Stevens BioHealth Informatics GroupSchool of Computer Science &Northwest Institute of BioHealth InformaticsUniversity of Manchester, Manchester M13 9PLrector@cs.manchester.ac.uk http://dx.doi.org/10.1016/j.jbi.2011.10.002 Lexically Suggest, Logically Define:QA of Qualifiers & Expected Results of Post-Coordinationin SNOMED CT
Pre-coordination and post-coordination • Pre-coordination • SNOMED authors define “Acute bronchitis” • Classifier creates correct hierarchy • Clinical user enters “Acute bronchitis” (or its code) • Post-coordination • Clinical user enters “Bronchitis” + “Acute” • Classifier finds any equivalent term or places the expression in the right place in the hierarchy • Concept does not need to exist beforehand, e.g.Might define “Acute” + “Bronchitis” + “Right main stem bronchus” • Would still be in the correct place in hierarchy even if no term exists.
Can SNOMED post-coordination work?Do SNOMED authors pre-coordinate consistently? • Two related questions? • Are SNOMED qualified expressions expressed consistently? • If SNOMED authors don’t do it consistently, can anyone else? • Proxies: In either case • The definitions should allow the description logic classifier to organize the hierarchies correctly • Includes determining equivalence between pre- and post- coordinated forms • Necessary but not sufficient for post-coordination to work • For post coordination, must be well defined consistent patterns that users & software develpers understand
First try • Take a simple case: “acute” and “chronic” • Look at the pattern SNOMED uses to defineAcute disease and Chronic disease • Follow Campbell, Tuttle, & Spackman and see how many diseases named “Acute…” or “Chronic …” are retrieved under the pattern
Definition of acute & chronic • Chronic disease ==Disease & (RoleGroup some (Clinical course some Chronic)) broaden to • Chronic finding == Clinical finding & (RoleGroup some (Clinical course some Chronic)) • … similarly for Acute • fully specified name: “Sudden onset AND/OR short duration”
Write a script to check for candidates in OPPL2 • Requires • Lexical match • Description logic/OWL semantics -- open world, negation as provably false DL Reasoner • Query semantics -- closed world, negation as failure over concepts in corpus • Procedural semantics – add things to ontology • ?C:CLASS=MATCH("'Chronic.*") LexicalSELECT ?C SubClassOf 'Clinical finding (finding)' DL SemanticsWHERE FAIL ?C SubClassOf ‘Chronic clinical finding (finding)’ Query SemanticsBEGIN ADD ?C SubClassOf Candidate ProceduralEND;
Next, classify candidates;only top-level ones need be examined • If a concept’s definition is changed, the change will be inherited by all descendants • What did we find? • 25%-30% of all lexical matches were “Candidate” errors,but there were cases where • “Acute” and “Chronic” clearly no longer can be taken literally • Chronic and acute leukemias and myeloproliferative disorders • So exclude them from candidates
Then remaining candidates not classified as Chronic findings: • Why? • Systematic? …or… • Accidental?
Look at definitions • Systematic • Chronic duodenal ulcer ==Duodenal ulcer disease and RoleGroup some (Associated morphology some Chronic ulcer (morphologic abnormality) and Finding site some Duodenal structure))) • Compare with • Chronic disease ==Disease & (RoleGroup some (Clinical course some Chronic)) • Different qualifiers • Associated morphology • Clinical course
Different qualifiers • User guide says: • Acute & chronic may be morphological • Chronic inflammation means mononuclear cell infiltration • Acute inflammation means polymorphonuclear cell infiltration • For ulcers… • Chronic ulcer (morphological abnormality) is a kind ofChronic inflammation (morphological abnormality) • But users must understand • Acute and chronic ulcers are defined by Associated morphology, • Acute obstruction is defined by Clinical course, • Chronic cholecystitis by both! • Are these the consequences we want? • Does this correspond to use in clinical care? • Do we have evidence? • Should pathology take precedence over clinical observation?
Late discovery:Chronic inflammatory diseaseis defined as have both qualifiers! • Chronic inflammatory disease ==Chronic disease & RoleGroup some (Associated morphology some Chronic inflammatory morphology) & RoleGroup some (Clinical course some Chronic ) • Means: • Classifier will chronic inflammatory disease only if you have both • Or that author asserts directly is a descendant of Chronic inflammatory disease • To get post-coordination to work you have to use both! • Will anyone remember to do so? • Obviously not all SNOMED authors,
… but even authors don’t, so Many inflammations (…itis) are missed • Authors have done some directly and not others • “Helter skelter” / “Mish mash” modelling • Systematic inconsistency • What using a description logic is meant to avoid
One solution • Change the axioms so that any disease with chronic inflammatory morphology has a chronic course • Still within SNOMED’s DL EL++/OWL-EL • SNOROCKET still classifies it efficiently • Or vice versa for all inflammatory diseases with chronic course • Chronic course & inflammatory morphologyChronic inflammatory morphology
How should the decision be made?How monitored? • New axiom may or may not be strictly “true”, but… • What are the consequences? • For accuracy of authoring? • For accuracy of retrieval? • For consistency of setting value sets? • For post-coordination? • For meaninful use? • Base decisions on evidence of consequences • Evidence-based terminologies / ontologies • Whatever the decision, need a QA process to enforce and check it
How big is the problem? • In a “module” based on the UMLS CORE Problem list subset: • 368 total chronic; 450 total acute • 103 (28%) chronic / 92 ( 20%) Acute were “candidates”,of these: • Due to use of morphology only85 (83%) chronic / 92 (85%) Acute • Due to simple errors and omissions18 (17%) chronic / 14 (15%)
Other issues (See paper) • Hierarchy of qualifiers • Should Intermittent (course) be a kind of chronic (course)? • What about “intermittent acute pain”? • Pressure ulcers and decubitous ulcers are all chronic by definition • Can there be an acute pressure ulcer? • Odd anatomy • Lower back pain is a kind of Abdominal pain • Because the lower back is part of the abdominal wall is part of the abdomen • (Anatomy under review by SNOMED)
You have to use a classifier • This work can only be done by using a classifier to find inferences • Post-coordination depends on the classifier • To work efficiently, the classifier must be fast • For iterative analysis, < 1 min • SNOROCKET in Protege is very fast and reliable • But still works better on modules than all of SNOMED
Use of “modules” makes this possible • A “signature” is a subset of the entities in a description logic/OWL KB • A “module” for a “signature” is a subset of the axioms & entities in the KB such that • All inferences amongst entities in the signature can be inferred from the module • For the UMLS CORE Problem List Subset • SNOMED Size ~300,000 • Classification time 2-8 minutes • Signature (UMLS CORE Subset) ~8500 • Module extracted ~35,000 • Classification time .25 – 2 minutes • Also methods for extracting the changes and applying them to the whole • Re-apply final methods to whole corpus if require • Total effort for this study =< 2 person weeks
Summary http://dx.doi.org/10.1016/j.jbi.2011.10.002 • Lexical suggest, semantically define works to raise issues • Post coordination of acute and chronic unlikely to workreliably, unless • SNOMED makes pattern consistent • Bases decisions on consequences for use in patient care • Are patient care clinicians likely to align with pathology in the ED? • Other Findings • Working on modules makes analysis of SNOMED practical • There are problems in the anatomy and qualifier hierarchies • Questions • How many other such problems are there? • How do they affect post-coordination? • How to establish QA procedures to find out and prevent recurrence?