760 likes | 767 Views
This research paper discusses the past, present, and future of the Neighborhood Auditing Tool (NAT), a hybrid interface for auditing the UMLS. It highlights the features of NAT, its adaptation to SNOMED, and the potential of relationship-centric UMLS auditing. The goal of NAT is to provide relevant information to auditors and help them focus on areas likely to contain errors.
E N D
The NeighborhoodAuditing Tool – Past, Present and Future James Geller Michael Halper Yehoshua Perl C. Paul Morrey Chris Ochs Structural Analysis of Biomedical Ontologies Center (SABOC) New Jersey Institute of Technology Newark, NJ
Overview The Past: Goals of an Auditor’s Tool for the UMLS Principles of Auditing with Neighborhoods The Idea of a Hybrid Display The Present: Neighborhood Auditing Tool (NAT) Features for the UMLS The new NAT Website The Near Future: Adaptation of NAT to SNOMED Tools for SNOMED abstraction display The Farther Future: Relationship-Centric UMLS auditing Guiding the Auditor what to Audit Managing Auditors and Workflows 2 2
Research Paper C.P. Morrey, J. Geller, M. Halper, Y. Perl. The Neighborhood Auditing Tool: A hybrid interface for auditing the UMLS. J Biomedical Informatics, 42(3):468-89, June 2009. (Part of a Special Issue edited by our group on Auditing). 3 3
Auditing the UMLS About 156 source vocabularies It is natural that inconsistencies will appear Over 2.2 million concepts and 9.9 million terms* Two level structure consisting of the Semantic Network and the Metathesaurus 133 Semantic Types in the Semantic Network organized as two trees 4 4 *UMLS Metathesaurus version 2010AA
Some of our Work on Auditing H. Gu, Y. Perl, J. Geller, M. Halper, L. Liu, and J.J. Cimino. Representing the UMLS as an Object-oriented Database: Modeling Issues and Advantages. J Am Med Inform Assoc, 7(1):66-80, 2000. J. Geller, H. Gu, Y. Perl, and M. Halper. Semantic refinement and error correction in large terminological knowledge bases. Data & Knowledge Engineering, 45(1):1-32, 2003. J.J. Cimino, H. Min, and Y. Perl. Consistency across the hierarchies of the UMLS Semantic Network and Metathesaurus. J Biomed Inform, 36(6):450-461, 2003. H. Gu, Y. Perl, G. Elhanan, H. Min, L. Zhang, Y. Peng. Auditing concept categorizations in the UMLS. Artif Intell Med, 31(1):29-44, 2004. Y. Chen, Y. Perl, J. Geller, and J.J. Cimino. Analysis of a study of the users, uses, and future agenda of the UMLS. J Am Med Inform Assoc, 14(2):221-231, 2007. J. Geller, C. P. Morrey, J. Xu, M. Halper, G. Elhanan, Y. Perl, G. Hripcsak, (2009). Comparing Inconsistent Relationship Configurations Indicating UMLS Errors, In L. Ohno-Machado, V. L. Patel, D. Aronsky (Ed.), Proceedings of the American Medical Informatics Association, (pp. 193-197). San Francisco, CA. Omnipress. 5
Previous Work on Auditing (cont’d) H. Gu, G. Hripcsak, Y. Chen, C.P. Morrey, G. Elhanan, J.J. Cimino, J. Geller, and Y. Perl. Evaluation of a UMLS auditing process of semantic type assignments. In J.M. Teich, J. Suermondt, and G. Hripcsak, editors, Proc AMIA Symp, pages 294-298, Chicago IL, Nov. 2007. Y. Chen, H. Gu, Y. Perl, J. Geller, M. Halper. Structural group auditing of a UMLS semantic type's extent. J Biomed Inform. 2009 Feb;42(1):41-52. L. Chen, C.P. Morrey, H. Gu, M. Halper, Y. Perl. Modeling multi-typed structurally viewed chemicals with the UMLS Refined Semantic Network. J Am Med Inform Assoc, 16(1):116-31, 2009. Y. Chen, H. Gu, Y. Perl, J. Geller. Structural group-based auditing of missing hierarchical relationships in UMLS. J Biomed Inform. 2009 Jun;42(3):452-67. Y. Chen, H. Gu, Y. Perl, M. Halper, and J. Xu, Expanding the extent of a UMLS Semantic Type via Group Neighborhood Auditing. J Am Med Inform Assoc, Accepted for publication. K. C. Huang, J. Geller, G. Elhanan, Y. Perl and M. Halper, Auditing SNOMED Integration into the UMLS for Duplicate Concepts. Accepted to AMIA 2010. 6
Ancient Past – Before the NAT: Provide Info as Paper Form CPT: C1081844 Antonospora locustae SRC: NCBI STY: T004T009 Fungus + Invertebrate DEF: SYN: Antonospora locustae | Nosema locustae PAR: Antonospora{STY: Invertebrate} CHD: 7 Data shown for this concept is from the UMLS Metathesaurus version 2006AC
Auditing Results also Paper Form (C1081844) Antonospora locustae STY: Fungus + Invertebrate No errors Semantic Type Error: Fungus Semantic Type Error: Invertebrate Add Semantic Type______________________ Ambiguity Other error_____________________________ Comments _____________________________ ______________________________________ 8 8
Goals of an Auditor’s Tool for the UMLS Display relevant information to the auditor. Do not overwhelm the auditor with too much information. Help the auditor focus on areas most likely to contain errors. Algorithms suggest likely erroneous concepts Concepts are reviewed in a neighborhood display 9 9
Or as I Like to Say it • “Give them [UMLS Auditors] what they want.” • “Give them all what they want.” • “Give them only what they want.” • (At least this is what we want.) • But how? • As a diagram? • As indented text?
What Makes a Diagram Wonderful? You can follow parent/child paths with your eyes. You can get a feeling for everything a concept is connected to with one look. You can see multiple parents and multiple paths with one look. You can see global features (short and bushy versus tall and sparse, or (gasp!) tall and bushy). But is every diagram wonderful? Let us look at more from the ancient past. 11 11
What makes Indented Text Wonderful? Think of something as simple as a Microsoft file list in an Explore Window. Indentation expresses parenthood compactly and elegantly. There are no lines crossing, no lines at all. You don’t need a layout algorithm. There is a linear order in which to study text. But … see under “what makes a diagram wonderful.” All that is missing. 15 15
We got a Problem Diagrams are wonderful – as long as they fit on one screen. Indented text is wonderful – as long as there are no or very few multiple parents. But the UMLS does not fit onto one screen and there are many cases of multiple parents. 16 16
The Idea of a Hybrid Display Keep the best features of text and the best features of diagrams. Auditing is organized around a “concept of interest,” the focus concept. Maintain relative positions between the focus concept and its children, parents, etc. Eliminate clutter of arrows. 17 17
Auditing with Neighborhoods of a Focus Concept Several years of experience: Auditing is to a large degree a “local” activity. It happens (mostly) in the Neighborhood of the focus concept an auditor is interested in. Concepts have two kinds of knowledge elements: Textual Knowledge Elements: Preferred term, CUI, synonyms, LUI, definition, sources, semantic types Contextual Knowledge Elements: Neighbors 18 18
Types of Neighborhoods Focus concept: The concept presently being audited Immediate Neighborhood: The set of concepts reachable from the focus concept by following one relationship (up, down, lateral, etc.) Extended neighborhood: Includes parents of parents (grandparents), children of children (grandchildren) and siblings. No lateral chains. Up-Extended and Down-Extended Neighborhoods (add only grandparents or only grandchildren) 19 19
References about Neighborhoods M.S. Tuttle, D.D. Sherertz, N.E. Olson, M.S. Erlbaum, W.D. Sperzel, and L.F. Fuller, et al. Using META-1, the first version of the UMLS Metathesaurus. In Proc 14th Annu Symp Comput Appl Med Care, pages 131-135, Washington, D.C., 1990. S.J. Nelson, M.S. Tuttle, W.G. Cole, D.D. Sherertz, W. D. Sperzel, M.S. Erlbaum, L.L. Fuller, N.E. Olson, From meaning to term: semantic locality in the UMLS Metathesaurus. In Proc Annu Symp Comput Appl Med Care, pages 209-213, Washington, D.C., 1991. 20
A Hybrid Diagram/Form Display of a Neighborhood Parents Synonyms Relationships Focus Concept Children 23 23
Desirable Information Beyond Neighborhoods Concept definition for Focus Concept Sources for concepts and relationships Assigned Semantic Types of concepts Definitions of relevant Semantic Types Global view of the Semantic Network Indented (better for wide branches) Graphical (better for almost everything else) 24 24
The Present NAT: Serving the Auditor The Neighborhood Auditing Tool has been implemented to fully support display of neighborhoods. It is Web-based. Navigation: Neighboring concepts are an easy (double)click away. Additional features listed above have been implemented. YouTube training videos for beginners Redesigned Home Page: http://nat.njit.edu 25 25
Demonstration of NAT Features Neighborhood Grandparents and grandchildren Synonyms Relationships: Concept, Sibling, Term Focus concept definition Sources: Concepts, Relationships Display CUIs Semantic Type display Semantic Type definition Semantic Network (indented) Semantic Network (diagram) Navigation Search (full, partial) Viewing History Choice of release Choice of sources offline version 26 26
The Present and Future (in Acronyms) Present: Release of the NAT with Level 0 (and SNOMED) BLUESNO =Biomedical Layout Utility Engine for SNOmed BLUESNO-3D Future: SNET = SNomed Enhancement Tool CRAM-NAT =C-NAT (Concept or Current NAT) + R-NAT (Rel’ship-Centric NAT) + Audit Set Builder + Management of Audits
Present: Release of the NAT with Level 0 (and SNOMED) • On the Web site there are three releases: • Public NAT with UMLS Level 0 (unrestricted) terminologies; for everybody • Public NAT + SNOMED for users with a SNOMED license. • NAT with complete UMLS (requires a password).
BLUESNO Biomedical Layout Utility Engine for SNOmed [Abstractions] Based on years of research on building abstraction networks of Areas and Partial Areas for the SNOMED by hand. Definition: An area contains all concepts with the same set of relationships (attrib./roles). Definition: A root of an area is a concept that has no parents within that area. Definition: A partial area contains a root and all its descendants within one area. (This is a simplified case. Assumes no overlap.)
Number of Partial Areas in Area Two Partial Areas Area Relationshipsthat are common to allconcepts in this Area and its included Partial Areas (= Area name) {Specimen Substance, Specimen Procedure} (9) Drainage Fluid Sample (3) Pus Swab (2) Number of concepts in Partial Area Root concepts of Partial Areas (= Partial Area name) One Element of a BLUESNO Area/Partial Area Diagram Other concepts in Partial Area (usually not shown)