220 likes | 226 Views
Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland - USA. Workshop on The Future of the UMLS Semantic Network NLM, April 8, 2005. Summary Issues and Suggestions. Issues. UMLS Semantic Network. Necessary complement to the Metathesaurus
E N D
Olivier Bodenreider Lister Hill National Centerfor Biomedical CommunicationsBethesda, Maryland - USA Workshop onThe Future of the UMLS Semantic Network NLM, April 8, 2005 Summary Issues and Suggestions
UMLS Semantic Network • Necessary complement to the Metathesaurus • Provides direct categorization to concepts(some of which would be orphans otherwise) • Best used in conjunction with the Metathesaurus • Used for • Natural Language Processing • Information retrieval • Knowledge discovery • Essentially stable
Semantic types • Purposely limited to a small number of categories • Purposely emphasizes categories of major interest • e.g., Neoplastic Process • No attempt to anything JEPD • No explicit classificatory principles or properties • Textual (not formal) definitions • Introduction points for semantic relationships
Semantic relations • Single-inheritance hierarchy • Class-class relations • Simply mirrored by inverses • Weakest reading possible: some-some • Sufficient for some applications (e.g., semantic interpretation, reporting and visualization of clinical information) • Too limited for reasoning
Semantic groups • 15 collections of semantic types • Created for visualization purposes • Purposely non-ontological (not subtrees from the isa hierarchy of STs) • Based on common properties of (sometimes) otherwise heterogeneous semantic types
Semantic categorization • Generally corresponds to isa(rarely is an instance of) • Convenient for extracting a class • Direct access: no traversal necessary • Bypasses hierarchies in vocabularies: not subject to questionable hierarchical relations
Semantic type assignment (1) • Essentially manual (default based on source information, reviewed by Metathesaurus editors) • Complex and labor intensive • Multiple ST assignment sometimes required • Structure + role (chemicals) • Systematic polysemy • Guidelines • Usage notes • Prior categorization of similar concepts
Semantic type assignment (2) • No constraints based on mandatory consistency between SN and Metathesaurus(e.g., ST of the child concept must be identical to or a descendant of ST of the parent concept) • No constraints based on ontological principles (e.g., disjunction between Entity and Event) • No constraints based on structural principles(e.g., allowable hybrid types)
Systematic polysemy (splitting vs. lumping) • Metathesaurus (RxNorm) distinguishes between • Clinical drug (e.g., Acetaminophen) • Branded drug (e.g., Tylenol) • But does not systematically distinguish between • Prostatic adenoma (the tumor responsible for compressing the urethra) • Prostatic adenoma (the disease of which urinary problems are one manifestation) both contain acetaminophenas their active ingredient
Finding • Role played by many different types • Necessarily some-some (rare exceptions) • Reified for convenience
Overall constraints for changes • Finite amount of resources • Driven by usefulness
SN and Metathesaurus • Issues in the SN cannot be dissociated from issues in the Metathesaurus • Inaccurate/inconsistent concept categorization • May be a bigger issue than issues identified in the SN • Relatively frequent • Impair semantic integration and semantic interpretation • Will not be solved solely be addressing issues in the SN
SN vs. Biomedical ontology • Having a good (high-level) ontology of biomedicine is certainly desirable… • But it will be of little use if it is not linked to Metathesaurus concepts • Some ontological features (e.g., some-all) require a much finer granularity than that of the current semantic types
Editing vs. Auditing • Auditing must be pursued, but… • Better editing environments are needed • Law: explicit classificatory principles and properties • Order: • Enforce SN/Meta consistency(use SN relations as a reference for Meta relations) • Restrict allowable combinations of STs • Quality assurance starts at the time of editing
Source transparency vs. Anarchy (1) • All relations asserted by sources are recorded…(source transparency) • But need not be necessarily trusted • Similar to how synonymy is treated • Metathesaurus synonymy does not always follow source synonymy
Source transparency vs. Anarchy (2) • Similar to how names lacking face validity are treated • Fully specified Metathesaurus names are created • Invalid names are made suppressible • Similarly for relations • Metathesaurus hierarchical relations should ignore some obviously non-hierarchical relations used to create hierarchies in source vocabularies • Suppressibility or Content View Flag (CVF)
Semantic types • Rename some types (face validity) • Extract explicit classificatory principles • Rearrange hierarchy as needed (e.g., Alga) • Revisit roles • Place under sortals when unique (e.g., Enzyme) • Create allowable hybrids (e.g., Steroid hormone)
Semantic relations • Align with Metathesaurus relations(e.g., caused_by / due_to) • Multiple inheritance (?) • Two levels • Coarse class-class, some-some, with mirrored inversesto label the relation (and support semantic interpretation) • Finer non-symmetric class-class, some-all (?)to support reasoning
ST assignment • Facilitated by improved editing environment • Driven by explicit classificatory principles and properties • Simplified by allowable hybrids • Constrained by coherence with SN relations (requires aligned relations and labeled Metathesaurus relations)