250 likes | 398 Views
Knowledge Representation and Indexing Using the Unified Medical Language System. Kenneth Baclawski * Joseph “Jay” Cigna * Mieczyslaw M. Kokar * Peter Major † Bipin Indurkhya ‡ * Northeastern University † Jarg Corporation ‡ Tokyo University of Agriculture and Technology. Purpose.
E N D
Knowledge Representation and Indexing Using the Unified Medical Language System Kenneth Baclawski* Joseph “Jay” Cigna* Mieczyslaw M. Kokar* Peter Major † Bipin Indurkhya ‡* Northeastern University† Jarg Corporation‡ Tokyo University of Agriculture and Technology
Purpose • Biomedical Information Searches • Ontologies & the UMLS • Knowledge Representation Input - Natural Language ProcessingRetrieval - Ontologies & Semantic Frameworks Information Visualization - Keynets • Results of Usability Studies
Introduction Problem: Low Quality Search • Searching using keyword matching often has high volume and low precision. • Discrete keywords do not represent knowledge. • Result of a search are not be arranged in a semantically relevant way. • Examining search results is often tedious. • Search results include only textual documents.
Introduction Solution: Ontologies Model for knowledge extraction/management using a domain-specific vocabulary and theories expressing the meaning of the vocabulary within the community using the vocabulary.
Advantages of Ontologies • Allows semantically correct retrieval based on domain specific criteria. • No limit to the depth of knowledge that can be represented, managed and retrieved. • Multiplicity of information objects retrieved:images, video, sound, etc. as well as text. • Results of a search are grouped by how documents are relevant to the whole query. • The ontology can be updated as new terminology and relationships are introduced.
UMLS • US National Library of Medicine since 1986 • Overcomes retrieval problems • Differences in terminology • Distributed database sources • Develops machine-readable “knowledge sources” • Allow researchers and health professionals to retrieve and integrate electronically available biomedical information.
Free • Iteratively refined and expanded from feedback • Maps many different names for the same concept • Grateful Med and PubMed are applications of the UMLS
Semantic Categories • > 130 semantic categories • Semantic Relationships • “ is a “, “ part of”, “disrupts” • Semantic Concepts (Vocabulary) • > 1,000,000 concepts map to categories
Natural Language Processing using an Ontology semantic syntactic
Keynets A technique for representing information in a visual manner that can be manipulated into meaningful associations for refinement of the knowledge extracted. • Exploits human – computer interactivity inherent in knowledge processing. • Based on Information Visualization Concept (Schneiderman, 1998)
Knowledge Representation using the UMLS and Keynets • Acyclic directed graph. • Provides a consistent categorization for all concepts. • Shows the important relationships between the concepts. • NLP using the UMLS produces Keynets, a new search strategy for knowledge processing of biomedical information. “Fc-receptors on NK cells”
Usability Study • The purpose was to explore the reactions of users to different representations of biomedical information • Keywords: Fc-receptors, cells, NK cells • Keynet: • Sample: n = 11; MD, PhD, Biomedical engineers, Pharmacologists - individuals who would typically be required to search for biomedical information
Survey Format Three Sections • Demographics. • 9 semantic differential focused questions. • Open ended questions to assess subjects overall impressions of using keynets and information visualization for knowledge representation,
Semantic Differential Question • scale 1-9 , 0 = N/Ae.g. confusing/clear 1 most like first word or “confusing” 9 most like the last word “clear” confusing clear 1 2 3 4 5 6 7 8 9
Semantic Differential Question • 1a How would you rate the Keynet version in its ability to represent the biomedical text given? confusing clear 1 2 3 4 5 6 7 8 9 • 1b How would you rate the Keyword version in its ability to represent the biomedical text given? confusing clear 1 2 3 4 5 6 7 8 9
Survey Results Question Score n=11
Results of Usability Study • Level of Understanding of Keynets • Remarkably high given short time to complete study, population diversity, different examples used. • Example – missing relationship detected (7 of 11) • Limit Complexity • Representations should be concise drilling down only at the user’s request • Keywords versus Keynet • No statistical difference, Keynets are as least as useful as Keywords in representation of biomedical information retrieval.
Summary • A new strategy is suggested for searching and retrieving biomedical information using NLP, the UMLS and Keynet displays of the retrieved results. • Issues of semantic versus syntactic representations for biomedical information retrieval. • Issues relating information visualization for the processing of biomedical information retrieval.
Conclusion • Consider the computer-human interactivity issues • A picture is worth a thousand keywords!!
Acknowledgement • This project was performed as part of the “Biomedical Science Information Retrieval and Management” project supported by grant # 1 R43 LM06665-01 from the National Institute of Health (NIH). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the NIH. • A portion of this study was conducted in part at Jarg Corporation, 332b Second Ave., Waltham, MA 02451-1104. • Travel expenses for this presentation were provided by a grant from the Dept. of Energy. www.jarg.com
Addendum • Technical information related to Keynetshttp://www.ccs.neu.edu/home/kenb/key/