150 likes | 318 Views
Automatic Mapping of ICPC-2 PLUS Terms to the SNOMED CT Terminology. Jon Patrick, Yefeng Wang, School of IT, University of Sydney Graeme Miller, Julie O’Halloran Family Medicine Research Centre, Uni of Sydney. Introduction. Mapping ICPC 2-PLUS to SNOMED CT
E N D
Automatic Mapping of ICPC-2 PLUS Terms to the SNOMED CT Terminology Jon Patrick, Yefeng Wang, School of IT, University of Sydney Graeme Miller, Julie O’Halloran Family Medicine Research Centre, Uni of Sydney
Introduction • Mapping ICPC 2-PLUS to SNOMED CT • Computerised Term Mapping Approach • Searching & Mapping Approval & Validation
Mapping Methodologies Semi-Automatic Mapping • Mapping using UMLS • Common CUI (Concept Unique Identifier) • Lexical Matching • String-Based Matching • Sense-Based Matching • Post-Coordination
Mapping using UMLS 100+ different vocabularies, including ICPC2P (2000) and SNOMED CT (2002) • Common CUI mapping ICPC2P Term SCT Term UMLS CUI ICPC2P Term UMLS CUI SCT Term ICPC2P Term ICPC2P Vocabulary UMLS Metathesaurus SNOMED CT Vocabulary
The UMLS Vocabularies • Concept Names and Sources C0000039|ENG|…|SNOMEDCT |FN|82991003|… C0000215|SPA|… |MTHSCTSPA |FN|86884000|… … C0000039|ENG|…|ICPC2P |FN|A01001|… … A01.001 mapped to SNOMED concept 82991003
Matching Results 3448 ICPC 2-PLUS Terms mapped to 6557 SNOMED CT Concepts, 3326 (50.7%) mappings are best-fit mapping
String-Based Matching • Normalized Term Matching • Remove attributes & punctuations • Remove stop words • Stemming, spelling variations • Ignore case, word order etc. • Expanded Term Matching • IUCD Intra-Uterine Contraceptive Device • musculo musculoskeletal • Substring Term Matching
Synonym Matching • Utilizing thesaurus to explore the semantics. • Replace each word constituent with its semantic equivalent word • Synonyms • fever febricity , pyrexia • Derivational related words • fever feverish, feverous
Post-coordination Matching • Break the pre-coordinated terms into atomic terms • Map each atomic term to SCT terms • Use top level categories to identify relationships • Qualification • Combination
Mapping Evaluation • One to One matching, on the “best fit” • Done by FMRC experts • UMLS Mapping • String-Based Mapping • Not done for Post-coordination Mapping • 96.49% UMLS mapping candidates have at least one best-fit mapping. • 94.25% string-based mapping candidates have at least one best-fit mapping.
Conclusion & Future Work Conclusion • Mapped 80% ICPC 2-PLUS terms to SNOMED CT. • UMLS & Lexical matching provide reliable mappings. • Post-coordination provides solution to content incompleteness issues. Future Work • Explore context semantics. • Use structural information and relationship in SNOMED CT. to refine mapping candidates. • Evaluate the post-coordination.