530 likes | 663 Views
An Introduction to SNOMED CT ® Understanding the Release Files. A Technical Overview. Presented by The UK Terminology Centre information.standards@hscic.gov.uk. Welcome. Audience: Technical overview – for anyone needing to use the SNOMED CT release files
E N D
An Introduction to SNOMED CT®Understanding the Release Files A Technical Overview Presented by The UK Terminology Centre information.standards@hscic.gov.uk
Welcome • Audience: Technical overview – for anyone needing to use the SNOMED CT release files • Walks through the release packs and folders • Illustrates how the basics of the SNOMED CT data model are provided in the release files • Does not require prior detailed knowledge of the previous release format – RF1, though assumes a basic understanding of SNOMED CT
Using this presentation • The slides to this presentation are also provided in PowerPoint on www.infostandards.org • You may find it useful to print them out – the entire script for each slide is provided in the Notes area • Please note some of these slides use animations and so are less readable when printed.
The SNOMED CT release • The SNOMED CT terminology is distributed via a set of release files. • Those release files are provided in two different formats: • Release Format 1 (RF1) • Release Format 2 (RF2)
Release Formats Release Format 1 (RF1) • current primary data source for UK Edition (RF2 is created from RF1) Release Format 2 (RF2) • Primary data source for International Release: International convert RF2 to RF1 to provide the international RF1 release (they plan to only provide RF2 at some time in the future) • UK currently provides both RF1 and RF2; notice will be given when the UK intends to move to RF2 as its primary source These are Distribution Formats rather than implementation formats. It is expected that technical users of the terminology upload the data into their own implementation schema.
Why two formats? • The core terminology provided by these release formats is essentially the same; i.e. the fundamental clinical content and knowledge are the same. • RF1 and RF2 provide the same concepts, descriptions and concept relationships as each other. • There are, however, some differences in additional features: for example timestamp and full history in RF2. • RF2 was developed to accommodate new features in order to address some of the issues and lack of flexibility perceived within RF1.
This webex focuses on RF2It will give an insight into the detail of the file content
UK Extensions A quick overview
UK Edition of SNOMED CT However, ideally need the Drug Extension for full UK descriptions • Consists of: • International Release • UK Clinical Extension • UK Drug Extension (provides same concepts as dm+d with some additional items such as family name and relationships to the UK Clinical Edition) • The files for each are provided separately to accommodate different supplier needs and different release cycle schedules. • Files of the same type of components need appending to each other for the UK Edition. • The international release is provided along with the UK Clinical Extension. UK Clinical Edition UK Clinical Edition UK Edition UK Edition
UK Edition of SNOMED CT • UK Clinical Extension: Six monthly releases • April and October • International Release • January and July but not implemented in UK until it is part of UK Edition in April and October (respectively) • UK Release contains UK Mappings and other resources • UK Drug Extension: 4 weekly, based on dm+d • Every 6-months an additional UK Drug Extension that provides relationships to the latest International Release • dm+d (weekly, xml) RF2 clinical releases will be on TRUD in April and October from April 2014. UK Drug Extensions will be within 5 days of its equivalent RF1 release
UK Terminology Centre ‘Collection’ TRUD Main Page: http://www.uktcregistration.nss.cfh.nhs.uk/trud3/user/guest/group/0/home ‘UK Terminology Centre Collection
UK Terminology Centre ‘Collection’ SNOMED CT release files: http://www.uktcregistration.nss.cfh.nhs.uk/trud3/user/guest/group/2/pack/26 ‘The UK Edition of SNOMED CT’ Download
Packs Our aim was for users to only have to download one pack for clinical and one for drugs • UK Clinical Edition, RF1 (UK Clinical Extension & International Release) • UK Drug Extension, RF1 (UK Drug Extension only) • UK Clinical Edition, RF2 - Full, Snapshot and Delta (UK Clinical Extension & International Release) • UK Clinical Edition, RF2: Full only (UK Clinical Extension & International Release) • UK Clinical Edition, RF2: Snapshot only (UK Clinical Extension & International Release) • UK Clinical Edition, RF2: Delta only (UK Clinical Extension & International Release) • UK Drug Extension, RF2: Full, Snapshot and Delta (UK Drug Extension only) • UK Drug Extension, RF2: Full only (UK Drug Extension only) • UK Drug Extension, RF2: Snapshot only (UK Drug Extension only) • UK Drug Extension, RF2: Delta only (UK Drug Extension only)
RF2 file types The SNOMED CT release has three different release file types: • Full Release: containing the complete history of every component • Snapshot Release: containing the current state only of every component • Delta Release: containing only the additions and changes since the previous release
Full • The filename will have _Full in the filename eg. SCT2_Description_Full.. • contains every version of every component ever released • Utilises a ‘log style’ audit approach on capturing change • Hence has a row for each change of each component: • with an effectiveTime timestamp • Once issued a row does not change • To “inactivate” a component a new row is created with a timestamp and inactive status • To change a component a full new row is added containing the updated fields • The UK Full includes releases of the UK Extension from 1st April 2004 • The International Release includes all releases from 31st January 2002
Snapshot • The filename will have _Snapshot in the filename e.g. SCT2_Description_Snapshot.. • A "Snapshot" release, contains only the most recent version of every component ever released • (both active and inactive components) • The snapshot can be derived from the Full
Delta • The filename will have _Delta in the filename e.g. SCT2_Description_Delta.. • A "Delta" release, contains only component versions created since the last release. • Eachcomponent version represents • a new component, or • a change in an existing component, or • the inactivation of a component • Delta file needs combining with previous release files to give the full terminology
Full + Delta Full UK Clinical April 2013 Delta UK Clinical Oct 2013 Full UK Clinical Oct 2013
Snapshot April 2013 RF1 Snapshot April 2013 RF2 When filtered on active components only
How do you decide which you need? Depends on your requirements: • Snapshot is, in effect, the ‘current’ state (similar to RF1) • Full provides a full history and thus for things like analytics is probably required • Delta – could take Full and then add the delta to your application each release – our QA does include checks in this respect. You would need to be robust in ensuring you do not miss a delta • See 7.2 of the IHTSDO Technical Implementation Guide (TIG) for more details
UK Baseline • The UK Baseline is the first RF2 release. • It provides a consolidation of all UK previous releases of SNOMED CT from 1st Jan 2004 in the release type: Full • It only has release types: Full and Snapshot for the UK Extensions • The UK baseline aligns with the April 2013 UK Release • The UK Candidate Baseline (beta version for review) previously issued has now been replaced by the UK Baseline. If you have a previous copy of the Candidate Baseline this should not be used. • The UK Baseline will now not change, any issues identified will be dealt with through the RF2 change mechanism. • The UK Baseline does not have a set of delta files
Main items in the release: • Core components of SNOMED CT : concepts, descriptions, relationships • Refsets • Documentation • Additional resources
Reference sets (refsets) • Refsets is the RF2 mechanism to extend information related to core components, for example: • Subsets of SNOMED CT • Mapping tables e.g. SNOMED CT to data dictionary codes, SNOMED CT to ICD-10 • Historical relationships such as Had_VMP
Refsets • The folder structure reflects the way we have released our RF1 subsets so that users can find the ones they require • Within a folder, all refsets of the same pattern are released within a single file • NB. The Resources folder has a file which, for each RF1 subset, identifies its RF2 refset equivalence
Filename formats • All the filename formats conform to a standard specification, this enables the production of a load script for each release.
Documentation • UKTC aims to add to NOT duplicate IHTSDO documentation but to supplement. You therefore need the TIG! (on the web) • Various UKTC materials: http://systems.hscic.gov.uk/data/uktc/snomed/training • Index to documentation: SNOMED CT Documentation Catalog
Documentation Folder (anything that is doc1 is the same as the RF1 release) • Inventory of Documentation • ‘Current’ – provides the nuances of the RF2 Release • Should look at the release notes for both UK and International for full overview (NB. the UK Baseline contains only the ‘Current’ document)
UK Resources • Provides a table linking each subset in the UK Edition RF1 to its equivalent refset in the UK Edition RF2 release • The names of the refset are not the same as those of the subset as refsets now have components in the metadata hierarchy and thus must conform to editorial naming principles
A Concept in SNOMED CT Concept Id: 56265001 FSN: Fully specified name Synonym:Heart Disease Synonym : Cardiopathy Synonym : Disorder of Heart Synonym : MorbusCordis Synonym : Cardiac Disorder Synonym :Cardiac Diseases Synonym :Heart Diseases • Heart disease (disorder) • Relationships • Is a cardiac finding • Is a disorder of mediastinum • Is a disorder of cardiovascular system • Finding site heart structure • Severity • Episodicity • Courses
High level Schema for SNOMED CT core Very Simplified: See TIG for full details Concepts 1 concept has many relationships 1 concept has many descriptions (min 2) Relationships Descriptions Relationship form: Source | type | destination FSN Synonyms Concepts
What’s where? • All concepts with their ID, effectiveTime, status and definitionStatus(primitive or fully defined) are in the concepts file • ALL descriptions (FSN and descriptions) are in the descriptions file • All relationships (IS_A, finding site etc) are in the relationships file • NB. All content is captured as a concept. For example: a definitionStatus of ‘primitive’ for a concept in the concepts table is recorded as 900000000000074008. Table 47 in the TIG provides a list of these and their respective conceptIds.
Example – Concepts File A SctId, the concept is primitive id effectiveTime active moduleIddefinitionStatusId 999001681000000107 20131001 1 999000011000000103 900000000000074008 999001691000000109 20131001 1 999000011000000103 900000000000074008 999001701000000109 20131001 1 999000011000000103 900000000000074008 999001711000000106 20131001 1 999000011000000103 900000000000074008 999001721000000100 20131001 1 999000011000000103 900000000000074008 999000011000000103 20040131 1 999000011000000103 900000000000074008 A SctId, indicates the Module the concept version is in Concept id SNOMED CT United Kingdom clinical extension module (core metadata concept) Section 5.5.3 gives the details of the file format for each of the core files: concepts, descriptions and relationships
Example – descriptions file id effectiveTime active moduleIdconceptIdlanguageCodetypeId term caseSignificanceId 1292371000000116 20100401 1 999000011000000103 582261000000107 en 900000000000003001 Operation on aneurysm of celiac artery NEC (procedure) 900000000000020002 1292381000000119 20100401 1 999000011000000103 582261000000107 en 900000000000013009 Operation on aneurysm of coeliac artery NEC900000000000020002 1292391000000117 20100401 1 999000011000000103 582271000000100 en 900000000000013009 Other nonteratogenic anomaly NOS 900000000000020002 1292401000000119 20100401 1 999000011000000103 582271000000100 en 900000000000003001 Other nonteratogenic anomaly NOS (disorder) 900000000000020002 129241000000119 20040131 1 999000011000000103 388681002 en 900000000000013009 Tomato RAST test 900000000000017005 1292411000000117 20100401 1 999000011000000103 582281000000103 en 900000000000003001 Other musculoskeletal deformity (disorder) 900000000000020002 1292421000000111 20100401 1 999000011000000103 582281000000103 en 900000000000013009 Other musculoskeletal deformity 900000000000020002 900000000000003001 | fully specified name | 900000000000020002 | only initial character case insensitive |
Preferred Descriptions As each concept can have more than one description, the UK Edition provides mechanisms to identify UK recommended descriptions: • The Realm Description Refset (RDR) contains the preferred UK description for clinical use and the UK FSN for each concept • It also provides synonyms acceptable for UK use • This enables the RDR to be placed in a look-up table to quickly identify either the FSN or preferred UK term for a concept • The UK currently also provides a language refset which holds all the current RF1 preferred terms from the UK RF1 Edition • The intention is to evolve the RDR (also a subset in RF1) as the place to obtain the UK FSN and the UK recommended terms for both RF1 and RF2.
A quick peek into the data (1) • All fields where data type=SCTID are primarily machine-readable • For many of us, early experimentation and familiarisation will benefit from these fields being human-readable. So... • Walk through and application of a ‘simple look-up’ to find the human-readable FSN and/or PT for any conceptId...
A quick peek into the data (2) • You will need: • Any ConceptId of interest • Could be just one Id • Could be from several tables • Could be all of them • The Descriptions table(s) • The NHS Realm Description RefSet table(s) • Here using Snapshots
A quick peek into the data (3) Description tables: Provide a link from each conceptId to its associated descriptions:
A quick peek into the data (4) • Realm Description Refset table - • To identify the appropriate description types: xder2_cRefset_NHSRealmDescriptionLanguageSnapshot_GB1000000_20131001.txt • Two parts of NHS RDR – together provide information on FSN and term preferences for all UK content. xder2_cRefset_NHSRealmDescriptionLanguageSnapshot_GB1000001_20131001.txt
A quick peek into the data (5) • To identify the preferred term: • “For any conceptId, give me the active preferred term as specified by the NHS RDR” SELECT Descriptions.conceptId, Descriptions.term FROM Descriptions, NHSRDR WHERE Descriptions.id = NHSRDR.referencedComponentId AND NHSRDR.acceptabilityId = '900000000000548007‘ AND Descriptions.typeId = '900000000000013009' AND Descriptions.conceptId = ‘any ConceptId’ AND Descriptions.active = 1 and NHSRDR.active = 1;
A quick peek into the data (6) • To identify the FSN: • “For any conceptId, give me the active FSN term as specified by the NHS RDR” SELECT Descriptions.conceptId, Descriptions.term FROM Descriptions, NHSRDR WHERE Descriptions.id = NHSRDR.referencedComponentId AND NHSRDR.acceptabilityId = '900000000000548007‘ AND Descriptions.typeId = '900000000000003001' AND Descriptions.conceptId = ‘any ConceptId’ AND Descriptions.active = 1 and NHSRDR.active = 1;
A quick peek into the data (7) • Application • By, for example, the use of additional Tables or Views based on these joins for all Concepts, it should be easier to inspect the contents of any tables. • Create view: • e.g. CREATE VIEW “PT" AS SELECT Concepts.id, Descriptions.term FROM Concepts, Descriptions, NHSRDR WHERE... • As required, join ‘PT.id’ to each SCTID type field in: • RefSets • Relationships • etc.
A quick peek into the data (8) • For Relationships: SELECT r.sourceId, pt1.term, r.typeId, pt2.term, r.destinationId, pt3.term FROM Relationships r, PT pt1, PT pt2, PT pt3 WHERE r.sourceId = pt1.id AND r.typeId = pt2.id AND r.destinationId = pt3.id AND r.active = 1;
A quick peek into the data (8) • For RefSets (e.g. the ‘Family history simple reference set’ – refsetId=‘999000771000000106’ ): SELECT s.refsetId, s.referencedcomponentId, PT.term FROM simplerefset s, PT WHERE s.referencedcomponentId = PT.id AND s.refsetId='999000771000000106' AND s.active = 1;
MetaDataRefsets • Module Dependency Refset • Provides data on which releases a particular release requires • Refset Descriptor • Provides details on the pattern of a particular refset • Refset Metadata language • Preferred terms for the metadata concepts
Some of the RF2 characteristics • Concepts can now change who manages them without changing conceptId • Concept Origin (namespace within sctId) • Managing organisation (moduleId) • Metadata: namespaces, statuses etc all are concepts • Refsets – flexible way of providing additional characteristics on a concept
SNOMED CT Documentation • UKTC Overview on RF2: http://www.infostandards.org/area-of-interest/clinical-terminologies/snomed-ct/uk-snomed-ct-in-release-format-2-an-overview/ • IHTSDO Technical Implementation Guide (TIG): http://ihtsdo.org/fileadmin/user_upload/doc/