280 likes | 639 Views
SAS Global Forum 2009. MedDRA data as SAS formats. Jim Groeneveld, OCS Biometric Support, Leiden, the Netherlands. SGF 170 –2009. MedDRA data as SAS formats. AGENDA / CONTENTS Purpose: Facilitate the use of MedDRA tables in SAS (Dis)advantages of permanent formats
E N D
SAS Global Forum 2009 MedDRA data as SAS formats Jim Groeneveld, OCS Biometric Support, Leiden, the Netherlands. SGF 170 –2009
MedDRA data as SAS formats • AGENDA / CONTENTS • Purpose: Facilitate the use ofMedDRA tables in SAS • (Dis)advantages of permanent formats • What the user has to know and do • How the SAS formats are being built
MedDRA data as SAS formats • What is MedDRA? • The Medical Dictionary for Regulatory Activities (MedDRA) Terminology is the international medical terminology developed under the auspices of the International Conference on Harmonisation (ICH) of Technical Requirements for Registration of Pharmaceuticals for Human Use. • It is a hierarchical classification of terms at various levels, a coding system.
MedDRA data as SAS formats • What are MedDRA tables? • MedDRA tables consist of data files in which medical terminology is coded with 8-digit numbers at various hierarchical levels: LLT, PT, HLT, HLGT and SOC (PSOC). • Relations between terminology codes at adjacent levels are indicated in the tables. • The relations upwards can be ambiguous (multiple) or unambiguous (unique).
MedDRA data as SAS formats • PURPOSE: Facilitate the use of MedDRA tables in SAS {1} • Traditionally MedDRA data tables are merged with user data with LLT codes • MedDRAascii data files; • MedDRA SAS datasets; • Many (different, dedicated) merges; • Quite a lot of merging program code; • Resulting in large SAS datasets.
MedDRA data as SAS formats • PURPOSE: Facilitate the use of MedDRA tables in SAS {2} • Instead of merging SAS datasets create MedDRA formats once and use them multiple times wherever, whenever. • To fetch the level names, descriptions from the level codes; • To fetch the higher level codes from the lower level codes.
MedDRA data as SAS formats • Advantages of permanent formats • No merges needed, create or update formats once and use multiple times; • Smaller datasets, formatted values are not part of datasets; • Centrally maintained formats, recreate format catalog once MedDRA updated; • Language support: create language specific format catalogs as desired.
MedDRA data as SAS formats • Disadvantage of MedDRA SAS formats • Many lower level codes (PT, HLT, HLGT) have multiple linked higher level codes. • SAS formats yield only one formatted value for an unformatted value. • SAS supports the MULTILABEL format option, which can only be applied with PROCs MEANS and TABULATE. • Else apply conventional dataset merging. • This will be discussed in detail later.
MedDRA data as SAS formats • What the user has to know • The formatting system to be discussed, that creates the MedDRA format catalog consists of several SAS programs, run from MedDRA.sas: • MedDRA.sas defines a few macro variables containing local filename and directory settings for these programs, the MedDRA tables and the MedDRA format catalog name and location.
MedDRA data as SAS formats • What the user has to do • Given a copy of the MedDRA data tables: • Put the SAS programs to create the format catalog into one directory; • Adapt the settings in MedDRA.sas; • Run MedDRA.sas, from which the other programs are being included; • Move the generated format catalog to the FMTSEARCH directory and use it.(Formats list and examples later.)
MedDRA data as SAS formats • SAS algorithms building the formats {1} • The formatting system discussed here starts from the MedDRA SAS datasets, whether already present or created from the MedDRA ascii data using the SAS programs llt.sas, pt.sas and mdhier.sas . • The LLT data contain the LLT codes, their names/descriptions and their uniquely associated PT codes.
MedDRA data as SAS formats • SAS algorithms building the formats {2} • The PT data contain the PT codes, their names/descriptions and their uniquely associated PSOC (primary SOC) codes. • The MDHIER data contain the PT, HLT, HLGT and SOC codes, their names/ descriptions and their multiple, non-uniquely associated higher level codes. • There are often multiple records per PT code in the MDHIER data (not always).
MedDRA data as SAS formats • SAS algorithms building the formats {3} • The SAS program llt_fmt.sas creates the formats for the LLT names (llt_fmt) and the unique LLT to PT association / translation (llt_pt) from the LLT data. • The SAS program pt_psoc.sas creates the format for the unique PT to PSOC association / translation (pt_psoc) from the PT data. • The MDHIER data has ‘buggy’ PSOC data.
MedDRA data as SAS formats • SAS algorithms building the formats {4} • SAS format types are (since vs. 8): • Standard formats (unique correspondence); • Multi-label formats (more routes/level up). • The MULTILABEL type is only supported by the PROCs MEANS and TABULATE. • In all other PROCs and in the DATA step only the first defined formatted value is returned from a multi-label format.
MedDRA data as SAS formats • SAS algorithms building the formats {5} • The SAS program Low_High.sas creates all other (MULTILABEL) formats from the MDHIER data. • These are the name formats pt_fmt, hlt_fmt, hlgt_fmt and soc_fmt, • and the (MULTILABEL) translation formats pt_hlt, pt_hlgt, pt_soc, hlt_hlgt, hlt_soc and hlgt_soc.
MedDRA data as SAS formats • SAS algorithms building the formats {6} • It would be preferable if the usual single (first) returned formatted value would be a value in the route from the PT to the PSOC (Primary SOC). That would mean that at least the most relevant translation would be returned. • However, it appears that the route to a PSOC is not always the first one specified per PT in the MDHIER data.
MedDRA data as SAS formats • SAS algorithms building the formats {7} • So the algorithm has been extended to force the “primary routes” to be the firstly defined formatted values. • The word route refers to the associations between successive, adjacent level values that correspond to a defined relation between a pt and its primary soc or a secondary soc.
MedDRA data as SAS formats • SAS algorithms building the formats {8} • PSOC problems in the MDHIER data: • Many 100s of times the PSOC indicator was missing for a PT, not ‘Y’ at all; • Many 100s of times the PSOC value was missing or incomplete by digits; • 4 times HLGT codes have more than 1 PSOC due to routes from different PTs.
MedDRA data as SAS formats • SAS algorithms building the formats {9} • Solution of problems with PSOCs: • The “Primary route” from a PT to a PSOC was taken from the PT data; • The correct PSOC value for a PT was uniquely taken from the PT data; • More than 1 PSOC per HLGT regarded valid; arbitrary which route first.
MedDRA data as SAS formats • Summary of the SAS MedDRA formats • llt_fmt llt_code and name (llt_name) • llt_ptllt to pt translation (unique) • pt_fmtpt_code and name (pt_name) • pt_hlt pt to hlt translation (multilabel) • pt_hglt pt to hlgt translation (multilabel) • pt_soc pt to soc translation (multilabel) • pt_psoc pt to psoc translation (unique) • hlt_fmt hlt_code and name (hlt_name) • hlt_hlgt hlt to hlgt translation (multilabel) • hlt_soc hlt to soc translation (multilabel) • hlgt_fmt hlgt_code and name (hlgt_name) • hlgt_soc hlgt to soc translation (multilabel) • soc_fmtsoc_code and name (soc_name)
MedDRA data as SAS formats • SAS variable type of MedDRA codes • MedDRA codes are not meaningless strings of 8 digits in text fields, but meaningless 8 digit (8-byte) numeric values (above 10,000,000). • For such meaningless, nominal identifiers I would have preferred text fields. • The codes are not meant to be used in arithmetic expressions. • But given the MedDRA system all SAS formats are numeric formats with either names/ descriptions or other (numeric) codes as formatted values.
MedDRA data as SAS formats • What the user does (not) need to know • No need to know about programming details of the formats generating code (advanced PROC FORMAT features like CNTLIN, MULTILABEL). The full code is presented in the paper and via • home.hccnet.nl/jim.groeneveld/meddrafmt • Need to know about applying formats in DATA step and PROCedures in order to be able to use the generated formats (PUT and FORMAT statements).
MedDRA data as SAS formats • Examples of use of MedDRA formats {1} • To fetch an HLGT translation of a PT code: • hlgt_pt = INPUT(PUT(pt_code,pt_hlgt.),8.); • or: • hlgt_pt = %NumToNum (pt_code , pt_hlgt.) ; • where: • %MACRO NumToNum (Value, Format);INPUT ( PUT ( &Value, &Format ),8.) • %MEND NumToNum;
MedDRA data as SAS formats • Examples of use of MedDRA formats {2} • To fetch a name of an HLT code: • hlt_name = PUT (hlt_code, hlt_fmt.); • To fetch a SOC translation of an LLT code: • soc_llt = %NumToNum ( %NumToNum ( llt_code, llt_pt.), pt_soc. ) ; • To fetch an HLGT name of a PT code: • hlgt_name = PUT (%NumToNum ( pt_code, pt_hlgt. ), hlgt_fmt.);
MedDRA data as SAS formats • Supported SAS versions and OSs • SAS vs 8.x / MS Windows 2000:MedDRA vs 5 & 6 (a prototype system) • SAS vs 9.x / MS Windows XP:MedDRA vs 9, 10 & 11; • SAS vs X.y / any operating system:expected: any future MedDRA version. • Any current and future human language releases of MedDRA tablesif unchanged data structure
Q&A: MedDRA data as SAS formats • QUESTIONS • & • ANSWERS • SASquestions@ocs-consulting.com • Jim.Groeneveld@ocs-biometricsupport.com • home.hccnet.nl/jim.groeneveld/meddrafmt
Q&A: MedDRA data as SAS formats • Other classes of codes • Associations between various levels of other (classes of) coding systems, e.g. ICD-9-CM or WHO-ART (included in the MedDRA distribution), could similarly be created as SAS formats. • This has not (yet) been done. • Developing the SAS code would basically need to be done only once too.