310 likes | 674 Views
A SAS based Solution for define.xml Monika Kawohl Statistical Programming Accovion. Presentation Topic – define.xml . Excerpt from the CDISC Draft Metadata Submission Guidelines define.xml Sample. Presentation Outline . Electronic Submission Context & Purpose XML Basics
E N D
A SAS based Solutionfor define.xml Monika KawohlStatistical ProgrammingAccovion
Presentation Topic – define.xml Excerpt from the CDISC Draft Metadata Submission Guidelines define.xml Sample
Presentation Outline • Electronic Submission Context & Purpose • XML Basics • define.xml Sections/Elements • define.xml Generation Process • Expected define.xml Enhancements • Summary & Conclusions
Electronic Submission Context & Purpose • Define Document Mandatory when Submitting Data to FDA • Purpose • Describe Structure and Contents of Data • Facilitate Review via Standardized Metadata Format • Aim: More Efficient Overall Review Process • define.xml Preferred Data Definition Format for SDTM • define.xml Human- and Machine-Readable • Benefit of define.xml not Restricted to Submissions
define.xml Documentation/Samples • CDISC Case Report Tabulation Data Definition Specification (define.xml),Version 1.0, February 9, 2005 • Sample define.xml Included • CDISC Metadata Submission Guidelines, Appendix to the SDTM IG V3.1.1, Draft Version 0.9, July 25, 2007 • Sample define.xml Included as Part of Sample Submission • CDISC SDTM/ADaM Pilot (Pilot 1), January 31, 2008 • Mock Submission Package Available for CDISC Members
XML Basics • Schema (Extension: .XSD) • Declaration of Elements and their Attributes • Prerequisite for Machine-Readability • XML File (Extension: .XML) • Data and Metadata in Machine-Readable Format • Usage of Elements and Attributes as Defined in Schema • Style Sheet (Extension: .XSL) • Definition of Layout in Browser Tool for Human-Readability • Usage of Elements and Attributes as Defined in Schema
Style Sheet Reference Interaction of XML, XSL, XSD <?xml version="1.0" encoding="ISO-8859-1"?> <?xml-stylesheet type="text/xsl" href="define1-0-0.xsl"?> <ODM ... <ItemGroupDef OID="DM" Name="DM" Repeating="No" IsReferenceData="No" Purpose="Tabulation" def:Label="Demographics" def:Structure="One record per subject" def:DomainKeys="STUDYID, USUBJID" def:Class="Special Purpose" def:ArchiveLocationID="Location.DM"> ...
define.xml Sections/Elements • Data Metadata (TOC) • Variable Metadata • Variable Value Level Metadata • Computational Algorithms • Controlled Terminology/Code Lists • Annotated CRF • Optional: Supplemental Data Definition Document • Navigation via Bookmarks and Hyperlinks
Data Metadata – XML Code <ItemGroupDef OID="LB" Name="LB" Repeating="Yes" IsReferenceData="No" Purpose="Tabulation" def:Label="Laboratory Tests" def:Structure="One record per lab test per time point per visit per subject" def:DomainKeys="STUDYID,USUBJID,LBTESTCD,VISITNUM,LBTPTNUM" def:Class="Findings" def:ArchiveLocationID="Location.LB"> ... <def:leaf ID="Location.LB" xlink:href="LB.xpt"> <def:title>lb.xpt</def:title> </def:leaf> </ItemGroupDef>
Variable Metadata – XML Code <ItemGroupDef OID="LB" ... <ItemRef ItemOID="LB.LBTESTCD" OrderNumber="5" Mandatory="Yes" Role="Topic"/> ... <ItemRef ItemOID="LB.LBBLFL" OrderNumber="22" Mandatory="No" Role="Record Qualifier"/> </ItemGroupDef> ... <ItemDef OID="LB.LBTESTCD" Name="LBTESTCD" DataType="text" Length="8" Origin="CRF" Comment="CRF Pages 5, 10, 15, 20" def:Label="LAB Test or Examination Short Name"> <def:ValueListRef ValueListOID="ValueList.LB.LBTESTCD"/> </ItemDef> ... <ItemDef OID="LB.LBBLFL" Name="LBBLFL" DataType="text" Length="1" Origin="Derived"def:Label="Baseline Flag" def:ComputationMethodOID="COMPMETHOD.LBBLFL"> <CodeListRef CodeListOID="YF"/> </ItemDef>
Variable Value Level Metadata – XML Code <def:ValueListDef OID="ValueList.LB.LBTESTCD"> <ItemRef ItemOID="LB.LBTESTCD.ALB" OrderNumber="1" Mandatory="No"/> ... </def:ValueListDef> ... <ItemDef OID="LB.LBTESTCD.ALB" Name="ALB" DataType="float" Length="8" SignificantDigits="1" Origin="CRF" Comment="CRF Pages 5, 15" def:Label="Albumin" def:DisplayFormat="5.1"/>
Computational Algorithms • Complex Derivations • Derivations Used More than Once
Computational Algorithms – XML Code <def:ComputationMethod OID="COMPMETHOD.LBBLFL"> Derive mean of pre-treatment measurements. Create new record with result and flag LBBLFL='Y' </def:ComputationMethod> ... <ItemDef OID="LB.LBBLFL" Name="LBBLFL" DataType="text" Length="1" Origin="Derived"def:Label="Baseline Flag" def:ComputationMethodOID="COMPMETHOD.LBBLFL"> <CodeListRef CodeListOID="YF"/> </ItemDef> • Masking of Special Characters • Ampersand, Apostrophe, Quote, Less Than, Greater Than • ' ‘
Controlled Terminology/Code Lists • External Dictionary References incl. Versions • e.g., MEDDRA, WHODRUG
Controlled Terminology/Code Lists– XML Code <ItemDef OID="LB.LBBLFL" Name="LBBLFL" DataType="text" Length="1“ Origin="Derived"def:Label="Baseline Flag" def:ComputationMethodOID="COMPMETHOD.LBBLFL"> <CodeListRef CodeListOID="YF"/> </ItemDef> ... <CodeList OID="YF" Name="YF" DataType="text"> <CodeListItem CodedValue="Y"> <Decode> <TranslatedText xml:lang="en">YES</TranslatedText> </Decode> </CodeListItem> </CodeList>
Supplemental Data Definitions • Optional • PDF Document • Additional Information Useful for Data Review • General Assumptions • Flowcharts • Derivation Dependancies • Reviewers' Guide
define.xml - SAS Based Generation Process • Use All Metadata Already Available in SAS • Provide Additional Information Required • Set-up at Design and Specification Level • Format: Excel Spreadsheets • Contents: CDISC Terminology, Study Specific Metadata (CRF Pages) • Combine Metadata and Additional Information in SAS • Create XML File in SAS • Use Stylesheet Provided with CDISC Sample
Available Metadata vs. Additional Information Req. - Continued
EXCEL Dataset Metadata D E F I N E . X M L Annotated CRF Variable Metadata EXCEL Draft Variable Value Level Metadata EXCEL Edited Variable Value Level Metadata Supple-mental Data Definitions Computa- tional Algorithms SAS Datasets XPT Files Process Flow SAS Formats
Automated Consistency Checks • SDTM Adherence Checks • Availability of Datasets and Variables • Order of Variables in Dataset • Labels and Data Type • Variables with Controlled Terminology(SAS Format Attached) • Consistency Checks for Well-formed XML Code • Intra Document Links, e.g. Computational Methods • Additional Manual Checks Required
Link to Analysis Metadata (s.b.) Link to CSR Table Link to Variable Metadata of ADSL Link to SAP Expected Enhancements – ADaM Integration • Aspects of CDISC Pilot 1 => CDISC define.xml Standard
Expected Enhancements - Continued • Correction of Software Issues with 2007 define.xml Sample • Adaptation to Latest CDISC ODM Standard • V2.0 => V3.0 • Improved Printability • Stylesheet Enhancements • Alternative Options (define_xml_printable.pdf) • Extension for CDISC ADaM Specific Metadata • Executable Computational Algorithm?
Summary & Conclusions • Presented Solution Just One of Many Options • Driven by Available Skills • XML Code Easy to Implement according to CDISC Standards • Biggest Challenge: Process Set-up • Advantages • Early Integration (Design and Specification Level) • Increased Consistency, Lower Risk of Redundancy • SDTM Adherence Check • Built-in Consistency with SAS Datasets • CDISC define.xml Standard is Work in Progress