330 likes | 465 Views
XML in a SAS World. Mike Molter d-Wise Technologies. Agenda What is XML? Examples of industry XML standards (schemas) SAS tools for working with XML. What is XML? e X tensible M arkup L anguage Used for structure, storage, and transport of data (w3schools.com)
E N D
XML in a SAS World Mike Molter d-Wise Technologies
Agenda • What is XML? • Examples of industry XML standards (schemas) • SAS tools for working with XML
What is XML? • eXtensibleMarkup Language • Used for structure, storage, and transport of data (w3schools.com) • Like any other computer language… • textual gibberish • set of rules (structural, syntax) • vocabulary • elements • attributes • tags • schemas
<nhl> <team name="Red Wings"> <conference>Eastern</conference> <division>Atlantic</division> <location>Detroit</location> </team> <team name="Flames"> <conference>Western</conference> <division>Pacific</division> <location>Calgary</location> </team> <team name="Devils"> <conference>Eastern</conference> <division>Metropolitan</division> <location>New Jersey</location> </team> </nhl> • XML document is made of elements (nhl, team, conference) • Elements are marked with a start tag and an end tag (<division>, </division>) • Elements may be nested within other elements (location is nested within team) • Elements may contain attributes (team element contains the name attribute) • An element's value is the text outside of a nested element between the element's start and end tags (Pacific is a value of the division element) • Each XML document must contain a root element (nhl)
What is XML? • Like any other computer language… • textual gibberish • set of rules (structural, syntax) • vocabulary • elements • attributes • tags • schemas • Unlike other computer languages… • no keywords • no processor
XML Schema (or standard) • XML Schema (informal) - A specific set of elements and attributes, along with a set of rules that govern their use, for the purpose of transferring data between systems and developing applications for processing such data. • An XML schema can be a combination of new elements along with other XML schemas (extensible) • XML schema file - A well-formed XML file used for enforcing the rules of an XML schema, or validating an XML document.
XML Schema Examples • NHL (Ok, I made this one up) • XSL (eXtensibleStylesheetLanguage, .xsl) • Transforms XML into something else • XML schema files (.xsd) • Validates an XML document • XML Spreadsheet 2003 (.xml) • Read and displayed by Excel • ODM, Define, SDS • Clinical Trials data, metadata
XML in Pharma • Operational Data Model (ODM) • Collected clinical trial data, metadata, administrative data, reference data, audit information • Define-XML • Metadata for submitted data in ODM structure • Value-level metadata is in the define extension • SDS-XML • Submission data in ODM structure
XML in Pharma Data Submission Collected Data Data Transformations Metadata Submission SDS.XML ODM.XML SAS Define.XML
ODM Clinical Data ItemGroup (dataset-level) Metadata
Clinical Data ODM ItemGroup (dataset-level) Metadata Item (variable-level) Metadata
ODM Item (variable-level) Metadata Codelist Metadata (allowable values)
Exporting XML Teams.sas7bdat
Exporting XML with the LIBNAME statement libnamexmlout xml 'C:\teams_generic.xml' ; data xmlout.xteams; set teams ; run;
Exporting XML with the LIBNAME statement libnamexmlout xml 'C:\teams_oracle.xml' xmltype=oracle; data xmlout.xteams; set teams ; run;
Exporting XML with a DATA step filename xmlout4 'C:\teams_datastep.xml' ; data _null_ ; file xmlout4 ; set teams end=thatsit ; if _n_ eq 1 then put '<nhl>' ; put '<team name="' name '">' ; put '<conference>' conference '</conference>' ; put '<division>' division '</division>' ; put '<location>' location '</location>' ; put '</team>' ; if thatsit then put '</nhl>' ; run;
Exporting XML with the LIBNAME statement or ODS using tagsets libnamexmlout xml 'C:\teams_tagset_libname.xml' tagset=<tagset-name>; data xmlout.xteams; set teams ; run; ods markup tagset=<tagset-name> file='C:\teams_tagset_ods.xml'; proc print noobs data=teams ; run; ods markup close ;
Exporting XML with ODS using SAS's ExcelXPtagset ods markup tagset=excelxpfile='C:\teams_excel.xml'; proc print noobs data=teams ; run; ods markup close ;
References A SAS Programmer's Guide to Generating Define.xml, SAS Global Forum 2009 ods markup tagset=mydefine file='define.xml' ; proc print noobs data=meta-dataset1; run; proc print noobs data=meta-dataset2; run; proc print noobs data=meta-dataset3; run; etc ods markup close ;
References Tips and Tricks for Creating Multi-Sheet Microsoft Excel Workbooks, Vince DelGobbo, SAS Global Forum 2009 ODS Markup: The SAS Reports You've Always Dreamed of, Eric Gebhart, SUGI 30
References ExcelXP on Steroids: Adding Custom Options to the ExcelXPTagset, SAS Global Forum 2011 ods markup tagset=myexcel file='define.xml' options (tab_color='45') ; proc print noobs data=dataset1; run; ods markup close ;
Importing XML Export libnamexmlout xml 'C:\teams_generic.xml' ; data xmlout.xteams; set teams ; run; Import data sasteams; set xmlout.xteams; run;
NHL.XML libnamexmlinxml 'C:\teams_nhl.xml' ; data sasteam; set xmlin.team; run; <nhl> <team name="Red Wings"> <conference>Eastern</conference> <division>Atlantic</division> <location>Detroit</location> </team> <team name="Flames"> <conference>Western</conference> <division>Pacific</division> <location>Calgary</location> </team> <team name="Devils"> <conference>Eastern</conference> <division>Metropolitan</division> <location>New Jersey</location> </team> </nhl> SASTEAM.SAS7BDAT
Importing XML with an XML map • An XML map is an XML schema • Provides instructions to the XML LIBNAME engine for reading XML • Name and Label for the data set • Which XML elements define observations • How to define variables (attributes and values) • Uses XPath syntax to navigate the XML document and identify its components filename mymap 'C:\mymap.map' ; libnamexmlin xml 'C:\nhl.xml' xmlmap=mymap; data sasteams; set xmlin.teams; run;
Importing XML with an XML map <?xml version="1.0" encoding="UTF-8"?> <SXLEMAP version="1.2"> <TABLE name="SASTeams"> Name of data set to be created <TABLE-PATH syntax="XPath">/nhl/team</TABLE-PATH> Observation boundary <COLUMN name="conference"> <PATH syntax="XPath">/nhl/team/conference</PATH> <TYPE>character</TYPE> <DATATYPE>string</DATATYPE> <LENGTH>20</LENGTH> </COLUMN> <COLUMN name="name"> <PATH syntax="XPath">/nhl/team/@name</PATH> <TYPE>character</TYPE> <DATATYPE>string</DATATYPE> <LENGTH>20</LENGTH> </COLUMN> Variable Definition </TABLE> </SXLEMAP>
Clinical Standards Toolkit (CST) • A Base SAS framework for executing clinical data tasks such as verification of data compliance against standards and importing/exporting ODM and Define.xml. • Contains all necessary files (SAS macros and driver programs, maps, XSL stylesheets) • Learning curve
Clinical Standards Toolkit (CST) …or PROC XSL
References • Using the SAS Clinical Standards Toolkit 1.5 to Import CDISC ODM Files, Lex Jansen, Pharmasug 2013 • Using the SAS Clinical Standards Toolkit for Define.xml Creation, Lex Jansen, Pharmasug 2011 • Accessing the Metadata from the Define.xml Using XSLT Transformation, Lex Jansen, Phuse 2010
In Summary… • Options for Exporting XML • XML LIBNAME engine (XMLTYPE=, TAGSET= options) • ODS (SAS XML destinations or user-defined tagsets) • DATA step • XSL stylesheets • CST (clinical) • Options for Importing XML • XML LIBNAME engine (XMLTYPE=, TAGSET= options) • XML maps • XSL stylesheets • CST (clinical)