260 likes | 282 Views
Discover efficient techniques for automating SAS coding tasks, including data extraction and DX code handling. From setting up SAS data sets to generating and running auto-coded SAS scripts, optimize your workflow with these advanced methods.
E N D
Let SAS Do the Coding for You! Robert Williams Business Info Analyst Sr. WellPoint Inc.
Overview • Coding Dilemma: Direct coding or “Robo-Coding”? • Step 1: Setting up SAS Data Set for “Robo-Coding” • Step 2a: Technique #1 – Macro variable • Generate SAS code using DATA _NULL_ and CALL SYMPUT to a macro • Step 2b: Technique #2 – External SAS code file • Generate SAS code using DATA _NULL_ and PUT statements to write to an external SAS program file • Step 3: Run the auto-generated SAS codes • Extra Example and Conclusion
Coding Dilemma: Claims Data Extract Request • Example data extraction asking for a subset of claims based on the following diagnosis and the corresponding diagnosis categories • Muscle problems that are of a disabling nature, limited to: 356.1, 359.0-359.2, 045.9, 728.11 • Hemophilia: 286.0-286.4 • Acquired or congenital heart disease: 745.0-747.9; 424.0-429.9
Coding Dilemma: 1-digit & 2-digit DX sub-codes • Diagnosis (DX) codes can be cumbersome to code into the WHERE statement due to having to account for each 1-digit and 2-digit sub-codes. • Example: To account for all medical DX codes from 359.0 to 359.2 (www.icd9data.com)
Coding Dilemma: Write the code the hard way Code all the LIKE statements with % Code all the WHEN statements with different sizes for SUBSTR function
Step 1: Setting up SAS Data File • Create a SAS data set with key data. It can be created in a DATA step or imported from Excel. Categories for the WHEN statements DX codes for WHERE statements and WHEN statements (without decimals) • Muscle problems:356.1, 359.0-359.2, 045.9, 728.11 • Hemophilia: 286.0-286.4 • Heart disease: 745.0-747.9; 424.0-429.9
Step 2a: Technique #1 – Macro Variable • Using the SAS data set from previous slide (Step 1), this technique uses DATA _NULL_ to generate a data string for the WHERE statement like this“WHERE IDCD_ID like '747%' or IDCD_ID like '0459%' or …” • Uses the RETAIN statement so the data string is retained for each DATA step iteration • Uses the END statement and CALL SYMPUT function to output the data string to a macro variable. The CALL SYMPUT is invoked at the last iteration of the DATA step
Step 2a: Technique #1 – The Robo-Coder • DATA _NULL_ step code to generate the data string At _n_=1, start the string. Then At _n_>1, “OR…” is added to the data string At _n_ = 1, where_string= IDCD_ID like "0459%" At the last iteration, the DATA _NULL_ outputs the where_string value as a SAS macro variable named: “IDCD_ID_WHERE_STATEMENT" At _n_ = 2, where_string= IDCD_ID like "0459%" OR IDCD_ID like "3561%"
Step 2a: Technique #1 – The Robo-Coder • DATA _NULL_ step code to generate the data string • Using the %PUT statement, the macro variable will look like this in the SAS log _n_ = 1 starts the string, after that, the “OR…” is added to the data string
Step 2b: Technique #2 – External Program File • Using the SAS data set from Step 1, this technique uses DATA _NULL_ to write the series of SELECT/WHEN statements to an external SAS program file using the PUT statements • Uses FILENAME statement to establish external SAS file reference i.e. “tmpwhen” to be used with FILE statement • Uses LENGTH function to determine the size of the DX_LIST field of the SAS data set (from step 1) i.e.“str_length = put(length(DX_LIST),1.0);” • Each DATA step iteration writes the each WHEN statement for the DX_ Category
Step 2b: Technique #2 – External Program File • Uses the END statement (inside the SET statement) to write the final the final two lines of the series of SELECT/WHEN statements i.e.“OTHERWISEDX_category="WHAT CATEGORY?!";END;” • When the DATA _NULL_ is completed, the external SAS program file is created with SELECT/WHEN statement block generated by all the PUT statements inside the DATA _NULL_
Step 2b: Technique #2 – The Robo-Coder • DATA _NULL_ step code to write to external program At _n_ = 2 and so on, it writes another line to the SAS file like this when (substr(IDCD_ID,1,4) = '3561') DX_category = 'Muscle problems'; Filenamestatement established an external SAS file with reference name ”tmpwhen”. At _n_ = 1, it writes to the SAS file these two lines select; when (substr(IDCD_ID,1,4) = '0459') DX_category = 'Muscle problems'; At the last iteration, it writes these two lines to the SAS file OTHERWISEDX_category = "WHAT CATEGORY?!"; END; LENGTH function to get the size of DX_LIST field. It changes for each DATA step iteration.
Step 2b: Technique #2 – The Robo-Coder • Opening up the SAS program name “tempwhenstatements.sas”, you will see this block of statements Notice how the length size changes for the SUBSTR function
Step 3: Run the auto-generated SAS codes • Invoke the macro variable inside the WHERE statement using the “&” sign or ampersand symbol • Insert external SAS program using the %INCLUDE statement. Note: this line, “optionssource2;” will allow the external SAS code to display in the log. Invoke macro here Insert external SAS code as referenced with FILENAME
Step 3: What the auto-generated code looks like It looks exactly the same as slide 6 if I hard coded it manually.
Step 3: Result of the claims data extraction • Here is the results of the claims data extraction as well as the DX categories.
Extra Example • Suppose, we get data with the patients’ diagnoses. We want to transpose the patients diagnosis to put the most severe diagnoses in the first column based on diag scores NOTE: Some patient has only one diag while others has four diags (or maybe more.) DX 055 has a higher diag score than DX 058. So, it will be first for the first patient.
Extra Example • SAS code to sort each patient ID with diag_score in descending sort before transposing the diags into columns. • /* Sort the diag data with the highest diag score on top for each patient */ • procsortdata=work.vasug_sample_diag; • bypatient_iddescendingdiag_score; • run; • /* Transpose the diag for each patient */ • proctransposedata=work.vasug_sample_diagout=work.trans_diags (drop= _name_ _label_) prefix=diag_; • var DIAG; • bypatient_id; • run;
Extra Example • SAS code to sort each patient ID with diag_score in descending sort before transposing the diags into columns. • /* Sort the diag data with the highest diag score on top for each patient */ • procsortdata=work.vasug_sample_diag; • bypatient_iddescendingdiag_score; • run; • /* Transpose the diag for each patient */ • proctransposedata=work.vasug_sample_diagout=work.trans_diags (drop= _name_ _label_) prefix=diag_; • var DIAG; • bypatient_id; • run;
Extra Example • It was nice to transpose the diags but the care managers don’t know who the patient is. We join with member demographics but we don’t know how many diag columns. So, use PROC CONTENTS (with OUT= and VARNUM options) to see all the diag columns from the PROC TRANSPOSE. Looks like at-most 5 diag columnsproccontentsdata=work.trans_diagsout=work.trans_diag_var_listvarnumnoprint; • run;
Extra Example • Take advantage of the output from PROC CONTENTS to create the variable list for PROC SQL when joining to member table. Use marco variable method (technique #1)data_null_; • formatstring_diag_var_list$500.; /* Set it large enough */ • setwork.trans_diag_var_listend=last; • retainstring_diag_var_list" "; • if _n_ = 2thenstring_diag_var_list = NAME; /* This will be the first one */ • elsestring_diag_var_list = trim(string_diag_var_list)||", "||trim(NAME); • if last thencallsymput("diag_var_list",trim(string_diag_var_list)); • run; • %put &DIAG_var_list; /* See what it looks like */
Extra Example • Take advantage of the output from PROC CONTENTS to create the variable list for PROC SQL when joining to member table. Use marco variable method (technique #1)data_null_; • formatstring_diag_var_list$500.; /* Set it large enough */ • setwork.trans_diag_var_listend=last; • retainstring_diag_var_list" "; • if _n_ = 2thenstring_diag_var_list = NAME; /* This will be the first one */ • elsestring_diag_var_list = trim(string_diag_var_list)||", "||trim(NAME); • if last thencallsymput("diag_var_list",trim(string_diag_var_list)); • run; • %put &DIAG_var_list; /* See what it looks like */
Extra Example • Put the macro variable into the PROC SQLprocsql; • createtablework.Final_member_ordered_DIAGS • asselectdistincta.MEMBER_ID, a.MEMBER_NAME, a.AGE, a.GENDER, • &DIAG_var_list • fromVASUG_sample_MEMBERS a leftjoinwork.trans_diags b • ona.member_id=b.patient_id; • quit; • PROC SQL after invoking the macro substitution • procsql; • createtablework.Final_member_ordered_DIAGS • asselectdistincta.MEMBER_ID, a.MEMBER_NAME, a.AGE, a.GENDER, • DIAG_1, DIAG_2, DIAG_3, DIAG_4, DIAG_5 • fromVASUG_sample_MEMBERS a leftjoinwork.trans_diags b • ona.member_id=b.patient_id; • quit;
Conclusion • These two techniques saved us a significant amount of manual coding labor. • This example may not seem like much coding but the idea will be useful if we have a large number of repetitive SAS statements to code in a dynamical process • This idea is not limited to just making WHERE statements or SELECT/WHEN statements. • The Robo-Coder can be expanded to include other SAS PROCs as well as dozen or hundreds of routine weekly/monthly reports (See my VASUG Spring 2011 presentation on the PROC REPORT example)
Questions? • Questions? • SAS code and sample SAS data set is available. Send me an email. • Contact: Robert Williams WellPoint Inc.4425 Corporation Ln. Virginia Beach, VA 23462 Robert.Williams@Amerigroup.com
References • SAS Certification Prep Guide: Base Programming for SAS 9 Third Edition Cary, NC: SAS Institute Inc. 2011 • SAS Certification Prep Guide: Advanced Programming for SAS 9 Third Edition Cary, NC: SAS Institute Inc. 2011 • The Web's Free 2013 Medical Coding Reference. (http://www.icd9data.com/) • Robo-SAS Coding? Yes! We can auto-generate SAS codes from a SAS data set. (http://vasug.org/about-2/past-presentations/)