440 likes | 645 Views
“Automagic” SAS Tricks. Bill McKirgan Database Admin/Develop II University of Iowa Department of Psychiatry bill-mckirgan@uiowa.edu. Before we start…. Beginning SAS programmers should practice the basics before trying to use this stuff.
E N D
“Automagic” SAS Tricks Bill McKirgan Database Admin/Develop II University of Iowa Department of Psychiatry bill-mckirgan@uiowa.edu
Before we start… Beginning SAS programmers should practice the basics before trying to use this stuff. One should have a grasp of SAS Base before learning how to automate processes or generate code.
Acknowlegement… There are many ways of doing the same thing in SAS and other languages. ’Automagic’ SAS Tricks is just one toolkit for generating code. In October 2006, Michael Brainard presented, How to use SAS to generate code at the IASUG meeting. His presentation provides valuable examples of how to use PROC SQL to generate SAS and other code. http://www.iowasasuser.org/CodeGenerationISUG200610.ppt (URL for Michael’s ISUG presentation)
Stuff we will discuss today What is “automagic”? Basic tools • Clued-in to %INCLUDE • Using ‘autoexe.sas’: what, how & why • The power of %LET • Fun with %SYSFUNC • Generating lists of files to import • Using file lists to automate import • Questions
What is “automagic”? • In simple terms, “Automagic” is a chopped combination of the words ‘automatic’ and ‘magic’. • Automating a process is generally good, and for folks who do not understand or need to know the details the end result is magical.
What is “automagic”? • Many programs are ‘static’ often only good for a single run. • The next time the program is needed certain details typically need to be changed: • Report dates • File names • Titles, subtitles, and footnotes
What is “automagic”? • SAS programmers have a vast set of tools for automating processes and making programs more dynamic: • %LET, system functions, and SAS macro (stuff we will do today) • SAS SCL and PROC SQL are other options, but are not used in this presentation
Basic tools Clued-in to %INCLUDE • As we expand our repertoire of SAS tricks we usually get to a point where we discover repetition and how to ‘copy/paste’ certain standard program elements. • As a beginning SAS programmer I wrote each program as a ‘one-off’ and would typically repeat libname statements, formats, and code for deriving and labeling variables in each new program. • Then I got “clued-in” to a way of storing these handy chunks of programs in separate files and calling upon them using %include. syntax: %INCLUDE “c:\temp\program_chunk.sas”;
Basic tools Clued-in to %INCLUDE • One of the first steps in using %include to automate SAS programs is to write the programs to be included in such a way that they can be executed independently. • Examples: %include ‘c:\temp\my_formats.sas’; should probably have no datastep code, titles, or other extraneous stuff. %include “s:\project_a\standard_libs_fmts.sas”; again, should have nothing extraneous. Craft the %include-able programs so they are generalized and can be called upon as needed in other programs.
Basic tools Clued-in to %INCLUDE NOTE: Not all programs need to be written this way, but the stuff you want to make generally useful from other SAS programs should be written with this in mind: • Take care not to over-write or “clobber” • Libnames • Macro variable names • Dataset names • Other stuff???
Basic tools Using ‘autoexe.sas’: what, how & why • First of all, WHAT is ‘autoexec.sas’? • It’s a special SAS program file name for the SAS user • The SAS user/administrator must create it • The SAS user can program it to do things at startup • HOW do I use it? • it will be executed every time SAS is started as long as it is placed in the proper directory: Example – C:\Program Files\SAS\SAS 9.1\autoexec.sas
Basic tools Using ‘autoexe.sas’: what, how & why • WHY would I need one? • If you have many custom SAS libraries • You can define them in your autoexec.sas instead of using the library/libname icon in the SAS graphic user interface • If you have datasets you want refreshed from other database applications or formats • SAS import and translation programs can be executed via. %include statement in your autoexec.sas • If you want to pre-load custom formats, or macro variables • Again, the autoexec.sas can store these statements or run them from other batch-capible programs via. %include
Basic tools The power of %LET • %LET is the most basic of all SAS macro constructs. It is used to define a macro variable name, and to store a string of characters in the macro variable defined. %let whatpath=c:\longpath\that\i\dont\want\to\type\again; • Here, above, we define WHATPATH and load it with a long directory path name that can be “called” later in the program by using the ampersand character: infile “&WHATPATH\mytextfile.txt”;
Basic tools The power of %LET • Storing a string of characters for later use is a powerful thing especially if it is a long string and used frequently later in the program.
Basic tools The power of %LET My Criteria for %LETting something • The string is so long retyping could result in errors • The string is repeated more than once or twice • A combination of 1 and 2
Basic tools Fun with %SYSFUNC • The macro function %SYSFUNC gives us access to the macro processor and many datastep and SCL functions. • %SYSFUNC has been around since SAS version 6.12.
Basic tools Fun with %SYSFUNC • Common uses: • changing the format of macro variables (covered in this presentation) • reading SAS datasets, catalogs, and external files • (www.cyassociates.com/sysfunc.html is a good source for %SYSFUNC examples).
Basic tools Fun with %SYSFUNC Changing macro variable formats • I used to archive dataset and report files by hand • that is I would: copy/paste and then rename files to include a “_YYYYMMDD” datestamp at the end • After discovering macro I tried to automate this by concatenating file names using “today()”, but the SASDATE number was confusing. • I wished I’d discovered %SYSFUNC sooner, because it does this and many other tasks to save the programmer from typing
Basic tools Fun with %SYSFUNC In this example we use %LET to create the macro variable DATESTAMP which is the YYYYMMDD-formatted version of the current SAS system date. %let datestamp = %sysfunc(today(),yymmddN8.); %put &datestamp; -- log output -- SYMBOLGEN: Macro variable DATESTAMP resolves to 20061211 20061211
Basic tools Fun with %SYSFUNC Over weeks and months of use, the automated DATESTAMP helps keep similar datasets neatly grouped in the directory where it is stored.
Generating lists of files SAS has several great ways of helping us avoid typing and looking up stuff we may need for our programs: • X is a command that passes text to the operating system • Example: X dir c:\temp • Will cause a “DOS” window to open and display the screen output • Example: X dir c:\temp > c:\temp\tempfiles.txt • Will cause a “DOS” window to open and direct the output to the file, ‘tempfiles.txt’ • In each case the DOS window will remain open unless you preceed the X command with the option: Options noxwait;
Generating lists of files • The following options are helpful: • symbolgen – causes the resolved macro variable to display in the log • noxwait – allows SAS to execute while a command window is open; otherwise, it will pause until the user closes the window. options nocenter symbolgen noxwait ; %let drv=c; /* Name the directory FILE SEARCH will branch-off from, below. */ %let progpath=&drv:\temp\IASUG Presentations\Automagic SAS Tricks; %let whatpath=&progpath\de_identified_data; %let therepth=&progpath\final_data; run;
Generating lists of files /* Make list of all files found under WHATPATH */ filename filelist pipe"dir ""&whatpath""\*IGT_* /S "; run; This piped command selects only the files that have names with ‘IGT_’ or ‘igt_’ in them. It took me much trial & error work to get &WHATPATH to resolve correctly.
Generating lists of files Here’s a look at some of the output that will become part of our list: Volume in drive C has no label. Volume Serial Number is 9052-5591 Directory of C:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data 12/12/2006 11:03 AM 1,225 igt_list1.sas 1 File(s) 1,225 bytes Directory of C:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_112233_ew 12/21/2004 12:39 PM 6,291 igt_112233.txt 1 File(s) 6,291 bytes Directory of C:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_1122_by 01/30/2006 01:09 PM 6,291 igt_1122.txt 1 File(s) 6,291 bytes Directory of C:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_223344_cjp 09/28/2006 04:47 PM 6,291 IGT_223344.txt 1 File(s) 6,291 bytes
Generating lists of files We will read the directory info from some lines Volume in drive C has no label. Volume Serial Number is 9052-5591 Directory of C:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data 12/12/2006 11:03 AM 1,225 igt_list1.sas 1 File(s) 1,225 bytes Directory of C:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_112233_ew 12/21/2004 12:39 PM 6,291 igt_112233.txt 1 File(s) 6,291 bytes Directory of C:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_1122_by 01/30/2006 01:09 PM 6,291 igt_1122.txt 1 File(s) 6,291 bytes Directory of C:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_223344_cjp 09/28/2006 04:47 PM 6,291 IGT_223344.txt 1 File(s) 6,291 bytes
Generating lists of files …and the file information from other lines Volume in drive C has no label. Volume Serial Number is 9052-5591 Directory of C:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data 12/12/2006 11:03 AM 1,225 igt_list1.sas 1 File(s) 1,225 bytes Directory of C:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_112233_ew 12/21/2004 12:39 PM 6,291 igt_112233.txt 1 File(s) 6,291 bytes Directory of C:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_1122_by 01/30/2006 01:09 PM 6,291 igt_1122.txt 1 File(s) 6,291 bytes Directory of C:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_223344_cjp 09/28/2006 04:47 PM 6,291 IGT_223344.txt 1 File(s) 6,291 bytes
Generating lists of files data fileprep; retain IND_ID ; infile filelist truncover ; input filedate :mmddyy10. timestr $ 13-21 sizestr $ 22-38 filename $ 40-90 lines $ 1-100; format filedate mmddyy10.; filetime = input(compress(timestr," "),??time9.); format filetime timeampm8.; filesize = input(sizestr,??comma8.); if index(lines,"Directory of ")=1thendo; DIRECTORY = substr(lines,14,200); IND_ID=INPUT(SCAN(DIRECTORY,4,'_'),BEST8.); retain DIRECTORY IND_ID ; end; if filedate ^= .; /* Discard output not related to files */ LABEL FILEDATE = "Date file was created" FILETIME = "Time file was created" FILESIZE = "Size file" FILENAME = "Name of file" DIRECTORY= "Name of source directory" ; drop timestr sizestr LINES; /* Drop junk strings */ IF INDEX(UPCASE(FILENAME),".TXT"); RUN; The following datastep reads the output in a sloppy way: First by trying to read each line as if it was a line with file details. And later, it reads directory details only from the lines that contain the string, “Directory of “.
Generating lists of files A closer look data fileprep; retain IND_ID ; infile filelist truncover ; input filedate :mmddyy10. timestr $ 13-21 sizestr $ 22-38 filename $ 40-90 lines $ 1-100; format filedate mmddyy10.; filetime = input(compress(timestr," "),??time9.); format filetime timeampm8.; filesize = input(sizestr,??comma8.); if index(lines,"Directory of ")=1thendo; DIRECTORY = substr(lines,14,200); IND_ID=INPUT(SCAN(DIRECTORY,4,'_'),BEST8.); retain DIRECTORY IND_ID ; end;
Generating lists of files A closer look data fileprep; retain IND_ID ; infile filelist truncover ; input filedate :mmddyy10. timestr $ 13-21 sizestr $ 22-38 filename $ 40-90 lines $ 1-100; format filedate mmddyy10.; filetime = input(compress(timestr," "),??time9.); format filetime timeampm8.; filesize = input(sizestr,??comma8.); if index(lines,"Directory of ")=1thendo; DIRECTORY = substr(lines,14,200); IND_ID=INPUT(SCAN(DIRECTORY,4,'_'),BEST8.); retain DIRECTORY IND_ID ; end; example of Log NOTEs resulting from input of lines without file info NOTE: Invalid data for filedate in line 1 2-7. RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8 1 Volume in drive C has no label. 32 IND_ID=. filedate=. timestr=rive C ha sizestr=s no label. filename= lines=Volume in drive C has no label. filetime=. filesize=. DIRECTORY= _ERROR_=1 _N_=1 NOTE: Invalid data for filedate in line 2 2-7. . . . ERROR: Limit set by ERRORS= option reached. Further errors of this type will not be printed
Generating lists of files A closer look data fileprep; retain IND_ID ; infile filelist truncover ; input filedate :mmddyy10. timestr $ 13-21 sizestr $ 22-38 filename $ 40-90 lines $ 1-100; format filedate mmddyy10.; filetime = input(compress(timestr," "),??time9.); format filetime timeampm8.; filesize = input(sizestr,??comma8.); if index(lines,"Directory of ")=1thendo; DIRECTORY = substr(lines,14,200); IND_ID=INPUT(SCAN(DIRECTORY,4,'_'),BEST8.); retain DIRECTORY IND_ID ; end;
Generating lists of files /* fileprep datastep continued… */ if filedate ^= .; /* Discard output not related to files */ LABEL FILEDATE = "Date file was created" FILETIME = "Time file was created" FILESIZE = "Size file" FILENAME = "Name of file" DIRECTORY= "Name of source directory" ; drop timestr sizestr LINES; /* Drop junk strings */ IF INDEX(UPCASE(FILENAME),".TXT"); RUN;
Generating lists of files And here is our viewtable of the file list Now we have what is needed to fish-out files to import from many different directories.
Using file lists to automate import • Importing files can be easy when they are all in one directory, but this is not always the case • Automating a routine import process is helpful • When files span different subdirectories • When files have different names • When filenames contain information needed in the data • We will examine one method for doing this
Using file lists to automate import • Macro is handy for doing the “list processing” we need for import • The classic macro loop defines the macro name and parameters that relate to a chunk of SAS code. The parameters are the elements that change with each macro call.
Using file lists to automate import • This is the classic macro loop: Macro definition (parameters) … template of SAS statements Macro end Call macro 1 Call macro 2
Using file lists to automate import • This is the classic macro loop: %macro runem(a, b, c); data this; set &a; id=&b; type=&c; %mend runem; %runem(fred, 99, 1); %runem(barney, 22, 7);
Using file lists to automate import First generate the macro call statements and write them to a file called “igtlist1.sas” for later %INCLUDE use. options ls=200; data makecode; set fileprep; lines='%runem( '||trim(left(ind_id))||' , '||trim(left(filedate ))||' , '||trim(filename) ||" , "||trim(directory)||' );' ; file"&whatpath\igtlist1.sas"; put lines; run; %runem( 1122 , 16831 , igt_1122.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_1122_by ); %runem( 112233 , 16426 , igt_112233.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_112233_ew ); %runem( 223344 , 17072 , IGT_223344.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_223344_cjp ); %runem( 334455 , 16531 , igt_334455.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_334455_eew ); %runem( 445566 , 16456 , igt_445566.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_445566_eew ); %runem( 556677 , 17013 , igt_556677.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_556677_by ); %runem( 667788 , 16476 , igt_667788.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_667788_ccp ); %runem( 778899 , 16668 , IGT_778899.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_778899_cjp ); %runem( 889900 , 17002 , igt_889900.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_889900_eew ); %runem( 990011 , 16510 , IGT_990011.txt.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_990011_cjp );
Using file lists to automate import Next generate a block of filenames that will be used for a SET statement to put all the data together. This list of file names is written to ‘igtlist2.sas’ and will later be %INCLUDEd. data makecode; set fileprep; lines='raw.igt_' ||trim(left(ind_id)) ; file"&whatpath\igtlist2.sas"; put lines; run; raw.igt_1122 raw.igt_112233 raw.igt_223344 raw.igt_334455 raw.igt_445566 raw.igt_556677 raw.igt_667788 raw.igt_778899 raw.igt_889900 raw.igt_990011
Using file lists to automate import LIBNAME RAW "&whatpath\rawsas"; run; %macro runem(whatid, whatdate, whatfile, whatpath); DATA raw.IGT_&whatid; IND_ID = &whatid; infile "&whatpath\&whatfile" firstobs=23; IGT_DATE=&whatdate; FORMAT IGT_DATE MMDDYY10.; input Trial Deck $ Win Lose Score Borrow Time_ms; run; %mend runem; %include"&whatpath\igtlist1.sas"; run; quit; define macro template of SAS statements Here we see how the macro call is executed with a %INCLUDE %runem( 1122 , 16831 , igt_1122.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_1122_by ); %runem( 112233 , 16426 , igt_112233.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_112233_ew ); %runem( 223344 , 17072 , IGT_223344.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_223344_cjp ); %runem( 334455 , 16531 , igt_334455.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_334455_eew ); %runem( 445566 , 16456 , igt_445566.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_445566_eew ); %runem( 556677 , 17013 , igt_556677.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_556677_by ); %runem( 667788 , 16476 , igt_667788.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_667788_ccp ); %runem( 778899 , 16668 , IGT_778899.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_778899_cjp ); %runem( 889900 , 17002 , igt_889900.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_889900_eew ); %runem( 990011 , 16510 , IGT_990011.txt.txt , c:\temp\IASUG Presentations\Automagic SAS Tricks\de_identified_data\n4_990011_cjp );
Using file lists to automate import Gambling Task V.2.0 c2002 By: Antoine Bechara Department of Neurology University of Iowa Based on the Gambling Task as published in: Bechara, A., Damasio, A. R., Damasio, H., & Anderson, S. W. (1994). Insensitivity to future consequences following damage to human prefrontal cortex. Cognition, 50, 7-15. Bechara, A., Tranel, D., & Damasio, H. (2000). Characterization of the decision-making impairment of patients with bilateral lesions of the ventromedial prefrontal cortex. Brain, 123, 2189-2202. Trial Deck Win Lose Score Borrow Time(ms) 1 A' 100 0 2100 2000 159447 2 A' 120 0 2220 2000 1646 3 A' 80 -150 2150 2000 676 4 A' 90 0 2240 2000 1583 5 A' 110 -300 2050 2000 831 6 B' 100 0 2150 2000 3973 7 B' 80 0 2230 2000 836 8 B' 110 0 2340 2000 103 9 B' 120 0 2460 2000 178 10 B' 90 0 2550 2000 243 11 B' 100 0 2650 2000 2121 12 B' 90 0 2740 2000 1799 Now we can cycle through each file and import the raw data with the same input statement. LIBNAME RAW "&whatpath\rawsas"; run; %macro runem(whatid, whatdate, whatfile, whatpath); DATA raw.IGT_&whatid; IND_ID = &whatid; infile "&whatpath\&whatfile" firstobs=23; IGT_DATE=&whatdate; FORMAT IGT_DATE MMDDYY10.; input Trial Deck $ Win Lose Score Borrow Time_ms; run; %mend runem; %include"&whatpath\igtlist1.sas"; run; quit;
Using file lists to automate import When all raw data files are ‘SASified’ we can use the next ‘call-list’ file to complete the file names needed in the SET statement. This is followed by %including the program that summarizes the data. libname there "&therepth"; run; data there.igt_data; set %include"&whatpath\igtlist2.sas"; ; run; %include"&progpath\summarize_igt_example.sas"; run;
Using file lists to automate import In the end we have each line of raw data for each subject in one file, and a summary record for each subject in another file.
Review I hope this spoiled some of the “magic” of code generation / list processing using SAS functions and macro. There are many ways of doing the same thing in SAS. This is just one method based on my own experiences trying to manage data in a dynamic environment. Areas I intend to study and practice for similar tricks are: • SAS/SCL • PROC SQL