280 likes | 361 Views
Explore effective strategies for debugging SAS Macros to identify and resolve errors efficiently for accurate statistical analysis. Discover useful techniques and system options for error tracking and debugging in SAS programming.
E N D
SAS Macro: Some Tips for Debugging Stat Talk @ St. Paul’s Hospital April 2, 2007
When is SAS Macro usually written? • same analyses are repeated for a number of variables • same logic search to be performed on a number of variables • reports to be generated on regular basis • passing value(s) from one data step to another • …
How to find the error(s)? • Log file • Results obtained different from expected • Via “MPRINT” (i.e. options mprint;) - Any other ways?
Some extra SAS system options: • SYMBOLGEN • the value of each Macro variable resolves to • MLOGIC • keep track of the parameter values, the logic that drives %DO loops and %IF logic checks • MFILE • similar to MPRINT, this option is used to write out the resolved macro code (proper SAS code) to a file
Example 1 data stattalk;input id:$2. age wt ht;cards;1 60 75 1802 25 55 1653 45 80 170; run; This macro computes the mean of continuous variable from PROC MEANS: * varlist: List of continuous variables for computation * nvar: total number of variables for computation %macro getave(varlist,nvar); %do i=1 %to &nvar; %let var=%scan(&varlist,&i,' '); proc means data=stattalk noprint; var &var; output out=&var.out mean=mean; run; %end; %mend getave;
SAS Code: options symbolgen; %getave(age wt ht,3); SAS Log File: : : SYMBOLGEN: Macro variable NVAR resolves to 3 SYMBOLGEN: Macro variable VARLIST resolves to age wt ht SYMBOLGEN: Macro variable I resolves to 1 SYMBOLGEN: Macro variable VAR resolves to age SYMBOLGEN: Macro variable VAR resolves to age NOTE: There were 3 observations read from the data set WORK.STATTALK. NOTE: The data set WORK.AGEOUT has 1 observations and 3 variables. NOTE: PROCEDURE MEANS used (Total process time): real time 0.07 seconds cpu time 0.07 seconds SYMBOLGEN: Macro variable VARLIST resolves to age wt ht SYMBOLGEN: Macro variable I resolves to 2 SYMBOLGEN: Macro variable VAR resolves to wt SYMBOLGEN: Macro variable VAR resolves to wt NOTE: There were 3 observations read from the data set WORK.STATTALK. NOTE: The data set WORK.WTOUT has 1 observations and 3 variables. NOTE: PROCEDURE MEANS used (Total process time): real time 0.07 seconds cpu time 0.06 seconds SYMBOLGEN: Macro variable VARLIST resolves to age wt ht SYMBOLGEN: Macro variable I resolves to 3 SYMBOLGEN: Macro variable VAR resolves to ht SYMBOLGEN: Macro variable VAR resolves to ht NOTE: There were 3 observations read from the data set WORK.STATTALK. NOTE: The data set WORK.HTOUT has 1 observations and 3 variables. NOTE: PROCEDURE MEANS used (Total process time): real time 0.06 seconds cpu time 0.05 seconds Log output from SYMBOLGEN
SAS Code: options mlogic; %getave(age wt ht,3); SAS Log File: MLOGIC(GETAVE): Beginning execution. MLOGIC(GETAVE): Parameter VARLIST has value age wt ht MLOGIC(GETAVE): Parameter NVAR has value 3 MLOGIC(GETAVE): %DO loop beginning; index variable I; start value is 1; stop value is 3; by value is 1. MLOGIC(GETAVE): %LET (variable name is VAR) NOTE: There were 3 observations read from the data set WORK.STATTALK. NOTE: The data set WORK.AGEOUT has 1 observations and 3 variables. NOTE: PROCEDURE MEANS used (Total process time): real time 0.06 seconds cpu time 0.06 seconds MLOGIC(GETAVE): %DO loop index variable I is now 2; loop will iterate again. MLOGIC(GETAVE): %LET (variable name is VAR) NOTE: There were 3 observations read from the data set WORK.STATTALK. NOTE: The data set WORK.WTOUT has 1 observations and 3 variables. NOTE: PROCEDURE MEANS used (Total process time): real time 0.06 seconds cpu time 0.06 seconds MLOGIC(GETAVE): %DO loop index variable I is now 3; loop will iterate again. MLOGIC(GETAVE): %LET (variable name is VAR) NOTE: There were 3 observations read from the data set WORK.STATTALK. NOTE: The data set WORK.HTOUT has 1 observations and 3 variables. NOTE: PROCEDURE MEANS used (Total process time): real time 0.06 seconds cpu time 0.05 seconds MLOGIC(GETAVE): %DO loop index variable I is now 4; loop will not iterate again. MLOGIC(GETAVE): Ending execution. Log output from MLOGIC
SAS Code: options mprint; %getave(age wt ht,3); SAS Log File: MPRINT(GETAVE): proc means data=stattalk noprint; MPRINT(GETAVE): var age; MPRINT(GETAVE): output out=ageout mean=mean; MPRINT(GETAVE): run; NOTE: There were 3 observations read from the data set WORK.STATTALK. NOTE: The data set WORK.AGEOUT has 1 observations and 3 variables. NOTE: PROCEDURE MEANS used (Total process time): real time 0.06 seconds cpu time 0.06 seconds MPRINT(GETAVE): proc means data=stattalk noprint; MPRINT(GETAVE): var wt; MPRINT(GETAVE): output out=wtout mean=mean; MPRINT(GETAVE): run; NOTE: There were 3 observations read from the data set WORK.STATTALK. NOTE: The data set WORK.WTOUT has 1 observations and 3 variables. NOTE: PROCEDURE MEANS used (Total process time): real time 0.06 seconds cpu time 0.05 seconds MPRINT(GETAVE): proc means data=stattalk noprint; MPRINT(GETAVE): var ht; MPRINT(GETAVE): output out=htout mean=mean; MPRINT(GETAVE): run; NOTE: There were 3 observations read from the data set WORK.STATTALK. NOTE: The data set WORK.HTOUT has 1 observations and 3 variables. NOTE: PROCEDURE MEANS used (Total process time): real time 0.06 seconds cpu time 0.06 seconds Log output from MPRINT
SAS Code: filename mprint “~/mfileoutput.sas”; options mprint mfile; %getave(age wt ht,3); SAS Log:…same log as in MPRINT… But you will find in a file “mfileoutput.sas” in your (main) directory! And it looks like this: proc means data=stattalk noprint; var age; output out=ageout mean=mean; run; proc means data=stattalk noprint; var wt; output out=wtout mean=mean; run; proc means data=stattalk noprint; var ht; output out=htout mean=mean; run; What does MFILE look like?
Example 2 – MLOGIC output for %IF logic check SAS Code: %macro sillyeg(catvar); %if &catvar=1 %then %do; %put This is a categorical variable; %end; %else %if &catvar=0 %then %do; %put This is not a categorical variable; %end; %mend sillyeg; %sillyeg(0); SAS Log: • %sillyeg(0); MLOGIC(SILLYEG): Beginning execution. MLOGIC(SILLYEG): Parameter CATVAR has value 0 MLOGIC(SILLYEG): %IF condition &catvar=1 is FALSE MLOGIC(SILLYEG): %IF condition &catvar=0 is TRUE MLOGIC(SILLYEG): %PUT This is not a categorical variable This is not a categorical variable MLOGIC(SILLYEG): Ending execution. NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414 NOTE: The SAS System used: real time 0.35 seconds cpu time 0.26 seconds
Example 3 – combination of different options can be helpful!
Common Mistakes • IF or %IF; DO or %DO • MISSING vs. NULL VALUE • SCAN, %SCAN, %QSCAN • SUBSTR, %SUBSTR, %QSUBSTR • %STR, %NRSTR, %BQUOTE, %NRBQUOTE • Doing math in MACRO environment • Range comparison
IF or %IF; DO or %DO • %IF (and %DO) can only be used within a MACRO declaration, to control what code is written or how the logic is evaluated within the MACRO. • IF (and DO) statement can be used in a MACRO, but will be executed as part of DATA step code within the MACRO.
Example 4 Dataset: data stattalk;input id:$2. age wt ht;cards;1 60 75 1802 25 55 1653 45 80 170; run; SAS code: %macro whatif(condition=gt 50); data subset; set stattalk; %if age &condition %then output;; run; proc print data=subset; run; %mend whatif; %whatif; SAS output: Obs id age wt ht 1 1 60 75 1802 2 25 55 1653 3 45 80 170 • Why did we get such incorrect output?? • Macro code is ALWAYS executed before the DATA step is even compiled • AGE in the %IF is not seen as a DATA step variable, but rather as the letters a-g-e • Since numbers are smaller than letters alphabetically, the letter ‘a’ comes after 50.
So, an example where both IF and %IF are used in a MACRO…. SAS code: %macro ifagain(condition=gt 30, print=1); data subset; set stattalk; if age &condition then output; run; proc means data=subset %if &print^=1 %then noprint;; var age; output out=subset_out mean=mean std=sd; run; %if &print>=1 %then %do; proc print data=subset_out; run; %end; %mend ifagain; %ifagain;
Missing vs. NULL • In the DATA step, there is no such thing as a truly NULL value. • Character or numeric variable has a “value” for missing, a single blank space or a period, respectively. • E.g.) if sex=‘ ‘ then delete; if age=. then delete. • In the MACRO language, there are no characters used to represent a missing value. So when a MACRO variable is NULL, it truly has no value. • E.g.) %if &age=. %then %do; – WRONG!! %if &gender=“ “ %then %do;– WRONG!!
3 ways to specify NULL in the logic check: Method 3: %macro sillycheck(age=); %if &age=%str() %then %do; %put It works; %end; %else %do; %put It did not work; %end; %mend sillycheck; Method 1: %macro sillycheck(age=); %if &age= %then %do; %put It works; %end; %else %do; %put It did not work; %end; %mend sillycheck; Method 2: %macro sillycheck(age=); %if “&age”=“” %then %do; %put It works; %end; %else %do; %put It did not work; %end; %mend sillycheck;
A side remark: In MACRO language, everything is TEXT! SAS code: %macro sillyeg(age=50,sex=F);%if &age=50 %then %do; %put Patient is 50 years old; %end;%if &sex=“F” %then %do; %put Female patient; %end;%mend sillyeg;%sillyeg; SAS LOG: SYMBOLGEN: Macro variable AGE resolves to 50Patient is 50 years oldSYMBOLGEN: Macro variable SEX resolves to FNOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414NOTE: The SAS System used: real time 0.52 seconds cpu time 0.41 seconds SAS code: %macro sillyeg(age=50,sex=F);%if &age=50 %then %do; %put Patient is 50 years old; %end;%if &sex=F %then %do; %put Female patient; %end;%mend sillyeg;%sillyeg; SAS LOG: SYMBOLGEN: Macro variable AGE resolves to 50Patient is 50 years oldSYMBOLGEN: Macro variable SEX resolves to FFemale patientNOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414NOTE: The SAS System used: real time 0.52 seconds cpu time 0.41 seconds
SCAN, %SCAN, %QSCAN SAS output: Obs string word1 word2 1 XYZ,A*BC&HOS A A*BC&HOS In DATA step: data example; string=“XYZ,A*BC&HOS”; word1=scan(string,2); word2=scan(string,2,’,’); run; %scan DOES NOT mask & (and %) as regular text In MACRO: %let hos=SPH; %let string=%nrstr(XYZ,A*B&HOS); %let word1=%scan(&string,2); %let word2=%scan(&string,2,%str(,)); %let word3=%qscan(&string,2,%str(,)); %put word1=&word1; %put word2=&word2; %put word3=&word3; SAS Log: word1=A word2=A*BCSPH Word3=A*BC&HOS %qscan masks & (and %) as regular text
SUBSTR, %SUBSTR, %QSUBSTR SAS Code: %let stuff = clinics; %let string=%nrstr(*&stuff*&dsn*&morestuff); %let word1=%substr(&string,2,7); %let word2=%qsubstr(&string,2,7); %put word1=&word1; %put word2=&word2; SAS Log: word1=clinics* word2=&stuff* • Syntax for %SUBSTR and %QSUBSTR is exactly the same as in SUBSTR in data step • The difference between %SUBSTR and %QSUBSTR: • %SUBSTR does not mask & (and %) as part of the text • %QSUBSTR treats & (and %) as part of the text
Macro Quoting Functions • Macro language is a character-based language, and is composed of some of the special characters (e.g. % & ;) or mnemonic (e.g. GE AND LE OR) • Macro quoting functions tells the macro processor to interpret special characters/mnemonic simply as text • The special characters/mnemonic might require masking are: blank ; ^ ~ , ‘ “ ) ( + -- * / < > = | AND OR NOT EQ NE LE LT GE GT IN % & # • The most commonly macro quoting functions are: %STR, %NRSTR, %BQUOTE, %NRBQUOTE, %SUPERQ • Two types of macro quoting functions: a) Compilation functions – processor masks the special characters as text in open code or while compiling a macro. E.g. %STR, %NRSTR b) Execution functions – processor will first resolve a macro expression and then masks the special characters in the result as text. E.g. %QUOTE, %NRQUOTE, %BQUOTE, %NRBQUOTE
Example 5 %macro fileit(infile); %if %bquote(&infile) NE %then %do; %let char1 = %bquote(%substr(&infile,1,1)); %if %bquote(&char1) = %str(%') or %bquote(&char1) = %str(%") %then %let command=FILE &infile; %else %let command=FILE "&infile"; %end; %put &command; %mend fileit; %fileit(‘stattalk.sas’) • %bquote is used to quote the realization of a macro variable or expression • %str is used to quote constant value (i.e. right side of logic check) • Unmatched single or double quotation, or unmatched parenthesis should always be accompanied by % in %str, but no need to add % in %bquote (B=by itself)
Example 5 data test; store="Susan's Office Supplies"; call symput('s',store); run; %macro readit; %if %bquote(&s) ne %then %put *** valid ***; %else %put *** null value ***; %mend readit; %readit; - If you change %BQUOTE to %STR, you will get error message! Try it…
Example 6 SAS Code: Options ps=36 ls=69 nocenter; data _null_; call symput(‘authors’,’Smith&Jones’); call symput(‘macroname’,’%macro test;’); run; %let aa=SPH; %let jones=%nrstr(&aa); title1 “Authors1: %SUPERQ(authors)”; title2 “Authors2: %NRSTR(&authors)”; title3 “Authors3: %NRBQUOTE(&authors)”; title4 “Authors4: %UNQUOTE(%NRBQUOTE(&authors)); footnote1 “Name of Macro: %SUPERQ(macroname)”; SAS Output: Authors1: Smith&Jones Authors2: &authors Authors3: Smith&aa Authors4: SmithSPH Name of Macro: %macro test; • %NRSTR – mask & as part of the text during compilation • %NRBQUOTE – resolve the macro variable during execution; if the result contains &, it will be treated as part of the text • NR = Not Resolved
Doing Math in the Macro Language • %EVAL and %SYSEVALF allow the language to handle arithmetic operations • %EVAL: only for integer arithmetic • %SYSEVALF: for non-integer arithmetic (e.g. 1.0, .3, 2.) • Error message if %SYSEVALF should be used instead of %EVAL:
Example 7 %let x=5; %let y=&x+1; %let z=%EVAL(&x+1); %let w=%SYSEVALF(&x+1.8); %put &x &y &z &w; The %PUT writes the following to the LOG: 5 5+1 6 6.8
SAS Code: data _null_; do val=-10,-2,2,10; if -5 le val le 0 then do; put val " is in the negative range (-5 to 0)"; end; else if 1 le val le 5 then do; put val " is in the positive range (1 to 5)"; end; else put val " is WAY out of range“; run; SAS Log: -10 is WAY out of range -2 is in the negative range (-5 to 0) 2 is in the positive range (1 to 5) 10 is WAY out of range NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds SAS Code: %macro checkit(val=); %if -5 le &val le 0 %then %put &val is in the negative range (-5 to 0); %else %if 1 le &val le 5 %then %put &val is in the positive range (1 to 5); %else %put &val is WAY out of range; %mend checkit; %checkit(val=-10); %checkit(val=-2); %checkit(val=2); %checkit(val=10); SAS Log: 182 %checkit(val=-10); -10 is in the negative range (-5 to 0) 183 %checkit(val=-2); -2 is in the positive range (1 to 5) 184 %checkit(val=2); 2 is in the positive range (1 to 5) 185 %checkit(val=10); 10 is in the positive range (1 to 5) Range Comparisons ????
In DATA step: if -5 le val le 0 then do;is interpreted as if -5 le val and val le 0 then do; • In Macro Language: %if -5 le &val le 0 %then %put &val is in negative range (-5 to 0); is interpreted as %if (-5 le &val) le 0 %then %put &val is in the negative range (-5 to 0); So, if &val=-10, the %if becomes %if (-5 le -10) le 0 %then … The comparison will first check if -5 is less than or equal to -10. If it is FALSE, a zero is returned, and the expression becomes %if 0 le 0 %then …; And this comparison is true, and hence it printed “-10 is in the negative range (-5 to 0) in the LOG file. • In summary, for range comparison in Macro Language, always use a compound expression (e.g. -5 le &val AND &val le 0)