280 likes | 350 Views
SAS Macro: Some Tips for Debugging. Stat Talk @ St. Paul’s Hospital April 2, 2007. When is SAS Macro usually written? same analyses are repeated for a number of variables same logic search to be performed on a number of variables reports to be generated on regular basis
E N D
SAS Macro: Some Tips for Debugging Stat Talk @ St. Paul’s Hospital April 2, 2007
When is SAS Macro usually written? • same analyses are repeated for a number of variables • same logic search to be performed on a number of variables • reports to be generated on regular basis • passing value(s) from one data step to another • …
How to find the error(s)? • Log file • Results obtained different from expected • Via “MPRINT” (i.e. options mprint;) - Any other ways?
Some extra SAS system options: • SYMBOLGEN • the value of each Macro variable resolves to • MLOGIC • keep track of the parameter values, the logic that drives %DO loops and %IF logic checks • MFILE • similar to MPRINT, this option is used to write out the resolved macro code (proper SAS code) to a file
Example 1 data stattalk;input id:$2. age wt ht;cards;1 60 75 1802 25 55 1653 45 80 170; run; This macro computes the mean of continuous variable from PROC MEANS: * varlist: List of continuous variables for computation * nvar: total number of variables for computation %macro getave(varlist,nvar); %do i=1 %to &nvar; %let var=%scan(&varlist,&i,' '); proc means data=stattalk noprint; var &var; output out=&var.out mean=mean; run; %end; %mend getave;
SAS Code: options symbolgen; %getave(age wt ht,3); SAS Log File: : : SYMBOLGEN: Macro variable NVAR resolves to 3 SYMBOLGEN: Macro variable VARLIST resolves to age wt ht SYMBOLGEN: Macro variable I resolves to 1 SYMBOLGEN: Macro variable VAR resolves to age SYMBOLGEN: Macro variable VAR resolves to age NOTE: There were 3 observations read from the data set WORK.STATTALK. NOTE: The data set WORK.AGEOUT has 1 observations and 3 variables. NOTE: PROCEDURE MEANS used (Total process time): real time 0.07 seconds cpu time 0.07 seconds SYMBOLGEN: Macro variable VARLIST resolves to age wt ht SYMBOLGEN: Macro variable I resolves to 2 SYMBOLGEN: Macro variable VAR resolves to wt SYMBOLGEN: Macro variable VAR resolves to wt NOTE: There were 3 observations read from the data set WORK.STATTALK. NOTE: The data set WORK.WTOUT has 1 observations and 3 variables. NOTE: PROCEDURE MEANS used (Total process time): real time 0.07 seconds cpu time 0.06 seconds SYMBOLGEN: Macro variable VARLIST resolves to age wt ht SYMBOLGEN: Macro variable I resolves to 3 SYMBOLGEN: Macro variable VAR resolves to ht SYMBOLGEN: Macro variable VAR resolves to ht NOTE: There were 3 observations read from the data set WORK.STATTALK. NOTE: The data set WORK.HTOUT has 1 observations and 3 variables. NOTE: PROCEDURE MEANS used (Total process time): real time 0.06 seconds cpu time 0.05 seconds Log output from SYMBOLGEN
SAS Code: options mlogic; %getave(age wt ht,3); SAS Log File: MLOGIC(GETAVE): Beginning execution. MLOGIC(GETAVE): Parameter VARLIST has value age wt ht MLOGIC(GETAVE): Parameter NVAR has value 3 MLOGIC(GETAVE): %DO loop beginning; index variable I; start value is 1; stop value is 3; by value is 1. MLOGIC(GETAVE): %LET (variable name is VAR) NOTE: There were 3 observations read from the data set WORK.STATTALK. NOTE: The data set WORK.AGEOUT has 1 observations and 3 variables. NOTE: PROCEDURE MEANS used (Total process time): real time 0.06 seconds cpu time 0.06 seconds MLOGIC(GETAVE): %DO loop index variable I is now 2; loop will iterate again. MLOGIC(GETAVE): %LET (variable name is VAR) NOTE: There were 3 observations read from the data set WORK.STATTALK. NOTE: The data set WORK.WTOUT has 1 observations and 3 variables. NOTE: PROCEDURE MEANS used (Total process time): real time 0.06 seconds cpu time 0.06 seconds MLOGIC(GETAVE): %DO loop index variable I is now 3; loop will iterate again. MLOGIC(GETAVE): %LET (variable name is VAR) NOTE: There were 3 observations read from the data set WORK.STATTALK. NOTE: The data set WORK.HTOUT has 1 observations and 3 variables. NOTE: PROCEDURE MEANS used (Total process time): real time 0.06 seconds cpu time 0.05 seconds MLOGIC(GETAVE): %DO loop index variable I is now 4; loop will not iterate again. MLOGIC(GETAVE): Ending execution. Log output from MLOGIC
SAS Code: options mprint; %getave(age wt ht,3); SAS Log File: MPRINT(GETAVE): proc means data=stattalk noprint; MPRINT(GETAVE): var age; MPRINT(GETAVE): output out=ageout mean=mean; MPRINT(GETAVE): run; NOTE: There were 3 observations read from the data set WORK.STATTALK. NOTE: The data set WORK.AGEOUT has 1 observations and 3 variables. NOTE: PROCEDURE MEANS used (Total process time): real time 0.06 seconds cpu time 0.06 seconds MPRINT(GETAVE): proc means data=stattalk noprint; MPRINT(GETAVE): var wt; MPRINT(GETAVE): output out=wtout mean=mean; MPRINT(GETAVE): run; NOTE: There were 3 observations read from the data set WORK.STATTALK. NOTE: The data set WORK.WTOUT has 1 observations and 3 variables. NOTE: PROCEDURE MEANS used (Total process time): real time 0.06 seconds cpu time 0.05 seconds MPRINT(GETAVE): proc means data=stattalk noprint; MPRINT(GETAVE): var ht; MPRINT(GETAVE): output out=htout mean=mean; MPRINT(GETAVE): run; NOTE: There were 3 observations read from the data set WORK.STATTALK. NOTE: The data set WORK.HTOUT has 1 observations and 3 variables. NOTE: PROCEDURE MEANS used (Total process time): real time 0.06 seconds cpu time 0.06 seconds Log output from MPRINT
SAS Code: filename mprint “~/mfileoutput.sas”; options mprint mfile; %getave(age wt ht,3); SAS Log:…same log as in MPRINT… But you will find in a file “mfileoutput.sas” in your (main) directory! And it looks like this: proc means data=stattalk noprint; var age; output out=ageout mean=mean; run; proc means data=stattalk noprint; var wt; output out=wtout mean=mean; run; proc means data=stattalk noprint; var ht; output out=htout mean=mean; run; What does MFILE look like?
Example 2 – MLOGIC output for %IF logic check SAS Code: %macro sillyeg(catvar); %if &catvar=1 %then %do; %put This is a categorical variable; %end; %else %if &catvar=0 %then %do; %put This is not a categorical variable; %end; %mend sillyeg; %sillyeg(0); SAS Log: • %sillyeg(0); MLOGIC(SILLYEG): Beginning execution. MLOGIC(SILLYEG): Parameter CATVAR has value 0 MLOGIC(SILLYEG): %IF condition &catvar=1 is FALSE MLOGIC(SILLYEG): %IF condition &catvar=0 is TRUE MLOGIC(SILLYEG): %PUT This is not a categorical variable This is not a categorical variable MLOGIC(SILLYEG): Ending execution. NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414 NOTE: The SAS System used: real time 0.35 seconds cpu time 0.26 seconds
Example 3 – combination of different options can be helpful!
Common Mistakes • IF or %IF; DO or %DO • MISSING vs. NULL VALUE • SCAN, %SCAN, %QSCAN • SUBSTR, %SUBSTR, %QSUBSTR • %STR, %NRSTR, %BQUOTE, %NRBQUOTE • Doing math in MACRO environment • Range comparison
IF or %IF; DO or %DO • %IF (and %DO) can only be used within a MACRO declaration, to control what code is written or how the logic is evaluated within the MACRO. • IF (and DO) statement can be used in a MACRO, but will be executed as part of DATA step code within the MACRO.
Example 4 Dataset: data stattalk;input id:$2. age wt ht;cards;1 60 75 1802 25 55 1653 45 80 170; run; SAS code: %macro whatif(condition=gt 50); data subset; set stattalk; %if age &condition %then output;; run; proc print data=subset; run; %mend whatif; %whatif; SAS output: Obs id age wt ht 1 1 60 75 1802 2 25 55 1653 3 45 80 170 • Why did we get such incorrect output?? • Macro code is ALWAYS executed before the DATA step is even compiled • AGE in the %IF is not seen as a DATA step variable, but rather as the letters a-g-e • Since numbers are smaller than letters alphabetically, the letter ‘a’ comes after 50.
So, an example where both IF and %IF are used in a MACRO…. SAS code: %macro ifagain(condition=gt 30, print=1); data subset; set stattalk; if age &condition then output; run; proc means data=subset %if &print^=1 %then noprint;; var age; output out=subset_out mean=mean std=sd; run; %if &print>=1 %then %do; proc print data=subset_out; run; %end; %mend ifagain; %ifagain;
Missing vs. NULL • In the DATA step, there is no such thing as a truly NULL value. • Character or numeric variable has a “value” for missing, a single blank space or a period, respectively. • E.g.) if sex=‘ ‘ then delete; if age=. then delete. • In the MACRO language, there are no characters used to represent a missing value. So when a MACRO variable is NULL, it truly has no value. • E.g.) %if &age=. %then %do; – WRONG!! %if &gender=“ “ %then %do;– WRONG!!
3 ways to specify NULL in the logic check: Method 3: %macro sillycheck(age=); %if &age=%str() %then %do; %put It works; %end; %else %do; %put It did not work; %end; %mend sillycheck; Method 1: %macro sillycheck(age=); %if &age= %then %do; %put It works; %end; %else %do; %put It did not work; %end; %mend sillycheck; Method 2: %macro sillycheck(age=); %if “&age”=“” %then %do; %put It works; %end; %else %do; %put It did not work; %end; %mend sillycheck;
A side remark: In MACRO language, everything is TEXT! SAS code: %macro sillyeg(age=50,sex=F);%if &age=50 %then %do; %put Patient is 50 years old; %end;%if &sex=“F” %then %do; %put Female patient; %end;%mend sillyeg;%sillyeg; SAS LOG: SYMBOLGEN: Macro variable AGE resolves to 50Patient is 50 years oldSYMBOLGEN: Macro variable SEX resolves to FNOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414NOTE: The SAS System used: real time 0.52 seconds cpu time 0.41 seconds SAS code: %macro sillyeg(age=50,sex=F);%if &age=50 %then %do; %put Patient is 50 years old; %end;%if &sex=F %then %do; %put Female patient; %end;%mend sillyeg;%sillyeg; SAS LOG: SYMBOLGEN: Macro variable AGE resolves to 50Patient is 50 years oldSYMBOLGEN: Macro variable SEX resolves to FFemale patientNOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414NOTE: The SAS System used: real time 0.52 seconds cpu time 0.41 seconds
SCAN, %SCAN, %QSCAN SAS output: Obs string word1 word2 1 XYZ,A*BC&HOS A A*BC&HOS In DATA step: data example; string=“XYZ,A*BC&HOS”; word1=scan(string,2); word2=scan(string,2,’,’); run; %scan DOES NOT mask & (and %) as regular text In MACRO: %let hos=SPH; %let string=%nrstr(XYZ,A*B&HOS); %let word1=%scan(&string,2); %let word2=%scan(&string,2,%str(,)); %let word3=%qscan(&string,2,%str(,)); %put word1=&word1; %put word2=&word2; %put word3=&word3; SAS Log: word1=A word2=A*BCSPH Word3=A*BC&HOS %qscan masks & (and %) as regular text
SUBSTR, %SUBSTR, %QSUBSTR SAS Code: %let stuff = clinics; %let string=%nrstr(*&stuff*&dsn*&morestuff); %let word1=%substr(&string,2,7); %let word2=%qsubstr(&string,2,7); %put word1=&word1; %put word2=&word2; SAS Log: word1=clinics* word2=&stuff* • Syntax for %SUBSTR and %QSUBSTR is exactly the same as in SUBSTR in data step • The difference between %SUBSTR and %QSUBSTR: • %SUBSTR does not mask & (and %) as part of the text • %QSUBSTR treats & (and %) as part of the text
Macro Quoting Functions • Macro language is a character-based language, and is composed of some of the special characters (e.g. % & ;) or mnemonic (e.g. GE AND LE OR) • Macro quoting functions tells the macro processor to interpret special characters/mnemonic simply as text • The special characters/mnemonic might require masking are: blank ; ^ ~ , ‘ “ ) ( + -- * / < > = | AND OR NOT EQ NE LE LT GE GT IN % & # • The most commonly macro quoting functions are: %STR, %NRSTR, %BQUOTE, %NRBQUOTE, %SUPERQ • Two types of macro quoting functions: a) Compilation functions – processor masks the special characters as text in open code or while compiling a macro. E.g. %STR, %NRSTR b) Execution functions – processor will first resolve a macro expression and then masks the special characters in the result as text. E.g. %QUOTE, %NRQUOTE, %BQUOTE, %NRBQUOTE
Example 5 %macro fileit(infile); %if %bquote(&infile) NE %then %do; %let char1 = %bquote(%substr(&infile,1,1)); %if %bquote(&char1) = %str(%') or %bquote(&char1) = %str(%") %then %let command=FILE &infile; %else %let command=FILE "&infile"; %end; %put &command; %mend fileit; %fileit(‘stattalk.sas’) • %bquote is used to quote the realization of a macro variable or expression • %str is used to quote constant value (i.e. right side of logic check) • Unmatched single or double quotation, or unmatched parenthesis should always be accompanied by % in %str, but no need to add % in %bquote (B=by itself)
Example 5 data test; store="Susan's Office Supplies"; call symput('s',store); run; %macro readit; %if %bquote(&s) ne %then %put *** valid ***; %else %put *** null value ***; %mend readit; %readit; - If you change %BQUOTE to %STR, you will get error message! Try it…
Example 6 SAS Code: Options ps=36 ls=69 nocenter; data _null_; call symput(‘authors’,’Smith&Jones’); call symput(‘macroname’,’%macro test;’); run; %let aa=SPH; %let jones=%nrstr(&aa); title1 “Authors1: %SUPERQ(authors)”; title2 “Authors2: %NRSTR(&authors)”; title3 “Authors3: %NRBQUOTE(&authors)”; title4 “Authors4: %UNQUOTE(%NRBQUOTE(&authors)); footnote1 “Name of Macro: %SUPERQ(macroname)”; SAS Output: Authors1: Smith&Jones Authors2: &authors Authors3: Smith&aa Authors4: SmithSPH Name of Macro: %macro test; • %NRSTR – mask & as part of the text during compilation • %NRBQUOTE – resolve the macro variable during execution; if the result contains &, it will be treated as part of the text • NR = Not Resolved
Doing Math in the Macro Language • %EVAL and %SYSEVALF allow the language to handle arithmetic operations • %EVAL: only for integer arithmetic • %SYSEVALF: for non-integer arithmetic (e.g. 1.0, .3, 2.) • Error message if %SYSEVALF should be used instead of %EVAL:
Example 7 %let x=5; %let y=&x+1; %let z=%EVAL(&x+1); %let w=%SYSEVALF(&x+1.8); %put &x &y &z &w; The %PUT writes the following to the LOG: 5 5+1 6 6.8
SAS Code: data _null_; do val=-10,-2,2,10; if -5 le val le 0 then do; put val " is in the negative range (-5 to 0)"; end; else if 1 le val le 5 then do; put val " is in the positive range (1 to 5)"; end; else put val " is WAY out of range“; run; SAS Log: -10 is WAY out of range -2 is in the negative range (-5 to 0) 2 is in the positive range (1 to 5) 10 is WAY out of range NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds SAS Code: %macro checkit(val=); %if -5 le &val le 0 %then %put &val is in the negative range (-5 to 0); %else %if 1 le &val le 5 %then %put &val is in the positive range (1 to 5); %else %put &val is WAY out of range; %mend checkit; %checkit(val=-10); %checkit(val=-2); %checkit(val=2); %checkit(val=10); SAS Log: 182 %checkit(val=-10); -10 is in the negative range (-5 to 0) 183 %checkit(val=-2); -2 is in the positive range (1 to 5) 184 %checkit(val=2); 2 is in the positive range (1 to 5) 185 %checkit(val=10); 10 is in the positive range (1 to 5) Range Comparisons ????
In DATA step: if -5 le val le 0 then do;is interpreted as if -5 le val and val le 0 then do; • In Macro Language: %if -5 le &val le 0 %then %put &val is in negative range (-5 to 0); is interpreted as %if (-5 le &val) le 0 %then %put &val is in the negative range (-5 to 0); So, if &val=-10, the %if becomes %if (-5 le -10) le 0 %then … The comparison will first check if -5 is less than or equal to -10. If it is FALSE, a zero is returned, and the expression becomes %if 0 le 0 %then …; And this comparison is true, and hence it printed “-10 is in the negative range (-5 to 0) in the LOG file. • In summary, for range comparison in Macro Language, always use a compound expression (e.g. -5 le &val AND &val le 0)