410 likes | 489 Views
Macrotize! A Beginner’s Guide to Designing and Writing Macros Stacey D. Phillips, i3 Statprobe, Madison, WI. Summary of Presentation. Introduction to Macro Design and Writing Description of Real World Problem using Quality of Life Data (QOL) Overview of 5 Steps to Designing and Writing Macros
E N D
Macrotize! A Beginner’s Guide to Designing and Writing MacrosStacey D. Phillips, i3 Statprobe, Madison, WI
Summary of Presentation • Introduction to Macro Design and Writing • Description of Real World Problem using Quality of Life Data (QOL) • Overview of 5 Steps to Designing and Writing Macros • 5 Steps In-Depth using QOL data • Additional example of the 5 Steps using Vital Signs data • Conclusion • Questions
Introduction to Designing and Writing Macros • Learning to write macro code is a valuable skill for a SAS programmer though it can be intimidating to beginning programmers. • This presentation and the corresponding paper focuses more on the concepts of macro design/writing rather than specific macro code details. • There are many ways to think about macro design. I will present the five steps I use when developing a new macro. • The paper demonstrates the five steps using a real-world example involving Quality of Life data. • This presentation will include an additional example that uses the five steps.
Why Use Macros? • The macro facility is a tool for extending and customizing SAS and for reducing the amount of text you must enter to do a particular task. • SAS macros are used to reduce the amount of code one has to write by automating similar operations that are likely to be repeated. • In general, it is best to use macro language when you find yourself writing the same or very similar code over and over. Macro code can be more difficult to debug in the short term but it may save you time in the long run.
Some reasons you might consider using macro code… • Macros allow you to make a change in one location of your program so that SAS can cascade the change throughout your program. • Macros allow you to write a section of code once and use it over and over again in the same program or even in different programs. • Macros allow you to make programs data driven letting SAS decide what to do based on actual data values.
Some examples of where macro code might be useful… • You find yourself doing the same statistical calculation in multiple programs so decide to create and store an external macro (macro contained in a completely separate program) in the macro library that can be called whenever these calculations are necessary. • Example: A macro named %count01 is created to count subjects, calculate percentages and format the data into character variables appropriate for display in any RTF table.
Some examples of where macro code might be useful… • You are producing the similar output (eg. a data set or table or graph, etc.) for different groups of a certain population. • Example: A macro named %by_state is created to generate an economic profile table for each of the 50 states. • You want to run multiple statistical analyses with only slight modifications to the inputs. • Example: A macro named %reg is created to run a series of regression models and includes parameters to specify both the independent and dependent variables for each individual model without retyping the whole PROC REG code.
A Real-world Example Using Quality of Life (QOL) Regression Data • A statistician on my project ran many regression models and output each individual statistic into 225 separate SAS datasets. • The datasets needed to be accessed, manipulated and reformatted based on the table shell provided (see Handout reference #1). • The original task required six tables be created though this presentation with demonstrate only three tables as the process is the same whether three or six tables are created. • In the interest of simplicity, I will include only partial sections of the code needed for the various steps. The complete code can be found in Step 5 of the paper and in Handout reference #5.
Some Things to Keep in Mind • Paper is designed to be concept-based rather than code-based. • Don’t get too bogged down in the specific code…but do ask questions if you have them. • Code examples are excerpts in the interest of simplicity. The complete code is in the paper. • The ultimate goal of this example is to write a dynamic macro that will be as data independent as possible. • Because the PROC REPORT portion of the generated table is not relevant to the included macro writing, I’ve left out the code.
Five Easy Steps to Designing and Writing Macros • Step 1 - Know where you are and where you’re going. • Step 2 - Pretend you’re not writing a macro. • Step 3 - Look for patterns. • Step 4 - Vary one parameter at a time. • Step 5 - Put it all together.
Step 1 - Know where you are and where you are going. • Do you even need to write a macro? Sometimes utilizing macro code is unnecessary and cumbersome. • Macro code is more difficult to debug than normal SAS code so macro language should only be used when it will be advantageous. • The benefits of using SAS macro language include its adaptability and portability which make it ideal for reducing the amount of code necessary when you find yourself repeating code over and over. • How do we know when using a macro is a good idea? I usually start to think about writing a macro when I what I’m working on involves repetition or pattern.
Step 1 - Know where you are and where you are going. (See Handout Reference #2)
Step 1 - Know where you are and where you are going. (See Handout Reference #2)
Step 2 - Pretend you’re not writing a macro. • Ignore the fact that you want to write a macro and write out your code in regular SAS code without any macro language. • Why would I suggest this? • It’s important to make sure your basic SAS code works properly before you “macrotize.” • While using macro code will make your work more efficient in the long-run, in the short-term it can make the task more complex if you don’t have your basic code hammered out. • Sometimes writing out the code in its basic form helps to identify additional patterns that will be useful for macro writing.
Step 2 - Pretend you’re not writing a macro. QOL example: I want to SET the required data for each scale together then MERGE the “N”, “mean” and “estimate/CI” data together. /*EXAMPLE: Setting the “N” datasets together to get the N statistics*/ libname c30auc "/project/data/stats/output/PRO/C30/AUC"; data N; set C30AUC.NAPPETITE C30AUC.NCONSTIPATION C30AUC.NDIARRHOEA C30AUC.NEMOTIONAL C30AUC.NFATIGUE C30AUC.NGLOBAL C30AUC.NINSOMNIA C30AUC.NNAUSEA C30AUC.NPAIN C30AUC.NPHYSICAL C30AUC.NROLE; run; proc sort; by scale cohort; run;
/*Merge the “N”, “mean” and “estimate/CI” data together to get final dataset ALL used in generating the table*/ data ALL; merge N /* “N” data */ M /* “mean” data*/ D; /* “estimate/confidence interval” data */ by scale cohort; run; • Using the ALL data I can now go ahead and use PROC REPORT to generate my table. • While this code is simple enough, I’m really interested in getting SAS to figure out which datasets I need based on what’s in the directory so the individual scale datasets don’t need to be specified.
Step 2 - Pretend you’re not writing a macro. • I am interested in generating the tables independent of the input data. In other words, I don’t want to specify the individual datasets for each scale (eg. NAPPETITE, DPAIN) but want SAS to figure it out for me. • The following code creates a variable that includes the list of relevant datasets from a specified directory using SASHELP tables. • SASHELP tables are outside the scope of this paper but if you’re interested I’ve included the reference to a paper I wrote several years ago regarding this topic. • Try not to get bogged down in the specific code but notice the patterns that I’ve highlighted which will help write our macro later.
Get “N” Population Data data N_data; length N_data $200; set vstable; by libname memname; retain N_data; if first.libname then do; N_data=trim(left(libname))||"."||trim(left(memname));; end; else do; N_data=trim(left(N_data))||" "||trim(left(libname))||"."||trim(left(memname)); end; call symput("N_data",N_data); if last.libname; run;
Step 2 - Pretend you’re not writing a macro. The PROC PRINT of the newly created N_DATA variable containing the dataset lists is below: N_data C30AUC.NAPPETITE C30AUC.NCONSTIPATION C30AUC.NDIARRHOEA C30AUC.NEMOTIONAL C30AUC.NFATIGUE C30AUC.NGLOBAL C30AUC.NINSOMNIA C30AUC.NNAUSEA C30AUC.NPAIN C30AUC.NPHYSICAL C30AUC.NROLE Note how this list matches the SET statement we previously wrote out manually earlier in Step 2: data N; set C30AUC.NAPPETITE C30AUC.NCONSTIPATION C30AUC.NDIARRHOEA C30AUC.NEMOTIONAL C30AUC.NFATIGUE C30AUC.NGLOBAL C30AUC.NINSOMNIA C30AUC.NNAUSEA C30AUC.NPAIN C30AUC.NPHYSICAL C30AUC.NROLE; run;
Step 3 - Look for patterns and variability. • Now that you know your code is working well, it is time to start thinking about introducing macro code. • First, let’s revisit the annotated shell to look for patterns and variability in dataset and variable naming conventions. • Next, revisit the code from Step 2 for the N data and compare it to the same code for the M data. • The locations of the “N” and “M” differences demonstrate a great place to add a macro parameter. • Further, by noting the similarities in variable naming conventions (eg. mean, estimate, etc.) we know the data can be set easily together without the need to rename variables.
Get “N” Population Data data N_data; length N_data $200; set vstable; by libname memname; retain N_data; if first.libname then do; N_data=trim(left(libname))||"."||trim(left(memname));; end; else do; N_data=trim(left(N_data))||" "||trim(left(libname))||"."||trim(left(memname)); end; call symput("N_data",N_data); if last.libname; run;
Get “M” Population Data data M_data; length M_data $200; set vstable; by libname memname; retain M_data; if first.libname then do; M_data=trim(left(libname))||"."||trim(left(memname));; end; else do; M_data=trim(left(M_data))||" "||trim(left(libname))||"."||trim(left(memname)); end; call symput(“M_data",M_data); if last.libname; run;
Step 4 - Vary one parameter at a time. • It’s best to vary one parameter at a time and check your work before continuing to add additional parameters. • By varying only one parameter at a time the code will be easier to debug should there be problems. • See Handout Reference #3 for the macro %AUC. • Similar in concept to varying one parameter at a time is to work from the “inside to the outside.” • When working with “inner” and “outer” macros (often referred to as “nested”) it is best to work on the inner macros and then the outer macros to reduce complexity in debugging.
Step 4 - Vary one parameter at a time. Remember that we’re generating not one but three similar tables. The titles are below: Table 1.1 Summary of Statistical Analysis of Time-Adjusted PRO AUC by ScaleEORTC QLQ - C30QOL Analysis Set Table 1.2 Summary of Statistical Analysis of Time-Adjusted PRO AUC by ScaleEORTC QLQ – LC13QOL Analysis Set Table 1.3 Summary of Statistical Analysis of Time-Adjusted PRO AUC by ScaleEORTC QLQ – TSQMQOL Analysis Set
Step 4 - Vary one parameter at a time. • Similar in concept to varying one parameter at a time is to work from the “inside to the outside.” • When working with “inner” and “outer” macros (often referred to as “nested”) it is best to work on the inner macros and then the outer macros to reduce complexity in debugging. • See Handout Reference #4 for example of nested macros. • Note that %AUC generates the interior of the table (eg. columns, rows, etc.) and is the inner macro. • Note that %PRO generates each individual table (eg. t_pro_auc_c30) and is the outer macro.
Step 5 - Put it all together. • Do a complete run of your program and fix any errors, warnings, etc. in the log file. • Review results and compare back to the original assignment. Have you included all of the necessary components? • Identify additional places of variability and if applicable, adjust the code and start step #5 again. • Once you’re convinced the output is what is desired, review the macro code and clean-up as necessary. Check for indentation for better readability and document with comments throughout the macro especially for large, nested macros. • See Handout Reference #5 for complete code.
Another Example Using the Five Steps: Vital Sign Data The Task: • Create two datasets containing vital sign data. • Blood Pressure Measurements (Systolic and Diastolic BP) • Non-Blood Pressure Measurements (Pulse, Respiration and Temperature). • Add unit labels to each vital sign parameters (eg. mmHG for BP measurements).
Five Easy Steps to Designing and Writing Macros • Step 1 - Know where you are and where you’re going. • Step 2 - Pretend you’re not writing a macro. • Step 3 - Look for patterns. • Step 4 - Vary one parameter at a time. • Step 5 - Put it all together.
Step 1 – Know where you are and where you’re going: Where are we? SUBJECT RESULT TEST 1 96.000 PULSE 1 18.000 RESP 1 36.800 TEMP 1 92.000 SYSBP 1 60.000 DIABP 2 96.000 PULSE 2 18.000 RESP 2 36.700 TEMP 2 129.000 SYSBP 2 84.333 DIABP
Step 1 – Know where you are and where you’re going: Where are we going? Data set with Blood Pressure Parameters: SUBJECT TEST RESULT UNIT 1 SYSBP 92.000 mmHg 2 SYSBP 129.000 mmHg 3 SYSBP 90.000 mmHg 4 SYSBP 120.667 mmHg 5 SYSBP 144.000 mmHg 1 DIABP 60.000 mmHg 2 DIABP 84.333 mmHg 3 DIABP 60.667 mmHg 4 DIABP 73.333 mmHg 5 DIABP 84.000 mmHg
Step 1 – Know where you are and where you’re going: Where are we going? Data set with non-Blood Pressure Parameters: SUBJECT TEST RESULT UNIT 1 PULSE 96.00 Beats per minute 2 PULSE 96.00 Beats per minute 3 PULSE 77.00 Beats per minute 4 PULSE 74.00 Beats per minute 5 PULSE 80.00 Beats per minute 1 RESP 18.00 Breaths per minute 2 RESP 18.00 Breaths per minute 3 RESP 18.00 Breaths per minute 4 RESP 16.00 Breaths per minute 5 RESP 18.00 Breaths per minute 1 TEMP 36.80 Degrees Celsius 2 TEMP 36.70 Degrees Celsius 3 TEMP 37.06 Degrees Celsius 4 TEMP 36.60 Degrees Celsius 5 TEMP 36.89 Degrees Celsius
Step 2 - Pretend you’re not writing a macro. data pulse; length UNIT $50; set vitals_v; if test='PULSE'; UNIT='Beats per minute'; run; data resp; length UNIT $50; set vitals_v; if test='RESP'; UNIT='Breaths per minute'; run;
Step 3 - Look for patterns. data pulse; length UNIT $50; set vitals_v; if test=“PULSE”; UNIT=“Beats per minute”; run; data resp; length UNIT $50; set vitals_v; if test=“RESP”; UNIT=“Breaths per minute”; run;
Step 4 - Vary one parameter at a time. %macro inner(test=); data &test; /*Vary the TEST parameter first*/ length UNIT $50; set vitals_v; if test=“&test”; UNIT=“Beats per minute”; /*Vary the UNIT parameter next*/ run; %mend inner; %inner(test=PULSE) %inner(test=RESP)
Step 5 - Put it all together. %macro outer(name=,data=,label=); %macro inner(test=,unit=); data &test; length UNIT $50; set vitals_v; if test="&test"; UNIT="&unit"; run; %mend inner; %inner(test=RESP,unit=Breaths per minute); %inner(test=PULSE,unit=Beats per minute); %inner(test=TEMP,unit=Degrees Celsius); %inner(test=SYSBP,unit=mmHg); %inner(test=DIABP,unit=mmHg); data &name; set &data; run; %mend outer; %outer(name=BP,data=sysbp diabp,label=Blood Pressure Parameters); %outer(name=NON_BP,data=pulse resp temp,label=non-Blood Pressure Parameters);
Step 5 - Put it all together. %macro outer(name=,data=,label=); %macro inner(test=,unit=); data &test; length UNIT $50; set vitals_v; if test="&test"; UNIT="&unit"; run; %mend inner; %inner(test=RESP,unit=Breaths per minute); %inner(test=PULSE,unit=Beats per minute); %inner(test=TEMP,unit=Degrees Celsius); %inner(test=SYSBP,unit=mmHg); %inner(test=DIABP,unit=mmHg); data &name; set &data; run; %mend outer; %outer(name=BP,data=sysbp diabp,label=Blood Pressure Parameters); %outer(name=NON_BP,data=pulse resp temp,label=non-Blood Pressure Parameters);
Step 5 – Put it all together. Data set with Blood Pressure Parameters: SUBJECT TEST RESULT UNIT 1 SYSBP 92.000 mmHg 2 SYSBP 129.000 mmHg 3 SYSBP 90.000 mmHg 4 SYSBP 120.667 mmHg 5 SYSBP 144.000 mmHg 1 DIABP 60.000 mmHg 2 DIABP 84.333 mmHg 3 DIABP 60.667 mmHg 4 DIABP 73.333 mmHg 5 DIABP 84.000 mmHg
Step 5 – Put it all together. Data set with non-Blood Pressure Parameters: SUBJECT TEST RESULT UNIT 1 PULSE 96.00 Beats per minute 2 PULSE 96.00 Beats per minute 3 PULSE 77.00 Beats per minute 4 PULSE 74.00 Beats per minute 5 PULSE 80.00 Beats per minute 1 RESP 18.00 Breaths per minute 2 RESP 18.00 Breaths per minute 3 RESP 18.00 Breaths per minute 4 RESP 16.00 Breaths per minute 5 RESP 18.00 Breaths per minute 1 TEMP 36.80 Degrees Celsius 2 TEMP 36.70 Degrees Celsius 3 TEMP 37.06 Degrees Celsius 4 TEMP 36.60 Degrees Celsius 5 TEMP 36.89 Degrees Celsius
Conclusion • Learning to write macro code is a useful skill for the SAS programmer. • Macros can make your programs more efficient and reduce the amount of code writing that needs to be done. • By learning a few basic concepts of macro writing, the beginning SAS programmer can easily get started in designing and writing macros.