180 likes | 487 Views
Using PROC REPORT to Cross Tabulate Multiple Response Items. Patrick Thornton SRI International. Outline. Example of a Multiple Response Item Variable coding and example data A Cross Tabulation using Proc REPORT Examine the Proc REPORT Syntax for: Grouping and counting observations
E N D
Using PROC REPORT to Cross Tabulate Multiple Response Items Patrick Thornton SRI International
Outline • Example of a Multiple Response Item • Variable coding and example data • A Cross Tabulation using Proc REPORT • Examine the Proc REPORT Syntax for: • Grouping and counting observations • Creating a summary row • Adding a ‘Total’ label to the summary row • Creating a count and percent • Summing columns
Example of a Multiple Response Item • Respondents may leave blank those service categories they did not receive • Respondents may not provide a response to any of the service categories
Multiple Response Item Cross Tabulation using Proc REPORT • Total: • Children receiving at least one service • Children and percent of children receiving each service • Number of total services • Referral One: • Number of children receiving at least one service • Number of children receiving each service (e.g. CHI) • Percent of children receiving each service (e.g. HV) • Number of total services
Grouping and Counting Observations proc report data=md nowd missing; col referral n; define referral /group center; define n /’Children’ center; where CHI ne . or HV ne . or TS ne . ; run; NOWD prevents the interactive window and MISSING includes OBS missing REFERRAL DEFINE with the /GROUP option is used to obtain the distinct values of REFERRAL COL determines the order of columns, REFERRAL and N where N is a key word to obtain the count of observations DEFINE N is used to label and center the column WHERE removes OBS missing on all service variables
Creating a Summary Row with RBREAK proc report data=work.multir missing nowd; col referral n; define referral /group; define n /’Children’ center; rbreak after / summarize; Run; Creates total number of children (OBS)
A Closer Look at RBREAK proc report data=work.multir missing nowd out=d; col referral n; define referral /group; define n /’Children’ center; rbreak after / summarize; run; Proc print data=d noobs; run;
Adding a ‘Total’ Label to the Summary Row col referral newcol n; define referral /group noprint; define n /’Children’ center; define newcol/computed ‘Referral’; compute newcol /char length=15; if _BREAK_="_RBREAK_" then newcol = 'Total'; else newcol = put(referral,myrows.); endcomp; rbreak after / summarize; Adds NEWCOL, a computed variable not in the data set Defines NEWCOL as COMPUTED Hides the REFERRAL variable Start a compute block, NEWCOL as a 15 length character variable Test _BREAK_ for summary row and set NEWCOL to “Total,” otherwise set to formatted value of REFERRAL.
Creating a Service Count and Percent of Children col referral newcol n CHI PCHI; define referral /group noprint; define n / ‘Children’ center; define CHI / analysis ‘CHI’ sum format=5.0 center; define PCHI /computed ‘%CHI’ format=percent8.1 center ; compute PCHI; PCHI = CHI.sum/n; endcomp; PCHI is a column and defined as COMPUTED CHI is a column and defined as a SUM PCHI is COMPUTED as CHI.SUM / n Repeat for each Service
Summing Columns to Create Count of Total Services col referral newcol n CHI PCHI HV PHV TS PTS total; define total/computed "Services" width=10 center; compute total; total = sum(chi.sum, hv.sum,ts.sum); endcomp; As shown on previous slide, service sums and computed percents. Total column is added. Total is sum of the sums of each service variable Total is defined as COMPUTED
Complete PROC REPORT Syntax • compute newcol /char length=15; • IF _BREAK_='_RBREAK_' THEN newcol = 'Total'; • else newcol = put(referral,myrows.); • endcomp; • compute p1; • p1 = CHI.sum /n; • endcomp; • compute p2; • p2 = HV.sum /n; • endcomp; • compute p3; • p3 = TS.sum /n; • endcomp; • compute total; • total = sum(CHI.sum , HV.sum , TS.sum); • endcomp; • rbreak after /summarize; • where CHI ne . or HV ne . or TS ne . ; • run; proc report data=work.multir missing nowd ; col referral newcol n CHI p1 HV p2 TS p3 total; define referral /group left width=15 noprint; define newcol /computed "Referral" ; define n/format=8.0 "Children" center; define CHI / analysis sum "CHI" format=5.0 center; define HV / analysis sum "HV" format=5.0 center; define TS / analysis sum "TS" format=5.0 center; define p1 /computed "%" format=percent8.1 center ; define p2 /computed "%" format=percent8.1 center ; define p3 /computed "%" format=percent8.1 center ; define total/computed "Services" width=10 center;
To Reuse this Syntax, What Needs to Change? • Libref and Data Set • Row variable, label and format • Observation Label • Total Label • Variables for the multiple response item • Labels for each variable proc report data=work.multir missing nowd; col referral newcol n CHI P1 total; define referral /group noprint; define n / ‘Children’ center; define newcol/computed ‘Referral’; define CHI / analysis ‘CHI’ sum; define P1 /computed ‘%’; compute newcol /char length=15; if _BREAK_=‘_RBREAK_’ then newcol= 'Total'; else newcol = put(referral,myrows.); endcomp; compute P1; P1 = CHI.sum /n; endcomp; define total/computed ‘Services’ width=10; compute total; total = sum(chi.sum); endcomp; rbreak after / summarize; where CHI ne .; run;
Macro Program : Six Elements to Build • %macro chooseall(lib,d,row,rowlabel,rowformat,obslabel,totallabel,series,labels); • proc report data=&lib..&d missing nowd; • col &row newcol n [1 SERIES AND PERCENT] total; [e.g. CHI p1 HV p2 TS p3] • define &row /group left width=15 noprint; • define newcol /computed “ &rowlabel “; • define n/format=8.0 “ &obslabel " center; • [2 DEFINE EACH SERIES VARIABLE ] [e.g. define CHI / analysis sum "CHI";] • [3 DEFINE PERCENT FOR EACH] [e.g. define p1 /computed "%“ ;] • define total/computed “ &totallabel " width=10 center; • compute newcol /char length=15; • IF _BREAK_='_RBREAK_' THEN newcol = 'Total'; • else newcol = put( &row ,” &rowformat.. “); • endcomp; • [4 COMPUTE EACH PERCENT] [e.g. compute p1; p1=CHI.sum /n; endcomp;] • compute total; • [5 COMPUTE TOTAL] [e.g. total = sum(CHI.sum , HV.sum , TS.sum);] • endcomp; • rbreak after /summarize; • [6 CREATE A WHERE STATEMENT] [e.g. where CHI ne . or HV ne . or TS ne . ;] • run; • %mend;
Macro Program: Building the Six Elements %macro chooseall(lib,d,row,rowlabel,rowformat,obslabel,totallabel,series,labels); %let vargroup =; %*[1 COLUMN SERIES AND PERCENTS] ; %let define =; %*[2 DEFINE EACH SERIES VARIABLE]; %let definep =; %*[3 DEFINE PERCENT FOR EACH]; %let computedp =; %*[4 COMPUTE EACH PERCENT] ; %let total=; %* [5 COMPUTE TOTAL]; %let where=; %* [6 CREATE A WHERE STATEMENT]; %*Iterate through each variable in series; %let i = 1; %do %until(%scan(&series,&i)= | &i > 50); %let var = %scan(&series,&i); %let label = %scan(&labels,&i,%str(^)); %let i = %eval(&i + 1); %end; %*Remove the comma and or; %let total = %substr(&total,1,%eval(%length(&total) - 1)); %let where = %substr(&where,1,%eval(%length(&where) - 2)); %*[PROC REPORT FROM PREVIOUS SLIDE]; %mend; %* [5 COMPUTE TOTAL]; %let total= &total &var..sum %str(,); %*[2 DEFINE EACH SERIES VARIABLE ]; %let define =&define define &var / analysis sum "&label" format=5.0 center%str(;); %*[3 DEFINE PERCENT FOR EACH]; %let definep = &definep define p&i /computed "%" format=percent8.1 center %str(;); %* [6 CREATE A WHERE STATEMENT]; %let where = &where &var ne %str(.) or; %*[1 COLUMN SERIES AND PERCENTS] ; %let vargroup = &vargroup &var p&i;
Conclusion • PROC REPORT can Generate Client-Quality Tabulations of Multiple Response Items (MRI) • A Relatively Concise Macro can be Used to Generate the Extended PROC REPORT Syntax • Limitations • The MRI variables must be coded 1=Yes, 0=No • Many MRI variables requires table breaks
Contact information Your comments and questions are valued and encouraged. Patrick Thornton Ph.D. Sr. Scientific Programmer/AnalystSRI International, Center for Education and Human ServicesPhone: 650 859-5583Fax: 650 859-2861Email: patrick.thornton@sri.com Web: www.sri.com