230 likes | 376 Views
Lesson 5 - Topics. Formatting Output Working with Dates Reading: LSB:3:8-9; 4:1,5-7; 5:1-4. Annotating SAS Output. TITLE statements - label procedure output LABEL statements - label names of variables FORMAT statements - label values of variables. Standard Output
E N D
Lesson 5 - Topics • Formatting Output • Working with Dates • Reading: LSB:3:8-9; 4:1,5-7; 5:1-4
Annotating SAS Output • TITLE statements - label procedure output • LABEL statements - label names of variables • FORMAT statements - label values of variables
Standard Output The FREQ Procedure Cumulative Cumulative clinic Frequency Percent Frequency Percent ----------------------------------------------------------- A 18 18.00 18 18.00 B 29 29.00 47 47.00 C 36 36.00 83 83.00 D 17 17.00 100 100.00 Annotated Output Number of Patients by Clinic The FREQ Procedure Clinical Center Cumulative Cumulative clinic Frequency Percent Frequency Percent ---------------------------------------------------------------- Birmingham 18 18.00 18 18.00 Chicago 29 29.00 47 47.00 Minneapolis 36 36.00 83 83.00 Pittsburgh 17 17.00 100 100.00 TITLE LABEL for clinic FORMAT for clinic
Standard Output The FREQ Procedure Cumulative Cumulative sebl_6 Frequency Percent Frequency Percent ----------------------------------------------------------- 1 70 70.00 70 70.00 2 23 23.00 93 93.00 3 6 6.00 99 99.00 4 1 1.00 100 100.00 Annotated Output The FREQ Procedure Patient Report Headaches Cumulative Cumulative sebl_6 Frequency Percent Frequency Percent ------------------------------------------------------------- None 70 70.00 70 70.00 Mild 23 23.00 93 93.00 Moderate 6 6.00 99 99.00 Severe 1 1.00 100 100.00 LABEL for sebl_6 FORMAT for sebl_6
TITLE STATEMENTS • PROCFREQDATA=tdata; • TABLES clinic group sex educ sebl_1 sebl_6; • TITLE'Distribution of Selected Variables'; • TITLE2'on the TOMHS Dataset' ; • RUN; • TITLE statements can go anywhere in the program. Good practice to put under PROC • Can change the titles at any time • TITLE(n)'text' is general syntax
Label Statements LABEL clinic = 'Clinical Center'; LABEL group = 'Drug Treatment Group'; LABEL educ = 'Highest Education Attained'; LABEL sebl_1 = 'Patient Report Drowsiness'; LABEL sebl_6 = 'Patient Report Headaches'; Label statements can go anywhere in the datastep or under a procedure (But not in-between!)
Format Statements • FORMAT brthdate mmddyy10. ; • FORMAT group groupF. ; • FORMAT fever headache seF. ; • FORMAT clinic $clinicF. ; • Tells SAS to display the values of the variable according to the format. • Format statements can go anywhere in the datastep or under a procedure • There are build in formats (e.g. dates) and user defined formats (which need to be defined using PROC FORMAT) • A format can apply to more than one variable. • Formats end with a period (.) • Character formats begin with a $
How to Make User Defined FORMATS PROCFORMAT; VALUE groupF 1 = 'Beta Blocker' 2 = 'Calcium Channel Blocker' 3 = 'Diuretic' 4 = 'Alpha Blocker' 5 = 'ACE Inhibitor' 6 = 'Placebo'; VALUE seF 1 = 'None'2 = 'Mild’ 3 = 'Moderate'4 = 'Severe'; The format name does NOT have to be the name of a variable on the dataset. It cannot end in a number. Name of format
PROCFORMAT; VALUE $clinicF 'A' = 'Birmingham' 'B' = 'Chicago' 'C' = 'Minneapolis' 'D' = 'Pittsburgh' ; Don't confuse the format with the variable(s) to be formatted! From PROC FORMAT alone SAS does not know which variables you plan to format with the given format. You need to apply format to the variable using the format statement
LOG FILE PROC FORMAT; 7 VALUE groupF 1 = 'Beta Blocker' 2 = 'Calcium Channel Blocker' 8 3 = 'Diuretic' 4 = 'Alpha Blocker' 9 5= 'ACE Inhibitor' 6 = 'Placebo'; NOTE: Format GROUPF has been output. 10 11 18 19 VALUE se 1 = 'None' 2 = 'Mild' 3 = 'Moderate' 4 = 'Severe'; NOTE: Format SE has been output. 20 21 22 23 VALUE $clinic 'A' = 'Birmingham' 'B' = 'Chicago' 24 'C' = 'Minneapolis' 'D' = 'Pittsburgh' ; NOTE: Format $CLINICF has been output. 25 26 run;
* Applying the formats ; PROCFREQ; TABLES clinic sebl_6; FORMAT clinic $clinicF. sebl_6 seF. ; RUN;
* Program 7; PROC FORMAT; ... DATA tdata ; INFILE‘C:\SAS_Files\tomhs.data' ; INPUT @ 1 ptid $10. @ 12 clinic $1. @ 25 group 1. @ 30 sex 1. @ 49 educ 1. @ 51 eversmk 2. @230 alcbl 1. @236 sebl_1 1. @246 sebl_6 1. ; LABEL clinic = 'Clinical Center'; LABEL group = 'Drug Treatment Group'; LABEL educ = 'Highest Education Attained'; LABEL sebl_1 = 'Patient Report Drowsiness'; LABEL sebl_6 = 'Patient Report Headaches'; LABEL alcbl = 'Alcoholic Drinks Per Week'; LABEL eversmk = 'Ever Smoke Cigarettes'; PROCFREQ DATA=tdata; TABLES clinic sebl_6; FORMAT clinic $clinicF. sebl_6 seF. ;
Items to Remember • Formats need to be defined before you use them (PROC FORMAT). • Formats are applied by using the FORMAT statement. • Label and format statements in the datastep apply to all subsequent PROCs • Label and format statements under a PROC apply only to that PROC
Working With Dates:Dates Come in Many Ways • 10/18/04 • 18/10/04 • 10/18/2004 • 18OCT2004 • 101804 • October 18, 2004 Need to know how to read-in dates and then work with them
What do you want to do with dates? • Display them • Compare two dates: find the number of days between 2 dates ndays = date2 - date1; Will this work? Problem: dates do not subtract well What if: date2 = 03/02/2003 date1 = 08/02/2002 ========== -05/00/0001
DATA dates; INFILE DATALINES; INPUT @1 brthdate mmddyy10.; * Use informat; DATALINES; 03/03/1971 02/14/1956 01/01/1960 ; PROCPRINT; VAR brthdate; PROCPRINT; VAR brthdate; FORMAT brthdate mmddyy10.; ------------------------------------------------------ Obs brthdate 1 4079 2 -1417 3 0 Obs brthdate 1 03/03/1971 2 02/14/1956 3 01/01/1960 Jan 1, 1960
When you read in a variable with a date informat • SAS makes the variable numeric • SAS assigns the numeric value relative to • January 1, 1960 • This makes it easy to subtract two dates to get the number of days between the dates. • dayselapsed = date2 – date1; • FORMAT date1 date2 mmddyy10.; • Note: Once read in SAS treats the variable as it does any numeric variable.
* Program 8 ; DATA age; INFILE‘C:\SAS_Files\tomhs.data' ; INPUT @14 randdate mmddyy10. @34 brthdate mmddyy10. @74 date12 mmddyy10. ; agedays = randdate - brthdate ; ageyrs = (randdate - brthdate)/365.25; ageint = INT( (randdate - brthdate)/365.25); * Can also use YRDIF function; ageyrsX = yrdif(brthdate,randdate,'Actual'); yrrand = YEAR(randdate);
PROCPRINTDATA=age (obs=10); VAR brthdate randdate agedays ageyrs ageyrsX ageint ; TITLE'Printing Dates Without a Date Format'; RUN; PROCPRINTDATA=age (obs=10); VAR brthdate randdate agedays ageyrs ageyrsX ageint ; FORMAT brthdate mmddyy10. randdate mmddyy10.; TITLE'Printing Dates With a Date Format'; RUN;
Printing Dates Without a Date Format Obs brthdate randdate agedays ageyrs ageyrsX ageint 1 -8589 10175 18764 51.3730 51.3739 51 2 -6880 10239 17119 46.8693 46.8711 46 3 -12572 10002 22574 61.8042 61.8055 61 4 -9592 10175 19767 54.1191 54.1205 54 5 -12996 10280 23276 63.7262 63.7268 63 All before 1960
Printing Dates With a Date Format Obs brthdate randdate 1 06/26/1936 11/10/1987 2 03/01/1941 01/13/1988 3 07/31/1925 05/21/1987 4 09/27/1933 11/10/1987 5 06/02/1924 02/23/1988
PROCFREQDATA=age; TABLES yrrand ; ; TITLE'Frequency Distribution of Year Randomized'; RUN;
Frequency Distribution of Year Randomized The FREQ Procedure Cumulative Cumulative yrrand Frequency Percent Frequency Percent ----------------------------------------------------------- 1986 9 9.00 9 9.00 1987 65 65.00 74 74.00 1988 26 26.00 100 100.00