200 likes | 306 Views
Lecture 5 - Topics. Working with dates Functions in the data-step Programs 8-9 in course notes LSB 3:7-3.8. Dates Come in Many Ways. 10/18/04 18/10/04 10/18/2004 18OCT2004 101804 October 18, 2004 Need to know how to read-in dates and then work with them.
E N D
Lecture 5 - Topics • Working with dates • Functions in the data-step • Programs 8-9 in course notes • LSB 3:7-3.8
Dates Come in Many Ways • 10/18/04 • 18/10/04 • 10/18/2004 • 18OCT2004 • 101804 • October 18, 2004 Need to know how to read-in dates and then work with them
What do you want to do with dates? • Display them • Compare two dates - which one is earlier • Find the number of days between 2 dates ndays = date2 - date1 Will this work? Problem: dates do not subtract well What if: date2 = 03/02/2003 date1 = 08/02/2002 ========== -05/00/0001
DATA dates; INFILE DATALINES; INPUT @1 brthdate mmddyy10.; DATALINES; 03/03/1971 02/14/1956 01/01/1960 ; PROCPRINT; VAR brthdate; PROCPRINT; VAR brthdate; FORMAT brthdate mmddyy10.; ------------------------------------------------------ Obs brthdate 1 4079 2 -1417 3 0 Obs brthdate 1 03/03/1971 2 02/14/1956 3 01/01/1960 Jan 1, 1960
When you read in a variable with a date informat: • SAS makes the variable numeric • SAS assigns the value relative to January 1, 1960 • This makes it easy to subtract two dates to get the number of days between the dates. • dayselapsed = date2 – date1; • Note: Once read in SAS treats the variable as it does any numeric variable.
* Program 8 ; DATA age; INFILE‘C:\SAS_Files\tomhs.data' ; INPUT @14 randdate mmddyy10. @34 brthdate mmddyy10. @74 date12 mmddyy10. ; agedays = randdate - brthdate ; ageyrs = (randdate - brthdate)/365.25; ageint = INT( (randdate - brthdate)/365.25); agerysX = yrdif(brthdate,randdate,’Actual’); agetoday= (TODAY() - brthdate)/365.25 ; ageendst= (MDY(02,28,1992) - brthdate)/365.25; daysv12 = date12 - randdate; if ABS(daysv12 - 365) = .then window12 = .; else if ABS(daysv12 - 365) < 31then window12 = 1; else if ABS(daysv12 - 365) >= 31then window12 = 2; yrrand = YEAR(randdate);
PROCPRINTDATA=age (obs=10); VAR brthdate randdate agedays ageyrs ageyrsX ageint agetoday; TITLE'Printing Dates Without a Date Format'; RUN; PROCPRINTDATA=age (obs=10); VAR brthdate randdate agedays ageyrs ageyrsX ageint agetoday; FORMAT brthdate mmddyy10. randdate mmddyy10.; TITLE'Printing Dates With a Date Format'; RUN;
Printing Dates Without a Date Format Obs brthdate randdate agedays ageyrs ageyrsX ageint agetoday 1 -8589 10175 18764 51.3730 51.3739 51 69.0678 2 -6880 10239 17119 46.8693 46.8711 46 64.3888 3 -12572 10002 22574 61.8042 61.8055 61 79.9726 4 -9592 10175 19767 54.1191 54.1205 54 71.8138 5 -12996 10280 23276 63.7262 63.7268 63 81.1335 All before 1960
Printing Dates With a Date Format Obs brthdate randdate 1 06/26/1936 11/10/1987 2 03/01/1941 01/13/1988 3 07/31/1925 05/21/1987 4 09/27/1933 11/10/1987 5 06/02/1924 02/23/1988 Page 124,128 of Cody & Smith lists several date formats and informats
PROCPRINTDATA=age (OBS=20); VAR randdate date12 daysv12 window12; FORMAT randdate date12 mmddyy8.; TITLE'Printing Days From Randomization to 1st Year Visit'; RUN; PROCFREQDATA=age; TABLES yrrand ; ; TITLE'Frequency Distribution of Year Randomized'; RUN;
Obs randdate date12 daysv12 window12 1 11/10/87 11/25/88 381 1 2 01/13/88 01/09/89 362 1 3 05/21/87 . . . 4 11/10/87 11/30/88 386 1 5 02/23/88 02/13/89 356 1 6 11/12/87 11/02/88 356 1 7 12/05/86 12/03/87 363 1 8 06/12/87 06/16/88 370 1 9 01/21/88 01/09/89 354 1 10 04/16/87 04/04/88 354 1 11 08/12/87 08/10/88 364 1 12 04/16/87 05/02/88 382 1 13 02/02/88 02/08/89 372 1 14 11/04/86 11/30/87 391 1 15 05/27/87 06/08/88 378 1 16 03/29/88 07/13/89 471 2
Frequency Distribution of Year Randomized The FREQ Procedure Cumulative Cumulative yrrand Frequency Percent Frequency Percent ----------------------------------------------------------- 1986 9 9.00 9 9.00 1987 65 65.00 74 74.00 1988 26 26.00 100 100.00
* Program 9 ; DATA example; INFILE‘C:\SAS_Files\tomhs.data' ; INPUT @058 height 4 @085 weight 5 @172 ursod 3. @236 (se1-se10) (1.0 + 1); bmi = (weight*703.0768)/(height*height); rbmi1 = ROUND(bmi,1); rbmi2 = ROUND(bmi,.1); lursod = LOG(ursod); seavg = MEAN (OF se1-se10); semax = MAX (OF se1-se10); semin = MIN (OF se1-se10);
* Use of dash notation ; seavg = MEAN (OF se1-se10); This is the same as seavg = MEAN (se1,se2,se3,se4,se5,se6,se7,se8,se9,se10); The OF is very important. Otherwise SAS thinks you are subtracting se10 from se1. To use this notation the ROOT of the name must be the same.
* Two ways of computing average ; seavg = MEAN (se1,se2,se3,se4,se5,se6,se7,se8,se9,se10); Versus seavg = (se1+se2+se3+se4+se5+se6+se7+se8+se9+se10)/10; Using mean function computes the average of non-missing values. Result is missing only if all values all missing. Using + formula requires all values be non-missing otherwise result will be missing if N(of se1-se10) > 5 then seavg = MEAN(of se1-se10); What does this statement do?
PROCPRINTDATA = example (OBS=15); VAR bmi rbmi1 rbmi2 seavg semin semax ; TITLE'Listing of Selected Data for 15 Patients '; RUN; PROCFREQDATA = example; TABLES semax; TITLE'Distribution of Worse Side Effect Value'; TITLE2'Side Effect Scores Range from 1 to 4'; RUN; ODS GRAPHICS ON; PROCUNIVARIATEDATA = example PLOT; VAR ursod lursod; QQPLOT ursod lursod; TITLE'Stem and Leaf Plots for Urine Sodium Data'; RUN;
Listing of Selected Data for 15 Patients Obs bmi rbmi1 rbmi2 seavg semin semax 1 28.2620 28 28.3 1.1 1 2 2 35.9963 36 36.0 1.0 1 1 3 27.0489 27 27.0 1.0 1 1 4 28.2620 28 28.3 1.1 1 2 5 33.2008 33 33.2 1.0 1 1 6 27.7691 28 27.8 1.2 1 2 7 32.6040 33 32.6 1.0 1 1 8 22.4057 22 22.4 1.2 1 2 9 37.2037 37 37.2 1.1 1 2 10 33.1717 33 33.2 1.7 1 3
Distribution of Worse Side Effect Value Side Effect Scores Range from 1 to 4 The FREQ Procedure Cumulative Cumulative semax Frequency Percent Frequency Percent ---------------------------------------------------------- 1 33 33.00 33 33.00 2 52 52.00 85 85.00 3 13 13.00 98 98.00 4 2 2.00 100 100.00 2 patients had at least 1 severe side effect
Exercise 5 • Working with dates