380 likes | 581 Views
Tap Into the Power of Formats. Introduction Pre-defined formats Custom, User-defined formats More FORMAT Applications Grouping/Recoding Data Table lookups The CTNLIN statement Creating new variables MULTILABEL formats. Tap Into the Power of Formats. Permanent formats Conclusions.
E N D
Tap Into the Power of Formats • Introduction • Pre-defined formats • Custom, User-defined formats • More FORMAT Applications • Grouping/Recoding Data • Table lookups • The CTNLIN statement • Creating new variables • MULTILABEL formats
Tap Into the Power of Formats • Permanent formats • Conclusions
Introduction/Review • Formats are instructions that tell SAS how to display variables in output. • These instructions change the look of the variables by mapping one value into another • Example: • SAS date of 17318 will look like June 1, 2007 after a format is applied
Pre-Defined Formats • SAS provides pre-defined formats • SAS has already done the programming – you just apply them to your variables • Examples: • A currency format of: DOLLAR7.2will change: 1346678.32into: $1,346,678.32 • A date/time format of: DATE9. will change: 17457into: 18OCT2007
Pre-Defined Formats • Can be used • when reading in data (an informat – permanent) • during a data step (a format - permanent) • during a procedure (a format - temporary) • Formats accept character or numeric input and always produce character output • Informats always expects character input and can produce character or numeric output
Assigning Informats • Example • DATA MYDIR.NEWFILE; INFILE C011960.PEDS.FCLAIMS.DATA.MAY2203.G0001V00; INPUT @1 HRN $9. @11 CAREDATE YYMMDD10. @23 DEPTNUM 5. ; RUN;
Assigning Formats to Variables in a Data Step • Formats can be used for variables already in a SAS data set that are not in the format you desire • Example • DATA MYDIR.EMPLOYEES;SET OLD; FORMAT BIRTHDATE DATE9.; RUN;
Assigning Formats to Variables in a PROC • Formats can be used to change the look temporarily for a listing or report • Example • PROCPRINT DATA=MYDIR.EMPLOYEES; TITLE 'ACME WIDGET EMPLOYEE BIRTHDAY LIST'; VAR DOB; FORMAT Birthdate worddate18.; RUN;
Custom Formats • Used when pre-defined formats don’t fit your needs • Can be permanent or temporary • Used to: • Convert character strings into numbers • Convert numeric strings into character values • Convert character strings into other character strings
PROC FORMAT • PROC FORMAT Syntax: • PROCFORMAT <option(s)>; VALUE <$>name value_specifications; INVALUE <$>name invalue_specifications; EXCLUDE entry(s); PICTURE name picture_specifications; SELECT entry(s); RUN;
Format Types • Format Types • Character format designated with a “$” before format name, Numeric does not need “$” • Incoming values encased with quotes for character formats, incoming values do not need quotes for numeric data
FORMAT Requirements • Format names limited to 8 characters for SASv8 and earlier and 32 characters for SAS9 and later • Character formats must start with a “$” • FORMAT name cannot begin or end with a number • FORMAT names cannot be identical to pre-defined formats • All mapped values in the VALUE statement must be of the form: • Existing_value = Format_value
Group/Recode Data • Formats can be used to define groups of data for meaningful, project-specific reporting • Replaces IF/THEN/ELSE statements in a DATA step • FORMAT more efficient • Easier to update/maintain
Group/Recode Data • Example • PROC FORMAT ; VALUE PERCEPTION 1 , 2 , . = 'Not at all Serious/Not Very Serious' 3 , 4 , 5 = 'Somewhat Serious/Very Serious' ; VALUE $ REGION 'Franklin','ED','Skyline','Midtown‘ = 'Central ' 'Lakewood','Arapahoe' = 'West ' 'East','Long Term Care','PARC' = 'East ' 'Westminster','Wheatridge','Boulder‘= 'North ' 'Southwest','Smoky Hill','Englewood'= 'South ' other = 'Out Of Area' ; RUN; PROC FREQ DATA=NEWDATA; FORMAT LOC $region. RISK4 perception.; TABLES LOC RISK4; RUN;
Group/Recode Data • Output
Table Lookups • Linking data in one file to related data in another file • MERGE often used • Sort data • Merge datasets together • PROC FORMAT alternative • Less coding • Less CPU • Easier to update with new or different values
Table Lookups • The VALUE statement in PROC FORMAT • PROCFORMAT; VALUE LOCDESC 2 = 'Franklin' 3 = 'Lakewood' 4 = 'ED' 6 = 'East' 7 = 'Westminster' 8 = 'Arapahoe' 9 = 'Southwest' 10 = 'Wheatridge' 77 = 'PARC' ; RUN; PROCFREQ DATA=BYLOCATION; FORMAT LOCATION $LOCDESC.; TABLES LOCATION; RUN;
Table Lookups • The CTNLIN Statement • Used for when you have substantially longer lists of lookups (e.g., ICD-9 descriptions linked to the ICD-9 codes, N=8,000) • Lookup dataset containing ICD-9 and description • Study dataset containing all study data
Table Lookups • The CTNLIN Statement • DATA CONTROL (KEEP=FMTNAME START LABEL); SET MYDIR.DX_CD (KEEP=DX DX_DESC); RETAIN FMTNAME '$DIAGDESC'; RENAME DX=START DX_DESC=LABEL; RUN; PROCSORT DATA=CONTROL NODUPKEY; BY START; RUN; PROCFORMAT CNTLIN=CONTROL; RUN; PROCPRINT DATA=VISITS (OBS=10); VAR DX ; FORMAT DX $DIAGDESC.; TITLE1 'TABLE LOOKUPS USING THE CNTLIN OPTION'; RUN;
Table Lookups • Output using CTNLIN Statement
Creating New Variables • Alternative for transforming data • Can be created from both pre-defined formats or custom, user-defined formats
Creating New Variables • Example • Converting numeric dates into SAS dates using the PUT and INPUT functions • DATA NEWSTUDY;SET MAINFRAME; FORMAT NEWDATE DATE9.; NEWDATE = INPUT( PUT ( OLDDATE,8. ), YYMMDD8.); RUN; PROCPRINT DATA=NEWSTUDY; VAR OLDDATE NEWDATE; RUN;
Creating New Variables • Output
Specifying Ranges of Values • Single values • 6 = “East” • List of values separated by commas • ‘One’,’Two’,’Three’ = “Group 1” • 221, 335, 589 = “Numeric Group” • Inclusive ranges • Numeric or character • 1 – 18 = “Pediatrics”
Specifying Ranges of Values • Exclusive ranges • Numeric or character • 1 -- < 19 = “Pediatrics” • 65 < -- 100 = “Medicare” • Extreme values • Does not format missing values for numeric formats. Missing values included for character formats • Low --< 65 = “Non-Medicare” • 65 – high = “Medicare” • Not specified or out of range • Includes missing values unless accounted for already in the VALUE statement
Multilabel formats • Used when you want to assign a single value to more than one range of values or overlapping range values • Counts single observations in more than one place • In SASv8 and later • Only MEAN, SUMMARY and TABULATE procedures can utilize Multilabel formats
MULTILABEL Formats • Example • We want to group patients in the following age groups • Pediatrics • Adult • Medicare • 0 – 12 • 13 – 18 • 19 – 64 • 65 and higher
MULTILABEL Formats • Example code • PROCFORMAT; VALUE MLTI_FMT (MULTILABEL) 0 - 18 = 'PEDIATRICS' 18 <- <65 = 'ADULT' 65 - HIGH = 'MEDICARE' 0 - 12 = '0 - 12' 13 - 18 = '13 - 18' 19 -< 65 = '19 - 64' 65 - HIGH = '65 & OLDER' ; QUIT;RUN; PROCMEANS DATA=NPDW.JAN98 MEAN MEDIAN; TITLE 'UTILIZING MULTIFORMATS'; FORMAT SVC_AGE MLTI_FMT.; CLASS SVC_AGE/MLF; VAR SVC_AGE; RUN;
MULTILABEL Formats • Output
MULTILABEL Formats • Alternative Code for other sorting • PROCFORMAT; VALUE MLTI_FMT (MULTILABEL NOTSORTED) 0 - 12 = '0 - 12' 13 - 18 = '13 - 18' 19 -< 65 = '19 - 64' 65 - HIGH = '65 & OLDER' 0 - 18 = 'PEDIATRICS' 18 <- <65 = 'ADULT' 65 - HIGH = 'MEDICARE' ; RUN; PROCMEANS DATA=NPDW.JAN98 MEAN MEDIAN; TITLE 'UTILIZING MULTIFORMATS - NOTSORTED'; FORMAT SVC_AGE MLTI_FMT.; CLASS SVC_AGE/MLF PRELOADFMT ORDER=DATA; VAR SVC_AGE; RUN;
MULTILABEL Formats • Output
Saving Formats • Unless otherwise indicated, formats created in your program will be temporary and stored in the temporary Formats Catalog. • Temporary formats deleted when SAS is shut down • Permanent Formats Catalogs may be designated for permanent storage of formats • Maintaining and revising code done only once • Accessible without copying from another program • Accessible to multiple people when saved to a server
Saving Formats • Establish library first • Use LIBNAME statement or New Library window • Default for SAS is to search only the library of predefined SAS formats and in the temporary Formats Catalog in the WORK library • FMTSEARCH option will give SAS explicit search instructions
Saving Formats • Example • LIBNAME DSUG 'G:\ANALYST SAS CODE'; PROCFORMAT LIBRARY=DSUG; VALUE LOCDESC 2 = 'Franklin' 3 = 'Lakewood' 4 = 'ED' 6 = 'East' 7 = 'Westminster' 8 = 'Arapahoe' 9 = 'Southwest' 10 = 'Wheatridge' 77 = 'PARC' ; RUN; OPTIONS FMTSEARCH = (DSUG) NOFMTERR; PROCFREQ DATA=DSUG.LOCATIONS; FORMAT LOCATION $LOCDESC.; TABLES LOCATION; RUN;
Conclusion • FORMATS may be underutilized • FORMATS are powerful • FORMATS are easy to learn • FORMATS are straightforward to use