260 likes | 413 Views
SAS Basics. Windows. Program Editor Write/edit all your statement here. Windows continue …. Log Watch this for any errors in program as it runs. Windows continue …. Output Will automatically pop in front when there is output.
E N D
Windows Program Editor Write/edit all your statement here.
Windows continue… Log Watch this for any errors in program as it runs
Windows continue… • Output • Will automatically pop in front when there is output. • Does not need to occupy screen space during program editing.
File Organization • Create subfolders in your Project folder for • Data • Contains SAS datasets, with .sd2 extension • Formats • Compiled version of formats, a file with .sc2 extension. Used for building classes of variables for looking at frequencies. • Output • Save output files here. These are text files with a .sas extension. • Programs • All programs are text files with .sas ending.
Creating a dataset • InternalData DATAdatasetname; INPUT name $ sex $ age; CARDS; John M 23 Betty F 33 Joe M 50 ; RUN;
Creating a dataset • ExternalData DATAdatasetname; INFILE ‘c:\folder\subfolder\file.txt’; INPUT name $ sex $ age; ; RUN;
Creating from an existing one DATAsave.data2 (keep = age income); SETsave.data1; RUN; DATAsave.data2; SETsave.data1; DROP age; TAX = income*0.28; RUN;
Permanent Data Sets LIBNAMEsave‘c:\project\data’; DATAsave.data1; X=25; Y=X*2; RUN; Note that save is merely a name you make up to point to a location where you wish to save the dataset called data1. (It will be saved as data1.sd2)
What’s in my SAS dataset? PROC CONTENTS data=save.data1; RUN; PROC CONTENTS data=save.data1POSITION; RUN; This will organize the variable list sorted alphabetically and a duplicate list sorted by position (the sequence in which they actually exist in the file).
Viewing file contents PROC PRINT data=save.data1; run; PROC PRINT data=save.data1 (obs=5); VAR name age; RUN; PROC PRINT data=save.data1 (obs=12); VAR age -- income; RUN;
Frequencies/Crosstabs PROC FREQ data=save.data1; TABLES age income trades; RUN; PROC FREQ data=save.data1; TABLES age*sex; RUN;
Scatter Plot PROC PLOT data=save.data1; PLOT Y*X; RUN;
Creating a Format Library PROCFORMATLIBRARY=LIBRARY; VALUE BG 0 = 'BAD' 1 = 'GOOD' -1 = 'MISSING' ; VALUE TWO -1 = 'MISSING' -2 = 'NO RECORD' -3 = 'INQS. ONLY' -4 = 'PR ONLY' 0='0'1='1'1<-HIGH='2+' ; RUN;
Applying a format to a variable PROCDATASETSlibrary=save; MODIFYdata1; FORMAT trades ten.; RUN; QUIT; This applies the format called ten to the variable trades. A subsequent PROC FREQ statement for trades will show the format applied. Note that ten must already exist in the format library for this to work.
Applying a format: Method 2 Datasave.data2; SETsave.data1; FORMAT trades bktrds ten. totbal mileage. ; RUN; • This is another way to apply formats when creating a new dataset (data2) from a previous one (data1) that has unformatted variables.
Random Selection of Obs. DATAsave.new; SETsave.old; Random1 = RANUNI(254987)*100; IF Random1 > 50 THEN OUTPUT; RUN; QUIT; The function RANUNI requires a seed number, and then produces random values between 0 and 1, stored under the variable name Random1 (you can choose any name). The above program will create new.sd2, with about half the observations of old.sd2, randomly chosen.
Sorting and Merging Datasets PROC SORT data = save.junk; BY Age Income; Run; PROC SORT data=save.junkOUT=save.neat; BY acctnum; RUN; PROC SORT data=save.junkNODUPKEY; BY something; RUN;
Sorting and Merging Datasets PROC SORT data=save.one; BY Acctnum; RUN; PROC SORT data=save.two; BY Acctnum; RUN; DATAsave.three; MERGEsave.one save.two; BY Acctnum; RUN;
Sorting and Merging Datasets DATAsave.three; MERGEsave.one (IN = a) save.two; BY Acctnum; IF a; RUN;
Using Arrays DATAsave.new; SETsave.old; ARRAY vitamin(6) a b c d e k; DO i = 1 to 6; IF vitamin(i) = -5 THEN vitamin(i) = .; END; RUN; This assumes you have 6 variables called a, b, c, d, e, and ,k in save.old. This program will modify all 6 such that any instance of a –5 value is converted to a missing value.
Simple Correlations PROC CORR data=save.relative; VAR tvhours study; RUN; PROC CORR data=save.relative; VAR tvhours study; WITH Score; RUN;
Run Regression Analysis • Runs the regression and stores the estimates in a file called estfile Proc reg data=save.treg2 corr outest=estfile; bgscore: model good= trades01 trades02 ageavg01 ageavg02 / selection=none; run; Quit;
Score the data • Score the data intreg1 and save the output in save.scrdata Proc score data=save.treg1 score=estfile out=save.scrdata type=parms; trades01 trades02 ageavg01 ageavg02 Run; Quit;
Format bgscore • Format the bgscore variable in the new save.scrdata file. Find or create a format from the format.sas file to apply to the bgscore variable. Proc datasets library=save; Modify scrdata; Format bgscore insert_format_here.; Run; Quit;
Creating Dummy Variables %MACRO DUMMY(VAR, FIRST, LAST, TOT); IF(&FIRST <= &VAR <= &LAST) THEN &VAR.&TOT =1; ELSE &VAR.&TOT =0; LABEL &VAR.&TOT="&VAR: &FIRST - &LAST "; %MEND DUMMY; data save.testreg2; set save.testreg; %Dummy(AGEOTD, 0, 78, 1); %Dummy(AGEOTD, 96, 119, 2); %Dummy(AGEOTD, 120, 143, 3); %Dummy(AGEOTD, 144, 179, 4); %Dummy(AGEOTD, 180, 99999999, 5); Run; Quit;