1 / 36

SAS Review and Tips

SAS Review and Tips. PM 515 Lecture 2 January 21, 2011. Paper progress. Which dataset have you chosen? What hypotheses are you interested in? What variables will you use, and how are they measured?

finian
Download Presentation

SAS Review and Tips

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SAS Review and Tips PM 515 Lecture 2 January 21, 2011

  2. Paper progress • Which dataset have you chosen? • What hypotheses are you interested in? • What variables will you use, and how are they measured? • Next step: Start working on reading the dataset into SAS and looking at the frequencies of the variables.

  3. SAS Review and Tips

  4. Overall goals • Read in your data • Get the variables coded the way you want them • Run the right statistics to test your hypotheses

  5. Data steps and proc steps • Data steps are for manipulating variables. Examples: • Code females as 1, males as 0 • Add up a series of depression questions to create a depression score • If “don’t know” was coded as 999, recode the 999s to missing

  6. Proc steps are for running statistical analyses • Examples: • Proc ttest runs a ttest • Proc corr runs a correlation • Proc reg runs a regression

  7. General flow of SAS programming • Bring your dataset into SAS. • Do all your data transformations in a data step. • Do your analyses in proc steps. • Try to avoid alternating data step, proc step, data step, proc step, etc. • If you do all your data transformations in one data step at the beginning, you won’t rewrite over your earlier work.

  8. Step 1: Bring your data into SAS

  9. Libnames • Tells SAS which subdirectory (folder) your dataset is in. • LIBNAME MYFOLDER ‘C:/JENNIFER/PM515’ • English translation: Hello, SAS. Let’s look at a dataset. I have it saved on my computer in the C:/JENNIFER/PM515 folder. But instead of that long name, today let’s just call the folder MYFOLDER.

  10. Getting your dataset into SAS • What format is your dataset in? • SAS dataset • ASCII file • Excel file • Another program (Access, SPSS, etc.)

  11. If it’s already a SAS dataset • Use the SET statement in a DATA step. • DATA A; SET MYFOLDER.MYDATASET; • English translation: SAS, look in that folder that I told you about. In it is a dataset called MYDATASET. I want to work with MYDATASET today. Just for today, let’s call it A. • You don’t need infile, input, etc.!

  12. If it’s already a SAS dataset • Can also just double-click on it from a folder in your computer. • SAS will recognize it and open it. • SAS will open it in a temp directory. If you plan to make changes, you should save it in a permanent libname directory.

  13. If it’s an ASCII file • Use the INFILE and INPUT statements. • DATA A; INFILE (filename); • INPUT VARIABLE1 VARIABLE2 VARIABLE3; • RUN;

  14. If it’s an Excel spreadsheet • Open SAS and import it with the import wizard. • Click on File, Import data, and follow the wizard. • Try to avoid Excel workbooks with multiple worksheets. Save each worksheet as a separate Excel file. • Variable names should be across the top (without spaces or funny characters). • Excel only lets you have 256 columns (variables) per sheet. If you have more than 256 variables, you’ll have to use multiple sheets, import them each separately, and merge them. Each sheet should contain an ID variable so you can merge.

  15. If it’s an SPSS dataset • In SPSS, save the dataset as a portable file. • File/Save as/Portable file (.por). • Read the file into SAS. • libnamemylibraryspss 'c:\jennifer\spssdataset.por'; data sasdataset; set mylibrary._first_;run; • SAS gives the SPSS dataset the name _first_ for some reason.

  16. If it’s in another program like Access or Stata • Use the other program’s functions to save it in an exportable format • Or write it out as an ASCII or CSV file and read back into SAS • Or use a conversion program like DBMS/COPY

  17. How do you know it worked? • Look in the log. • It should say “The dataset A has X observations and X variables.” • If there are 0 observations, something is wrong. Look for errors in the log. You probably misspelled the libname or dataset name, or you specified a subdirectory that doesn’t really contain the dataset. • Don’t bother doing any analyses until your dataset has the right number of observations! Find out what went wrong first.

  18. How to print out a list of the variables in the dataset • PROC CONTENTS; RUN; • By default, the variables are listed alphabetically. Use the POSITION option to order them in the order they appear in the dataset. • Use the SHORT option to show just the names, not all the other stuff.

  19. How to see what your dataset looks like • PROC PRINT; RUN; • Prints out one line for each observation (respondent) • Prints the values for all the variables, in the order that they appear in the dataset • Warning: only do this with small datasets! • Another way to do this is to open the dataset in the Viewtable. Viewtable shows your dataset as a spreadsheet.

  20. How to make a frequency table • PROC FREQ; TABLES VAR1 VAR2 VAR3; RUN; • A 2X2 table: • TABLES VAR1*VAR2; • Include the missing values: • TABLES VAR1*VAR2/MISSING; • Chi-square test: • TABLES VAR1*VAR2/CHISQ; • Warning—only do this when your variables are categorical!

  21. How to transform or recode variables • Example: • Missing is coded as 999. • DATA B; SET A; • IF VAR1=999 THEN VAR1=.; • RUN;

  22. How to reverse-code a variable V1: I love SAS. • Strongly agree • Agree • Disagree • Strongly disagree DATA B; SET A; LOVESAS=5-V1; RUN;

  23. How to change a character variable to numeric • NEWVAR=0+OLDVAR;

  24. How to recode a lot of variables at once • DATA B; SET A; • ARRAY J(10) VAR1-VAR10; • DO I=1 TO 10; • IF J(I)=999 THEN J(I)=.; • END; • RUN;

  25. How to create a scale V1. I love SAS. V2. I look forward to using SAS when I come to work. V3. I want to use SAS all day. V4. The best part of my job is using SAS. DATA B; SET A; LOVESAS=V1+V2+V3+V4; or LOVESAS=MEAN(V1,V2,V3,V4); RUN; Check the scale for internal consistency reliability: PROC CORR ALPHA; VAR V1 V2 V3 V4; RUN;

  26. Warning about creating scales • If the variables are unstandardized, the one with the largest variance will contribute the most to the variance of the whole scale. • If you want each variable to contribute equally, standardize the variables first. • Especially important if the variables are measured in different metrics! (e.g., one is on a scale from 1 to 4 and another is on a scale from 0 to 1000).

  27. How to standardize variables • Proc standard data=a out=b mean=0 std=1; • Var variable1 variable2 variable3; • Run; • Then do the rest of your analyses on dataset b. • Variable1, variable2, and variable3 now have a mean of 0 and a standard deviation of 1.

  28. Another warning about creating scales • Make sure all the variables are coded in the same direction (e.g., high score=high level of X). • If your Cronbach’s alpha is negative or very low, check this again.

  29. Clean, efficient programming

  30. What not to do • Libname mylib ‘c://datasets’; • Data a; set mylib.mydataset; • newvar1=var1+3; • newvar2=5-var2; • diffscore=newvar2-newvar1; • Proc print; run; • Data a; • newscale=var5+var6+var7; • Proc corr; • var diffscore newscale; • Run; • What will happen?

  31. Useful tips for clean programming • Line up and indent Data b; set a; if id ne .; newvar1=var1+3; Run; Proc freq; tables newvar1; Run; Proc corr; var var5 var6; Run;

  32. Don’t overwrite your master dataset • Libname mylibrary ‘C:/jennifer/pm515’; • Data a; set mylibrary.permanentdataset;

  33. Use consistent recoding conventions • Examples: • RV1 is V1, reversed. • SV1 is V1, standardized. • Make up your own conventions and use them consistently.

  34. Use arrays and macros for repeated tasks • (when you feel comfortable)

  35. A simple macro to run a logistic regression with multiple outcomes • %macro m1 (dv); • proc logistic; • model &dv=iv1 iv2 iv3; • run; • %mend m1; • %m1(outcome1); • %m1(outcome2); • %m1(outcome3);

  36. What you might want to do with your dataset now • Make sure you can read it into SAS. • Run a PROC FREQ on the variables you plan to use. • Too many missing? • Not enough variance? • Skip patterns? • Start recoding variables to the way you like them (0/1 vs. 1/2 , reversing) • Start constructing scale scores (check the dataset’s documentation to make sure you’re being consistent with the author of the scale!) • Run some exploratory correlations, t-tests, chi-squares, etc.

More Related