1 / 54

Ann Arbor ASA Up and Running Series: SAS

Ann Arbor ASA Up and Running Series: SAS. Sponsored by t he Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of the University of Michigan. Contents. Starting SAS User Interface Libraries Syntax Getting Data into SAS Examining Data

dai
Download Presentation

Ann Arbor ASA Up and Running Series: SAS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ann Arbor ASAUp and Running Series:SAS Sponsored by the Ann Arbor Chapter of the American Statistical Association and the Department of Statistics of the University of Michigan

  2. Contents Starting SAS User Interface Libraries Syntax Getting Data into SAS Examining Data Manipulating Data Descriptive Statistics Graphing Data Statistics in SAS Up and Running Series: SAS

  3. Starting SAS Start  SAS 9.3 (English) Up and Running Series: SAS

  4. User Interface Log Comments, warnings, etc. Explorer/ Results Program Editor: Write and submit commands Up and Running Series: SAS Output (not seen)

  5. Libraries • SAS requires the creation of Library folders to save the data • Libraries are accessed through LIBNAME command • Four Libraries are defined by default, at the start of SAS • Maps • SASHELP: holds help info and sample datasets • SASUSER: holds settings, etc. • WORK: default temporary Library for each session • All data stored in this folder will be deleted at the end of each SAS session • It is recommended the creation of permanent files/Libraries Up and Running Series: SAS

  6. Libraries • Create a folder called ‘my_files’ on your desktop. • Run this command in SAS: LIBNAME a "C:\Users\uniquename\Desktop\my_files"; • Refer to datasets in that folder by with the prefix ‘a.datasetname’. • TIP: Use memorable names for libraries, rather than ‘a’ (e.g., ‘raw’, ‘final’, ‘time1’, etc) Up and Running Series: SAS

  7. Syntax • SAS divides commands into two groups • DATA step • create/alter datasets • PROC (Procedures) • perform statistical analyses or generate reports. • Some exceptions to the rule: • DATA step can be used to generate reports • PROC IMPORT creates a data set • PROC SORT alters data sets (without telling you!) Up and Running Series: SAS

  8. Getting data into SAS • PROC IMPORT • Allows the reading of standard file types • Allows the reading of plain text, with user-specified delimiters (i.e., the characters which separate the data) • WARNING – SAS changed PROC IMPORT for Excel and Access files, in 64-bit SAS • DATA step • Allows the reading of non-standard file types, complex file structures, and unusual delimiters. Up and Running Series: SAS

  9. DATA step • SAS syntax can be used to read in raw data files (.txt, .csv files), specifying which variables to read in, which ones are text/numeric, combining multiple rows into one case, etc. • However, this is a more advanced topic. • Follow up with an Intro class from CSCAR, or by going through examples from the literature (e.g., ‘The Little SAS Book’). Up and Running Series: SAS

  10. Examining Data • VIEWTABLE Window • Select dataset icon in Explorer • PROC CONTENTS • Produces a listing of data set information, including the variables and their properties • PROC PRINT • Prints a subset of variables or cases to the output window Up and Running Series: SAS

  11. VIEWTABLE Window Up and Running Series: SAS

  12. PROC CONTENTS • In the Editor window, type: PROC CONTENTS data=a.class2; run; • Highlight the syntax • Submit for processing • Click on icon of ‘running-man’ • Right click on selected syntax  Submit Selection Up and Running Series: SAS

  13. PROC CONTENTS Up and Running Series: SAS

  14. PROC PRINT • In the Editor window, type: PROC PRINT data=a.class2; run; • Submit for processing Up and Running Series: SAS

  15. PROC PRINT Up and Running Series: SAS

  16. Manipulating Data • Usually done within a data step • Match data sets using a shared key variable • Create new variables, or drop/rename existing variables • Take one or more subsets of the data • Sort the data by specific variable(s). • Overwrite existing or create new datasets • PROC SORT • Adding/Removing variables • Merging Datasets Up and Running Series: SAS

  17. PROC SORT • In the Editor window, type: PROC SORT data=a.class2 out=a.class2sorted; by age descending weight height; run; • Submit for processing • WARNING: PROC SORT alters data • Store in a new dataset out=‘newdatasetname’; Up and Running Series: SAS

  18. PROC SORT Up and Running Series: SAS

  19. Adding/Removing variables • Create new data set, compute new variables, remove unwanted variables DATA a.class2metric (drop=weight height sex age); set a.class2; height_cm=height*2.54; weight_kg=weight/2.2; label height_cm=‘Height in CM’ weight_kg=‘Weight in Kilograms’; run; PROC PRINT data=a.class2metric; run; • Submit for processing Up and Running Series: SAS

  20. Adding/Removing variables Up and Running Series: SAS

  21. Merging Datasets • Data sets must be sorted by the same key variable(s) proc sort data=a.class2; by name; proc sort data=a.class2metric; by name; data classmerged; merge a.class2 a.class2metric; by name; run; • Submit for processing Up and Running Series: SAS

  22. Merging Datasets Up and Running Series: SAS

  23. Merging Datasets Up and Running Series: SAS

  24. Descriptive Statistics • PROC FREQ • Produces a table of counts and percentages • For cross-tabulations, statistical tests can also be performed; e.g., independence testing • PROC MEANS • Produces descriptive statistics such as mean, standard deviation, minimum, maximum Up and Running Series: SAS

  25. PROC FREQ • In the Editor window, type proc freqdata=a.class2; tables age*sex; run; • Submit for processing Up and Running Series: SAS

  26. PROC FREQ Up and Running Series: SAS

  27. PROC MEANS • In the Editor window, type proc means data=a.class2; var age weight height; run; • Submit for processing Up and Running Series: SAS

  28. PROC MEANS Up and Running Series: SAS

  29. Graphing DataPROC GPLOT Simple bivariate scatterplot Separate lines Multiple variables scatterplot Options Up and Running Series: SAS

  30. PROC GPLOT • Simple bivariate scatterplot: proc gplotdata=a.class2; symbol1 value=dot interpol=rl; plot weight*height; run; • Submit for processing Up and Running Series: SAS

  31. PROC GPLOT - Log Up and Running Series: SAS

  32. PROC GPLOT Up and Running Series: SAS

  33. PROC GPLOT • To graph separate lines for each level of a categorical variable, type: proc gplotdata=a.class2; symbol1 value=dot interpol=rl; plot weight*height = sex; run; • Submit for processing Up and Running Series: SAS

  34. PROC GPLOT Up and Running Series: SAS

  35. PROC GPLOT • Multiple variables on the same graph: proc gplotdata=a.class2; symbol1 value=dot interpol=rl color=blue; symbol2 value=dot interpol=rl color=red; plot weight * age; plot2 height * age; run; quit; • Submit for processing Up and Running Series: SAS

  36. PROC GPLOT Up and Running Series: SAS

  37. value=___ Any character enclosed in single quotes Special characters dot plus sign star square ...and many others interpol=___ RL / RQ / RC linear quadratic cubic regression curves JOIN connects consecutive points (line graph) BOX PROC GPLOT Up and Running Series: SAS

  38. Statistics in SAS • PROC CORR • Correlational analyses • PROC REG • Statistical Regression • PROC UNIVARIATE • To assess normality of regression residuals Up and Running Series: SAS

  39. PROC CORR • Compute bivariate correlation coefficients proc corr data = a.class2; var age; with height weight; run; Up and Running Series: SAS

  40. PROC CORR Up and Running Series: SAS

  41. PROC REG Run a regression on merged ‘class’ dataset Save residuals and predicted values in an output dataset Request residual plot proc reg data=a.classmerged; model height_cm=age weight / partial; output out=reg_data p=predict r=resid rstudent=rstudent; plot rstudent. * height_cm; run; quit; Notes – the quit command terminates the regression procedure; otherwise it keeps running; the output data set will be in the work library, since no library was specified. Up and Running Series: SAS

  42. PROC REG Up and Running Series: SAS

  43. PROC REG Up and Running Series: SAS

  44. PROC REG Up and Running Series: SAS

  45. PROC REG Up and Running Series: SAS

  46. PROC UNIVARIATE • Assess normality of regression residuals stored in the output dataset from PROC REG: proc univariate data=reg_data; var rstudent; histogram; qqplot / normal (mu=est sigma=est); run; quit; Up and Running Series: SAS

  47. PROC UNIVARIATE Up and Running Series: SAS

  48. PROC UNIVARIATE Up and Running Series: SAS

  49. PROC UNIVARIATE Up and Running Series: SAS

  50. QUESTIONS Up and Running Series: SAS

More Related