1 / 26

PubH 6420 Introduction to SAS Programming

PubH 6420 Introduction to SAS Programming. Instructor: Greg Grandits TA: Michael Petzold Textbook: The Little SAS Book, 5 th Edition. Course Information. Prerequisite: Want to learn SAS Evaluation: 6 assignments and 2 exams Monitored computer lab hours (Mayo C381) Web site for class

snyder
Download Presentation

PubH 6420 Introduction to SAS Programming

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PubH 6420Introduction to SAS Programming Instructor: Greg Grandits TA: Michael Petzold Textbook: The Little SAS Book, 5th Edition

  2. Course Information • Prerequisite: Want to learn SAS • Evaluation: 6 assignments and 2 exams • Monitored computer lab hours (Mayo C381) • Web site for class http://www.biostat.umn.edu/~greg-g/ph6420.html Datasets, programs, lectures, help links

  3. Course Information • Access to SAS using personal copy of PC SAS or any computer with SAS available to you. • Version 9.2 or 9.3 or 9.4

  4. Course Resources • Dataset Documentation • Case Report Forms for TOMHS • Data dictionary for TOMHS dataset • Instructions on working with TOMHS dataset • Datasets • SAS programs for download • Help tutorials

  5. SAS OS/Environment • Windows PC • UNIX /Linux

  6. Lecture 1 Readings • LSB (Chapter 1)

  7. What is SAS ? • SAS is a programming language that reads, processes, and performs statistical analyses of data. • A SAS program is made up of programming statements which SAS interprets to do the above functions. Note: Programming statements are sometimes referred to as “syntax” or programming “code”. A program is sometimes called a “syntax” file.

  8. SAS Usage • Started in late 1970s • Used extensively at academic and business environments (medical device and pharmaceutical companies) • Many analyses of publications in medical journals use SAS

  9. Parts of SAS Program • DATA step • Reads in and processes your raw data and makes a SAS dataset. • Procedures (PROCS) • Performs specific statistical analyses • Some procedures are utility procedures such as PROC SORT that is used to sort your data

  10. Raw Data Read in Data Process Data (Create new variables) Output Data (Create SAS Dataset) Data Step Analyze Data Using Statistical Procedures PROCs

  11. Structure of Data • Made up of rows and columns • Rows in SAS are called observations • Columns in SAS are called variables • Together they make up the dataset (table) An observation (row) is all the information for one entity (patient, patient visit, clinical center, county) SAS data step processes data one observation at a time

  12. Example of Data 12 observations and 5 variables F 23 S 15 MN F 21 S 15 WI F 22 S 09 MN F 35 M 02 MN F 22 M 13 MN F 25 S 13 WI M 20 S 13 MN M 26 M 15 WI M 27 S 05 MN M 23 S 14 IA M 21 S 14 MN M 29 M 15 MN

  13. Types of Variables In SAS • Numeric (e.g. age, blood pressure) • 54, 140 • Character (patient ID, diagnosis) • A001, TIA, 0410 You need to tell SAS if the data is character. The default is numeric.

  14. Rules for SAS Statements • SAS statements end with a semicolon (;) data demo; infile datalines; input gender $ age; • SAS statements can be entered in lower or uppercase data demo; infile datalines; input gender $ age; DATA DEMO; INFILE DATALINES; INPUT GENDER $ AGE; IS SAME AS :

  15. Rules for SAS Statements • Multiple SAS statements can appear on one line data demo; infile datalines; input gender $ age; X1 = 0; X2 = 0; X3 = 0; X4 = 0; • A SAS statement can use multiple lines input gender $ age marstat;

  16. Rules for SAS Variables Names • Variable names can be from 1-32 characters and must begin with A-Z or an underscore (_). No special characters except underscore is allowed. • OK AS VARIABLE NAMES • dbp12 • DiastolicBloodPressure • _dbp12 • Not OK AS VARIABLE NAMES • 12dbp • dbp 12 • dbp*12

  17. * This is a short example program to demonstrate what a SAS program looks like. This is a comment statement because it begins with a * and ends with a semi-colon ; data demo; infiledatalines; input gender $ age marstat $ credits state $ ; if credits > 12then fulltime = 1 ; else fulltime = 2; if state = 'MN'then resid = 1; else resid = 2; datalines; F 23 S 15 MN F 21 S 15 WI F 22 S 09 MN F 35 M 02 MN F 22 M 13 MN F 25 S 13 WI M 20 S 13 MN M 26 M 15 WI M 27 S 05 MN M 23 S 14 IA M 21 S 14 MN M 29 M 15 MN ; RUN; procprintdata=demo ; var gender age marstat credits fulltime state ; run; * More procedures; DATA STEP SAS PROCEDURE

  18. 1 data demo; Create a SAS dataset called demo 2 infiledatalines; Where is the data? 3 input gender $ What are the variable age names and types? marstat $ credits state $ ; 4 if credits > 12then fulltime = 1; else fulltime = 2; 5 if state = 'MN'then resid = 1; else resid = 2; Statements 4 and 5 create 2 new variables

  19. 6 datalines; Tells SAS the data is coming F 23 S 15 MN F 21 S 15 WI F 22 S 09 MN F 35 M 02 MN F 22 M 13 MN F 25 S 13 WI M 20 S 13 MN M 26 M 15 WI M 27 S 05 MN M 23 S 14 IA M 21 S 14 MN M 29 M 15 MN ; Tells SAS the data has ended 7 run; Tells SAS to run the statements above

  20. Main SAS Windows (PC) • Editor Window – where you type your program • Log Window –lists program statements processed, giving notes, warnings and errors. Always look at the log window ! Tells how SAS understood your program • Output Window/Results Viewer – gives the output generated from the PROCs • Results Window – index to all of your output Submit program by clicking on run icon

  21. Messages in SAS Log • Errors: fatal in that program will abort • Warnings: messages that are usually important • Notes: messages that may or may not be important (notes and warnings will not abort your program)

  22. * This is a short example program to demonstrate what a SAS program looks like. This is a comment statement because it begins with a * and ends with a semi-colon ; data demo; infiledatalines; input gender $ age marstat $ credits state $ ; if credits > 12then fulltime = 1; else fulltime = 2; if state = 'MN'then resid = 1 ; else resid = 2; datalines; F 23 S 15 MN F 21 S 15 WI F 22 S 09 MN F 35 M 02 MN F 22 M 13 MN F 25 S 13 WI M 20 S 13 MN M 26 M 15 WI M 27 S 05 MN M 23 S 14 IA M 21 S 14 MN M 29 M 15 MN ; run; title'Running the Example Program'; procprintdata=demo ; var gender age marstat credits fulltime state ; run;

  23. OUTPUT (Results) WINDOW Running the Example Program Obs gender age marstat credits fulltime state 1 F 23 S 15 Y MN 2 F 21 S 15 Y WI 3 F 22 S 9 N MN 4 F 35 M 2 N MN 5 F 22 M 13 Y MN 6 F 25 S 13 Y WI 7 M 20 S 13 Y MN 8 M 26 M 15 Y WI 9 M 27 S 5 N MN 10 M 23 S 14 Y IA 11 M 21 S 14 Y MN 12 M 29 M 15 Y MN The MEANS Procedure Variable N Sum Mean ---------------------------------------------- age 12 294.0000000 24.5000000 credits 12 143.0000000 11.9166667 ----------------------------------------------- The FREQ Procedure Cumulative Cumulative gender Frequency Percent Frequency Percent ----------------------------------------------------------- F 6 50.00 6 50.00 M 6 50.00 12 100.0 proc means data=demo; var age credits; proc freq data=demo; tables gender;

  24. Some common procedures PROC PRINT • lists out your data - always a good idea!! PROC MEANS • descriptive statistics for continuous data PROC FREQ • descriptive statistics for categorical data PROC UNIVARIATE • detailed descriptive statistics for continuous data PROC TTEST • performs t-tests (continuous data)

More Related