220 likes | 342 Views
Lesson 4 Overview. Descriptive Procedures Procedures FREQ, CORR, REG, SGPLOT Comment and Option Statements Program 4 in course notes LSB: See syllabus LSB: Chapter 11 – Debugging Programs. Program 4. DATA weight; INFILE ‘C:SAS_Filestomhs.dat' ; INPUT @ 1 ptid $10.
Lesson 4 Overview • Descriptive Procedures • Procedures FREQ, CORR, REG, SGPLOT • Comment and Option Statements • Program 4 in course notes • LSB: See syllabus • LSB: Chapter 11 – Debugging Programs
Program 4 DATA weight; INFILE‘C:\SAS_Files\tomhs.dat' ; INPUT @1 ptid $10. @12 clinic $1. @27 age 2. @30 sex 1. @58 height 4. @85 weight 5. @140 cholbl 3. ; bmi = (weight*703.0768)/(height*height); RUN;
PROCFREQDATA=weight; TABLES clinic sex ; TITLE'Frequency Distribution of Clinical Center and Gender'; RUN; Frequency Distribution of Clinical Center and Gender The FREQ Procedure Cumulative Cumulative clinic Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ A 18 18.00 18 18.00 B 29 29.00 47 47.00 C 36 36.00 83 83.00 D 17 17.00 100 100.00 Cumulative Cumulative sex Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 73 73.00 73 73.00 2 27 27.00 100 100.00
PROCFREQDATA=weight; TABLES clinic/ NOCUM ; TITLE'Frequency Distribution of Clinical Center '; TITLE2'(No Cumulative Percentages) '; RUN; Frequency Distribution of Clinical Center (No Cumulative Percentages) The FREQ Procedure clinic Frequency Percent ------------------------------- A 18 18.00 B 29 29.00 C 36 36.00 D 17 17.00
*2-Way Frequency Tables ; PROCFREQDATA=weight; TABLES sex*clinic ; TITLE'Cross Tabulation of Clinical Center and Sex'; RUN; *Adding a two-way plot ; PROCFREQDATA=weight; TABLES sex*clinic/ PLOTS=FREQPLOT(TWOWAY=GROUPHORIZONTAL); RUN;
Cross Tabulation of Clinical Center and Sex The FREQ Procedure Table of sex by clinic sex clinic Frequency| Percent | Row Pct | Col Pct |A |B |C |D | Total ---------+--------+--------+--------+--------+ 1 | 12 | 20 | 30 | 11 | 73 | 12.00 | 20.00 | 30.00 | 11.00 | 73.00 | 16.44 | 27.40 | 41.10 | 15.07 | | 66.67 | 68.97 | 83.33 | 64.71 | ---------+--------+--------+--------+--------+ 2 | 6 | 9 | 6 | 6 | 27 | 6.00 | 9.00 | 6.00 | 6.00 | 27.00 | 22.22 | 33.33 | 22.22 | 22.22 | | 33.33 | 31.03 | 16.67 | 35.29 | ---------+--------+--------+--------+--------+ Total 18 29 36 17 100 18.00 29.00 36.00 17.00 100.00 Percent men in clinic A
*Getting only the counts ; PROCFREQDATA=weight; TABLES sex*clinic / nopercent norow nocol; RUN; sex clinic Frequency|A |B |C |D Total ---------+--------+--------+--------+--------+ 1 | 12 | 20 | 30 | 11 | 73 ---------+--------+--------+--------+--------+ 2 | 6 | 9 | 6 | 6 | 27 ---------+--------+--------+--------+--------+ Total 18 29 36 17 100
OTHER USEFUL TABLE OPTIONS • CHISQ – performs chi-square analyses for 2-way tables • MISSING – includes missing data as a separate category • LIST – makes condensed table (useful when looking at 3-way or higher tables)
* Using PROC SGPLOT for bar charts; ODSGRAPHICS /WIDTH=300px ; PROCSGPLOT; VBAR clinic; TITLE"Vertical Bar Chart of Clinical Center"; LABEL clinic = "Clinical Center"; Plot can be imbedded into an HTML document or kept as a separate file. The file can be inserted in Office documents.
* Same plot displayed horizontally; PROCSGPLOT; HBAR clinic; TITLE“Horizontal Bar Chart of Clinical Center"; LABEL clinic = "Clinical Center";
* DATALABEL puts values on top of bar; PROCSGPLOT; YAXISLABEL = "Mean Cholesterol" VALUES = (0to300by50); VBAR clinic/RESPONSE=cholbl STAT=MEAN DATALABEL ; TITLE'Mean Cholesterol by Clinical Center'; LABEL clinic = "Clinical Center"; RUN;
* Using SGPLOT to make regression plot; PROCSGPLOTDATA=weight; YAXISLABEL = "Body Mass Index (BMI)" ; XAXISLABEL = "Age (y)" ; REGX=age Y=bmi/CLM; WHERE sex = 2; TITLE'Plot of BMI and Age for Women'; RUN;
PROCCORRDATA=weight; VAR bmi age; WHERE gender = 2; TITLE'Correlation of BMI and Age for Women'; RUN; Correlation Coefficient P-value testing if correlation is significantly different from zero Pearson Correlation Coefficients, N = 27 Prob > |r| under H0: Rho=0 bmi age bmi 1.00000 -0.44397 0.0203 age -0.44397 1.00000 0.0203
ODSGRAPHICS ; PROCREGDATA=weight ; MODEL bmi=age; WHERE gender = 2; TITLE'Simple Linear Regression'; RUN; Partial Output Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept 1 43.61312 6.40001 6.81 <.0001 age 1 -0.28964 0.11710 -2.47 0.0205 Regression equation: bmi = 43.61 - 0.29*age *Note: many options for plotting within proc reg. ODS graphics on will produce many plots by default.
Using Comments in Program • Two Purposes • Documenting your program • Temporarily delete part of a program • See page 3 LSB
Examples of Comment Code * Run proc univariate for variable BMI; *---------------------------------------------------------------------* High resolution graphs can also be produced. The following makes a plot of a histogram with the best fit normal curve and summary statistics. *---------------------------------------------------------------------*; PROCUNIVARIATEDATA = weight PLOT ; * ID ptid ; VAR bmi; PROCUNIVARIATEDATA = weight /*PLOT*/; VAR bmi;
Temporarily Removing Code: Do not want to produce histogram but may want to run it at another time PROCUNIVARIATEDATA = weight; VAR bmi; /* HISTOGRAM bmi / NORMAL MIDPOINTS=20 to 40 by 2; INSET N = 'N' (5.0) MEAN = 'Mean' (5.1) STD = 'Sdev' (5.1) MIN = 'Min' (5.1) MAX = 'Max' (5.1)/ POS=lm HEADER='Summary Statistics'; */ LABEL bmi = 'Body Mass Index (kg/m2)'; TITLE'Histogram of BMI'; RUN;
What is wrong with this program ? * This is my first SAS program DATA bp; INFILE ... (more lines)
Option Statement OPTION NOCENTER LINESIZE = 78; OPTION NODATE NONUMBER; Many, many options (run PROC OPTIONS) Usually put at top of program Can put in autoexec.sas so they will always be in effect.