230 likes | 378 Views
Experimental Statistics - week 4. Chapter 8: 1-factor ANOVA models Using SAS. EXAM SCHEDULE: Exam I – Take-home exam (handed out Thursday, March 3, due 8:00 AM Tuesday, March 8) Exam II – Take-home exam (handed out Thursday, April 14,
Experimental Statistics - week 4 Chapter 8:1-factor ANOVA models Using SAS
EXAM SCHEDULE: Exam I – Take-home exam (handed out Thursday, March 3, due 8:00 AM Tuesday, March 8) Exam II – Take-home exam (handed out Thursday, April 14, due 8:00 AM Tuesday, April 19) Final Exam – optional(scheduled for 8:00 AM – 11:00 AM Friday, May 6) GRADE COMPUTATION: Exam Grades (75%) Daily Assignments (25%)
ANOVA Table Output - hostility data - calculations done in class Source SS df MS F p-value Between 767.17 2 383.58 16.7 <.001 samples Within 205.74 9 22.86 samples Totals 972.91
ANOVA Models Note: Example: Population has mean m = 5. Consider the random sample
General Form of Model: Alternative form of the 1-Factor ANOVA Model (pages 394-395) - random errors follow a Normal (N) distribution, are independently distributed (ID), and have zero mean and constant variance -- i.e. variability does not change from group to group
Analysis of Variance Table Recall: In our model:
Introduction to SAS Programming Language
Recall CAR DATA For this analysis, 5 gasoline types (A - E) were to be tested. Twenty carswere selected for testing and were assigned randomly to the groups (i.e. the gasoline types). Thus, in the analysis, each gasoline type was tested on 4 cars. A performance-based octane reading was obtained for each car, and the question is whether the gasolines differ with respect to this octane reading. A 91.7 91.2 90.9 90.6 B 91.7 91.9 90.9 90.9 C 92.4 91.2 91.6 91.0 D 91.8 92.2 92.0 91.4 E 93.1 92.9 92.4 92.4
The CAR data set as SAS needs to see it: A 91.7 A 91.2 A 90.9 A 90.6 B 91.7 B 91.9 B 90.9 B 90.9 C 92.4 C 91.2 C 91.6 C 91.0 D 91.8 D 92.2 D 92.0 D 91.4 E 93.1 E 92.9 E 92.4 E 92.4
SAS file for CAR data Case 1: Data within SAS FILE : DATA one; INPUT gas$ octane; DATALINES; A 91.7 A 91.2 . . . E 92.4 E 92.4 ; PROC GLM; (or ANOVA) CLASS gas; MODEL octane=gas; TITLE 'Gasoline Example - Completely Randomized Design'; MEANS gas/duncans; RUN; PROC MEANS mean var; RUN; PROC MEANS mean var; class gas; RUN;
Brief Discussion of Components of the SAS File: DATA Step DATA STATEMENT - the first DATA statement names the data set whose variables are defined in the INPUT statement -- in the above, we create data set 'one' INPUT STATEMENT - 2 forms 1. Freefield- can be used when data values are separated by 1 or more blanks INPUT NAME $ AGE SEX $ SCORE; ($ indicates character variable) 2.Formatted - data occur in fixed columns INPUT NAME $ 1-20 AGE 22-24 SEX $ 26 SCORE 28-30; DATALINES STATEMENT - used to indicate that the next records in the file contain the actual data andthe semicolon after the data indicates the end of the data itself
SPECIFYING THE ANALYSIS -- PROC STATEMENTS GENERAL FORM PROC xxxxx;implies procedure is to be run on most recently created data set PROC xxxxx DATA = data set name; Note: I did not have to specify DATA=one in the above example Example PROCs: PROC REG - regression analysis PROC ANOVA - analysis of variance PROC GLM - general linear model PROC MEANS - basic statistics, t-test for H0: m = 0 PROC PLOT - plotting PROC TTEST - t-tests PROC UNIVARIATE - descriptive stats, box-plots, etc. PROC BOXPLOT - boxplots
PROC GLM • Proc GLM data = fn ; • Class … ; • List all the factors. • Model … / options; • e.g., model octane = gas; • Means … / options; • Run;
SAS Syntax • Every command MUST end with a semicolon • Commands can continue over two or more lines • Variable names are 1-8 characters (letters and numerals, beginning with a letter or underscore), but no blanks or special characters • Note: values for character variables can exceed 8 characters • Comments • Begin with *, end with ;
Titles and Labels • TITLE ‘…’ ; • Up to 10 title lines: TITLE ‘include your title here’; • Can be placed in Data Steps or Procs • LABELname = ‘…’ ; • Can be in a DATA STEP or PROC PRINT • Include ALL labels, then a single ; Note:For class assignments, place descriptive titles and labels on the output. Print the data to the output file.
Case 2: Data in an External File FILENAME f1 ‘complete directory/file specification’; FILENAME f1 ‘a:car.data'; DATA one; INFILE f1; INPUT gas$ octane; PROC GLM; (or ANOVA) CLASS gas; MODEL octane=gas; TITLE 'Gasoline Example - Completely Randomized Design'; RUN; PROC MEANS mean var; RUN; PROC MEANS mean var; class gas; run;
The SAS Output for CAR data: Gasoline Example - Completely Randomized Design General Linear Models Procedure Dependent Variable: OCTANE Sum of Mean Source DF Squares Square F Value Pr > F Model 4 6.10800000 1.52700000 6.80 0.0025 Error 15 3.37000000 0.22466667 Corrected Total 19 9.47800000 R-Square C.V. Root MSE OCTANE Mean 0.644440 0.516836 0.4739902 91.710000 Source DF Type I SS Mean Square F Value Pr > F GAS 4 6.10800000 1.52700000 6.80 0.0025 Source DF Type III SS Mean Square F Value Pr > F GAS 4 6.10800000 1.52700000 6.80 0.0025
Text Format for ANOVA Table Output - car data Source SS df MS F p-value Between 6.108 4 1.527 6.80 0.0025 samples Within 3.370 15 0.225 samples Totals 9.478 19
PC SAS on Campus Library BIC Student Center SAS Learning Edition $125 http://support.sas.com/rnd/le/index.html
“Lab” Assignment Using CAR Data, run the following in this order with one set of code: 1. Calculate the average, standard deviation, minimum, and maximum for the 20 octane readings.CS pp. 25 - 32 2.Graph a histogram of OCTANE. CS pp. 37 3.Calculate descriptive statistics in (1) above for OCTANE for each of the 5 gasolines.CS pp. 32-34 A and B. CS pp. 138-141 5. Plot side-by-side box plots for OCTANE for the 5 levels of the variable GAS 6. Compute a 1-factor ANOVA for the CAR data using only the first 3 GAS types. CS pp.150-155