480 likes | 905 Views
Doing ANOVA and t-tests. LISA short course by Ciro Velasco-Cruz. ONE SAMPLE t TEST Example In a study, 15 lobsters were randomly selected from recent catches along a certain region of the Maine shore line. The lobsters were weighed to the nearest ounce, with results:
E N D
Doing ANOVA and t-tests LISA short course by Ciro Velasco-Cruz
ONE SAMPLE t TEST Example In a study, 15 lobsters were randomly selected from recent catches along a certain region of the Maine shore line. The lobsters were weighed to the nearest ounce, with results: 26 14 18 13 22 15 24 21 29 10 12 31 19 16 21 Suppose that for research purposes it is needed that the mean lobster’s weight equal to 15 ounces. It is known that lobster weight is normally distributed with both mean and standard deviation unknown.
SAS for coding The data step data lobsters_w; input type weigth @@; datalines; 1 26 1 14 1 18 1 13 1 22 1 15 1 24 1 21 1 29 1 10 1 12 1 31 1 19 1 16 1 21 ;
SAS for coding Exploratory data analysis: procmeans data=lobsters_w mean std max min median; var weigth; run; procboxplot data=lobsters_w; title'BoxPlot for one sample t-test example'; plot (weigth)*type/ cframe = vligb cboxes = dagr cboxfill = ywh; inset mean max min /CFILL = WHITE header = "Summary" CTEXT = RED; run;
SAS coding Data analysis: procttest data=lobsters_w h0=15; title 'One sample t test example'; var weigth; run;
SAS OUTPUT Conclusion: Since the p-value is <0.05, we reject the Null Hypothesis, that the mean=15, at 5% of level of significance.
Two Sample t-test example An animal scientist is interested in comparing two different topical treatments (A, B) against osteoarthritis in the leg joints of horses. Seven horses with the illness are available at the animal clinic. For each horse it is randomly determined which of the front legs receives treatment A and which treatment B. After four weeks of treat., the horses’ mobility is measured. Assuming that they were two independent samples, we can perform our tests.
SAS data step data horses; input trt horse mobility @@ ; cards; 1 1 48.2 1 2 44.6 1 3 49.7 1 4 40.5 1 5 54.6 1 6 47.1 1 7 46.8 2 1 41.5 2 2 40.1 2 3 44.0 2 4 41.2 2 5 49.8 2 6 41.7 2 7 51.4 ;
SAS E.D.A. procmeans data=horses mean std max min median; class trt; var mobility; run; procboxplot data=horses; title'BoxPlot for two sample t-test example'; plot (mobility)*trt/ cframe = vligb cboxes = dagr cboxfill = ywh; insetgroup mean max min q1 q2 q3/header = 'Summary by Treatme ctext = red; run;
SAS t test procttest data=horses; title 'Two sample t test example'; class trt; var mobility; run;
Conclusion • About Variance: Since the p-value is larger than 5%, we conclude that the variances are indeed equal. • About means: Since p-value for this test is larger to 5% too, we conclude that the means are equal.
Paired t test example • Let’s consider the last example. Since treatment A and B were both measured on the same horse. Measurements of mobility are not independent within horses. Then the right way to analyze the data is by Paired t test. • Idea: we look at the difference between the response from trts A and B: Di=YiA-YiB
SAS paired test procttest data=newhorses; paired MobilityA*MobilityB; run;
One Way Anova An experiment was conducted to study the growth of plant tissue in the presence of hormone solutions containing various growth inhibiting substances. For each solution, 10 independent tissues cultures were prepared and the growth of the plant tissue was recorded in mm. This experiment has One factor and 5 levels. Each has 10 replications.
SAS data step data peasection; input trtmnt growth @@; label trtmnt= 1:'Control' 2:'Sol.1' 3:'Sol.2' 4:'Mixture' 5:'Sol.3'; datalines; 1 7.84 1 8.69 1 8.11 1 8.35 1 7.74 1 7.69 1 7.98 1 7.64 1 8.57 1 8.32 2 6.78 2 6.69 2 6.95 2 6.64 2 6.41 2 6.69 2 6.72 2 6.57 2 6.67 2 7.07 3 6.79 3 6.79 3 6.79 3 6.61 3 6.43 3 6.69 3 6.57 3 6.49 3 7.05 3 6.72 4 6.64 4 6.57 4 6.78 4 6.48 4 6.54 4 6.36 4 6.67 4 6.26 4 6.67 4 6.68 5 7.31 5 7.65 5 7.26 5 7.39 5 6.98 5 7.46 5 7.32 5 7.13 5 7.07 5 7.25 ;
SAS coding procboxplot data=peasection; title'BoxPlot for one-way ANOVA example'; plot growth*trtmnt/ cframe = vligb cboxes = dagr cboxfill = ywh; insetgroup mean stddev q1 q2 q3/header = 'Summary by Treatment' ctext = red; run;
SAS glm anyway procglm data=peasection; class trtmnt; model growth=trtmnt; lsmeans trtmnt /pdiff adjust=tukey ; contrast 'our first contrast with contrast' trtmnt -10-102; estimate 'our first contrast with estimate' trtmnt -10-102; output out=residuals p=yhat r=res; run;
SAS output Note that: -(8.093+6.693)+2*7.282= -.222
Remedies • Transform the response: Log(var(y))=Co+q*log(mean) • g(y)=y^(1-q/2) if q different to 2 • g(y)=log(y) q=2 and y>0 • g(y)=log(y+shift) q=2 if some y <=0 • Use analysis for Gaussian data with unequal variances: Satterthwaite’s approximation or Welch (for one-way anova)
SAS E.D.A. procmeans data=peasection noprint; var growth; by trtmnt; output out=varmeans var= vargro mean=meangro; run; data varmeans;set varmeans; vargro=log(vargro);meangro=log(meangro); procgplot data=varmeans; plot vargro*meangro; run; procreg data=varmeans; model vargro=meangro; run;
SAS trans. And analysis code data trans; set peasection; yt=growth**-2.69881; ; procglm data=trans; class trtmnt; model yt=trtmnt; means trtmnt /hovtest=levene(type=square); output out=resi r=res; run; procboxplot data=resi; title'BoxPlot for one-way ANOVA example'; plot res*trtmnt/ cframe = vligb cboxes = dagr cboxfill = ywh; insetgroup mean stddev q1 q2 q3/header = 'Summary by Treatment' ctext = red; run;
Two-way ANOVA fixed factors An educational researcher was interested in the factors noise and solitude as they affect study conditions. Each subject in an experiment was asked to study an essay on American history for 15 minutes and then was tested on a 25 item quiz, the number of correct items being the score. The subjects differed, however, in the conditions under which they were allowed to study Factor Solitude with 2 levels: Alone and not alone (w/stooge) Factor Noise with 3 levels: no noise, soft background music, and loud rock and roll music. There are 3 replication of each treatment combination.
SAS data step data QuizScores; input Solitude $ Noise $ Score @@; datalines; Alone None 10 Alone None 6 Alone None 14 Alone Soft 21 Alone Soft 21 Alone Soft 16 Alone Loud 5 Alone Loud 15 Alone Loud 7 Stooge None 6 Stooge None 11 Stooge None 1 Stooge Soft 6 Stooge Soft 17 Stooge Soft 13 Stooge Loud 1 Stooge Loud 2 Stooge Loud 6 ;
SAS E.D.A procboxplot data=quizscores; title'BoxPlot for two-way ANOVA example'; plot score*noise(solitude)/ cframe = vligb cboxes = dagr cboxfill = ywh; *inset mean max min/pos=tm header='The overall summary'; insetgroup mean stddev q1 q2 q3/header = 'Summary by Treatment' ctext = red; run; procmeans data=quizscores noprint; by solitude noise; var score; output out=meanquizscore mean=meanquiz; run; symbol i=j; symbol2 i=j; procgplot data=meanquizscore; plot meanquiz*Noise=solitude; plot meanquiz*solitude=noise; run;
SAS output procglm data=quizscores; class solitude noise; model score=solitude|noise; run;
Slices • On this example interaction was not significant. But what we should do if it were? There are a way to come out with this problem: SLICES. Since main effects could be either significant or not at the presence of interaction, we need to test how they change at a given level of a treatment. In SAS, we use the following statement to obtain the slices: lsmeans “interaction”/slice=treatment;
SAS two way ANOVA random factor An experiment was performed to examine the effect of time Aging on the strength of cement. From a large number of mixes three cement mixes were randomly selected and six specimens were produced form each mix. After two days three randomly selected specimens from each mix were tested for strength with a load test and the other three specimens were tested after seven days. This is a two-way classification with factor Cement Mix (three levels) and Time (2 levels) The levels of factor Time were predetermined. The three levels of cement mixes were randomly selected from a large number of mixes, thus Cement Mix factor is Random.
SAS data input data YieldLoads; input Aging $ Mix Load @@; datalines; 2-Days 1 574 2-Days 1 564 2-Days 1 550 2-Days 2 524 2-Days 2 573 2-Days 2 551 2-Days 3 576 2-Days 3 540 2-Days 3 592 7-Days 1 1092 7-Days 1 1086 7-Days 1 1065 7-Days 2 1028 7-Days 2 1073 7-Days 2 998 7-Days 3 1066 7-Days 3 1045 7-Days 3 1055 ;
SAS code procglm data=yieldloads; class aging mix; model load = aging mix aging*mix; random mix aging*mix /test; run; OR USING: procmixed data=yieldloads; class aging mix; model load= aging; random mix mix*aging; run;
Question… • Option 1. Go back and complete SLICE part or • Option 2. Go ahead to the MANOVA • ?
MANOVA example A researcher randomly assigns 33 subjects to one of three groups: G1 receives technical dietary information interactively from an on-line website. G2 receives the same information in from a nurse practitioner G3 receives the information from a video tape made by the same nurse practitioner The researcher looks at three different ratings of the presentation, difficulty, useful and importance, to determine if there is a difference in the modes of presentation. In particular, the researcher is interested in whether the interactive website is superior because that is the most cost-effective way of delivering the information.
SAS code procglm data=manovaex; class group; model useful difficulty importance = group; contrast '1 vs 2&3' group 2 -1 -1; contrast '2 vs 3' group 01 -1; manova h=_all_; run; Note: go to the manova.sas example