280 likes | 477 Views
BIOL 582. Lecture Set 7 Tests for multiple groups: Part 3 One-Way ANOVA Assumptions, One-Way ANOVA Evaluation, Multiple Comparisons Tests, Data Transformations ,Contingency Tables. BIOL 582. Assumptions in ANOVA. A One-Way (Single Factor) ANOVA is appropriate for
E N D
BIOL 582 Lecture Set 7 Tests for multiple groups: Part 3 One-Way ANOVA Assumptions, One-Way ANOVA Evaluation, Multiple Comparisons Tests, Data Transformations ,Contingency Tables
BIOL 582 Assumptions in ANOVA • A One-Way (Single Factor) ANOVA is appropriate for • Samples collected from multiple populations (observational study) • A completely randomized design (Experimental design) – experimental subjects are randomly assigned to treatments • Assumptions include • Normally distributed residuals (not data) • Homoscedasticity • Independent observations (i.e., sample sizes don’t contain multiple measurements on the same subjects; different samples or treatments do not contain the same subjects) • These are the assumptions of Linear Models! • For visual assistance, the testing of assumptions with the pupfish.parasite.csv data will be presented in the next few slides. (It is assumed that data were loaded and a variable, “GROUP”, was created for all population-sex classifications.
BIOL 582 Assumptions in ANOVA: Testing normality • Histograms, Normal Probability plots (fat pencil test), Normality Tests > lm.group<-lm(GRUBS~GROUP) > residuals<-resid(lm.group) > hist(residuals,col=‘light blue’,xlab=“Residual”) > qqnorm(residuals) Conclusion: data positively skewed
BIOL 582 Assumptions in ANOVA: Testing normality • Histograms, Normal Probability plots (fat pencil test), Normality Tests > a<-qqnorm(residuals) > normal<-a$x # standard normal quantiles > observed<-(a$y-mean(a$y))/sd(a$y) # standard deviates of observed > plot(normal,observed,xlab="Theoretical Quantiles",ylab="Observed Quantiles") This is the same plot as before, but the observed values have been converted to standard normal deviates. If this is a symmetric distribution, the scale on the axis should be between -3 and 3. A “lopsided” scale is indicative of skewing Conclusion: data positively skewed
BIOL 582 Assumptions in ANOVA: Testing normality • Histograms, Normal Probability plots (fat pencil test), Normality Tests > ks.test(normal,observed) Two-sample Kolmogorov-Smirnov test data: normal and observed D = 0.165, p-value = 0.1209 alternative hypothesis: two-sided Warning message: In ks.test(normal, observed) : cannot compute correct p-values with ties > shapiro.test(observed) Shapiro-Wilk normality test data: observed W = 0.8292, p-value = 1.472e-09 One has to be cautious with normality tests. Small or large samples can affect results – high and small P-values, respectively. A single outlier can have huge impact. There are situations when one type of test will be better than another. It is probably easiest to use the Shapiro-Wilk Test (specifically for normality), not worry about the formula, and evaluate its results cautiously. If a distribution is symmetric but not normal, it is usually ok. Note! One does not have to find standard deviates! > shapiro.test(residuals) Shapiro-Wilk normality test data: residuals W = 0.8292, p-value = 1.472e-09 Conclusion: data not normal?
BIOL 582 Assumptions in ANOVA: Testing homoscedasticity • Plot standardized residuals, ANOVA on residuals > par(mfcol=c(1,2)) > plot(GROUP, observed, xlab="Population-SEX", ylab="Standardized Residuals") > plot(predict(lm.group), observed, xlab="Predicted Values", ylab="Standardized Residuals") Groups Should have mean of 0 plus symmetric and equally spread distributions. Conclusion: Unequal variances (heteroscedasticity)
BIOL 582 Assumptions in ANOVA: Testing homoscedasticity • Plot standardized residuals, ANOVA on residuals > lm.residuals<-lm(abs(residuals)~GROUP) # use absolute value of residuals > anova(lm.residuals) Analysis of Variance Table Response: abs(residuals) Df Sum Sq Mean Sq F value Pr(>F) GROUP 3 230367 76789 6.6675 0.0003799 *** Residuals 99 1140179 11517 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 # If the means of absolute values of residuals are different among groups, then it means that at least one group has values more extreme than another P-value is less than alpha = 0.05; reject null hypothesis of equal variances Conclusion: Unequal variances (heteroscedasticity)
BIOL 582 Data Transformation • Linear models and ANOVA are ignorant about the data but the biologist might know better how they behave. • The intent with data transformations is to find a new variable which is a function of the desired variable and which is suitable for linear models (i.e., meets assumptions). • Let y be a variable and z be a function of y • If z is a variable that can be used with a linear model, then it is a suitable transformation of y, if and only if, • is solvable
BIOL 582 Data Transformation • For example, if y is a variable that is not suitable for ANOVA, but • and z is a variable that is suitable, then z is a log-transformation of y. • More importantly, one can perform ANOVA on z because
BIOL 582 Data Transformation • Here is a list of common data transformations used on continuous data, plus an explanation for their use.
BIOL 582 Example log-transformation • Using the pupfish data > log.grubs<-log(GRUBS+1) # the +1 eliminates problems, in most cases, caused by 0s in the data, since log0 is undefined > lm.new<-lm(log.grubs~GROUP) > residuals<-resid(lm.new) > qqnorm(residuals) > shapiro.test(residuals) Shapiro-Wilk normality test data: residuals W = 0.9869, p-value = 0.4061
BIOL 582 Example log-transformation • Using the pupfish data > st.residuals<-(residuals-mean(residuals))/sd(residuals) > par(mfcol=c(1,2)) > plot(GROUP, st.residuals,xlab="Population-SEX",ylab="Standardized Residuals") > plot(predict(lm.new), st.residuals,xlab="Predicted Values",ylab="Standardized Residuals")
BIOL 582 Example log-transformation • Using the pupfish data > lm.residuals<-lm(abs(residuals)~GROUP) # use absolute value of residuals > anova(lm.residuals) Analysis of Variance Table Response: abs(residuals) Df Sum Sq Mean Sq F value Pr(>F) GROUP 3 16.500 5.5001 12.913 3.457e-07 *** Residuals 99 42.167 0.4259 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 Conclusion: Log-transformation “fixed” normality issue, but did not fix equal variance issue (nor will it ever). However, standardized residuals were all within -3 and +3 for all four groups. Although the assumption of homoscedastcity is violated, the violation is not egregious. Probably OK to use F distributions (but one could use a randomization procedure instead)
BIOL 582 Evaluations in ANOVA • One of the most important questions when using linear models and ANOVA is, “How much of the variation is explained by group differences?” • This is easy. Since SST = SSB + SSE, the ratio of SSB to SST , called the coefficient of determination (R2), expresses how important group differences are. Or, one can calculate 1 – SSE/SST , which is a better definition for more complex models (next lecture)
BIOL 582 Evaluations in ANOVA • One of the most important questions when using linear models and ANOVA is, “How much of the variation is explained by group differences?” • Example True: Group difference explained 12.9 percent of the overall variation in the log of grubs. False: Group difference explained 12.9 percent of the overall variation in grubs. > summary(lm.group) Call: lm(formula = log.grubs ~ GROUP) Residuals: Min 1Q Median 3Q Max -2.8590 -0.8546 0.1149 0.7944 2.8303 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.1649 0.2575 16.176 < 2e-16 *** GROUP1.M 0.3154 0.3641 0.866 0.38852 GROUP2.F -0.2073 0.3289 -0.630 0.52995 GROUP2.M 1.1630 0.3999 2.909 0.00448 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 Residual standard error: 1.261 on 99 degrees of freedom Multiple R-squared: 0.129, Adjusted R-squared: 0.1026 F-statistic: 4.889 on 3 and 99 DF, p-value: 0.003273 > # Brute Force > lm.null<-lm(log.grubs~1) > lm.group<-lm(log.grubs~GROUP) > > SSE.null<-SSE(lm.null) > SSE.group<-SSE(lm.group) > SSB<-SSE.null-SSE.group > SST<-SSB + SSE.group > R2<-1-SSE.group/SST > > R2 [,1] [1,] 0.1290355 > Do not worry about this yet
BIOL 582 Multiple Comparisons Test • What if we want to know why SSB is significantly greater than 0? Can we do t-tests among groups? • Yes, but they must be constrained to maintain an experiment-wise (family-wise) type I error rate. • Tukey’s range test (aka Tukey’s Honest Significant Difference Test) • Modified t-distribution • Finds confidence intervals for pairwise differences • If CI contains 0, do not reject null hypothesis that means compared are different
BIOL 582 Multiple Comparisons Test • Example: Tukey’s range test > aov.group<-aov(log.grubs~GROUP) > TukeyHSD(aov.group) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = log.grubs ~ GROUP) $GROUP diff lwr upr p adj 1.M-1.F 0.3153772 -0.6361571 1.2669114 0.8222764 2.F-1.F -0.2072937 -1.0667296 0.6521422 0.9220590 2.M-1.F 1.1630006 0.1180954 2.2079059 0.0228731 2.F-1.M -0.5226708 -1.3821067 0.3367651 0.3894327 2.M-1.M 0.8476235 -0.1972817 1.8925287 0.1539578 2.M-2.F 1.3702943 0.4085045 2.3320841 0.0018319
BIOL 582 Multiple Comparisons Test • Example: Tukey’s range test > aov.group<-aov(log.grubs~GROUP) > TukeyHSD(aov.group) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = log.grubs ~ GROUP) $GROUP diff lwr upr p adj 1.M-1.F 0.3153772 -0.6361571 1.2669114 0.8222764 2.F-1.F -0.2072937 -1.0667296 0.6521422 0.9220590 2.M-1.F 1.1630006 0.1180954 2.2079059 0.0228731 2.F-1.M -0.5226708 -1.3821067 0.3367651 0.3894327 2.M-1.M 0.8476235 -0.1972817 1.8925287 0.1539578 2.M-2.F 1.3702943 0.4085045 2.3320841 0.0018319 Something important to realize. Parametric Multiple Comparison tests, like Tukey’s, are used after a parametric ANOVA is performed and significant SSB has been found. One cannot use it with a randomization test. To perform multiple comparisons with a randomization test, one simply needs to compare individual group means in every random permutation See Next slide
BIOL 582 Multiple Comparisons Test • Example: Randomization approach > lm.group<-lm(log.grubs~GROUP) > ls<-data.frame(GROUP=levels(GROUP)) # a list of group means to obtain > group.means<-predict(lm.group,ls) > names(group.means)<-levels(GROUP) > > group.means 1.F 1.M 2.F 2.M 4.164853 4.480231 3.957560 5.327854 > > # create a pair-wise mean-difference matrix (actually absolute values) > diff<-dist(group.means) > diff 1.F 1.M 2.F 1.M 0.3153772 2.F 0.2072937 0.5226708 2.M 1.1630006 0.8476235 1.3702943
BIOL 582 Multiple Comparisons Test • Example: Randomization approach > diff 1.F 1.M 2.F 1.M 0.3153772 2.F 0.2072937 0.5226708 2.M 1.1630006 0.8476235 1.3702943 > P.values > # P.values are in the same cells as mean differences 1 2 3 2 0.406 3 0.549 0.121 4 0.013 0.047 0.001 > > # Notice that P.values are different > # they are not constrained > # one might want to do a Bonferroni correction of P-values > diff<-as.matrix(diff) > P<-matrix(1,nrow(diff),ncol(diff)) > > permute<-999 > > for(i in 1:permute){ + y<-sample(log.grubs) + lm.r<-lm(y~GROUP) + m.r<-predict(lm.r,ls) + diff.r<-as.matrix(dist(m.r)) + P<-ifelse(diff.r>=diff,P+1,P+0) + } > > P.values<-P/(permute+1) > P.values<-as.dist(P.values) > diff<-as.dist(diff) > > A (sequential) Bonferroni correction is to rank P-values from smallest to highest and compare to α/rank. E.g 0.001 compare to 0.05/1 reject 0.013 compare to 0.05/2 reject 0.047 compare to 0.05/3 do not reject
BIOL 582 Going Forward • Henceforth, models will become more complex but the procedure will be the same: • Linear model design and set-up • Experimental Design and Observation Studies / Use • Assumptions • Evaluation • Multiple Comparisons • Example
BIOL 582 Categorical response data • Until now, we have considered only continuous data for testing hypotheses that involve comparison of groups • I.e., continuous response ~ factor • What about categorical response ~ factor ? • Contingency Table (aka Test of Independence) via example (Data from Zar 1984) H0: Human hair color is independent of sex HA: Human hair color is not independent of sex I.e., are the distributions of hair colors consistent between males and females?
BIOL 582 Categorical response data • Contingency Table (aka Test of Independence) via example (Data from Zar 1984) • Create a table of expected frequencies if the null hypothesis were true • The totals should still be the same. What values go into the cells?
BIOL 582 Categorical response data • Contingency Table (aka Test of Independence) via example (Data from Zar 1984) • Create a table of expected frequencies if the null hypothesis were true • The totals should still be the same. What values go into the cells? • Ri*Ci/T, where R, C, and T represent row, column, and total sums, respectively
BIOL 582 Categorical response data • Contingency Table (aka Test of Independence) via example (Data from Zar 1984) • Create a table of expected frequencies if the null hypothesis were true • The totals should still be the same. What values go into the cells? • Ri*Ci/T, where R, C, and T represent row, column, and total sums, respectively
BIOL 582 Categorical response data • Contingency Table (aka Test of Independence) via example (Data from Zar 1984) • Create a table of expected frequencies if the null hypothesis were true • The totals should still be the same. What values go into the cells? • Ri*Ci/T, where R, C, and T represent row, column, and total sums, respectively
BIOL 582 Categorical response data • Contingency Table (aka Test of Independence) via example (Data from Zar 1984) • Observed • Expected • Test statistic • For this example Find the percentile of the test statistic in a χ2 distribution with (r-1)(c-1) df to get a P-value. r = number of rows = 2 c = number of columns = 4 Thus df = 3. The χ2 test stat of 8.987 yields a P-value of 0.029 in this case, which would reject the null hypothesis at α = 0.05.
BIOL 582 Categorical response data • Contingency Table Assumptions and Rules • The data are a random sample from a population or from subjects randomly assigned to experimental treatments • Sufficient sample size (usually 5 or more in each cell). The chi-square stat is “shaky” when cell sizes are low. However, 0 might be an appropriate response. If low cell sizes are a problem, randomization tests are probably advisable. (E.G., Fisher exact Test) • Independent observations (i.e., values within cells are not repeatedly measured; values between cells are not collected from the same subjects) • Contingency Table analysis in R • See http://www.statmethods.net/stats/frequencies.html • Involves setting up the table and then analyzing accordingly. • But easy to do “by hand”