340 likes | 356 Views
This example demonstrates model validation and prediction for the digestibility of fat for different proportions of stearic acid in the fat. Residuals, residual standard error, and residual analysis are used to validate the model. Confidence intervals and standard errors are also discussed.
E N D
Chapter 7 Model validation and prediction
Example: Stearic acid and digestibility Digestibility of fat for different proportions of stearic acid in the fat. The line is y = −0.93· x + 96.53.
Example: Stearic acid and digestibility Residuals for the dataset on digestibility and stearic acid. The vertical lines between the model (the straight line) and the observations are the residuals.
Residual standard error The sample standard error (SE) measures the average distance from the observations to the predicted. In linear regression the residuals measure the distance from the observed value to the predicted value. Thus, we can calculate the standard error of the residuals. We can use it to describe the effectiveness of our prediction—if the residual standard deviation is small then the observations are generally closer to the predicted line, and they are further away if the residual standard deviation is large.
Residual analysis The residuals are standardized with their standard error: The standardized residuals are standardized such that they resemble the normal distribution with mean zero and standard deviation one—if the model assumptions hold. Models are usually validated with a residual plot.
Example: Stearic acid and digestibility Residual analysis for the digestibility data: residual plot (left) and QQ-plot (right) of the standardized residuals. The straight line has intercept zero and slope one.
Model validation based on residuals Plot the standardized residuals against the predicted values. The points should be spread randomly in the vertical direction, without any systematic patterns. In particular, • points should be roughly equally distributed between positive and negative values in all parts of the plot (from left to right). • there should be roughly the same variation in the vertical direction in all parts of the plot (from left to right). • there should be no too extreme points. Systematic deviations correspond to problems with the mean structure, the variance homogeneity, or the normal distribution, respectively.
Example: Stearic acid and digestibility There seem to be both positive and negative residuals in all parts of the plot (from left to right; for small, medium, as well as large predicted values). This indicates that the specification of the digestibility mean as a linear function of the stearic acid level is appropriate. There seems to be roughly the same vertical variation for small, medium, and large predicted values. This indicates that the standard deviation is the same for all observations (homoscedasticity, or homogeneity of variance). There are neither very small nor very large standardized residuals This indicates that there are no outliers and that it is not unreasonable to use the normal distribution.
Example: Growth of duckweed Top panel shows the original duckweed data. Bottom left shows the data and fitted regression line after logarithmic transformation and bottom right shows the fitted line transformed back to the original scale.
Example: Growth of duckweed Residual plots for the duckweed data. Left panel: linear regression with the leaf counts as response. Right panel: linear regression with the logarithmic leaf counts as response.
Example: Chlorophyll concentration Upper left panel: scatter plot of the data. Remaining panels: residual plots for the regression of nitrogen concentration (N) predicted by chlorophyll content (C) in the plants (upper right), for the regression of log (N) on C (lower left), and for the regression of the square root of N (lower right).
Confidence interval Construction. In general we denote the confidence level 1−α, such that 95% and 90% confidence intervals corresponds to α = 0.05 and α = 0.10, respectively. The relevant t quantile is 1−α/2, assigning probability α/2 to the right. Then (1−α)-confidence interval for a parameter θ is of the form where r is the degrees of freedom. Interpretation.The 1−α confidence interval includes the values of θ for which it is reasonable, at confidence degree 1−α, to believe that they could have generated the data. If we repeated the experiment many times then a fraction 1−α of the corresponding confidence intervals would include the true value θ.
Standard errors in linear regression Consider the linear regression model. As already derived, the least squares estimates for slope and intercept are Define SSx and s respectively by The standard errors for the estimates are
Example: Stearic acid and digestibility Consider the linear regression model which describes the association between the level of stearic acid and digestibility. The residual sum of square SSe = 61.7645 is used to calculate
Example: Stearic acid and digestibility There are n = 9 observations and p = 2 parameters. Thus, we need quantiles from the t distribution with 7 degrees of freedom. Since t0.95,7 = 1.895 and t0.975,7 = 2.365, we compute the 90% and the 95% confidence interval for the slope parameter β. Hence, decrements between 0.76 and 1.11 percentage points of the digestibility per unit increment of stearic acid level are in agreement with the data on the 90% confidence level.
Confidence interval for prediction The expected value of prediction is obtained by the model with the estimates of intercept and the slope: It takes into account the estimation error and thus gives rise to the confidence interval for the expected value y0 = α+β x0.
Example: Stearic acid and digestibility If we consider a stearic acid level of x0 = 20%, then we will expect a digestibility percentage of which has standard error The 95% confidence interval for μ0 is given by In conclusion, the predicted values of digestibility percentage corresponding to a stearic acid level of 20% between 75.2 and 80.5 are in accordance with the data on the 95% confidence level.
Example: Stearic acid and digestibility We can calculate the 95% confidence interval for the expected digestibility percentage for other values of the stearic acid level.
Prediction interval However, y0 is subject to observation error. The observational error has standard deviation σ, and the prediction interval should take this source of variation into account, too. Intuitively, this corresponds to adding s to the residual standard error. Hence, the 95% prediction interval is computed as follows: The interpretation is that a (new) random observation with x = x0 will belong to this interval with probability 95%.
Confidence and prediction intervals • Interpretation. The confidence interval includes the expected values that are in accordance with the data (with a certain degree of confidence), whereas a new observation will be within the prediction interval with a certain probability. • Interval widths. The prediction interval is wider than the corresponding confidence interval. • Dependence on sample size. The confidence interval can be made as narrow as we want by increasing the sample size. This is not the case for the prediction interval.
Example: Stearic acid and digestibility Predicted values (solid line), pointwise 95% prediction intervals (dashed lines), and pointwise 95% confidence intervals (dotted lines) for the digestibility data. The prediction intervals are wider than the confidence intervals. Also notice that the confidence bands and the prediction bands are not straight lines: the closer x0 is to the mean value, the more precise the prediction—reflecting that there is more information close to the mean.
One-way ANOVA model In the one-way ANOVA setup with k groups, the group means α1,…,αk are parameters, and we write the one-way ANOVA model where g(i) = xi is the group that corresponds to yi. The remainder terms e1,…,en are independent and N(0,σ2) distributed. In other words, it is assumed that there is a normal distribution for each group, with means that are different from group to group and given by the α's but with the same standard deviation in all groups (namely, σ) representing the within-group variation. The parameters of the model are α1,…,αk and σ, where αj is the expected value (or the population average) in the jth group. In particular, we are often interested in the group differences αj−αl, since they provide the relevant information if we want to compare the jth and the lth group.
Standard error in one-way ANOVA Consider the one-way ANOVA model where g(i) denotes the group corresponding to the ith observation and e1,…,en are independent and N(0,σ2) distributed. Then the estimate for the group means α1,…,αk are simply the group averages, and the corresponding standard errors are given by It suggests that mean parameters for groups with many observations (large nj) are estimated with greater precision than mean parameters with few observations.
Standard error in one-way ANOVA In the ANOVA setup the residual variance s2 is given by which we call the pooled variance estimate. In the one-way ANOVA case we are very often interested in the differences or contrasts between group levels rather than the levels themselves. Hence, we are interested in quantities αj−αl for two groups j and l. Then the estimate is simply the difference between the two estimates, and the corresponding standard error is given by The formulas above are particularly useful for two samples (k = 2).
Example: Parasite counts for salmons An experiment with two difference salmon stocks, from River Conon in Scotland and from River Ätran in Sweden, was carried out as follows. Thirteen fish from each stock were infected and after four weeks the number of a certain type of parasites was counted. The statistical model for the salmon data is given by where g(i) is either “Ätran” or “Conon” and e1,…,e26 are from N(0,σ2). In other words, Ätran observations are N(αÄtran,σ2) distributed, and Conon observations are N(αConon,σ2) distributed
Example: Parasite counts for salmons We can compute the group means and the residual standard deviation. The difference in parasite counts is estimated by with a standard error of The 95% confidence interval for the difference is given by We see that the data is not in accordance with a difference of zero between the stock means. Thus, the data suggests that Ätran salmons are more susceptible than Conon salmons to parasites during an infection.
Example: Dung decomposition An experiment with dung from heifers was carried out in order to explore the influence of antibiotics on the decomposition of dung organic material. As part of the experiment, 36 heifers were divided into six groups. All heifers were fed a standard feed, and antibiotics of different types (alpha-Cypermethrin, Enrofloxacin, Fenbendazole, Ivermectin, Spiramycin) were added to the feed for heifers in five of the groups. No antibiotics were added for heifers in the remaining group (the control group). For each heifer, a bag of dung was dug into the soil, and after eight weeks the amount of organic material was measured for each bag.
Linear model Consider the comparison of groups where g(i) is the group that observation i belongs to and e1,…,en are residuals. As usual, k denotes the number of groups. In a typical linear model, it tests the null hypothesis that αi = 0. However, in this study we are interested in whether there is no difference between the groups. Thus, the null hypothesis is given by and the alternative is the opposite; namely, that at least two α's are different.
Analysis of variance (ANOVA) Since only large values of F are critical, we have where F follows the F(k−1,n−k) distribution. The hypothesis is rejected if the p-value is 0.05 or smaller (if 0.05 is the significance level). In particular, H0 is rejected on the 5% significance level if Fobs ≥ F0.95,k−1,n−k.
F-test If there is no difference between any of the groups (H0 is true), then the group averages will be of similar size and be similar to the total mean . Hence, MSgrp will be “small”. On the other hand, if groups 1 and 2, say, are different (H0 is false), then the group means will be somewhat different and cannot be similar to —hence, MSgrp will be “large”. “Small” and “large” should be measured relative to the within-group variation, and MSgrp is thus standardized with MSe. We use as the test statistic and note that large values of Fobs are critical; that is, not in agreement with the hypothesis.
F-test If the null hypothesis is true, then Fobs comes from a so-called F distribution with (k−1,n−k) degrees of freedom. Notice that there is a pair of degrees of freedom (not just a single value) and that the relevant degrees of freedom are the same as those used for computation of MSgrp and MSe. The density for the F distribution is shown for three different pairs of degrees of freedom in the left panel below.
Example: Antibiotics on decomposition The values are listed in an ANOVA table as follows: The F value 7.97 is very extreme, corresponding to the very small p-value. Thus, we reject the hypothesis and conclude that there is strong evidence of group differences. Subsequently, we need to quantify the conclusion further. Which groups are different and how large are the differences?