180 likes | 192 Views
A detailed analysis of the relationship between the cost of electricity production and the units produced using regression techniques.
E N D
Example 15.5 Modeling Possibilities
POWER.XLS • The Public Service Electric Company produces different quantities of electricity each month, depending on the demand. • This file lists the number of units of electricity produced (Units) and the total cost of producing these (Cost) for a 36-month period. • The data set appears on the next slide. • How can regression be used to analyze the relationship between Cost and Units?
Solution • A good place to start is with a scatterplot of Cost versus Units.
Solution -- continued • The scatterplot indicates a definite positive relationship and one that is nearly linear. • However, there is also some evidence of curvature in the plot. The points increase slightly less rapidly as Units increase from left to right. • In economic terms, there may be economics of scale, where marginal cost of the electricity decreases as more units of electricity are produced. • Nevertheless, we use regression to estimate a linear relationship between Cost and Units.
Solution -- continued • The resulting regression equation is Predicted Cost = 23.651 + 30.53 Units • The corresponding R2 and se are 73.6% and $2734. We also requested a scatterplot of the residuals versus the fitted values. The scatterplot is on the next slide. Obtaining this scatterplot is always a good idea if nonlinearity is suspected. • The sign of nonlinearity in this plot is that the residuals to the far left and the far right are all negative, whereas the majority of the residuals in the middle are positive.
Solution -- continued • Admittedly the pattern is far from perfect - there are a few negatives in the middle - but the plot does hint at nonlinear behavior. • The negative-positive-negative behavior of the residuals suggests a parabola; that is, a quadratic equation with the square of Units included in the equation. • We first create a new variable Sqr_Units in the data set. This can be done manually or using StatPro’s Transform Variables menu item.
Solution -- continued • Then we use multiple regression to estimate the equation for Cost with both explanatory variables, Units and Sqr_Units, included. • The resulting equation from the output on the next slide is Predicated Cost = 5793 +98.3Units - 0.0600Sqr_Units • Note that R2 has increase to 82.2% and se has decreased to $2281.
Solution -- continued • One way to see how this regression equation fits the scatterplot of Costs versus Units is to use Excel’s trendline option. • To do so activate the scatterplot, click on any point and use the Chart/Add Trendline menu item, click the Type tab and select the Polynormal type or order 2, that is a quadratic. • A graph of the equation is superimposed on the scatterplot on the following slide. It shows reasonably good fit, plus an obvious curvature.
Solution -- continued • The main disadvantage to a quadratic regression equation is that there is no easy interpretation of the coefficients of Units and Sqr_Units. • All we can say is that the terms in the equation combine to explain the nonlinear relationship between units produced and total cost. • A final note about the equation concerns the coefficient of Sqr_Units. • First, the fact that it is a negative make the parabola bend downward. This produces the decreasing marginal cost behavior, where every extra unit of electricity incurs a smaller cost.
Solution -- continued • Second, we shouldn’t be fooled by the small magnitude of this coefficient. Remember that it is the coefficient of Units squared, which is a large quantity. Therefore, the effect of the product -0.0600Sqr_Units is sizable. • One other possibility we might examine is a logarithmic fit. • In this case we create a new variable Log_Units, the natural logarithm of Units, and then regress Cost against the single variable Log_Units.
Solution -- continued • To create the new variable we can again use StatPro’s Transform Variable menu item and then we can superimpose a logarithmic curve on the scatterplot of Cost versus Units by using the trendline feature. • This curve appears in the scatterplot on the next slide. • To the naked eye, it appears to be similar, and about as good a fit as the quadratic curve.
Solution -- continued • The resulting regression equation is Predicted Cost = -63.993 + 16,654Log_Units • The values of R2 and se are 79.8% and 2393. • These latter values indicate that the logarithmic fit is not quite as good as the quadratic fit. • However, the advantage of the logarithmic equation is that it is easier to interpret.
Solution -- continued • In this case, where the log of the explanatory variable is used, we can interpret its coefficient as follows. • Suppose Units increases by 1%, for example from 600 to 606. Then the equation implies that the expected Cost will increase approximately $166.54. • In words, every 1% increase in Units is accompanied by an expected $166.54 increase in Cost. • Note that for larger values of Units, a 1% increase represents a larger absolute increase. But each such 1% increase entails the same increase in Cost. This is another way of describing the decreasing marginal cost property.