710 likes | 1.15k Views
Business Forecasting. Chapter 8 Forecasting with Multiple Regression. Chapter Topics. The Multiple Regression Model Estimating the Multiple Regression Model—The Least Squares Method The Standard Error of Estimate Multiple Correlation Analysis Partial Correlation
Business Forecasting Chapter 8 Forecasting with Multiple Regression
Chapter Topics • The Multiple Regression Model • Estimating the Multiple Regression Model—The Least Squares Method • The Standard Error of Estimate • Multiple Correlation Analysis • Partial Correlation • Partial Coefficient of Determination
Chapter Topics (continued) • Inferences Regarding Regression and Correlation Coefficients • The F-Test • The t-test • Confidence Interval • Validation of the Regression Model for Forecasting • Serial or Autocorrelation
Chapter Topics (continued) • Equal Variances or Homoscedasticity • Multicollinearity • Curvilinear Regression Analysis • The Polynomial Curve • Application to Management • Chapter Summary
The Multiple Regression Model Relationship between one dependent and two or more independent variables is a linear function. Population Y-intercept Population slopes Random Error Dependent (Response) Variable Independent (Explanatory) Variables
Interpretation of Estimated Coefficients • Slope (bi) • Estimated that the average value of Y changes by bi for each 1 unit increase in Xi, holding all other variables constant (ceterus paribus). • Example: If b1 = −2, then fuel oil usage (Y) is expected to decrease by an estimated 2 gallons for each 1 degree increase in temperature (X1), given the inches of insulation (X2). • Y-Intercept (b0) • The estimated average value of Y when all Xi = 0.
Multiple Regression Model: Example (°F) Develop a model for estimating heating oil used for a single family home in the month of January, based on average temperature and amount of insulation in inches.
Multiple Regression Equation: Example Excel Output For each degree increase in temperature, the estimated average amount of heating oil used is decreased by 4.86 gallons, holding insulation constant. For each increase in one inch of insulation, the estimated average use of heating oil is decreased by 15.07 gallons, holding temperature constant.
Multiple Regression Using Excel • Stat | Regression … • EXCEL spreadsheet for the heating oil example.
Simple and Multiple Regression Compared • Coefficients in a simple regression pick up the impact of that variable (plus the impacts of other variables that are correlated with it) and the dependent variable. • Coefficients in a multiple regression account for the impacts of the other variables in the equation.
Simple and Multiple Regression Compared: Example • Two simple regressions: • Multiple Regression:
Standard Error of Estimate • Measures the standard deviation of the residuals about the regression plane, and thus specifies the amount of error incurred when the least squares regression equation is used to predict values of the dependent variable. • The standard error of estimate is computed by using the following equation:
Coefficient of Multiple Determination • Proportion of total variation in Yexplained by all X Variables taken together. • Never decreases when a new Xvariable is added to model. • Disadvantage when comparing models.
Adjusted Coefficient of Multiple Determination • Proportion of variation in Yexplained by all Xvariables adjusted for the number of X variables used and sample size: • Penalizes excessive use of independent variables. • Smaller than . • Useful in comparing among models.
Coefficient of Multiple Determination • Adjusted R2 • Reflects the number of explanatory variables and sample size • Is smaller than R2
Interpretation of Coefficient of Multiple Determination • 96.32% of the total variation in heating oil can be explained by temperature and amount of insulation. • 95.71% of the total fluctuation in heating oil can be explained by temperature and amount of insulation after adjusting for the number of explanatory variables and sample size.
Using The Regression Equation to Make Predictions Predict the amount of heating oil used for a home if the average temperature is 30° and the insulation is 6 inches. The predicted heating oil used is 304.39 gallons.
Predictions Using Excel • Stat | Regression … • Check the “Confidence and Prediction Interval Estimate” box • EXCEL spreadsheet for the heating oil example.
Residual Plots • Residuals vs. • May need to transform Y variable. • Residuals vs. • May need to transform variable. • Residuals vs. • May need to transform variable. • Residuals vs. Time • May have autocorrelation.
Residual Plots: Example May be some non-linear relationship. No Discernible Pattern
Testing for Overall Significance • Shows if there is a linear relationship between all of the Xvariables together and Y. • Use F test statistic. • Hypotheses: • H0: 1 = 2 = … = k = 0 (No linear relationship) • H1: At least one i 0 (At least one independent variable affects Y.) • The Null Hypothesis is a very strong statement. • The Null Hypothesis is almost always rejected.
Testing for Overall Significance (continued) • Test Statistic: • where F has k numerator and (n-k-1) denominator degrees of freedom.
Test for Overall SignificanceExcel Output: Example p value k = 2, the number of explanatory variables. n - 1
H0: 1 = 2 = … = k = 0 H1: At least one i 0 = 0.05 df = 2 and 12 Critical Value: Test for Overall SignificanceExample Solution Test Statistic: Decision: Conclusion: F 157.24 (Excel Output) Reject at = 0.05 There is evidence that at least one independent variable affects Y. = 0.05 F 0 3.89
Test for Significance:Individual Variables • Shows if there is a linear relationship between the variable Xi and Y. • Use t Test Statistic. • Hypotheses: • H0: i= 0 (No linear relationship.) • H1: i 0 (Linear relationship between Xi and Y.)
t Test StatisticExcel Output: Example t Test Statistic for X1 (Temperature) t Test Statistic for X2 (Insulation)
t Test : Example Solution Does temperature have a significant effect on monthly consumption of heating oil? Test at = 0.05. H0: 1 = 0 H1: 1 0 df = 12 Critical Values: Test Statistic: Decision: Conclusion: t Test Statistic = -15.084 Reject H0 at = 0.05 Reject H Reject H 0 0 There is evidence of a significant effect of temperature on oil consumption. 0.025 0.025 t 0 2.1788 −2.1788
Confidence Interval Estimate for the Slope Provide the 95% confidence interval for the population slope 1(the effect of temperature on oil consumption). -5.561 -4.15 The estimated average consumption of oil is reduced by between 4.15 gallons and 5.56 gallons for each increase of 1° F.
Contribution of a Single Independent Variable • Let Xk be the independent variable of interest • Measures the contribution of Xk in explaining the total variation in Y.
Contribution of a Single Independent Variable From ANOVA section of regression for: From ANOVA section of regression for: Measures the contribution of in explaining Y.
Coefficient of Partial Determination of • Measures the proportion of variation in the dependent variable that is explained by Xk , while controlling for (Holding Constant) the other independent variables.
Coefficient of Partial Determination for (continued) Example: Model with two independent variables
Coefficient of Partial Determination in Excel • Stat | Regression… • Check the “Coefficient of partial determination” box. • EXCEL spreadsheet for the heating oil example.
Contribution of a Subset of Independent Variables • Let Xs be the subset of independent variables of interest • Measures the contribution of the subset Xs in explaining SST.
Contribution of a Subset of Independent Variables: Example Let Xs be X1 and X3 From ANOVA section of regression for: From ANOVA section of regression for:
Testing Portions of Model • Examines the contribution of a subset Xs of explanatory variables to the relationship with Y. • Null Hypothesis: • Variables in the subset do not improve significantly the model when all other variables are included. • Alternative Hypothesis: • At least one variable is significant.
Testing Portions of Model (continued) • One-tailed Rejection Region • Requires comparison of two regressions: • One regression includes everything. • Another regression includes everything except the portion to be tested.
Partial F Test for the Contribution of a Subset of X variables • Hypotheses: • H0 : Variables Xs do not significantly improve the model, given all other variables included. • H1 : Variables Xs significantly improve the model, given all others included. • Test Statistic: • with df = m and (n-k-1) • m = # of variables in the subset Xs .
Partial F Test for the Contribution of a Single • Hypotheses: • H0 : Variable Xj does not significantly improve the model, given all others included. • H1 : Variable Xj significantly improves the model, given all others included. • Test Statistic: • With df = 1 and (n−k−1) • m = 1 here
Testing Portions of Model: Example Test at the = 0.05 level to determine if the variable of average temperature significantly improves the model, given that insulation is included.
Testing Portions of Model: Example H0: X1(temperature) does not improve model with X2 (insulation) included. H1: X1 does improve model = 0.05, df = 1 and 12 Critical Value = 4.75 (For X2) (For X1 and X2) Conclusion: Reject H0; X1does improve model.
Testing Portions of Model in Excel • Stat | Regression… • Calculations for this example are given in the spreadsheet. When using Minitab, simply check the box for “partial coefficient of determination. • EXCEL spreadsheet for the heating oil example.
Do We Need to Do This for One Variable? • The F Test for the inclusion of a single variable after all other variables are included in the model is IDENTICAL to the t Test of the slope for that variable. • The only reason to do an F Test is to test several variables together.
The Quadratic Regression Model • Relationship between the response variable and the explanatory variable is a quadratic polynomial function. • Useful when scatter diagram indicates non-linear relationship. • Quadratic Model: • The second explanatory variable is the square of the first variable.
Quadratic Regression Model (continued) Quadratic model may be considered when a scatter diagram takes on the following shapes: Y Y Y Y X1 X1 X1 X1 2 > 0 2> 0 2 < 0 2 < 0 2 = the coefficient of the quadratic term.
Testing for Significance: Quadratic Model • Testing for Overall Relationship • Similar to test for linear model • F test statistic = • Testing the Quadratic Effect • Compare quadratic model: with the linear model: • Hypotheses: • (No quadratic term.) • (Quadratic term is needed.)
Heating Oil Example (°F) Determine if a quadratic model is needed for estimating heating oil used for a single family home in the month of January based on average temperature and amount of insulation in inches.
Heating Oil Example: Residual Analysis (continued) Possible non-linear relationship No Discernible Pattern
Heating Oil Example: t Test for Quadratic Model (continued) • Testing the Quadratic Effect • Model with quadratic insulation term: • Model without quadratic insulation term: • Hypotheses • (No quadratic term in insulation.) • (Quadratic term is needed in insulation.)
Example Solution Is quadratic term in insulation needed on monthly consumption of heating oil? Test at = 0.05. H0: 3 = 0 H1: 3 0 df = 11 Critical Values: Do not reject H0 at = 0.05. Reject H Reject H 0 0 0.025 0.025 There is not sufficient evidence for the need to include quadratic effect of insulation on oil consumption. Z 0 −2.2010 2.2010 0.2786