930 likes | 941 Views
This chapter provides an introduction to multiple linear regression models, including least squares estimation of parameters, hypothesis testing, confidence intervals, prediction of new observations, and model adequacy checking. It also covers aspects such as polynomial regression models, categorical regressors, and variable selection.
E N D
12 Multiple Linear Regression CHAPTER OUTLINE 12-1 Multiple Linear Regression Model 12-1.1 Introduction 12-1.2 Least squares estimation of the parameters 12-1.3 Matrix approach to multiple linear regression 12-1.4 Properties of the least squares estimators 12-2 Hypothesis Tests in Multiple Linear Regression 12-2.1 Test for significance of regression 12-2.2 Tests on individual regression coefficients & subsets of coefficients 12-3 Confidence Intervals in Multiple Linear Regression 12-4.1 Use of t-tests 12-3.2 Confidence interval on the mean response 12-4 Prediction of New Observations 12-5 Model Adequacy Checking 12-5.1 Residual analysis 12-5.2 Influential observations 12-6 Aspects of Multiple Regression Modeling 12-6.1 Polynomial regression models 12-6.2 Categorical regressors & indicator variables 12-6.3 Selection of variables & model building 12-6.4 Multicollinearity
Learning Objectives for Chapter 12 After careful study of this chapter, you should be able to do the following: • Use multiple regression techniques to build empirical models to engineering and scientific data. • Understand how the method of least squares extends to fitting multiple regression models. • Assess regression model adequacy. • Test hypotheses and construct confidence intervals on the regression coefficients. • Use the regression model to estimate the mean response, and to make predictions and to construct confidence intervals and prediction intervals. • Build regression models with polynomial terms. • Use indicator variables to model categorical regressors. • Use stepwise regression and other model building techniques to select the appropriate set of variables for a regression model.
12-1: Multiple Linear Regression Models 12-1.1 Introduction • Many applications of regression analysis involve situations in which there are more than one regressor variable. • A regression model that contains more than one regressor variable is called a multiple regression model.
12-1: Multiple Linear Regression Models 12-1.1 Introduction • For example, suppose that the effective life of a cutting tool depends on the cutting speed and the tool angle. A possible multiple regression model could be where Y – tool life x1 – cutting speed x2 – tool angle
12-1: Multiple Linear Regression Models 12-1.1 Introduction Figure 12-1(a)The regression plane for the model E(Y) = 50 + 10x1 + 7x2. (b) The contour plot
12-1: Multiple Linear Regression Models 12-1.1 Introduction
12-1: Multiple Linear Regression Models 12-1.1 Introduction Figure 12-2 (a) Three-dimensional plot of the regression model E(Y) = 50 + 10x1 + 7x2 + 5x1x2. (b) The contour plot
12-1: Multiple Linear Regression Models 12-1.1 Introduction Figure 12-3 (a) Three-dimensional plot of the regression model E(Y) = 800 + 10x1 + 7x2 – 8.5x12 – 5x22 + 4x1x2. (b) The contour plot
12-1: Multiple Linear Regression Models 12-1.2 Least Squares Estimation of the Parameters
12-1: Multiple Linear Regression Models 12-1.2 Least Squares Estimation of the Parameters • The least squares function is given by • The least squares estimates must satisfy
12-1: Multiple Linear Regression Models 12-1.2 Least Squares Estimation of the Parameters • The least squares normal Equations are • The solution to the normal Equations are the least squares estimators of the regression coefficients.
12-1: Multiple Linear Regression Models Example 12-1
12-1: Multiple Linear Regression Models Example 12-1
12-1: Multiple Linear Regression Models Figure 12-4 Matrix of scatter plots (from Minitab) for the wire bond pull strength data in Table 12-2.
12-1: Multiple Linear Regression Models Example 12-1
12-1: Multiple Linear Regression Models Example 12-1
12-1: Multiple Linear Regression Models Example 12-1
12-1: Multiple Linear Regression Models 12-1.3 Matrix Approach to Multiple Linear Regression Suppose the model relating the regressors to the response is In matrix notation this model can be written as
12-1: Multiple Linear Regression Models 12-1.3 Matrix Approach to Multiple Linear Regression where
12-1: Multiple Linear Regression Models 12-1.3 Matrix Approach to Multiple Linear Regression We wish to find the vector of least squares estimators that minimizes: The resulting least squares estimate is
12-1: Multiple Linear Regression Models 12-1.3 Matrix Approach to Multiple Linear Regression
12-1: Multiple Linear Regression Models Example 12-2
12-1: Multiple Linear Regression Models Example 12-2
12-1: Multiple Linear Regression Models Example 12-2
12-1: Multiple Linear Regression Models Example 12-2
12-1: Multiple Linear Regression Models Example 12-2
12-1: Multiple Linear Regression Models Estimating 2 An unbiased estimator of 2 is
12-1: Multiple Linear Regression Models 12-1.4 Properties of the Least Squares Estimators Unbiased estimators: Covariance Matrix:
12-1: Multiple Linear Regression Models 12-1.4 Properties of the Least Squares Estimators Individual variances and covariances: In general,
12-2: Hypothesis Tests in Multiple Linear Regression 12-2.1 Test for Significance of Regression The appropriate hypotheses are The test statistic is
12-2: Hypothesis Tests in Multiple Linear Regression 12-2.1 Test for Significance of Regression
12-2: Hypothesis Tests in Multiple Linear Regression Example 12-3
12-2: Hypothesis Tests in Multiple Linear Regression Example 12-3
12-2: Hypothesis Tests in Multiple Linear Regression Example 12-3
12-2: Hypothesis Tests in Multiple Linear Regression Example 12-3
12-2: Hypothesis Tests in Multiple Linear Regression R2 and Adjusted R2 The coefficient of multiple determination • For the wire bond pull strength data, we find that R2 = SSR/SST = 5990.7712/6105.9447 = 0.9811. • Thus, the model accounts for about 98% of the variability in the pull strength response.
12-2: Hypothesis Tests in Multiple Linear Regression R2 and Adjusted R2 The adjusted R2 is • The adjusted R2 statistic penalizes the analyst for adding terms to the model. • It can help guard against overfitting (including regressors that are not really useful)
12-2: Hypothesis Tests in Multiple Linear Regression 12-2.2 Tests on Individual Regression Coefficients and Subsets of Coefficients The hypotheses for testing the significance of any individual regression coefficient:
12-2: Hypothesis Tests in Multiple Linear Regression 12-2.2 Tests on Individual Regression Coefficients and Subsets of Coefficients The test statistic is • Reject H0 if |t0| > t/2,n-p. • This is called a partial or marginal test
12-2: Hypothesis Tests in Multiple Linear Regression Example 12-4
12-2: Hypothesis Tests in Multiple Linear Regression Example 12-4
12-2: Hypothesis Tests in Multiple Linear Regression The general regression significance test or the extra sum of squares method: We wish to test the hypotheses:
12-2: Hypothesis Tests in Multiple Linear Regression A general form of the model can be written: where X1 represents the columns of X associated with 1 and X2 represents the columns of X associated with 2
12-2: Hypothesis Tests in Multiple Linear Regression For the full model: If H0 is true, the reduced model is
12-2: Hypothesis Tests in Multiple Linear Regression The test statistic is: Reject H0 if f0 > f,r,n-p The test in Equation (12-32) is often referred to as a partial F-test
12-2: Hypothesis Tests in Multiple Linear Regression Example 12-6
12-2: Hypothesis Tests in Multiple Linear Regression Example 12-6
12-2: Hypothesis Tests in Multiple Linear Regression Example 12-6