1 / 81

Statistics for Business and Economics

Statistics for Business and Economics. Chapter 11 Multiple Regression and Model Building. Learning Objectives. Explain the Linear Multiple Regression Model Describe Inference About Individual Parameters Test Overall Significance Explain Estimation and Prediction

eden-dodson
Download Presentation

Statistics for Business and Economics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics for Business and Economics Chapter 11 Multiple Regression and Model Building

  2. Learning Objectives • Explain the Linear Multiple Regression Model • Describe Inference About Individual Parameters • Test Overall Significance • Explain Estimation and Prediction • Describe Various Types of Models • Describe Model Building • Explain Residual Analysis • Describe Regression Pitfalls

  3. RegressionModels 1 Explanatory 2+ Explanatory Variable Variables Multiple Simple Non- Non- Linear Linear Linear Linear Types of Regression Models

  4. Multiple Regression Model • General form: • k independent variables • x1, x2, …, xk may be functions of variables • e.g. x2 = (x1)2

  5. Regression Modeling Steps • Hypothesize deterministic component • Estimate unknown model parameters • Specify probability distribution of random error term • Estimate standard deviation of error • Evaluate model • Use model for prediction and estimation

  6. First–Order Multiple Regression Model Relationship between 1 dependent and 2 or more independent variables is a linear function Population Y-intercept Population slopes Random error Dependent (response) variable Independent (explanatory) variables

  7. First-Order Model With 2 Independent Variables • Relationship between 1 dependent and 2 independent variables is a linear function • Model • Assumes no interaction between x1 and x2 • Effect of x1 on E(y) is the same regardless of x2 values

  8. Population Multiple Regression Model Bivariate model: y (Observed y) b Response e 0 i Plane x2 x1 (x1i , x2i)

  9. Sample Multiple Regression Model Bivariate model: y (Observed y) ^ b Response 0 ^ e Plane i x2 x1 (x1i , x2i)

  10. Regression Modeling Steps • Hypothesize Deterministic Component • Estimate Unknown Model Parameters • Specify Probability Distribution of Random Error Term • Estimate Standard Deviation of Error • Evaluate Model • Use Model for Prediction & Estimation

  11. Multiple Linear Regression Equations Too complicated by hand! Ouch!

  12. 1st Order Model Example You work in advertising for the New York Times. You want to find the effect of ad size(sq. in.) and newspaper circulation (000) on the number of ad responses (00). Estimate the unknown parameters. You’ve collected the following data: (y) (x1) (x2)RespSizeCirc 1 1 2 4 8 8 1 3 1 3 5 7 2 6 4 4 10 6 See ResponsesVsAdsizeAndCirculationData.jmp

  13. ^ 0 ^ ^ 1 2

  14. ^ • Slope (1) • Number of responses to ad is expected to increase by .2049 (20.49) for each 1 sq. in. increase in ad size holding circulation constant ^ • Slope (2) • Number of responses to ad is expected to increase by .2805 (28.05) for each 1 unit (1,000) increase in circulationholding adsize constant Interpretation of Coefficients Solution

  15. Regression Modeling Steps • Hypothesize Deterministic Component • Estimate Unknown Model Parameters • Specify Probability Distribution of Random Error Term • Estimate Standard Deviation of Error • Evaluate Model • Use Model for Prediction & Estimation

  16. Estimation of σ2 For a model with k predictors (k+1 parameters)

  17. SSE s2 s More About JMP Output (also called “standard error of the regression”) (also called “mean squared error”)

  18. Regression Modeling Steps • Hypothesize Deterministic Component • Estimate Unknown Model Parameters • Specify Probability Distribution of Random Error Term • Estimate Standard Deviation of Error • Evaluate Model • Use Model for Prediction & Estimation

  19. Evaluating Multiple Regression Model Steps • Examine variation measures • Test parameter significance • Individual coefficients • Overall model • Do residual analysis

  20. df = n – (k + 1) Inference for an Individual β Parameter • Confidence Interval (rarely used in regression) • Hypothesis Test (used all the time!) Ho: βi= 0 Ha: βi≠ 0 (or < or > ) • Test Statistic (how far is the sample slope from zero?)

  21. Easy way: Just examine p-values Both coefficients significant! Reject H0 for both tests

  22. Testing Overall Significance • Shows if there is a linear relationship between allx variables together and y • Hypotheses • H0: 1 = 2 = ... = k = 0 • No linear relationship • Ha: At least one coefficient is not 0 • At least one x variable affects y

  23. Testing Overall Significance • Test Statistic • Degrees of Freedom1 = k2 = n – (k + 1) • k = Number of independent variables • n = Sample size

  24. Analysis of Variance Sum of Mean Source DF Squares Square F Value Prob>F Model 2 9.2497 4.6249 55.440 0.0043 Error 3 0.2503 0.0834 C Total 5 9.5000 k Testing Overall SignificanceComputer Output MS(Model) n – (k + 1) MS(Error) P-value

  25. k Testing Overall SignificanceComputer Output n – (k + 1) MS(Model) MS(Error) P-value

  26. Explanatory Variable 1 2 or More 1 Quantitative Quantitative Qualitative Variable Variables Variable 1st 2nd 3rd 1st Inter- 2nd Dummy Order Order Order Order Action Order Variable Model Model Model Model Model Model Model Types of Regression Models

  27. Contains two-way cross product terms Interaction Model With 2 Independent Variables • Hypothesizes interaction between pairs of x variables • Response to one x variable varies at different levels of another x variable • Can be combined with other models • Example: dummy-variable model

  28. E(y) = 1 + 2x1 + 3(1) + 4x1(1) = 4 + 6x1 E(y) = 1 + 2x1 + 3(0) + 4x1(0) = 1 + 2x1 Interaction Model Relationships E(y) = 1 + 2x1 + 3x2 + 4x1x2 E(y) 12 8 4 x1 0 0 0.5 1 1.5 Effect (slope) of x1 on E(y) depends on x2 value

  29. Interaction Example You work in advertising for the New York Times. You want to find the effect of ad size(sq. in.), x1, and newspaper circulation (000), x2, on the number of ad responses (00), y. Conduct a test for interaction. Use α = .05.

  30. Adding Interactions in JMP is Easy Analyze >> Fit Model Click on the response variable and click the Y button Highlight the two X variables and click on the Add button While the two X variables are highlighted, click on the Cross button Run Model You can also combine steps 3 and 4 into one step: Highlight the two X variables and, from the “Macros” pull down menu, chose “Factorial to Degree.” The default for degree is 2, so you will get all two-factor interactions in the model.

  31. JMP Interaction Output Interaction not important: p-value > .05

  32. Explanatory Variable 1 2 or More 1 Quantitative Quantitative Qualitative Variable Variables Variable 1st 2nd 3rd 1st Inter- 2nd Dummy Order Order Order Order Action Order Variable Model Model Model Model Model Model Model Types of Regression Models

  33. Curvilinear effect Linear effect Second-Order Model With 1 Independent Variable • Relationship between 1 dependent and 1 independent variable is a quadratic function • Useful 1st model if non-linear relationship suspected • Model

  34. Second-Order Model Relationships 2 > 0 2 > 0 y y x1 x1 2 < 0 2 < 0 y y x1 x1

  35. Types of Regression Models Linear (First order) ^ ^  Y     X i 0 1 i Quadratic (Second order) ^ ^ ^  2 Y     X   X i 0 1 2 i i Cubic (Third order) ^ ^ ^ ^  3 2   X Y     X   X 3 i i 0 1 2 i i

  36. 2nd Order Model Example The data shows the number of weeks employed and the number of errors made per day for a sample of assembly line workers. Find a 2nd order model, conduct the global F–test, and test if β2 ≠ 0. Use α = .05 for all tests.

  37. Analyze >> Fit Y by X From hot spot menu choose: Fit Polynomial >> 2, quadratic Could also use: Analyze >> Fit Model, select Y, then highlight X and, from the “Macros” pull down menu, chose “Polynomial to Degree.” The default for degree is 2, so you will get the quadratic (2nd order) polynomial. But from Fit Model, you won’t get the cool fitted line plot.

  38. Explanatory Variable 1 2 or More 1 Quantitative Quantitative Qualitative Variable Variables Variable 1st 2nd 3rd 1st Inter- 2nd Dummy Order Order Order Order Action Order Variable Model Model Model Model Model Model Model Types of Regression Models

  39. Second-Order (Response Surface) Model With 2 Independent Variables • Relationship between 1 dependent and 2 independent variables is a quadratic function • Useful 1st model if non-linear relationship suspected • Model

  40. 4 + 5 > 0 4 + 5 < 0 y y x2 x1 y 32 > 4 45 Second-Order Model Relationships x2 x1 x2 x1

  41. From JMP: To specify the model, all you need to do is: Analyze >> Fit Model Highlight the X variables From the “Macros” pull down menu, chose “Response Surface.” The default for degree is 2, so you will get the full second-order model having all squared terms and all cross products.

  42. Explanatory Variable 1 2 or More 1 Quantitative Quantitative Qualitative Variable Variables Variable Qualitative 1st 2nd 3rd 1st Inter- 2nd Order Order Order Order Action Order Variable Model Model Model Model Model Model Model Types of Regression Models

  43. Qualitative-Variable Model • Involves categorical x variable with 2 levels • e.g., male-female; college-no college • Variable levels coded 0 and 1 • Number of dummy variables is 1 less than number of levels of variable • May be combined with quantitative variable (1st order or 2nd order model)

More Related