1 / 21

Multiple Regression

Multiple Regression. Simple Regression in detail. Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter Mean value of dependent variable (Y) when the independent variable (X) is zero. Simple Regression in detail.

Download Presentation

Multiple Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiple Regression

  2. Simple Regression in detail Yi = βo + β1 xi + εi Where • Y =>Dependent variable • X =>Independent variable • βo =>Model parameter • Mean value of dependent variable (Y) when the independent variable (X) is zero

  3. Simple Regression in detail • Β1 => Model parameter - Slope that measures change in mean value of dependent variable associated with a one-unit increase in the independent variable • εi => - Error term that describes the effects on Yi of all factors other than value of Xi

  4. Assumptions of the Regression Model • Error term is normally distributed (normality assumption) • Mean of error term is zero (E{εi} = 0) • Variance of error term is a constant and is independent of the values of X (constant variance assumption) • Error terms are independent of each other (independent assumption) • Values of the independent variable X is fixed • No error in X values.

  5. Estimating the Model Parameters • Calculate point estimate bo and b1 of unknown parameter βo and β1 • Obtain random sample and use this information from sample to estimate βo and β1 • Obtain a line of best "fit" for sample data points - least squares line = bo + b1 Xi Where is the predicted value of Y

  6. Values of Least Squares Estimates bo and b1 b1 = n xiyi - (xi)(yi) n xi2 - (xi)2 bo = y - bi x Where y = yi ; x = xi n n • bo and b1 vary from sample to sample. Variation is given by their Standard Errors Sbo and Sb1

  7. Example 1 • To see relationship between Advertising and Store Traffic • Store Traffic is the dependent variable and Advertising is the independent variable • We find using the formulae that bo=148.64 and b1 =1.54 • Are bo and b1 significant? • What is Store Traffic when Advertising is 600?

  8. Example 2 • Consider the following data • Using formulae we find that b0 = -2.55 and b1 = 1.05

  9. Example 2 Therefore the regression model would be Ŷ = -2.55 + 1.05 Xi r2 = (0.74)2 = 0.54 (Variance in sales (Y) explained by ad (X)) Assume that the Sbo(Standard error of b0)= 0.51 and Sb1 = 0.26 at  = 0.5, df = 4, Is bo significant? Is b1 significant?

  10. Idea behind Estimation: Residuals • Difference between the actual and predicted values are called Residuals • Estimate of the error in the population ei = yi - yi = yi - (bo + b1 xi) Quantities in hats are predicted quantities • bo and b1 minimize the residual or error sums of squares (SSE) SSE = ei2 = ((yi - yi)2 = Σ [yi-(bo + b1xi)]2

  11. Testing the Significance of the Independent Variables • Null Hypothesis • There is no linear relationship between the independent & dependent variables • Alternative Hypothesis • There is a linear relationship between the independent & dependent variables

  12. Testing the Significance of the Independent Variables • Test Statistic t = b1 - β1 sb1 • Degrees of Freedom v = n - 2 • Testing for a Type II Error H0: β1 = 0 H1: β1 0 • Decision Rule Reject H0: β1 = 0 if α > p value

  13. Significance Test for Store Traffic Example • Null hypothesis, Ho: β1=0 • Alternative hypothesis, HA: β1 0 • The test statistic is t = = =7.33 • With as 0.5 and with Degree of Freedom v = n-2 =18, the value of t from the table is 2.10 • Since , we reject the null hypothesis of no linear relationship. Therefore Advertising affects Store Traffic

  14. Predicting the Dependent Variable • How well does the model yi = bo + bixi predict? • Error of prediction without indep var is yi - yi • Error of prediction with indep var is yi- yi • Thus, by using indep var the error in prediction reduces by (yi – yi)-(yi- yi)= (yi – yi) • It can be shown that (yi - y)2 = ( yi - y)2 + (yi - yi)2

  15. Predicting the Dependent Variable • Total variation (SST)= Explained variation (SSM) + Unexplained variation (SSE) • A measure of the model’s ability to predict is the Coefficient of Determination (r2) r2 = = • For our example, r2 =0.74, i.e, 74% of variation in Y is accounted for by X • r2 is the square of the correlation between X and Y

  16. Multiple Regression • Used when more than one indep variable affects dependent variable • General model Where Y: Dependent variable : Independent variables : Coefficients of the n indep variables : A constant (Intercept)

  17. Issues in Multiple Regression • Which variables to include • Is relationship between dep variables and each of the indep variables linear? • Is dep variable normally distributed for all values of the indep variables? • Are each of the indep variables normally distributed (without regard to dep var) • Are there interaction variables? • Are indep variables themselves highly correlated?

  18. Example 3 • Cataloger believes that age (AGE) and income (INCOME) can predict amount spent in last 6 months (DOLLSPENT) • The regression equation is DOLLSPENT = 351.29 - 0.65 INCOME +0.86 AGE • What happens when income(age) increases? • Are the coefficients significant?

  19. Example 4 • Which customers are most likely to buy? • Cataloger believes that ratio of total orders to total pieces mailed is good measure of purchase likelihood • Call this ratio RESP • Indep variables are - TOTDOLL: total purchase dollars - AVGORDR: average dollar order - LASTBUY: # of months since last purchase

  20. Example 4 • Analysis of Variance table - How is total sum of squares split up? - How do you get the various Deg of Freedom? - How do you get/interpret R-square? - How do you interpret the F statistic? - What is the Adjusted R-square?

  21. Example 4 • Parameter estimates table - What are the t-values corresp to the estimates? - What are the p-values corresp to the estimates? - Which variables are the most important? - What are standardized estimates? - What to do with non-significant variables?

More Related