400 likes | 545 Views
Week 5. Regression through the origin. &. Multiple Regression (Partial and overall significance test). The intercept term is absent or zero. =. b. +. Y. X. u. i.e.,. i. 2. i. i. Y. ^. ^. i. =. SRF. :. Y. . X. i. 2. i. ^. b. 2. 1. X. 0. i.
E N D
Week 5 Regression through the origin & Multiple Regression (Partial and overall significance test)
The intercept term is absent or zero. = b + Y X u i.e., i 2 i i Y ^ ^ i = SRF : Y X i 2 i ^ b 2 1 X 0 i Regression through the origin
The estimated model: ~ ~ = b Y X 2 i ~ ~ or = b + Y X 2 i i Applied OLS method: ^ ( ) 2 s ~ å X Y ~ = b2 Var and i i å b = X 2 2 i 2 å X i ~ å 2 u and ^ 2 s = N -1 Regression through the origin
~ å 1. need not be zero u i can be occasions turn out to be negative. may not be appropriate for the summary of statistics. R2 2. df 3. does not include the constant term, i.e., (n-k) Some feature of interceptless model • In practice: • 1. A very strong priori or theoretical expectation, otherwise stick to the conventional intercept-present model. • 2. If intercept is included in the regression model but it turns out to be statistically insignificant, then we have to drop the intercept to re-run the regression.
Regression through origin = b2 + Y X u’ = b + b + Y X u i 1 2 i å ^ xy å XY b2 = ~ b2 = x2 å å X2 s ^ 2 ^ ( ) s 2 ^ ) b2 = ( ~ Var b2 = Var å å 2 x X2 ~ å 2 u’ s ^ ^ å = 2 2 u ^ - s = 2 n 1 [ ] ( ) ( ) - n 2 2 - - å X X Y Y = 2 R å 2 ( ) ( ) ( XY ) 2 2 - - å å X raw X Y Y = 2 R å å X2 Y2 ( ) 2 å xy or = 2 R å å 2 2 x y
Y False SFR: Y = β’2X ^ ^ True and best SFR: Y = β1+ β2X ^ ^ ^ 5 ^ 1 X 10 50 50
Y True and best SFR: Y = β1+ β2X False SFR: Y = β2X ^ ^ ^ ^ ^ 5 ^ 1 X 10 50 50
Y False SFR: Y =β1+ β2X False SFR: Y = β1+ β’2X ^ ^ ^ ^ ^ ^ True and best SFR: Y = β2X ^ ^ 5 ^ 1 X 10 50 50
( ) ( ) - = b - ER r ER r 2 i f m f risk free of return expected rate of return on security i expected rate of return on market portfolio b as a measure of systematic risk. 2 b >1 ==> implies a volatile or aggressive security. 2 b <1 ==> implies a defensive security. 2 Example 1: Capital Asset Pricing Model (CAPM) security i’s expect risk premium=expected market risk premium
Example 1:(cont.) Security market line - ER r i f b 2 1 - ER f m
- F e - = = * ( i i ) f N N e International interest rate differentials equal exchange rate forward premium. - F e ( ) - = b2 * i i ( ) i.e., e Covered interest parity line b 2 1 Example 2:Covered Interest Parity
in regression: 1 = E ( ) 0 - F e - = + b + * ( i i ) ( ) u 1 2 i e If covered interest parity holds, is expected to be zero. 1 Example 2:(Cont.)
y: Return on A Future Fund, % X: Return on Fisher Index, % ^ Formal report: = R2=0.714 Y 1 . 0899 X N=10 SEE=19.54 (5.689)
H0: 1 = 0 1.279 - 0 7.668 ^ = + Y 1 . 2797 1 . 0691 X R2=0.715 N=10 (0.166) (4.486) SEE=20.69 The t-value shows that b1 is statistically insignificant different from zero
Multiple Regression Y = 1 + 2 X2 + 3 X3 +…+ k Xk+ u
Y = 1 + 2X2 + 3X3 + u ^ ^ ^ ^ ^ ^ ^ u = Y - 1 - 2X2 - 3X3 ^ ^ OLS is to minimize the SSR( 2) ^ ^ ^ ^ min. RSS = min. 2= min. (Y - 1 - 2X2 - 3X3)2 RSS 1 ^ ^ ^ =2 ( Y - 1- 2X2 - 3X3)(-1) = 0 ^ RSS 2 ^ ^ ^ =2 ( Y - 1- 2X2 - 3X3)(-X2) = 0 ^ RSS 3 ^ ^ ^ =2 ( Y - 1- 2X2 - 3X3)(-X3) = 0 ^ Derive OLS estimators of multiple regression
rearranging three equations: ^ ^ ^ n1 + 2 X2 + 3 X3 = Y ^ ^ ^ 2 X2 + 2 X32+ 3 X2X3 = X2Y ^ ^ ^ 1 X3 + 2 X2X3 + 3 X32 = X3Y rewrite in matrix form: ^ n X2 X3 X2 X22X2X3 X3 X2X3 X32 1 2 3 Y X2Y X3Y 2-variables Case ^ = 3-variables Case ^ ^ Matrix notation (X’X) = X’Y
Cramer’s rule: n YX3 X2 X2YX2X3 X3 X3YX32 (yx2)(x32) - (yx3)(x2x3) = = n X2 X3 X2 X22X2X3 X3 X2X3 X32 (x22)(x32) - (x2x3)2 n X2Y X2 X22 X2Y X3 X2X3X3Y (yx3)(x22) - (yx2)(x2x3) = = n X2 X3 X2 X22X2X3 X3 X2X3 X32 (x22)(x32) - (x2x3)2 ^ ^ 2 3 _ _ ^ _ ^ ^ 1 = Y - 2X2 - 3X3
^ (X’X) = X’Y ==> = (X’X)-1 (X’Y) 3x1 3x3 3x1 u2 ^ ^ ^ ^ Var-cov() = u2 (X’X)-1and u2 = n-3 Variance-Covariance matrix ^ ^ ^ ^ ^ Var(1)Cov(1 2) Cov(1 3) Cov (2 1) Var(2) Cov(2 3) Cov (3 1) Cov(3 2) Var(3) ^ Var-cov() = ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ = u2(X’X)-1 or in matrix form: 3x3 3x1 3x1
-1 n X2 X3 X2 X22 X2X3 X3 X3X2 X32 u2 u2 u2 ^ ^ ^ and ^ u2= = n-3 n- k k=3 # of independent variables ( including the constant term) ^ = u2
Y = 1 + 2X2 + 3X3 + u (suppose this is a true model) Y measures the change in the mean values of Y, per unit change in X2, holding X3 constant. = 2 : 2 X2 or The ‘direct’ or ‘net’ effect of a unit change in X2 on the mean value of Y Y = 3 holding X2 constant, the direct effect of a unit change in X3 on the mean value of Y. X3 Holding constant: To assess the true contribution of X2 to the change in Y, we control the influence of X3. The meaning of partial regression coefficients
_ _ _ 1. The regression line(surface)passes through the mean of Y1, X2, X3 _ _ _ ^ ^ ^ i.e., 1 = Y - 2X2 - 3X3 Linear in parameters Regression through the mean _ _ _ ^ ^ Y= 1 + 2X2 + 3X3 ^ ==> _ ^ ^ ^ Y = Y + 2x2 + 3x3 2. Unbiased: E(i) = i ^ ^ ^ or y = 2x2 + 3x3 Zero mean of error 3. u=0 uY=0 ^ ^ constant Var(ui) = 2 ^ ^ 4. uX1 = uX2 = 0 (uXk=0 ) ^ ^ 5. random sample Properties of multiple OLS estimators
Properties of multiple OLS estimators ^ ^ 6. As X2 and X3 are closely related ==> var(2) and var(3) become large and infinite. Therefore the true values of 2 and 3 are difficult to know. 7. The greater the variation in the sample values of X2 or X3, the smaller variance of 2 and 3 , and the estimations are more precisely. ^ ^ All the normality assumptions in the two-variables case regression are also applied to the multiple variable regression. But one addition assumption is No exact linear relationship among the independent variables. (No perfect collinearity,i.e.,Xk Xj ) 8. BLUE (Gauss-Markov Theorem)
The adjusted R2 (R2) as one of indicators of the overall fitness ESS RSS u2 u2 ^ ^ R2 = = 1 - = 1 - TSS TSS y2 y2 ^ k : # of independent variables plus the constant term. _ _ _ u2 / (n-k) R2 = 1 - R2 = 1 - R2 = 1 - y2 / (n-1) 2 ^ SY2 n : # of obs. (n-1) n-1 (n-k) n-k _ R2 = 1 - (1-R2) _ 0 < R2 < 1 R2 R2 Adjusted R2 can be negative: R2 0 Note: Don’t misuse the adjusted R2, Gujarati(2003) pp. 222
Y TSS Y n-1 ^ u Y = 1 + 2 X2 + 3 X3 + u
Suppose X4 is not an explanatory Variable but is included in regression C X2 X3 X4
Hypothesis Testing in multiple regression: 1. Testing individual partial coefficient 2. Testing the overall significance of all coefficients 3. Testing restriction on variables (add or drop): Xk = 0 ? 4. Testing partial coefficient under some restrictions Such as 2+ 3 = 1; or 2 = 3 (or 2+ 3 = 0); etc. 5. Testing the functional form of regression model. 6. Testing the stability of the estimated regression model -- over time -- in different cross-sections
1 holding X3 constant: WhetherX2 has the effect on Y ? H0 : 2 = 0 H1 : 2 0 ^ 2 - 0 0.726 t = = = 14.906 Se (2) ^ 0.048 Compare with the critical value tc0.025, 12 = 2.179 Since t > tc ==> reject Ho ^ Answer : Yes, 2 is statistically significant and is significantly different from zero. Y = 2 = 0? X2 1. Individual partial coefficient test
holding X2 constant: Whether X3 has the effect on Y? 2 H0 : 3 = 0 H1 : 3 0 Critical value: tc0.025, 12 = 2.179 ^ 3 - 0 2.736-0 t = = = 3.226 Since |t | > | tc | ==> reject Ho Se (3) ^ 0.848 ^ Answer: Yes, 3 is statistically significant and is significantly different from zero. Y = 3 = 0? X3 1. Individual partial coefficient test (cont.)
3-variable case: Y = 1 + 2X2 + 3X3 + u H0 : 2 = 0, 3 = 0, (all variable are zero effect) H1 : 2 0 or 3 0 (At least one variable is not zero) 2.Testing overall significance of the multiple regression 1. Compute and obtain F-statistics 2. Check for the critical Fc value (F c, k-1, n-k) 3. Compare F and Fc, and if F > Fc ==> reject H0
Analysis of Variance: ^ Since y = y + u ^ ==> y2 = y2 + u2 ^ TSS = ESS + RSS ANOVATABLE (SS) (MSS) Source of variation Sum of Square df Mean sum of Sq. Due to regression(ESS) y 2 k-1 Due to residuals(RSS) u2 n-k Total variation(TSS) y2 n-1 y2 ^ ^ k-1 u2 ^ ^ = u2 ^ n-k Note: k is the total number of parameters including the intercept term. MSS of ESS ESS / k-1 y2/(k-1) ^ F = = = u2 /(n-k) ^ MSS of RSS RSS / n-k H0 : 2 = … = k = 0 if F > Fck-1,n-k ==> reject Ho H1 : 2 … k 0
y = 2x2 + 3x3 + u ^ ^ ^ Three- variable case y2 = 2 x2 y+ 3 x3 y+ u2 ^ ^ ^ TSS = ESS + RSS ANOVA TABLE Source of variation SS df(k=3) MSS ESS2 x2 y + 3 x3 y 3-1 ESS/3-1 RSS u2 n-3 RSS/n-3 TSS y2 n-1 ^ ^ ^ (n-k) ^ (2 x2y + 3 x3y) / 3-1 ESS / k-1 ^ F-Statistic = = RSS / n-k u2 / n-3 ^
ESS (n-k) ESS / k-1 For the three-variables case : F = = RSS (k-1) RSS / n-k R2 / 2 F = ESS n-k n-k n-k (1-R2) / n-3 = TSS-ESS k-1 k-1 k-1 (k-1) F ESS/TSS R2 = = (k-1)F + (n-k) ESS 1 - TSS R2 = 1 - R2 R2 / (k-1) F = Reverse : (1-R2) / n-k An important relationship between R2 and F
R2 and the adjusted R2 (R2) ESS RSS u2 u2 ^ ^ R2 = = 1 - = 1 - TSS TSS y2 y2 ^ k : # of independent variables including the constant term. _ _ _ u2 / (n-k) R2 = 1 - R2 = 1 - R2 = 1 - y2 / (n-1) 2 ^ SY2 n : # of obs. (n-1) n-1 (n-k) n-k _ R2 = 1 - (1-R2) _ 0 < R2 < 1 R2 R2 Adjusted R2 can be negative: R2 0
Overall significance test: H0 : 2 = 3 = 4 = 0 H1 : at least one coefficient is not zero. 2 0 , or 3 0 , or 4 0 F*= R2 / k-1 0.9710 / 3 = (1-R2) / n- k (1-0.9710) /16 = Fc(0.05, 4-1, 20-4) = 3.24 = 179.13 k-1 n-k Since F* > Fc ==> reject H0.
Source of SS Df MSS variation 2 2 2 2 Due to k-1 å å R ( y ) R ( y )/(k-1) regression =(0.971088)(28.97771)2x19 (SSE) =3 =15493.171 = 5164.3903 2 2 2 2 2 Due to n-k (1- R )( å y ) or ( å ) (1- R )( å y )/(n-k) Residuals =(0.0289112)(28.97771) )2x19 (RSS) =16 =461.2621 =28.8288 2 Total (TSS) n-1 å ( y ) =(28.97771) 2x19 =19 =15954.446 MSS of regression 5164.3903 F*= = = 179.1339 MSS of residual 28.8288 Construct the ANOVA Table (8.4) .(Information from EViews) Since (y)2 = Var(Y) = y2/(n-1) => (n-1)(y)2 = y2
R2 / k-1 0.707665 / 2 Fc(0.05, 3-1, 64-3) = 3.15 = (1-R2) / n- k (1-0.707665)/ 61 k-1 n-k Example:Gujarati(2003)-Table6.4, pp.185) H0 : 1 = 2 = 3 = 0 ESS/ k-1 F*= = RSS/(n- k) F* = 73.832 Since F* > Fc ==> reject H0.
Source of SS Df MSS variation 2 2 2 2 Due to k-1 å å R ( y ) R ( y )/(k-1) regression =(0.707665)(75.97807)2x64 (SSE) =2 =261447.33 = 130723.67 2 2 2 2 2 Due to n-k (1- R )( å y ) or ( å ) (1- R )( å y )/(n-k) Residuals =(0.292335)(75397807)2x64 (RSS) =61 =108003.37 =1770.547 2 Total (TSS) n-1 å ( y ) =(75.97807)2x64 =63 =369450.7 MSS of regression 130723.67 F*= = = 73.832 MSS of residual 1770.547 Construct the ANOVA Table (8.4) .(Information from EVIEWS) Since (y)2 = Var(Y) = y2/(n-1) => (n-1)(y)2 = y2
Y = 1 + 2 X2 + 3 X3 + u H0 : 2 = 0, 3= 0, Compare F* and Fc, checks the F-table: H1 : 2 0 ; 3 0 Fc0.01, 2, 61 = 4.98 Fc0.05, 2, 61 = 3.15 Decision Rule: Since F*= .73.832> Fc = 4.98(3.15) ==> reject Ho Answer : The overall estimators are statistically significant different from zero.