290 likes | 459 Views
Back to House Prices…. Our failure to reject the null hypothesis implies that the housing stock has no effect on prices Note the phrase “cannot reject” This is not very plausible. Even if it is true maybe it is an effect of the bubble Prices became divorced from their usual determinants
E N D
Back to House Prices… • Our failure to reject the null hypothesis implies that the housing stock has no effect on prices • Note the phrase “cannot reject” • This is not very plausible. Even if it is true maybe it is an effect of the bubble • Prices became divorced from their usual determinants • Re-estimate the model for the pre bubble period and see if there is difference • There seems to be a difference after 1997
Structural Break • This is known as a structural break or a regime shift • Implies that the coefficients may be different not just the variables • So the conditional expectation function has a kink • Can happen at a point in time or for a different group of observations
b b Y E(Y|X)= + X 1 2 Y 1 u1 Y u 3 3 Y u 2 2 b 1 X X X X 2 1 3 A Regime Shift Show three data points for illustration
Estimating with Structural Break • Stata command: regress … if condition regress price inc_pchstock_pc if year<=1997 Source | SS df MS Number of obs = 28 -------------+------------------------------ F( 2, 25) = 88.31 Model | 1.1008e+10 2 5.5042e+09 Prob > F = 0.0000 Residual | 1.5581e+09 25 62324995.9 R-squared = 0.8760 -------------+------------------------------ Adj R-squared = 0.8661 Total | 1.2566e+10 27 465423464 Root MSE = 7894.6 ------------------------------------------------------------------------------ price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- inc_pc | 10.39438 1.288239 8.07 0.000 7.741204 13.04756 hstock_pc | -637054.1 174578.5 -3.65 0.001 -996605.3 -277503 _cons | 135276.6 35433.83 3.82 0.001 62299.24 208253.9 ------------------------------------------------------------------------------
Interpreting Results • We can reject the null that housing stock has no effect on price • How? Do the details yourself • It has the correct sign i.e. as theory would suggest • We can use this “pre-bubble” model to predict what prices would have been • Use “predict” command • Note this command predicts out of sample also
Interpreting the Graph • Historically house prices have been determined by per capita income and the per capita housing stock. • We estimate the parameters of that relationship before 1997 • If the relationship had remained constant after 1997 prices would have followed the red line • They difference between the two is huge (100% at peak)
Interpreting the Graph • So prices rose rapidly even as the stock of houses rose • Note we are not saying that prices should have remained constant • They should have risen in line with income • But the rose by more • If the parameters had remained the same then prices would have followed the red line
But the marginal effect of income and stock should have remained the same • Income effect rose (more positive) • Stock effect rose (less negative) • The change in parameters is the structural break that constitutes the bubble • Note: this definition of a bubble would not be universally accepted
How Low Will They Go? • If we believe our model then the parameters from pre 1997 still hold • This doesn’t mean that prices will return to 1997 levels • It does mean that prices will return to the level determined by todays income (and stock) using the pre-bubble coefficients • The red line represents the future level of house prices • This represents a decline of 53% from 2010 levels!!!
Quality of Prediction • Before we get too confident in this (or any) model it is worthwhile to assess whether it is a good predictor. • We measure this using the R2 (“R-Squared”) and the adjusted R2 • Both of these measure how much of the variation in the Y variable is explained by all the X variables collectively
R2 • Let’s look at the model again • Full model: Yi= b0+b1X1i+ b2X2i+ ...+bkXki+ ui • Where E(Y|X)=b0+b1X1i+ b2X2i+ ...+bkXki • uiis the residual • For ease of notation let • So the model can be re-written as • Subtract the sample mean from both sides to get • Note that the sample mean doesn’t have a subscript
Now square it and sum over observations to get TSS = ESS + RSS • Total Sum of Squares (TSS) measures the total variation of the Y variable in the sample • Note that the variance would be this divided by N • Explained Sum of Squares (ESS) is the variation that is explained by the model i.e. by the “line” • The Residual Sum of Squares (RSS) is the unexplained portion of the variation in Y • Also know as SSR • Thing we minimized for OLS
R2 = ESS/TSS • It gives the proportion of the total variation in the Y variables that is explained by the model • By all the x variables collectively • It doesn’t say anything about which of the X variables are responsible for what portion of the variance • Doesn’t say if the individual coefficients are statistically significant (multicolinearity) • Doesn’t say if the individual coefficients make economic sense • If want to use the model to predict Y then we want R2 to be high • Model explained a lot of the historical variation so it should explain a high proportion of the future variation • i.e. make good predictions • Doesn’t automatically follow that high R2 model is better than lower R2 model
Adjusted R2 • Amend formula slightly to account for number of variables • A model with more variables will automatically have higher R2 • Some authors prefer adjusted R2 (also called Rbar squared and denoted by
Using either R2 • Compare two different models • Must have same Y variable • Must have the same sample • Be aware that more variables will necessarily lead to higher R2 but not necessarily higher • R2 (either) only tell you how good the model is at predcition • Doesn’t say anything about whether model is economically sensible • For that we need to look at the coefficients
One tailed test • The structure and interpretation of hypothesis tests is slightly different when the hypothesis involves an inequality • Examples • Gender: H0: bsex>= 0 H1: bsex<0 • House: H0: bH<= 0 H1: bH> 0 • Nevertheless, sometimes these hypotheses arise naturally
Housing Example Need to be careful about the interpretation of the null and alternative • State the Hypothesis we want to test H0: bH>= 0 H1: bH < 0 • Calculate the test statistic assuming that bH=0 true. t=-3.65 (this is the same as before) • Reject null if t<-critical value at chosen sig level • Can reject null as -3.65<-1.70 Q: where did -1.70 come fom?
Trick to a one tailed test • The test statistic is the same as two tailed • The critical value is different even if significance level is the same • The reason is that rejection is all in one tail • Remember we reject the null only if there is overwhelming evidence • What constitutes overwhelming evidence to reject the null which states that coeff is positive? • Evidence that the coeff is a strongly negative number • i.e. the entire rejection region is in the left tail • Being a large positive number would (obviously) not allow reject a null that states the coeff is positive • This affects calculation of the critical value
One Tailed example “Acceptance” Region
Critical Value • Let’s choose significance level of 5% • Df=N-K=38 • all the sl is in one tail • From table: df=30 tc=1.70 • Note for two sided it would have been 2.04 • Since rejection region is negative: tc=-1.70 • Using stata: • One sided: di invttail(38,0.05) • Two sided: di invttail(38,0.025) • Classic mistake: use +/-2.04 or +1.70 • Hint : use diagram to remind of size and sign of rejection region
One Tailed example “Acceptance” Region
The Example from the other side • State the Hypothesis we want to test H0: bH<= 0 H1: bH>0 • Calculate the test statistic assuming that H0 =0 true. t=-3.65 • Reject null if t> critical value at chosen sig level • large positive is evidence to reject null that states coeff is negative • Cannot reject null as -3.65<1.70
The difference between the two • The first H0: bH>= 0 H1: bH< 0 • 5% chance of rejecting null when it is correct (defn of SL) • i.e. of stating bH< 0 when in fact bH>= 0 • i.e. of stating there is discrimination when in fact there is none • The second H0: bH<= 0 H1: bH> 0 • 5% chance of rejecting null when it is correct • i.e. of stating bH> 0 when in fact bH<= 0 • i.e. of stating there is no discrimination when in fact there is some
Which you use is up to you. But • Beware of translating directly from English • Be aware of the implications • Rule of thumb: • H1: “what you expect” e.g. guilt, negative effect of housing stock • H0: “what you fear” e.g. innocent, positive effect of housing stock • So the test procedure minimizes the prob of rejecting what you fear when it is true • This notion works for a two sided test also