270 likes | 602 Views
Agenda. Introduction summary foregoing chaptersRestrictions and nested models2 approaches to testing hypothesesNon-normal disturbances and large sample testsTesting non-linear restrictions Prediction. 1. Introduction. Linear Regression Model (CLRM)Estimation (chap 3 to 5 )Hypotheses testin
E N D
1. Chapter 6: Inference and Prediction Marco Wijnakker
Sunny Cheng
2. Agenda Introduction + summary foregoing chapters
Restrictions and nested models
2 approaches to testing hypotheses
Non-normal disturbances and large sample tests
Testing non-linear restrictions
Prediction
3. 1. Introduction
Linear Regression Model (CLRM)
Estimation (chap 3 to 5 )
Hypotheses testing ? chap 6
Prediction ? chap 6
4. 1. Introduction Previous Chapters:
Regression model
y= Xß + e
OLS estimator
b = (X’X) -1 X’y
If the assumptions of chapter 2 hold, b is
Unbiased
Minimum variance
The model described so far has no restrictions (unrestricted)
5. 2. Restrictions Test hypothesis ? formulate a statistical model that contains hypothesis as a restriction on its parameters.
Example: y = ß1+ ß2 x1 + ß3 x2
(ß1, ß2, ß3) test if ß2 = 0
6. 2. Linear Restrictions y= Xß + e
s.t. Rß = q ? linear restrictions
J < K
Each row is a restriction and rows must be linearly independent
7. 3. Testing Hypotheses Two approaches:
First estimate the unrestricted model with OLS
Then test restrictions on the estimated coefficient
Impose restrictions and compare both the unrestricted and restricted model in how they fit
8. 3.1 First approach Test J linear restrictions stated in null hypothesis
Alternative hypothesis
9. 3.1 First approach OLS determines an estimate of ß which is denoted by b.
Discrepancy vector ? m = Rb – q
If Ho is true then the deviation of m from 0 can be seen as a sampling error. It is false if the error is significant.
m is normally distributed, since b is normally distributed
E (m|X) = 0
Var (m|X) = R { Var (b|X) } R’
= s2 R (X’X) -1 R’
10. 3.1 First approach Test Ho with the Wald criterion
W = m’ { Var (m|X) } m
= (Rb – q)’[ R (X’X) -1 R’] (Rb – q)
s2
~ ?2(J)
The larger m (thus how more it deviates from 0), the larger the chi-square statistic, how more chance that Ho fails.
11. 3.1 First Approach Result on last slide only holds if s2 is known.
Practice ? s2 is unknown and estimated by s2
? chi-square statistic not usable
To solve this problem we construct an F statistic
f
12. 3.1 First Approach
Numerator is distributed as a ?2(J)/ J
Denominator is distributed as a ?2(n-K)/(n-K)
Ratio of two chi-square ? F (J, n – K)
(for the case with estimated variance)
13. 3.1 First Approach To summarize:
In case of known variance s2
W = (Rb – q)’[ R (X’X) -1 R’] (Rb – q)
s2
~ ?2(J)
In case of unknown variance
F =
14. 3.1 Only one restriction One restriction ? more easy computation
We can test with the usual t – test
15. 3.2 Second Approach Recall : Impose restrictions and compare both the unrestricted and restricted model in how they fit
The unrestricted model is known
y= Xß + e
with b as estimator for ß
and e as estimator for e
16. 3.2 Second Approach Restricted Model:
y= Xß + e
s.t. Rß = q
We have to find a new estimate b* and e*
Min (y – Xb0)’ (y – Xb0)
St Rb0 = q
17. 3.2 Second Approach L*(b0, ?) = (y – Xb0)’ (y – Xb0) +2 ?’( Rb0 – q)
?L* / ?b* = -2 X’(y – Xb*) + 2 R’ ?* = 0
?L* / ? ? *= 2 ( Rb* - q ) = 0
If X’X is non-singular then we find explicit solutations:
B* = b – (X’X)-1R’[R (X’X)-1R’]-1(Rb – q)
?* = [R (X’X)-1R’]-1(Rb – q)
Var[b*| X ] = Var[b | X ] – non-negative definite matrix
18. 3.2 Second Approach Loss of fit:
Restricted least squares coefficients cannot be better than the unrestricted solution
Let e* = y – Xb*
e* = y – Xb – X(b*- b) = e – X(b* - b)
New sum of squared deviations ?
e*’e* = e’e + (b* - b)’X’X (b* -b) >= e’e
19. 3.2 Second Approach Loss of fit ?
e*’e* - e’e = (Rb – q)’[ R (X’X) -1 R’] (Rb – q)
?
(e*’e* - e’e) / J
e’e/(n – K) = F [J, n – K]
Dividing by ?i (yi – E(y))2 ?
(R2 – R*2) / J
(1 – R2)/ (n – K) = F [J, n – K]
20. 4. Nonnormal disturbances and large sample tests Assumed in the first sections that the disturbances are normally distributed
Whithout this assumption the distributions of the statistics presented are not exactly F and chi-square.
Large samples:
CLT tells us that the Wald statistic converges in probability to a chi square with J degrees of freedom
21. 4. Nonnormal disturbances and large sample tests Thm 6.1 : Limiting distribution of the Wald statistic ?
If
and if H0: Rß – q = 0
Then
W = (Rb – q)’[ R s2 (X’X) -1 R’]-1(Rb – q) = JF
(d)? ?2(J)
22. Summary section 6.1 – 6.4 y= Xß + e regression model
s.t. Rß = q restrictions
Test against
2 approaches :
1 First estimate unrestricted model, then test restrictions
2 Impose restrictions, compare both restricted and unrestricted m
23. 6.5 Testing Nonlinear Restrictions H0: c(ß) = q
c(ß) can be 1/ß, 10ß, etc.
z = (c(b) – q) / estimated standard error
c(b) ˜ c(ß) + (?c(ß)/?ß)’(b -ß)
Unbiasedness? Consistency?
In general, E(c(ß)) ? c(E(ß))
Var[c(b)]˜ (? c(ß)/?(ß))’ Var[b] (? c(ß)/?(ß))˜ g(b)’ Var[b] g(b)
24. 6.5 Testing Nonlinear Restrictions Extension to General Cases
G = ?c(b) /?(b)
Est.Asy. Var[c(b)]= G {Est.Asy. Var[b]} G’
W = (c(b)-q)’ {Est.Asy. Var[c(b)]}-1 (c(b)-q)
W ~ ?2[J]
25. 6.6 Prediction y0 = x0’ß+e0
Gauss-Markov: LS (i.e. b) is BLUE
E[y0 |x0] = y0 = x0’ b
Forecasting Error
e0 = y0 – y0 = (ß– b)’ x0 +e0
Prediction Variance
Var[e0 |X,x0] =s2 + Var[(ß– b)’ x0 |X,x0] =s2 + x0’ [s2(X’X)-1] x0 = s2 + x0’ [ s2(X’X)-1] x0
26. 6.6 Prediction Prediction Interval
= y ± t?/2 se(e0)
= y ± t?/2 (s2 + x0’ [ s2(X’X)-1] x0)
RMSE – Root Mean Squared Error
= v[(1/n0) ?(yi-yi)2]
MAE – Mean Absolute Error
= (1/n0)? |yi-yi|
27. Questions ??