Chapter 6: Inference and Prediction

1. Chapter 6: Inference and Prediction Marco Wijnakker Sunny Cheng

2. Agenda Introduction + summary foregoing chapters Restrictions and nested models 2 approaches to testing hypotheses Non-normal disturbances and large sample tests Testing non-linear restrictions Prediction

3. 1. Introduction Linear Regression Model (CLRM) Estimation (chap 3 to 5 ) Hypotheses testing ? chap 6 Prediction ? chap 6

4. 1. Introduction Previous Chapters: Regression model y= X� + e OLS estimator b = (X�X) -1 X�y If the assumptions of chapter 2 hold, b is Unbiased Minimum variance The model described so far has no restrictions (unrestricted)

5. 2. Restrictions Test hypothesis ? formulate a statistical model that contains hypothesis as a restriction on its parameters. Example: y = �1+ �2 x1 + �3 x2 (�1, �2, �3) test if �2 = 0

6. 2. Linear Restrictions y= X� + e s.t. R� = q ? linear restrictions J < K Each row is a restriction and rows must be linearly independent

7. 3. Testing Hypotheses Two approaches: First estimate the unrestricted model with OLS Then test restrictions on the estimated coefficient Impose restrictions and compare both the unrestricted and restricted model in how they fit

8. 3.1 First approach Test J linear restrictions stated in null hypothesis Alternative hypothesis

9. 3.1 First approach OLS determines an estimate of � which is denoted by b. Discrepancy vector ? m = Rb � q If Ho is true then the deviation of m from 0 can be seen as a sampling error. It is false if the error is significant. m is normally distributed, since b is normally distributed E (m|X) = 0 Var (m|X) = R { Var (b|X) } R� = s2 R (X�X) -1 R�

10. 3.1 First approach Test Ho with the Wald criterion W = m� { Var (m|X) } m = (Rb � q)�[ R (X�X) -1 R�] (Rb � q) s2 ~ ?2(J) The larger m (thus how more it deviates from 0), the larger the chi-square statistic, how more chance that Ho fails.

11. 3.1 First Approach Result on last slide only holds if s2 is known. Practice ? s2 is unknown and estimated by s2 ? chi-square statistic not usable To solve this problem we construct an F statistic f

12. 3.1 First Approach Numerator is distributed as a ?2(J)/ J Denominator is distributed as a ?2(n-K)/(n-K) Ratio of two chi-square ? F (J, n � K) (for the case with estimated variance)

13. 3.1 First Approach To summarize: In case of known variance s2 W = (Rb � q)�[ R (X�X) -1 R�] (Rb � q) s2 ~ ?2(J) In case of unknown variance F =

14. 3.1 Only one restriction One restriction ? more easy computation We can test with the usual t � test

15. 3.2 Second Approach Recall : Impose restrictions and compare both the unrestricted and restricted model in how they fit The unrestricted model is known y= X� + e with b as estimator for � and e as estimator for e

16. 3.2 Second Approach Restricted Model: y= X� + e s.t. R� = q We have to find a new estimate b* and e* Min (y � Xb0)� (y � Xb0) St Rb0 = q

17. 3.2 Second Approach L*(b0, ?) = (y � Xb0)� (y � Xb0) +2 ?�( Rb0 � q) ?L* / ?b* = -2 X�(y � Xb*) + 2 R� ?* = 0 ?L* / ? ? *= 2 ( Rb* - q ) = 0 If X�X is non-singular then we find explicit solutations: B* = b � (X�X)-1R�[R (X�X)-1R�]-1(Rb � q) ?* = [R (X�X)-1R�]-1(Rb � q) Var[b*| X ] = Var[b | X ] � non-negative definite matrix

18. 3.2 Second Approach Loss of fit: Restricted least squares coefficients cannot be better than the unrestricted solution Let e* = y � Xb* e* = y � Xb � X(b*- b) = e � X(b* - b) New sum of squared deviations ? e*�e* = e�e + (b* - b)�X�X (b* -b) >= e�e

19. 3.2 Second Approach Loss of fit ? e*�e* - e�e = (Rb � q)�[ R (X�X) -1 R�] (Rb � q) ? (e*�e* - e�e) / J e�e/(n � K) = F [J, n � K] Dividing by ?i (yi � E(y))2 ? (R2 � R*2) / J (1 � R2)/ (n � K) = F [J, n � K]

20. 4. Nonnormal disturbances and large sample tests Assumed in the first sections that the disturbances are normally distributed Whithout this assumption the distributions of the statistics presented are not exactly F and chi-square. Large samples: CLT tells us that the Wald statistic converges in probability to a chi square with J degrees of freedom

21. 4. Nonnormal disturbances and large sample tests Thm 6.1 : Limiting distribution of the Wald statistic ? If and if H0: R� � q = 0 Then W = (Rb � q)�[ R s2 (X�X) -1 R�]-1(Rb � q) = JF (d)? ?2(J)

22. Summary section 6.1 � 6.4 y= X� + e regression model s.t. R� = q restrictions Test against 2 approaches : 1 First estimate unrestricted model, then test restrictions 2 Impose restrictions, compare both restricted and unrestricted m

23. 6.5 Testing Nonlinear Restrictions H0: c(�) = q c(�) can be 1/�, 10�, etc. z = (c(b) � q) / estimated standard error c(b) � c(�) + (?c(�)/?�)�(b -�) Unbiasedness? Consistency? In general, E(c(�)) ? c(E(�)) Var[c(b)]� (? c(�)/?(�))� Var[b] (? c(�)/?(�))� g(b)� Var[b] g(b)

24. 6.5 Testing Nonlinear Restrictions Extension to General Cases G = ?c(b) /?(b) Est.Asy. Var[c(b)]= G {Est.Asy. Var[b]} G� W = (c(b)-q)� {Est.Asy. Var[c(b)]}-1 (c(b)-q) W ~ ?2[J]

25. 6.6 Prediction y0 = x0��+e0 Gauss-Markov: LS (i.e. b) is BLUE E[y0 |x0] = y0 = x0� b Forecasting Error e0 = y0 � y0 = (ߖ b)� x0 +e0 Prediction Variance Var[e0 |X,x0] =s2 + Var[(ߖ b)� x0 |X,x0] =s2 + x0� [s2(X�X)-1] x0 = s2 + x0� [ s2(X�X)-1] x0

26. 6.6 Prediction Prediction Interval = y � t?/2 se(e0) = y � t?/2 (s2 + x0� [ s2(X�X)-1] x0) RMSE � Root Mean Squared Error = v[(1/n0) ?(yi-yi)2] MAE � Mean Absolute Error = (1/n0)? |yi-yi|

27. Questions ??

Chapter 6: Inference and Prediction

Chapter 6: Inference and Prediction

Presentation Transcript

Chapter 6 Classification and Prediction (1)

Chapter 6 Classification and Prediction (3)

Chapter 6 Classification and Prediction (2)

Chapter 4 Prediction and Bayesian Inference

Chapter 6: Introduction to Inference

Chapter 6 Classification and Prediction (1)

CHAPTER 6 Statistical Inference & Hypothesis Testing

What clues? prediction and inference

INFERENCE VS PREDICTION

Observation/Inference/Prediction

Chapter 6. Classification and Prediction

CHAPTER 6 Statistical Inference & Hypothesis Testing

Notes: Observation, Inference, Prediction

Chapter 6 introduction to Statistical Inference

Chapter 6. Classification and Prediction

Chapter 6. Classification and Prediction

Boosting for prediction and inference

Chapter 6. Classification and Prediction

Chapter 6 Prediction, Residuals, Influence

Chapter 6. Classification and Prediction

Boosting for prediction and inference

Chapter 6. Classification and Prediction

Chapter 6: Inference and Prediction