210 likes | 618 Views
Applied Econometrics. 6. Finite Sample Properties of the Least Squares Estimator. Terms of Art. Estimates and estimatorsProperties of an estimator - the sampling distribution
E N D
1. Applied Econometrics William Greene
Department of Economics
Stern School of Business
2. Applied Econometrics 6. Finite Sample Properties of
the Least Squares Estimator
3. Terms of Art Estimates and estimators
Properties of an estimator - the sampling distribution
Finite sample properties as opposed to asymptotic or large sample properties
4. The Statistical Context of Least Squares Estimation The sample of data from the population
The stochastic specification of the regression model
Endowment of the stochastic properties of the model upon the least squares estimator
5. Least Squares
6. Deriving the Properties So, b = a parameter vector + a linear combination of the disturbances, each times a vector.
Therefore, b is a vector of random variables. We analyze it as such.
The assumption of nonstochastic regressors. How it is used at this point.
We do the analysis conditional on an X, then show that results do not depend on the particular X in hand, so the result must be general i.e., independent of X.
7. Properties of the LS Estimator Expected value and the property of unbiasedness. E[b|X] = ? = E[b]. Prove this result.
A Crucial Result About Specification:
y = X1?1 + X2?2 + ?
Two sets of variables. What if the regression is computed without the second set of variables?
What is the expectation of the "short" regression estimator?
b1 = (X1?X1)-1X1?y
8. The Left Out Variable Formula (This is a VVIR!)
E[b1] = ?1 + (X1?X1)-1X1?X2?2
The (truly) short regression estimator is biased.
Application:
Quantity = ?1Price + ?2Income + ?
If you regress Quantity on Price and leave out Income. What do you get? (Application below)
9. The Extra Variable Formula A Second Crucial Result About Specification:
y = X1?1 + X2?2 + ? but ?2 really is 0.
Two sets of variables. One is superfluous. What if the regression is computed with it anyway?
The Extra Variable Formula: (This is a VIR!)
E[b1.2| ?2 = 0] = ?1
The long regression estimator in a short regression is unbiased.)
Extra variables in a model do not induce biases. Why not just include them, then? We'll pursue this later.
10. Application: Left out Variable Leave out Income. What do you get?
E[b1] = ?1 + ?2
In time series data, ?1 < 0, ?2 > 0 (usually)
Cov[Price,Income] > 0 in time series data.
So, the short regression will overestimate the price coefficient.
Simple Regression of G on a constant and PG
Price Coefficient should be negative.
11. Estimated Demand EquationShouldnt the Price Coefficient be Negative?
12. Multiple Regression of G on Y and PG. The Theory Works!
13. Variance of the Least Squares Estimator
14. Gauss-Markov Theorem A theorem of Gauss and Markov: Least Squares is the MVLUE
1. Linear estimator
2. Unbiased: E[b|X] = ß
Comparing positive definite matrices:
Var[c|X] Var[b|X] is nonnegative definite for any other linear and unbiased estimator. What are the implications?
15. Aspects of the Gauss-Markov Theorem Indirect proof: Any other linear unbiased estimator has a larger covariance matrix.
Direct proof: Find the minimum variance linear unbiased estimator
Other estimators
Biased estimation a minimum mean squared error estimator. Is there a biased estimator with a smaller dispersion?
Normally distributed disturbances the Rao-Blackwell result. (General observation for normally distributed disturbances, linear is superfluous.)
Nonnormal disturbances - Least Absolute Deviations and other nonparametric approaches
16. Specification Errors-1 Omitting relevant variables: Suppose the correct model is
y = X1?1 + X2?2 + ?. I.e., two sets of variables.
Compute least squares omitting X2. Some easily proved results:
Var[b1] is smaller than Var[b1.2]. (The latter is the northwest submatrix of the full covariance matrix. The proof uses the residual maker (again!). I.e., you get a smaller variance when you omit X2. (One interpretation: Omitting X2 amounts to using extra information (?2 = 0). Even if the information is wrong (see the next result), it reduces the variance. (This is an important result.)
17. Omitted Variables (No free lunch) E[b1] = ?1 + (X1?X1)-1X1?X2?2 ? ?1. So, b1 is biased.(!!!) The bias can be huge. Can reverse the sign of a price coefficient in a demand equation.
b1 may be more precise.
Precision = Mean squared error
= variance + squared bias.
Smaller variance but positive bias. If bias is small, may still favor the short regression.
(Free lunch?) Suppose X1?X2 = 0. Then the bias goes away. Interpretation, the information is not right, it is irrelevant. b1 is the same as b1.2.
18. Specification Errors-2 Including superfluous variables: Just reverse the results.
Including superfluous variables increases variance. (The cost of not using information.)
Does not cause a bias, because if the variables in X2 are truly superfluous, then ?2 = 0, so E[b1.2] = ?1.
19. Linear Restrictions Context: How do linear restrictions affect the properties of the least squares estimator?
Model: y = X? + ?
Theory (information) R? - q = 0
Restricted least squares estimator:
b* = b - (X?X)-1R?[R(X?X)-1R?]-1(Rb - q)
Expected value: ? - (X?X)-1R?[R(X?X)-1R?]-1(Rb - q)
Variance:
?2(X?X)-1 - ?2 (X?X)-1R?[R(X?X)-1R?]-1 R(X?X)-1
Var[b] a nonnegative definite matrix < Var[b]
20. Interpretation Case 1: Theory is correct: R? - q = 0 (the restrictions do hold).
b* is unbiased
Var[b*] is smaller than Var[b]
How do we know this?
Case 2: Theory is incorrect: R? - q ? 0 (the restrictions do not hold).
b* is biased what does this mean?
Var[b*] is still smaller than Var[b]
21. Restrictions and Information How do we interpret this important result?
The theory is "information"
Bad information leads us away from "the truth"
Any information, good or bad, makes us more certain of our answer. In this context, any information reduces variance.
What about ignoring the information?
Not using the correct information does not lead us away from "the truth"
Not using the information foregoes the variance reduction - i.e., does not use the ability to reduce "uncertainty."