230 likes | 500 Views
Lecture 5. Topics in Applied Econometrics A. Asymptotics. Until now, we covered finite sample or small sample properties of the OLS estimators For example, we derived the unbiasedness of the OLS estimator under Assumptions 1 – 4. It’s a finite sample property b/c it holds for any n (n > k+1)
E N D
Lecture 5 Topics in Applied Econometrics A
Asymptotics • Until now, we covered finite sample or small sample properties of the OLS estimators • For example, we derived the unbiasedness of the OLS estimator under Assumptions 1 – 4. • It’s a finite sample property b/c it holds for any n (n > k+1) • We also added an additional assumption that u is normally distributed and independent of the explanatory variables. • This allowed us to show that the OLS estimators have normal sampling distributions, which lead to the t and F distributions for t and F statistics
Asymptotics • In addition to finite sample properties, it is important to also know the asymptotic or large sample properties of estimators and test statistics. • These are properties that are defined as the sample size grows without bound • One important finding is that even without the normality assumption, t and F statistics have approximately t and F distributions
Detour on asymptotics • Though not all useful estimators are unbiased, virtually all economists agree that a minimal requirement for an estimator is consistency • meaning as n ∞, the distribution of the estimator collapses to the parameter value • We generally have a fixed sample so consistency involves a thought experiment about what would happen as the sample size gets large • If obtaining more data does not get us closer to the parameter value of interest, then we are using a poor estimation procedure.
Sampling Distributions as n n3 n1 < n2 < n3 n2 n1 b1
Consistency of OLS • Under the Gauss-Markov assumptions, the OLS estimator is consistent (and unbiased) • Consistency can be proved for the simple regression case in a manner similar to the proof of unbiasedness • Will need to take probability limit (plim) to establish consistency
A few words on plims • Consistency: Let Wn be an estimator of theta based on a sample Y1, Y2, …, Yn of size n. Then, Wn is a consistent estimator of θ if for every ε > 0 • P(|Wn – θ| > ε )0 as n ∞ • When Wn is consistent, we also say that θ is the probability limit of Wn, written as plim(Wn) = θ • In other words, the distribution of Wn becomes more and more concentrated about theta as n grows. • If an estimator is not consistent, then it does not help us to learn about θ, even with an unlimited amount of data.
Example: plim • Unbiased estimators are not necessarily consistent, but those whose variances shrink to zero as the sample size grows are consistent. • Formally: If Wn is an unbiased estimator of θ and Var(Wn) 0 as n ∞ then plim(Wn) = θ. • We know the sample average is unbiased for μ. We also know that the variance of ybar is equal to σ2/n. • As n ∞, σ2/n 0 so ybar is a consistent estimator of μ (as well as being unbiased)
Here we applied the Law of Large Numbers (numerator and denominator converge to their population parameters)
A Weaker Assumption • For unbiasedness, we assumed a zero conditional mean – E(u|x1, x2,…,xk) = 0 • For consistency, we can have the weaker assumption of zero mean and zero correlation – E(u) = 0 and Cov(xj,u) = 0, for j = 1, 2, …, k • Without this assumption, OLS will be biased and inconsistent!
Deriving the Inconsistency • Just as we could derive the omitted variable bias earlier, now we want to think about the inconsistency, or asymptotic bias, in this case
Asymptotic Bias (cont) • So, thinking about the direction of the asymptotic bias is just like thinking about the direction of bias for an omitted variable • Main difference is that asymptotic bias uses the population variance and covariance, while bias uses the sample counterparts • Remember, inconsistency is a large sample problem – it doesn’t go away as we add data
Large Sample Inference • Recall that under the CLM assumptions, the sampling distributions are normal, so we could derive t and F distributions for testing • This exact normality was due to assuming the population error distribution was normal • This assumption of normal errors implied that the distribution of y, given the x’s, was normal as well
Large Sample Inference (cont) • Easy to come up with examples for which this exact normality assumption will fail • Any clearly skewed variable, like wages, arrests, savings, etc. can’t be normal, because a normal distribution is symmetric • Normality assumption not needed to conclude OLS is BLUE, only for inference (t-stats, confidence intervals, etc)
Central Limit Theorem • Based on the central limit theorem, we can show that OLS estimators are asymptotically normal • Asymptotic Normality implies that P(Z<z)F(z) as n , or P(Z<z) F(z) • The central limit theorem states that the standardized average of any population with mean m and variance s2 is asymptotically ~N(0,1), or
Asymptotic Normality (cont) • Because the t distribution approaches the normal distribution for large df, we can also say that • Note that while we no longer need to assume normality with a large sample, we do still need homoskedasticity
Asymptotic Standard Errors • If u is not normally distributed, we sometimes will refer to the standard error as an asymptotic standard error, since • So, we can expect standard errors to shrink at a rate proportional to the inverse of √n
Lagrange Multiplier statistic • Once we are using large samples and relying on asymptotic normality for inference, we can use more that t and F stats • The Lagrange multiplier or LM statistic is an alternative for testing multiple exclusion restrictions • Because the LM statistic uses an auxiliary regression it’s sometimes called an nR2 stat • Definition: auxiliary regression means a regression used to compute a test statistic (for example) where we don’t really care about the estimated coefficients.
LM Statistic (cont) • Suppose we have a standard model, y = b0 + b1x1 + b2x2 + . . . bkxk + u and our null hypothesis is • H0: bk-q+1 = 0, ... , bk = 0 • First, we just run the restricted model
With a large sample, the result from an F test and from an LM test should be similar Unlike the F test and t test for one exclusion, the LM test and F test will not be identical LM Statistic (cont) • If our null hypothesis is true, the R2 from a regression of ũ on the x’s should be “close” to zero. • With a large sample, the result from an F test and from an LM test should be similar • Unlike the F test and t test for one exclusion, the LM test and F test will not be identical