380 likes | 559 Views
Linear Statistical Models with General Covariance Matrix. Generalized Linear Models. Previous Model: Least Squares rule used to estimate parameters Properties of estimator discussed Developed hypothesis tests
E N D
Linear Statistical Models with General Covariance Matrix
Generalized Linear Models • Previous Model: • Least Squares rule used to estimate parameters • Properties of estimator discussed • Developed hypothesis tests • Assumption A.3: ei’s are independently distributed (e.g., cov(ei,ej)=0, i≠j) • Assumption A.4: V(ei)= s2 • In reality, the assumption that E(ee′)=s2IT may not hold
Generalized Linear Models • Heteroscedasticity • ei’s have different variances • e.g., error terms may be larger for larger firms, households • E(ee')=Φ=diagonal matrix w/ different values for diff. obs.
Generalized Linear Models An Example of Possible Heteroscedasticity: var(et)=σ2t
Generalized Linear Models • Autocorrelation (Serial Correlation) • The impact of xt on yt may take a number of periods to be fully felt • Errors of one obs. are correlated with errors from another obs. • Present in time-series data • E(ee')=Φ= a full (T x T) matrix w/off-diagonal elements ≠ 0 Constant variance Symmetric matrix
Generalized Linear Models An Example of Possible Autocorrelation: cov(et,et-1) ≠0 cov(et,et-1)>0
Generalized Linear Models An Example of Negative Serial Correlation time Residuals cov(et,et-1)<0
Generalized Linear Models • General Model • T observations: y1,…,yT • Y=Xβ+e with K rhs variables • E(Y)=E(Xβ+e)=Xβ w/E(e)=0 • E[(Y-Xβ) (Y-Xβ)'] = E[ee'] = = σ2Ψwhere Ψ is a (T x T) symmetric, positive definite matrix→ is (T x T) symmetric, positive definite since σ2 is a positive scalar • Previously Ψ was IT • What is the impact on the bias of LS (CRM) estimator? • What is the impact on the variance of LS estimator?
Generalized Linear Models • Remember βs=(X'X)-1X'Y • Does E(βs)=β even under the above general error structure? • E(βs)=E[(X'X)-1X'(Xβ+e)] = E(β+ (X'X)-1X'e) = β+ (X'X)-1X'E(e)=β • →covariance assumption has no impact on E(βs), it is still unbiased Y • What are the implications of the above general error structure on Σβs? • Var(βs) ≡ Σβs= E[(βs- β)(βs – β)'] • Remember that βs=(X'X)-1X'Y • → βs=(X'X)-1X'(Xβ+e)
Generalized Linear Models • βs=(X'X)-1X'(Xβ+e)= β+(X'X)-1 X'e →Var(βS)=E[(βs- β)(βs – β)'] Var(βS)= E[(β+ (X'X)-1X'e- β)(β+ (X'X)-1X'e- β)'] =E[((X'X)-1X'e)((X'X)-1X'e)'] =E[((X'X)-1X'e)(e'X(X'X)-1)] =E[(X'X)-1X'ee'X(X'X)-1] =(X'X)-1X'E(ee')X(X'X)-1] • But from above, E(ee')=σ2Ψ → Var(βs) =σ2(X'X)-1X'ΨX(X'X)-1 • Using LS (CRM) and have a non-constant error covariance matrix (e.g., Ψ≠IT) →var(βs)≠σ2(X'X)-1
Generalized Linear Models • Bottom Line: If use CRM and in fact E[ee']= σ2Ψ this implies CRM parameter estimates (βs) are unbiased but the traditional formulas for parameter variance are incorrect. • The parameter variances change and computer programs would generate incorrect parameter standard errors because they would still use the formula Σβs=σ2(X'X)-1 when in fact Σβs=σ2(X'X)-1X'ΨX(X'X)-1 • Note what happens to Σβs if Ψ=IT
Generalized Linear Models • Lets reformulate the model so that we obtain correct covariance matrices • Model (i) • Y=Xβ+e, E[ee'] = σ2Ψ = where Ψ is a (T x T) symmetric, positive definite matrix • Given the above characteristic of Ψ, there exists a matrix P that has the following characteristics (JHGLL, A.14.9, Greene, p. 207) Var(e)
Generalized Linear Models • With Ψbeing a symmetric (T x T) positive definite matrix, the matrix P will be a nonsingular (T x T) matrix which will always exists such that: PΨP' =IT → PΨP'(P')-1=IT(P')-1 = (P')-1 → PΨ= (P')-1 → P-1PΨ= P-1(P')-1→ Ψ= P-1(P')-1 → Ψ-1= P'P Rules of Inverses (AB)-1=B-1A-1 • Apply the above to our error covariance matrix, σ2Ψ = in order to generate an alternative to Model (i) • → the CRM can be applied even with above error structure
Generalized Linear Models Model i: Y=Xβ+e, E[ee'] = σ2Ψ = • Model (ii) • Lets redefine our model X*≡PX Y*≡PY e*≡Pe • Y*=X*β + e* • Models (i) and (ii) are informationally the same as we simply multiply through by P in Model (ii) • Y*=X*β + e* →PY=PXβ + Pe P'P=Ψ-1 (TxK) (TxT) (Tx1) (TxT) (Tx1) (TxT) (TxK) (Tx1) (Tx1)
Generalized Linear Models • From the above assumptions • Model (i): V(e)=σ2Ψ which violates A.3 and A.4 when Ψ≠IT • Model (ii): V(e*)=V(Pe) =PV(e)P' = Pσ2ΨP' = σ2PΨP' • Previously we showed that Ψ=P-1P'-1 • →V(e*)=σ2P(P-1P'-1)P'=σ2IT • →V(e*) is homoscedastic and not autocorrelated IT IT CRM assumption which satisfies A.3 and A.4
Generalized Linear Models • Under Model (ii), LS techniques can be used to estimate β, βG • βG=(X*'X*)-1X*'Y* • Referred to as a Generalized Least Squares (GLS) estimator • ΣβG=σ2(X*'X*)-1=σ2((PX)'PX)-1= σ2(X'P'PX)-1= σ2(X'Ψ-1X)-1 • Under Model (ii): • SSE=e*'e*=(Pe)'Pe=e'P'Pe=e'Ψ-1e scaler = Ψ-1 KxK Weighted SSE where weights involve Ψ-1
Generalized Least Squares • Generalized Least Squares (GLS) estimator of b: bG =(X*'X*)-1X*'Y* = (X'P'PX)-1X'P'PY =(X′Ψ-1X)-1X′Ψ-1Y • Characteristics of GLS Estimator, βG • Is it biased? • Is it efficient? • E(βG)=E[(X′Ψ-1X)-1X′Ψ-1Y] =(X′Ψ-1X)-1X′Ψ-1E(Y) remember Y=Xβ+e, E(e)=0 • E(βG)= (X′Ψ-1X)-1X′Ψ-1Xβ = β • →βG unbiased = Ψ-1 • Is bG more efficient than βs? • Remember that both βS and βG are unbiased estimators of β given the error structure: E(ee')=Φ=σ2Ψ
Generalized Least Squares • Assume we have the following: • Which implies that if • Following the argument on page 331 of JHGLL, where D is positive semi-definite given:
Generalized Least Squares • That is, Ψ which is a positive definite matrix is pre and post multiplied by the same matrix, A. Thus D which is equal to Σβs-ΣβG is also positive definite which implies that GLS estimator is more efficient than the traditional least squares estimator.
Generalized Least Squares • In summary lets compare the GLS and OLS estimators of β →βS an inefficient estimator relative to βG • βG satisfies all the classical assumptions of the linear model • βG is the ML estimator of β in Model (ii) (we will see this later)
Generalized Least Squares • To estimate ΣβG and var(e) we need an unbiased GLS estimator of s2 • E(ee')= σ2 • ΣβG = σ2(X'Ψ-1X)-1 • It can be shown that σ2s and σ2u are biased and inconsistent estimates of σ2 with the above error structure (JHGLL, pp.331) • Remember Model (ii) satisfies all of the classical regression assumptions • σ2G=[(Y*-X*βG)'(Y*-X*βG)]/T • σ2G a biased but consistent estimator when βG=(X′Ψ-1X)-1X′Ψ-1Y es′es/T es′es/(T-K)
Generalized Least Squares • σ2G=[(Y*-X*βG)'(Y*-X*βG)]/T=e*'e*/T • Lets substitute in the definition of X* and Y* Weighted SSE P'P=Ψ-1 • σ2G is biased but consistent est. of σ2 • Similar to the CRM, an unbiased estimator ofσ2, σ2Gu
Generalized Least Squares • Given the above least squares estimates lets now take a look at Maximum Likelihood estimation when we have a general error variance • Comparison of Likelihood Functions under CRM and GLS Frameworks • Lets find βG,l and σ2G,l, Maximum Likelihood estimates of β and σ2 • Note the above assumes we take the structure of Ψ as given and know the values of the individual elements of this matrix • Will address estimating values of Ψ later
Generalized Least Squares • Given the above representation of the general likelihood function where E(ee′)= σ2Ψ, we have the following log-likelihood • To determine the value of β that max. the above:
Generalized Least Squares e~N(0,Φ)=N(0,σ2Ψ) • Multiplying through by σ2G,l and solving for βG,l • X′Ψ-1Y- X′Ψ-1XβG,l=0 • -X′Ψ-1XβG,l = -X′Ψ-1Y • βG,l =(X′Ψ-1X)-1X′Ψ-1Y • → The same estimator as βG
Generalized Least Squares e~N(0,Φ)=N(0,σ2Ψ) • This implies to Max. L(β,σ2): • Multiplying through by 2σ2G,l eGl • The same estimator as σ2G
Generalized Least Squares • After we developed the CRM (OLS) estimator, we talked about prediction and developed a prediction confidence interval • Lets do the same but this time assume a General Variance Structure • You should note that as always, prediction variance under the homoscedastic specification is a special case of the above result • The above results are very general • Simplifications arise when specific structures are assumed for Ψ and C • Estimated versions of the prediction variance are used where Ψ and C will depend on estimated parameters • Finite sample properties are unknown • Approximation to true V(e0)
Generalized Least Squares • Model Summary: Y=Xβ+e, V(e)=E[ee'] = = σ2ΨE(e)=0 • Use CRM: βs=(X'X)-1X'Y, E(βs)=β σ2U=es'es/(T-K) Σβs=σ2(X'X)-1X'ΨX(X'X)-1 ≠ σ2(X'X)-1 if Ψ≠IT • GLS: βG=(X'Ψ-1X)-1X'Ψ-1Y, E(βG)=β ΣβG=σ2(X'Ψ-1X)-1, σ2G,U=eG'Ψ-1eG/(T-K) • βG is BLUE (has smaller variance than βS if Ψ≠IT) heteroscedastic or autocorrelated errors eG=Y-XβG
Hypothesis Testing Under Generalized Least Squares • Y=Xβ+e where e~N(0, 2Ψ) • H0: Rβ = r • R is (J x K) restriction matrix • r is (J x 1) known vector • Similar to the CRM, Restricted GLS Estimator βRG=βG+ (X′Ψ-1X)-1R′[R(X′ Ψ-1X)-1R′]-1(r-RβG)
Hypothesis Testing Under Generalized Least Squares From Model (ii) • Remember SSEG=e*'e*=eG'Ψ-1eG weighted SSE’s • F-Statistic if H0 true and e~N(0, 2Ψ): (ver. 1) weighted SSE’s
Hypothesis Testing Under Generalized Least Squares (ver. 2) (ver. 3) • How would you test the null hypothesis that all slope coefficients are jointly zero? • How do you calculated SSEG,R?
Hypothesis Testing Under Generalized Least Squares • Test Procedure Under GLS Specification • Choose = P(reject H0| H0 true) = P(Type-I error) • Calculate the test statistic LR* based on sample information • Find the critical value LRcrit in an F-table such that: • = P(F(J, T – K)≥ LRcrit), where P(F(J, T – K) ≥ LRcrit) = P(reject H0| H0 true) • Reject H0 if LR* ≥ LRcrit • Don’t reject H0 if LR* < LRcrit
Hypothesis Testing Under Generalized Least Squares • Test Procedure Under GLS Specification with J=1 • Use (LR*)1/2. If H0 true • Choose = P(reject H0| H0 true) = P(Type-I error) • Find the critical value tcrit in a t-table such that: = P(|t(T – K)| ≥ tcrit) • Reject H0 if (LR*)1/2≥ tcrit
Feasible Generalized Least Squares • How do we estimate the structure of error covariance matrix, Φ = σ2Ψ? • Under CRM only concerned with estimating σ2 • How can we tell whether Ψ=IT or ≠IT? • We will test against H0:Ψ≠ITwhere we assume a structure • Heteroscedasticity→ assume Ψ is a diagonal matrix with some diagonal elements not equal to each other • Autocorrelation→Ψ is full • We can only estimate the elements of Ψ conditional on our H0. We need to be justified in the assumed structure.
Feasible Generalized Least Squares • Feasible Generalized Least Squares (FGLS) estimator • Replace Ψwith Ψe • Properties of Ψe? • Proposed GLS estimation procedure: FG=(X'(Ψe)-1X)-1X '(Ψe)-1Y 2FG,U=(e'FG(Ψe)-1eFG)/(T-K) • Difficulty arises because Ψe and e are correlated. Estimator of Ψ will depend on the sample of observations and therefore on disturbance vector E[FG]=β+E[(X'(Ψe)-1X)-1X '(Ψe)-1e] • It is not possible to treat Ψe as fixed an estimate • βFG will no longer be a linear function of Y because Ψ-1 will depend on Y • May not be min. var. because it may not possible to derive finite sample cov. matrix
Feasible Generalized Least Squares • General Estimation Procedure • Obtain the CRM estimate of β as a consistent estimator • Generate a consistent estimate of the error terms, es=Y-Xβs • Use es to obtain consistent estimator of Ψ, Ψe, and σ2FG conditional on your assumed error structure • Estimate FG=(X'(Ψe)-1X)-1X'(Ψe)-1Y • FG is consistent and assymptotically efficient • FG≈N(β,σ2FG(X'(Ψe)-1X)-1) as T→∞ • Use above for hypothesis testing
Feasible Generalized Least Squares • General Estimation Procedure • It follows that is a consistent estimator of Σβwhich can be used to conduct asymptotic tests of β (e.g., using a Wald test)