320 likes | 431 Views
Spatial Econometrics. Why it may be important and some helpful hints getting started Philip Watson Colorado State University Dept. of Ag. and Resource Economics 2.17.2006. Objectives. Briefly describe the issue Why does space matter? Discuss how to incorporate the spatial dimension
E N D
Spatial Econometrics Why it may be important and some helpful hints getting started Philip Watson Colorado State University Dept. of Ag. and Resource Economics 2.17.2006
Objectives • Briefly describe the issue • Why does space matter? • Discuss how to incorporate the spatial dimension • Different options for weighting matrices • Present the two different spatial dependence models • Spatial error • Spatial lag • Go over some diagnostics • How to test for the presence of spatial dependence and how to correct for it
Introduction • Who cares? • First Law of Geography • Assumption of OLS is that observations are independent of one another • Are Larimer County and Weld county completely independent observations?
Introduction • Who cares? • First Law of Geography • Assumption of OLS is that observations are independent of one another • Are Larimer County and Weld county completely independent observations? • …no, so we have a violation of OLS • So what? • Depending on the nature of the dependence, OLS will be either inefficient with wrong SE or will be biased and inconsistent
Weighting I • Assumption: structure of spatial dependence is known, not estimated - Taken as known a priori • The specification of the weighting matrix “is a mater of considerable arbitrariness and a wide range of suggestions in the literature” – Anselin and Bera 1998 • “Connectivity matrix” specifies the degree of interdependence among observations • Based either on contiguity, Euclidean distance, or even non-geographical distance based measure • Might be considered a strong assumption, not as strong as assuming it is zero and all observations are spatially independent
Weighting II • Typical types of Weighting Matricies • Contiguity • Rook • Queen • First vs Second Order vs Higher Order • Distance • k nearest neighbors • Inverse distance • Distance decay function • Combination • wij=pbij/daij
Weighting III • Denote Weighting matrix, W • Contiguity matrix • NxN symetric matrix where wij = 1 when i and j are neighbors and 0 when they are not • Makes for a fairly sparse matrix • W matrix is usually standardized so all columns sum to 1 • wsij = wij / Σj wij • Makes operations with the W matrix as an average of neighboring values
Weighting IV • W matrix used to generate spatial lag operator, Wy • Σjwijyj • Weighted average of the y values based on neighbors
Types of Models • Spatial Error • Spatial Lag • Also often reffered to as spatial autoregressive model • Nonstandardized vocabulary • Anselin calls both “spatial autocorrelation” with the first referred to as spatial error model and the second referred to as spatial lag model • Others use different definitions so watch out
Analogy to Time Series • This issue can be thought of as an analogy to time series autocorrelation • Dependence moves both ways instead of just one • Spatial error model analogous to time-series serially correlated errors • Spatial lag model corresponds to the time-series lagged dependent variable model
Types of Models: Spatial Error • Spatially lagged error • Observations interdependent through unmeasured variables that are correlated across space or measurement error that is correlated with space • A nuisance that arises because we can not model all the facets of a geographical region that may influence all nearby locations • May also arise from boundaries that are not perfect measures • Counties are not labor markets but we use them as proxies • Theoretically possible to eliminate this type of spatial dependence with proper explanatory variables and correct boundaries of observations
Types of Models: Spatial Error • Space matters only in the error process, not in the substantive portion of the model • If we were able to add the right variables and move the error to the model, then space doesn’t matter anymore • Two counties affected by same hurricane 3 years ago • Natural amenity index based on county boundaries but natural amenities don’t conform to same boundaries
Types of Models: Spatial Error • Model • Start with basic model • y= x+e e~N(0,2) • y= x+e+λwe • If λ=0, reduces to OLS, if λ0, OLS is unbiased and consistent, but SE will be wrong and the betas will be inefficient
Types of Models: Spatial Lag • Spatial lag model • Dependent variable is affected by the values of the dependent variables in nearby places • Land value in a county is a function of land value in nearby counties, not just related to common unmeasured variables
Types of Models: Spatial Lag • Model • Y = xi + φwiy+ ei • Can also include wixi term • OLS in this case is biased and inconsistent
Types of Models: Spatial Lag • Quick look at why spatial lag leads to inconsistent estimates • OLS omits φwiy and thus becomes part of error • By construction, φwiy is related to neighboring y’s • Therefore, yi is correlated with the error term, unless φ=0 • As opposed to the time series case (where GLS is appropriate), the correlation between observations move both ways • Variance matrix is full not, upper triangular as in time series • Spatial GLS is also biased and inconsistent • For formal proof see Anselin and Bera 1998
Squiggles • Must be accounted for in maximum likelihood framework or using a proper set of instrument variables (Ord 1975) Spatial lag ML function Spatial error ML function
Diagnostic • Morans’ I • Indicates general spatial misspecification • I = e’We’/e’e (for row standard weights) • e = vector of OLS residuals • W = spatial weights • Similar to the Durbin-Watson test • Does not provide insight into suggesting which alternative specification to use
Diagnostics • Lagrange multiplier tests • Run regression of the residuals on the original variables and the lagged residuals • Test for λ=0 and φ=0 • See Anselin et. al 1996 • LM-Lag and Robust LM-Lag • Pertain to Spatial Lag model as alternative • Robust: tests for lag dependency in presence of missing error • LM-Error and Robust LM-Error • Pertain to Spatial Error model as alternative • Robust: tests for error dependence in presence of missing lag • Problem: tests for spatial lag and error can be mutually contaminated by each other • LM test for λ=0 responds to non-zero φ and vice versa • Robust takes into account the possibility of non-zero of the nuisance parameter
DIAGNOSTICS FOR SPATIAL DEPENDENCE - NFIASS FOR WEIGHT MATRIX : Dissertation Weights New.GAL (row-standardized weights) TEST MI/DF VALUE PROB Moran's I (error) 0.096119 3.7097932 0.0002075 Lagrange Multiplier (lag) 1 13.2385278 0.0002743 Robust LM (lag) 1 3.5785788 0.0585292 Lagrange Multiplier (error) 1 9.6647450 0.0018784 Robust LM (error) 1 0.0047961 0.9447878 DIAGNOSTICS FOR SPATIAL DEPENDENCE - NFIA FOR WEIGHT MATRIX : Dissertation Weights New.GAL (row-standardized weights) TEST MI/DF VALUE PROB Moran's I (error) 0.123882 4.6386775 0.0000035 Lagrange Multiplier (lag) 1 26.9069495 0.0000002 Robust LM (lag) 1 11.8469234 0.0005776 Lagrange Multiplier (error) 1 16.0540439 0.0000616 Robust LM (error) 1 0.9940178 0.3187624
REGRESSION SUMMARY OF OUTPUT: SPATIAL LAG MODEL - MAXIMUM LIKELIHOOD ESTIMATION Data set : west Spatial Weight : Dissertation Weights New.GAL Dependent Variable : NFIASSIMP Number of Observations: 413 Mean dependent var : 0.0268696 Number of Variables : 18 S.D. dependent var : 0.0386071 Degrees of Freedom : 395 Lag coeff. (Rho) : 0.626462 R-squared : 0.431525 Log likelihood : 874.714 Sq. Correlation : - Akaike info criterion : -1713.43 Sigma-square : 0.000847318 Schwarz criterion : -1641.01 S.E of regression : 0.0291087 ----------------------------------------------------------------------- Variable Coefficient Std.Error z-value Probability ----------------------------------------------------------------------- W_NFIASSIMP 0.6264623 0.04290037 14.60273 0.0000000 CONSTANT -0.01827905 0.0116231 -1.572648 0.1158003 ORCHIMP 0.0316304 0.01921175 1.646409 0.0996796 VEGIMP 0.1204401 0.02197299 5.48128 0.0000000 CORNIMP 0.1146744 0.0512294 2.238448 0.0251917 WHTIMP 0.06172138 0.01830314 3.372175 0.0007459 FRGIMP 0.0179874 0.01102344 1.631741 0.1027340 SILGIMP 0.04123105 0.06479901 0.6362914 0.5245864 HORTIMP -0.02029143 0.009002836 -2.253893 0.0242028 DIRECTIMP -0.2231155 0.1135618 -1.964706 0.0494481 STOCKTIMP -0.006905066 0.006543202 -1.055304 0.2912865 OFFARMTIMP 0.01557627 0.01854374 0.8399748 0.4009224 TEMPIMP 0.0002773511 0.0002004012 1.383979 0.1663650 URBANIMP 0.001173342 0.0007315193 1.603979 0.1087187 SIZEIMP -6.707924e-007 7.395939e-007 -0.9069739 0.3644205 COOPIMP -0.5220158 0.6303666 -0.8281147 0.4076054 CUSTIMP -0.1743578 0.2256416 -0.7727203 0.4396878 DAIRYIMP 0.05479179 0.01530465 3.580075 0.0003436 -----------------------------------------------------------------------
Final Thoughts • OLS suffers from potentially severe omitted variable bias • Tends to inflate estimates of common-stimuli effects • Spatially weighted GLS dramatically improves the estimates • Still has simultaneity basis • Appropriateness depends on the degree of spatial dependence • Spatial Maximum Likelihood best option • Spatial Autocorrelation (Lag) probably a bigger issue than Spatial Error • Choosing a weighting structure? • …try a few and compare the log likelihood