370 likes | 547 Views
Spatial Panel Data Forecasting over Different Horizons, Cross-Sectional and Temporal Dimensions Matías Mayor (University of Oviedo) Roberto Patuelli (University of Bologna & RCEA). Rise of theoretical and empirical spatial econometrics literature, but temporal aspect has received less attention
E N D
Spatial Panel Data Forecasting over Different Horizons, Cross-Sectional and Temporal Dimensions Matías Mayor (University of Oviedo) Roberto Patuelli (University of Bologna & RCEA)
Rise of theoretical and empirical spatial econometrics literature, but temporal aspect has received less attention Spatial dimension of labour markets a relevant topic, spatial autocorrelation emerging in particular from labour mobility over different regions Importance of spatial aspect for forecasting pointed out by Giacomini and Granger (2004; ‘ignoring spatial autocorrelation, even when it is weak, leads to highly inaccurate forecasts’) and Hernández-Murillo and Owyang (2006) Different methods proposed: Static panel models (Baltagi and Li 2004; Longhi and Nijkamp 2007; Fingleton 2009; Baltagi et al. 2012; Fingleton and Palombi 2013) Dynamic panel models (Kholodilin et al. 2009; Baltagi et al. 2013) (improves forecast performance in particular when the forecasting horizon is longer) VAR models (Schanne et al. 2010) (similar conclusion to Kholodilin et al.) Introduction
We compare two methods to obtain unemployment forecasts in (small) administrative units, and observe their performance between different countries a spatial vector autoregressive (SVAR) model (Beenstock and Felsenstein 2007; Kuethe and Pede 2011) a dynamic heterogeneous-coefficients panel data model based on an eigenvector-decomposition spatial filtering (SF) procedure (Griffith 2000, 2003). We exploit the strong heterogeneity in the size of NUTS regions to investigate the variation in the forecasting performance The two methods belong to two separate traditions: VAR models represent the mainstream (time-series) forecasting tradition, while the SF-enhanced dynamic panel model attempts to merge the panel data modelling tradition with the spatial statistics one, within a semi-parametric framework. Introduction (2)
A VAR model (Sims 1980) can be written as a set of symmetric equations in which each (dependent) variable is described by a set of its own lags and the lags of other variables in the system VAR models assume the absence of spatial spillovers A few proposals to introduce spatial relationships in a VAR framework. The number of parameters to collect then increases quadratically with spatial units some of the existing proposals use spatial contiguity information to limit the number of parameters Pan and LeSage (1995) propose to use spatial contiguity information as an alternative prior in a Bayesian VAR model Di Giacinto (2003) defines parameter constraints in a structural VAR model based on neighbouring structure, Schanne et al. (2010), based on the Global VAR (GVAR) model of Pesaran et al. (2004), use geographical information to include spatial connections between regions. Advantage of the GVAR model: inclusion of a temporal dimension within the spatial dependence process Some authors consider only contemporaneous spatial processes (Longhi and Nijkamp 2007; Kholodilin et al. 2008), whereas others specify only a temporally lagged type of spatial dependence (Hernández-Murillo and Owyang 2006). Spatial VAR Models
We follow the approach by Beenstock and Felsenstein (2007), where traditional VAR methods and modern spatial panel data techniques are ‘mixed’ Beenstock and Felsenstein allow for both contemporaneous and serially lagged spatially correlated variables Highly nonlinear model, because of the contemporaneous spatial autoregressive process They restrict the coefficient of the endogenous contemporaneous spatial lag to zero, linearizing the model Novelty is the inclusion of the spatial cross-regressive lags A further advantage is the possibility of testing for the significance of regional spillovers by means of Granger causality test. Since Wy and the residuals are not independent, it’s estimated by SUR If only one lag is allowed for: Spatial VAR Models (2)
Our alternative approach decomposes the autoregressive processes according to exogenous spatial patterns representative of accessibility/contiguity relations between the regions Benefits We obtain an explicit model of the spatial patterns without being over-restrictive by imposing (probably erroneous) regime-specific constraints We are able to estimate the model more parsimoniously, while covering the most relevant spatial structures Spatial Filtering
Griffith’s (2003) spatial filtering (SF) approach is based on the computational formula of Moran’s I (MI, Moran 1948) statistic This eigenvector decomposition technique extracts n orthogonal numerical components from a n × n normalized spatial weights matrix C can be used to obtain, given X, the numerator of MI and its extreme eigenvalues are approximately the extreme values of MI (Griffith 2000). Because of this mathematical relation, the eigenvectors of C represent all mutually exclusive (orthogonal and independent) spatial patterns implied by W. They are extracted in decreasing order of spatial autocorrelation (MI). (e.g. E1 has the largest MI achievable, given W, and all subsequent eigenvectors maximize MI while being orthogonal to previously extracted eigenvectors). The set of eigenvecs explaining spatial patterns in the variable of interest can be found by regressing it stepwise on the eigenvecs Spatial Filtering (2)
Griffith (2008) showed that SF can also help explaining spatial heterogeneity in regression coefficients. An equivalent to GWR can be computed by interacting the Xs with the eigenvectors Patuelli et al. (2012) used it in a dynamic panel to construct a spatial filter representation of the serial autoregressive coefficients, allowing for improved inference in unit root testing Spatial Filtering (3)
We test the forecasting performance of SVAR and SF on three data sets, for Spain, Switzerland and France. We use official regional unemployment rates at the NUTS-3 level All three data sets have satisfactory but different temporal (T) and spatial dimensions (n), but the geographical size of the spatial (administrative) units is widely different. Average area of Spanish provinces is about 10,499 km2, while for the Swiss cantons it is 1582 km2. French regions are 7030 km2 on average. Data for Spain: quarterly unemployment rates by province, for the period 1976–2008, 47 provinces Data for Switzerland: monthly unemployment rates, for the period 1975–2008, 26 cantons Data for France: quarterly unemployment rates, for the period 1982–2011, 96 departments The Data
1) We evaluate the short-run predictive power of the two methods. To do so, we use a rolling window For each model and data set, estimates are obtained using a fixed-size window of observations The forecasting window rolls over two years, providing one-step-ahead forecasts over 8 quarters for Spain and 24 months for Switzerland. Given cross-sectional dimensions, the overall number of forecasted values is (8 * 47 =) 376 for Spain and (24 * 26 =) 624 for Switzerland Forecasting Strategy
2) We then evaluate the predictive power of the same methods over longer forecasting horizons, again using a (one-year) rolling window and a fixed-size window of observations We provide forecasts until two years ahead, i.e. over 8 quarters for Spain and France, and 24 months for Switzerland. Given cross-sectional dimensions, the overall number of forecasted values is (4* 8 * 47 =) 1504 for Spain, (12 * 24 * 26 =) 7488 for Switzerland, and (4 * 8 * 96) = 3072 for France Forecasting Strategy
Forecasting performance is summarized by means of statistical indicators mean square error (MSE) mean absolute error (MAE) mean absolute percentage error (MAPE) (to account for scale heterogeneity) Moran’s I (MI) We use a nonparametric test to assess if two models are equally accurate: the sign test (ST, Lehmann 1998) Does not rely on the usual assumptions of most tests (e.g. Diebold-Mariano or Wilcoxon tests), as it does not require normal distribution or symmetry between the two vectors Based on the comparison of forecasting errors. If the methods tested present a similar forecasting performance, the number of SF (Model 2) forecasts with a greater error than the one of SVAR (Model 1) may be expected to be 50% Does not provide insights on the error distribution, but only on comparative forecasting, pairwise. In practice, it tests the hypothesis of equality in the medians where C is the number of times that Model 2 shows a higher error than Model 1, and p is the number of forecasts. S follows a normal distribution N(0, 1) Evaluation of Forecasts
SVAR shows better forecasting performance than SF, although differences are considerably reduced when MAPE is considered. In any case, numerical distance is rather small In all cases, SF presents a high level of variability in comparison to SVAR (see graphs) Finally, the sign test is performed along three dimensions all forecasting errors are pooled (for all cross-sectional units and all forecasting periods) the average forecasting errors by canton are analysed the average forecasting errors per period are compared The results show a statistically better performance of the SVAR model only when forecasting errors by region are analysed 1) Results for Switzerland
1) Results for Switzerland (2) • From spatial methods, we might expect forecasting errors with no spatial autocorrelation… • For the majority of forecasting periods, there is no significant spatial autocorrelation, for both SVAR and SF models, but SVAR seems to produce less spatially autocorrelated forecasting errors • Overall, our findings, are not surprising, since T >> n clearly advantages a time-series-related method like the SVAR.
2) Results for Switzerland (3) • Sign test (MeAPE): • Test is not significant for the first two forecasting horizons, but then becomes significant in favour of SVAR until the two years horizon
Findings for Spain differ from the ones for Switzerland: the SF model has gained in competitiveness from the different data structure In particular, the SVAR model appears to be more competitive with regard to MSE and MAE (when the error is not standardized), while the SF model minimizes percentage error (MAPE), winning six of out eight comparisons It is now the SVAR model that presents a higher heterogeneity in forecasting errors (see graphs) (maybe due to increase in cross-sectional dimension?) Also noteworthy: generalized increase in forecasting errors over time and in particular at the last two quarters, coinciding with the 2008 financial crisis, which had a strong labour market impact on the Spanish labour market In all cases, sign tests are not significant, suggesting an overall equivalence between the SVAR and the SF models 1) Results for Spain
2) Results for Spain (2) • As for the case of Switzerland, both methods produce spatially uncorrelated forecasting errors in most cases, but the SVAR model appears to account better for the true spatial correlation in the dependent variable. In any case, the levels of spatial autocorrelation of forecasting errors, when significant, are very low
2) Results for Spain (3) • Sign test (MeAPE): • Test is not significant for the first two forecasting horizons, then becomes significant in favour of SVAR • BUT SF becomes competitive again at the two-year forecasting horizon
2) Results for France (3) • Sign test (MeAPE): • SVAR appears to be superior for short-term forecasts • BUT SF becomes competitive – and now wins! –when approaching the two-year forecasting horizon
Differences in data structure (between the Swiss, Spanish and French data sets) appears to be a discriminating factor in terms of forecasting accuracy. Short-run forecasting SVAR seems preferable on the SF model when T >> n and the spatial units have smaller size (i.e., the Swiss data). When moderate n and T are used, we do not find stable significant differences between the two competing methods. SVAR appears to minimize errors on the scale of the unemployment rates (MSE and MAE), while the SF model is preferable when percentage error is considered (MAPE) This finding is justified by methodological aspects, as the SF model computes a geographical approximation of both the autoregressive coefficients and of the fixed/random effects. As such, it may be less efficient in estimating outliers (e.g., change in high unemployment areas), while it may be expected to provide smoother findings on the spatial patterning of coefficients Finally, the SVAR model shows a smaller number of spatially autocorrelated errors for both the Swiss and the Spanish data sets, although most estimations produced uncorrelated errors for both methods Rejoinder
Expanding forecasting horizon Consistently with previous results, median and average errors appear to be tied to the data structure (n and T) Both methods may deserve their own niche in regional forecasting Sign tests on median equivalence tend to prefer SVAR, but for longer forecasting horizons SF becomes competitive (Spain/France) Forecasting errors of SF show stronger residual spatial autocorrelation, while SVAR forecasting error often end up having negative spatial autocorrelation More questions to be answered: Are n and T influencing our results, or are the geographical characteristics of the regions or macro attributes? What happens for small-n, small-T? Not possible to test a data structure opposite to the one of Switzerland (e.g., German NUTS-3, for which n >> T), as SVAR cannot be estimated in such case If the SF model improves its performance for longer horizons, could the spatial autocorrelation of its forecasting errors follow the same pattern? How can we improve forecasting performance by considering neighbouring regions across national borders? (NARSC 2013) Rejoinder (2)
Thank you! Roberto Patuelli Department of Economics roberto.patuelli@unibo.it www.unibo.it Thanks for listening!