490 likes | 502 Views
Using quantile regression to forecast poverty rate from current monthly income variable. Workshop on best practices for EU-SILC revision. Summary. Introduction Data Method Results Discussion Conclusion. Summary. Introduction Data Method Results Discussion Conclusion. Introduction.
E N D
Using quantile regression to forecast poverty rate from current monthly income variable Workshop on best practices for EU-SILC revision
Summary • Introduction • Data • Method • Results • Discussion • Conclusion Using quantile regression to forecast poverty rate from current monthly income variable
Summary • Introduction • Data • Method • Results • Discussion • Conclusion Using quantile regression to forecast poverty rate from current monthly income variable
Introduction • Aim : forecasting at-risk-of-poverty rate from survey data in SILC. • Why ? • At-risk-of-poverty rate calculation : based on administrative data • Survey data available much earlier than administrative data • Survey data available during year N • Administrative data available during april N+1 • At-risk-of-poverty rate disseminated in September N+1 • Disappointing results Using quantile regression to forecast poverty rate from current monthly income variable
Summary • Introduction • Data • Method • Results • Discussion • Conclusion Using quantile regression to forecast poverty rate from current monthly income variable
Data • Two distinct statistical units : households and individuals • Households : cross-section units • Individuals : panel-data units • Two distinct statistical units : households and individuals • Households : cross-section units • Individuals : panel-data units • Survey data vs administrative data • Survey data : situation on year N • Administrative data : income on year N-1 Using quantile regression to forecast poverty rate from current monthly income variable
Summary • Introduction • Data • Method • Results • Discussion • Conclusion Using quantile regression to forecast poverty rate from current monthly income variable
Method • General outline • Choice of model • Practical aspects Using quantile regression to forecast poverty rate from current monthly income variable
Method • General outline • Choice of model • Practical aspects Using quantile regression to forecast poverty rate from current monthly income variable
General outline • 1. Estimate a relationship between survey and administrative data on year N • 2. Use this relationship to forecast at-risk-of-poverty rate without using administrative data on year N+1 Using quantile regression to forecast poverty rate from current monthly income variable
Method • General outline • Choice of model • Practical aspects Using quantile regression to forecast poverty rate from current monthly income variable
Choice of model 1 • Standard linear models perhaps not the most adapted to the problem • Firpo, Fortin, Lemieux 2009 recentered influence function (RIF) • RIF measures the change in a functionnal of a distribution when the distribution changes slightly • ARPR seen as a functionnal of the distribution of equivalised disposable income • Survey data seen as covariates • RIF regression gives the change in ARPR given a change in the distribution of the covariates Using quantile regression to forecast poverty rate from current monthly income variable
Choice of model 2 • RIF has a nice property : E[RIF] = ARPR • 1. On year N : • Compute the RIF of ARPR on equivalised disposable income • Estimate RIFN = β XN (XN : survey data) • 2. On year N+1 : • Use estimated β to predict RIFN+1 • Forecasted ARPR : average of predicted RIFN+1 Using quantile regression to forecast poverty rate from current monthly income variable
Method • General outline • Choice of model • Practical aspects Using quantile regression to forecast poverty rate from current monthly income variable
Practical aspects : choice of covariates 1 • XN : variables in SILC survey data • Among these variables : current monthly income • Dependency of RIF on equivalised disposable income : non-monotonous, non-continuous • Equivalised disposable income (administrative data) and current monthly income (survey data) expected to have a strong correlation • Raw current monthly income : probably not a good covariate choice Using quantile regression to forecast poverty rate from current monthly income variable
Practical aspects : choice of covariates 2 Using quantile regression to forecast poverty rate from current monthly income variable
Practical aspects : choice of covariates 3 • Instead of raw current monthly income : RIF of ARPR computed on current monthly income • Involves the estimation of a probability density function • Issue 1 : current monthly income reported as multiples of 10, 100, 500… • Issue 2 : current monthly income sometimes reported as income brackets • Computation of a simulated current monthly income that has a smooth distribution and is consistent with actual current monthly income information • Heitjan and Rubin, 1990 • Assumptions on the way households report their income • Also solves partial non-answer issues. Using quantile regression to forecast poverty rate from current monthly income variable
Practical aspects : choice of regression model 1 • RIF violates standard linear models assumptions : non-centered residuals • Equivalised disposable income based RIF can only take 3 different values • So does current monthly income based RIF • E[ε | RIFCMI] ≠ 0 Using quantile regression to forecast poverty rate from current monthly income variable
Practical aspects : choice of regression model 2 Using quantile regression to forecast poverty rate from current monthly income variable
Practical aspects : choice of regression model 3 • May lead to poor estimation with standard linear models • Other possibility • RIF defines three distinct domains • estimate the probability to fall in each of the domain given CMI-based RIF • Identification issue : cannot estimate both the probability to fall in each domain and the level of RIF for this domain • One of the domains = being under ARPT • probability to fall in this domain = ARPR • No interest in computing RIF in such a case Using quantile regression to forecast poverty rate from current monthly income variable
Practical aspects : choice of regression model 4 • No simple way to solve these problems • Choice : estimate RIFEDI, N = α RIFCMI, N + β XN as GLM • Comparison with dichotomic and polynomial models • 1(EDIN≤ ARPT) = α 1(CMIN ≤ ARPT) + β XN • Logit • Probit • Three-modalities multinomial logit Using quantile regression to forecast poverty rate from current monthly income variable
Practical aspects :choice of year 1 • Standard matching of survey and administrative data in SILC : • Survey data on year N • Administrative data on income of year N-1 • Aim : forecasting ARPR the earlier, the better • Alternate matching : survey data on year N and administrative data on income of year N • EDI defined at a household level • Households = cross-section units / Individuals = panel-data units • Select households whose composition did not change between years N and N+1 for the estimation Using quantile regression to forecast poverty rate from current monthly income variable
Practical aspects : choice of year 2 • Forecasting of ARPR levels : • 1. Estimate RIFEDI = α RIFCMI + β Xon years N and N-1 (pooled cross-section) • 2. Use estimated α and β to forecast ARPR on year N+1 based on RIFCMI, N+1 and XN+1 • N = 2009 • Forecasting of ARPR evolutions : • 1. Estimate RIFEDI, N = α RIFCMI, N + β XN on year N • 2. Use estimated α and β to forecast ARPR on year N+1 (resp. N+2) based on RIFCMI, N+1 and XN+1 (resp. RIFCMI, N+1 and XN+1) • N = 2008 Using quantile regression to forecast poverty rate from current monthly income variable
Summary • Introduction • Data • Method • Results • Discussion • Conclusion Using quantile regression to forecast poverty rate from current monthly income variable
Results • Estimation results • Forecast of ARPR levels • Forecast of ARPR evolutions Using quantile regression to forecast poverty rate from current monthly income variable
Estimation results 1 • Pooled cross-section estimations on years N and N-1 : • RIF non-monotonous and non-continuous difficult to interpretsigns and values of the coefficients in the RIF-GLM regression • For most covariates : non-significant coefficients in the RIF-GLM regression • Dichotomic and multinomial models : highly significant coefficient for most covariates • Covariates discriminate well between the different domains but the definition of the RIF leads to poor estimation results • Remains true for estimation on year N only Using quantile regression to forecast poverty rate from current monthly income variable
Estimation results 2 • Very disappointing results • Use of a reduced RIF-GLM regression : keep only covariates that had a significant coefficient in the original RIF-GLM regression Using quantile regression to forecast poverty rate from current monthly income variable
Forecast of ARPR levels 1 • 5 estimated models 5 forecasts of ARPR based on survey data in 2010 • Comparison with published ARPR in 2010 • No computed confidence interval Using quantile regression to forecast poverty rate from current monthly income variable
Forecast of ARPR levels 2 Using quantile regression to forecast poverty rate from current monthly income variable
Forecast of ARPR evolutions 1 • 5 estimated models 5 forecasts of ARPR evolutions based on survey data in 2009 and 2010 • Comparison with published ARPR in 2009 and 2010 • No computed confidence interval Using quantile regression to forecast poverty rate from current monthly income variable
Forecast of ARPR evolutions 2 Using quantile regression to forecast poverty rate from current monthly income variable
Summary • Introduction • Data • Method • Results • Discussion • Conclusion Using quantile regression to forecast poverty rate from current monthly income variable
Discussion 1 • RIF-GLM regressions poor estimation results + poor forecasting results • Not only unsatisfactory in comparison with published ARPR… • But also in comparison with basic dichotomic and multinomial models that do not require so much steps • Why are these results that disappointing ? • Violated assumptions Using quantile regression to forecast poverty rate from current monthly income variable
Discussion 2 • Assumptions made : • 1. Assumptions on the way households report an approximate CMI • 2. Estimating the model only on households whose composition does not change between N and N+1 is OK • 3. Assumptions related to the estimation of the PDF • 4. Assumptions related to the estimation of the GLM • 5. Relationship between RIFEDI and RIFCMI and X remains constant over time Using quantile regression to forecast poverty rate from current monthly income variable
Discussion 2 • Assumptions made : • 1. Assumptions on the way households report an approximate CMI • 2. Estimating the model only on households whose composition does not change between N and N+1 is OK • 3. Assumptions related to the estimation of the PDF • 4. Assumptions related to the estimation of the GLM • 5. Relationship between RIFEDI and RIFCMI and X remains constant over time Using quantile regression to forecast poverty rate from current monthly income variable
Discussion 3 • Poor estimation results : • Covariates discriminate well between households that are below or above ARPT, below or above median of EDI… • But fail to do so between the three levels of RIF • using GLM estimations is not appropriate when using RIF computations Using quantile regression to forecast poverty rate from current monthly income variable
Discussion 4 Using quantile regression to forecast poverty rate from current monthly income variable
Discussion 5 Using quantile regression to forecast poverty rate from current monthly income variable
Discussion 6 • For every household in SILC 2010 : • Actual RIF 2009 computed on EDI 2009 • Forecasted RIF 2010 based on survey data • Assuming incomes do not change much between 2009 and 2010 a good forecast would give very few differences between the two variables • Is it the case ? • plot of joint PDF of the two variables Using quantile regression to forecast poverty rate from current monthly income variable
Discussion 7 Using quantile regression to forecast poverty rate from current monthly income variable
Discussion 7 Using quantile regression to forecast poverty rate from current monthly income variable
Discussion 7 Using quantile regression to forecast poverty rate from current monthly income variable
Discussion 8 Using quantile regression to forecast poverty rate from current monthly income variable
Discussion 8 Using quantile regression to forecast poverty rate from current monthly income variable
Discussion 9 • Poor quality of the RIF prediction… • … especially for households that are below the ARPT • Major problem since those households weight a lot in the forecast of ARPR • Misspecification of the model Poor estimation Poor prediction of RIF Poor forecast of ARPR Using quantile regression to forecast poverty rate from current monthly income variable
Summary • Introduction • Data • Method • Results • Discussion • Conclusion Using quantile regression to forecast poverty rate from current monthly income variable
Conclusion • RIF-regression based forecasting proves disappointing • Misspecification issues of the model seem difficult to overcome • Alternate options to be investigated • Non-parametric estimations ? Using quantile regression to forecast poverty rate from current monthly income variable
Insee 18 bd Adolphe-Pinard 75675 Paris Cedex 14 www.insee.fr Informations statistiques : www.insee.fr / Contacter l’Insee 09 72 72 4000 (coût d’un appel local) du lundi au vendredi de 9h00 à 17h00 Using quantile regression to forecast poverty rate from current monthly income variable Thank you for your attention Contact Pierre Pora Tél. : +33141175463 Courriel : pierre.pora@insee.fr