120 likes | 330 Views
Thomas Glaser Statistics Austria Directorate Social Statistics European Conference on Quality in official Statistics 5 th June 2014. Model based estimation of indicators of poverty and social exclusion. Overview. Europe 2020 indicators: Break in time series
E N D
Thomas Glaser Statistics Austria Directorate Social Statistics European Conference on Quality in official Statistics 5th June 2014 Model based estimation of indicators of poverty and social exclusion
Overview • Europe 2020 indicators: Break in time series • Data basis for modelling: EU-SILC 2011 • Different modelling variants • Application of modelled register data effect toEU-SILC 2008-2010 • Conclusions and outlook
Register basedhouseholdincome • Usageofregisterfrom EU-SILC 2012 onwards • Results in a break in time seriesforincomebased Europe 2020 indicators • At-risk-of-poverty: AROP(REG) • At-risk-of-povertyorsocialexclusion: AROPE(REG) • Data fromincomeregisters also availableforEU-SILC 2011 • Revision of time series 2008-2012 desired • Noincomeregisterdatafor 2008-2010 availableyet • Model basedregisterhouseholdincomeassolution
Choice ofmodels: Variant 1 • Directestimationofindicators • Likelihoodof AROP(REG)and AROPE(REG)estimatedbylogisticregressions • Estimateofindicators: meanvalueofestimatedprobabilities on personal level • Advantage: • Directestimationofindicators • Disadvantages: • Possibilityofinconsistentestimatesofindicators • Only moderate model fit • Underestimationof AROP(REG)and AROPE(REG)
Choice ofmodels: Variant 2 • Estimationofregisterbasedhouseholdincome HINC(REG) • Linear regression: Natural log ofHINC(REG)asdependent variable • Estimationofindicators • Equivalisedincomefromwith interview baseddata • Calculationof AROP(REG)and AROPE(REG) • Advantages: • Consistentindicators • Estimatesofhouseholdincome on sample level • Verygoodmodel fit (R2=89%) • Disadvantages: • Loss ofvariance due toregression • Underestimationof AROP(REG)and AROPE(REG)
Choice ofmodels: Variant 2a • Addition ofiid N(0,σ2) stochasticerrortermstoestimatesresultingfrom linear regression in variant 2 • Advantages: • Compensationoftendencytowardsthemean • EstimatedHINC(REG)distributionsimilartoHINC(REG) • EstimatesclosetoactualAROP(REG)and AROPE(REG)(EU-SILC 2011) • Disadvantage: • Additional variancecontainsno additional information on structureofhouseholdincome
Choice ofmodels: Variant 3 • Estimationofdifference HINC - HINC(REG) • Twostepmodelling • 1) Classificationof relevant difference (discriminantanalysis) • 2) Linear regressionsimilarto variant 2 fordifference • Estimateddifferenceisaddedto HINC • Advantages: • Explicit estimationofregisterdataeffect • Disadvantages: • OverestimationofAROP(REG)and AROPE(REG) • Low model fit • Errors oftwomodellingsteps
Weighting • Calibrationincorporatesregisterincomedata • Additional modellingstepforestimatedweights in every variant wouldbenecessary • All modelsfittedwithoutweights • Characteristics relevant forweightingare also predictors in themodels • Marginal differencesweighted – unweighted • OLS mostefficientfor linear regression
Chosen model • Variant 2a: EstimationofregisterbasedhouseholdincomeincludingiidN(0,σ2) stochasticerrorterms • More advantagesthandisadvantages • Easy application • Coefficientsfromregressionandstochasticerrortermswereappliedto interview baseddataof EU-SILC 2008-2010 • Socio-economicstructurereflected in predictor variables foreachyearcanbe incorporated in estimation
Conclusions and outlook • Unbroken time seriesofEurope 2020 indicatorsfor EU-SILC 2008-2012 achieved • Europe 2020 targetscanbemeasuredfrom 2008 onwardswithregisterbasedindicators • Next task: recalculationofregisterbasedhouseholdincomefor EU-SILC 2008-2010 • Revision of EU-SILC micro-data 2008-2011 until 09/2014 • Publication of revised time-series 2008-2013 in autumn/winter 2014
Please address queries to: Thomas Glaser thomas.glaser@statistik.gv.at Contact information: Guglgasse 13, 1110 Vienna Thank you for your attention!