410 likes | 611 Views
9 th Workshop on LFS Methodology. The Italian Labour Force Survey consistency framework. 8 th Workshop on LFS Methodology. Antonio R. Discenza : discenza@istat.it Silvia Loriga : siloriga@istat.it Alessandro Martini: alemartini@istat.it
E N D
9th Workshop on LFS Methodology The Italian Labour Force Survey consistency framework 8th Workshop on LFS Methodology Antonio R. Discenza: discenza@istat.it Silvia Loriga: siloriga@istat.it Alessandro Martini: alemartini@istat.it ISTAT – Italian National Statistical Institute Labour Force Survey Division Rome, May 15-16, 2014
Overview of the Italian LFS • It is designed as a quarterly survey, all information obtained by interview, with no useof “Waveapproach”. • Space-time allocation in order to produce direct monthly estimates of the main figures
Dissemination of results • Monthly figures at national level (both SA and NSA series) • Quarterly figures up to the 21 NUTS_2 “ regions” (both SA and NSA series) and micro-data • Yearly figures up to the 110 NUTS_3 “provinces” and 13 larger Municipalities, as “direct estimates”, and micro-data • Yearly figures of employment and unemployment for the 686 Local Labour Market Areas, as small-area-estimates • Yearly figures by the households perspective • Quarter-on-quarterflow estimates and longitudinal micro-data • Year-on-yearflows estimates and longitudinal micro-data IT-LFS assures full consistency between figures and micro-data using calibration estimators and other benchmarking techniques.
For the consistency framework of IT-LFS and its timeliness, a fundamental role is played by the auxiliary information updated on monthly bases, by the Demographic Division, for weighting purposes: resident population in each Municipality by sex, age and citizenship (Nationals/Non-Nationals). External information on reference population • A monthly population is used for monthly estimates • A weightedaverageof the monthly population is used for monthly estimates • Is the numberofweeks in the month (4 or 5)
The quarterlyweightingprocedure • A Generalized calibration estimatorhas been adopted in order to improve the accuracy of the estimates • Finalweightsare obtained in three steps: • the base weights are obtained for all selected households as the inverse of the probability of inclusion in the sample; • the base weights are adjusted by a correction factor for total non-response worked out as the reciprocal of the response ratio for sub-groups of households; • final weights are obtained applying a calibration estimator that assures that the sample replies the same structure as the population, with regard to the several constraints.
Is obtained using constraints from several external sources population by sex and fourteen 5-year age groups (0-14, 15-19, …, 70-74, 75 and more years) at NUTS_2 level; non-national population (males, females, EU, Not EU) at NUTS_2 level; population by sex and five age groups (0-14, 15-29, 30-49, 50-64, 65 and more years) at NUTS_3 level population by sex and five age groups (0-14, 15-29, 30-49, 50-64, 65 and more years) for 13 large municipalities (> 250.000 inhabitants) number of households at NUTS_2 level for each rotation group; population by sex at NUTS_2 level each of the three months of the quarter (representing 4/13, 4/13, 5/13 of the whole quarter) Calibration to the reference population
Monthly constraints and monthly weights • The weighting procedure provides fully consistent monthly and quarterly weights. • Monthly estimates could be directly obtained using the monthly sample and its monthly weights • Problems: • These estimates are only available at the end of the quarter when all the interviews have been completed and quarterly weights have been computed; • Time series showed a veryhigh variability. • Monthly direct estimates were never published.
MONTHLY ESTIMATES
For few years Istat studied the possibility to improve timeliness and quality of the monthly estimates. It was found that a Regression Composite Estimator would have suited the purpose: it is a design based estimator, purely based on LFS data, and exploits the longitudinal dimension of the sample to produce more robust estimate) Provisional and finalmonthlyestimates • After evaluating the results and tuning the model for a long period, monthly estimates where finally disseminated in 2009. • The framework: monthlyestimatesare disseminatedasProvisional(timely and asFinalat a later stage
Provisional monthly estimates • Data are first disseminated on a provisional basis, about 30 days after each reference month, computed over a partial sample (the fieldwork is not completed yet). • Press release on monthlyunemploymentthe samedayasEurostat, focusedon SeasonalAdjusted (SA) data; • Simultaneously, monthly data (both SA and Not SA) are made available on Istat data warehouse (I.Stat) • The production processstartsabout 22 daysafter the end of the referencemonth.
Provisional monthly data production timetable La produzione di stime mensili
Estimation procedure in two steps • Step 1: a calibration to apply the regression composite estimator • Step 2: the seasonal adjustment of the estimates: • First a univariate seasonal adjustment; • Then a time series reconciliation procedure in two steps to ensure consistency between different aggregates and the total population, and between monthly and quarterly SA series. • Procedure based on a dual system of constraints: • contemporary constraints (monthlypopulation by sex and age groups) • inter-temporal constraints (quarterly SA figures of Employment, Unemployment, Inactivity; quarterly population by sex and age groups). • The approach of benchmarking is based on the “movement preservation principle” in order to maintain the temporal profile of the original series.
Employment figures with three different estimators Another representation: irregular vs. seasonal Source: Q2010 Conference, Assessing quality by means of temporal disaggregation. Riccardo Gatto, Silvia Loriga, Andrea Spizzichino and Alessio Guandalini
Framework for dissemination of monthly estimates • Final monthly data are then produced when the corresponding quarterly data are available, that is about 60 days after the reference quarter, for each of the three months • An additionalstepisadded in the estimation procedure: thisis a specificcalibrationstepthatassuresthatmonthly data are consistent with the quarterlyones( for the main aggregates, the weighted average of the three monthly figures, with weights equal to 4/13 or 5/13, is equal to the corresponding quarterly figures). • the constraints are related to both single months (total population by sex and age groups at different levels of geographical detail); • the quarterly estimates of the main aggregates: employed, unemployed and inactive, by gender and three age groups (15-24, 25-64, 65+)
GROSS LABOUR MARKET FLOWS • Quarterly and Yearly net changesare the finalresultof a high numberofgrossflowsofdifferent nature and differentsize
The longitudinal micro-data files constitute a “by-product” of the survey itself; LFS is not a “real” panel survey (the longitudinal sample has no information on persons which move out of the selected households, or household which move out of the municipality) Longitudinal estimates can refer only to a specific longitudinal reference population Longitudinal Weights should: reflect the longitudinal population, account for the panel attrition (usually not at random), ensure consistency with the other quarterly estimates. Definition of a referencelongitudinalpopulation
Referencelongitudinalpopulation and weights • The longitudinalpopulationin the IT-LFS isdefinedas: the populationwhichisresident in the samemunicipality for the entire 3 or 12 monthsperiod, excluding • Deaths; thosewhohavemoved to otherItalianmunicipalities (change of residence); Migrants to othercountries • It is fully consistent with the quarterly reference populations, given the general population equation • the longitudinal population is • A multi-step calibration procedure is used compute longitudinal weights, which produce results which are also consistent with quarterly cross-sectional populations and figures
longitudinal micro-data and transition matrices • This approach allow us to produce several kind of longitudinal estimates of gross flows and transition rates, assuring consistency of a large number of stock/flow results, by sex and age groups, and at NUTS2 and NUTS3 level. • It is straightforward to calculate: • quarterly flows: from one quarter to the subsequent one (3 months , quarter-on-quarteroverlap); • yearly flows: from one quarter to the same quarter of the subsequent year (12 months, quarter-on-quarteroverlap ); • average yearly flows: as average of the 4 yearly flows, referring to the 4 quarters of the calendar year (12 months, year-on-year overlap) • append of the yearly longitudinal datasets and their weights divided by four. • flow estimates are consistent with yearly cross-sectional estimates (annual averages) for the 2 consecutive years. • more detailed analysis at regional level and for subgroups
Complete Matrix with net and gross flows. Quarter 2 2001 – Quarter 2 2002. (Thousands) GESIS – Mannheim, 5 – 6 march 2009
Net change due to Longitudinal Population + 96 Net change due to Demographic flows - 35 Net change due to Migratory flows + 323 Net change in employment +384 Complete Matrix with net and gross flows. Quarter 2 2001 – Quarter 2 2002. (Thousands) GESIS – Mannheim, 5 – 6 march 2009
Leaving employment 1.203 Persistence in employment Net change +96 Entering employment 1.298 about 2.500 movements Transition Matrix for longitudinal population. Quarter 2 2001 – Quarter 2 2002. (Thousands) GESIS – Mannheim, 5 – 6 march 2009
Points of discussion • about consistency between stock and flows • It is worth to have this consistency ? • The use of this methodological approach requires the availability of data on longitudinal population of good quality and details, and this is the case for Istat. It would be interesting to study the possibility to use it in other countries, or at European level. • What could be the limitations or the advantages of this method in countries with different survey design which sample dwellings, with area sample, etc.
A brief exercise on WAVE APPROACH
A brief exercise on WaveApproach • IT-LFS never used wave approach. All the variables are collected, in all quarters, on the whole sample. • We have the possibility to simulate a wave approach on past data and compare results with the annual averages already disseminated. • Weassumedthat some of the structuralvariableswereobservedonly on the first wavesof the 4 quarters • This exercises has been conducted to evaluate the impact of the introduction wave approach on: • estimation procedures • in terms of coherence/consistency between yearly estimates (from sub-sample) and annual averages (from the full-sample)
SUB-SAMPLE STRUCTURAL VARIABLES Rotational pattern, full and sub samples • the sub-sample has the same theoretical sample size of a quarterly sample. • We have reweighted the sub-sample benchmarking to the averages of the 4 quarters (from the full-samples) to get consistency with annual averages.
Commission Regulation (EC) No 377/2008 • sets “conditions for the use of a sub-sample for the collection of data on structural variables” • It states that: • “Consistency between annual sub-sample totals and full-sample annual averages shall be ensured for employment, unemployment and inactive population by sex and for the following age groups: 15 to 24, 25 to 34, 35 to 44, 45 to 54, 55 +” • “The sample used to collect information on ad hoc modules shall also provide information on structural variables”.
Conditions for weighting the sub-sample • Considering that: • the sub-samplehas to be used for the actual ad-hoc modules and future Supplementary Annual Modules (we want to possibility to analyse regional differences) • It is important to take into account also the differences between the theoretical and the actual sample in terms of distribution over time and space (to compensate for a possible different total-non-response in different quarters and different regions). • the higher is the total non-response and the bias in the different waves or quarters, the higher is the risk of inconsistencies between the two kinds of annual averages
Weighting the sub-sample: • Some yearly variables in the sub-sample could be strictly correlated with those collected quarterly, not only with ILO status. • If the sub-sample is biased with respect to those quarterly variable then the estimate of the yearly variable could be biased. • For example, “income”, “second job” and “looking for another job” are probably correlated with STAPRO, FTPT, TEMP, NACE, ISCO. • Under these conditions, is the • minimum set of requirements in the regulation 377/2008 • sufficient to achieve coherent results, • and to produce unbiased yearly estimates?
Different sets of constraints (SoC) • Conditions in the regulation do not seem sufficient to us • Several sets of Final Weights have been obtained: • Using calibrator estimators, • Starting from the quarterly weights, • with several different sets of constraints (SoC) • Annualdistributionof the referencepopulationby sex, age, region and citizenship (similartoquarterlyweights) • Annualaveragesof several main variables correlated with the structural variables • For each SoC all constraints are contemporary • defined at NUTS 2 level.
Different sets of constraints (SoC) Table 1 – INCDECIL: Annual averages obtained from the full sample and the sub-sample using different sets of constraints. Year 2012. (Percentages) For INCDECIL the sub-sample provides higher relative frequencies for lower monthly pay than the full-sample, especially for the first decile. The opposite happens for higher monthly pay. The differences became bigger in Soc_7 where constraints are put on the characteristics of the employment also.
Different sets of constraints (SoC) Table 2 – MAINSTAT: Annual averages obtained from the full sample and the sub-sample using different sets of constraints. Year 2012. (Absolute values, Percentages) For MAINSTAT (see Table 2), the sub-sample provides a lower number of employed (about 100 thousands) and a higher number of unemployed than the full-sample (100 thousands). The greater difference occur with Soc_2, where no constraints are put on labour statuses. No much difference between the other SoC’s.
Different sets of constraints (SoC) Table 3 – EXIST2J-STAPRO2J-NACE2D2J-HWACTUA2: Annual averages obtained from the full sample and the sub-sample using different sets of constraints. Year 2012. (Absolute values, Percentages, averages) Table 3 shows the results for some of the variables related to the SECOND JOB. The sub-sample provides a much higher number of employed with a second job (+30%), and a much higher incidence (from 1.4% to 1.9%). As consequence, the number of total hours worked is higher (about 20%) providing a much smaller number of hours worked per employees (from 23.5 to 18.6). The estimates are higher for both employees and self-employed, and in all the main NACE sectors. However, the sub-sample tends to reduce the incidence of employees and of the employed in the Service sector, and increase the incidence of self-employed and of the employed in Agriculture and industry.
Points of discussion about the wave approach • It is indubitable that a panel attrition exist and that quarterly estimates could be biased. Thus their annual averages could also be biased but have higher precision. • On the other hand, it seems also reasonable that estimates from the sub-sample should be “in principle” less biased than those from the full-sample, but with a lower precision. • An important questions arises: • Is it methodologically correct • to benchmark • the sub-sample estimates to the full sample ones • if we suspect that • the latter are more biased than the former ?
Points of discussion about the wave approach • Are we sure that the benefits • of a reduction in respondents burden • are so high that they compensate, or exceed, the much bigger effort needed for • the continuous management of questionnaires and micro-data, • the implementation of a more complex methodology? • Time series for the structural variables could have breaks when we introduce wave approach. How to manage this? • What would be the dissemination strategy? (given the new limitations due to the consistency problem) • What kind of yearly indicators can be produced: levels or percentage distributions?
and VERY MUCH INDEEDfor yourPATIENCE,TOLERANCE, TENACITY,mentalalertness,physicalresistance,greatcapacity to remaincalm.... although .. THANK YOU FOR YOUR ATTENTION!
and VERY MUCH INDEEDfor yourPATIENCE,TOLERANCE, TENACITY,mentalalertness,physicalresistance,greatcapacity to remaincalm.... although .. THANK YOU FOR YOUR ATTENTION!
Labour Status at 2008Q1 Population Longitudinal aged 15+ People Population Employed Unemployed Inactive Total 2007Q1 Leaving the Deaths Municipalities Employed 20.346 22.846 353 1.281 21.980 49 817 Unemployed 489 1.556 449 514 1.452 2 102 Labour Status at 2007Q1 Net change due to Longitudinal Population flows + 115 Inactive 1.260 26.021 757 23.131 25.149 495 377 Total 22.095 50.424 1.559 24.926 48.581 547 1.296 Net change due to Demographic flows - 49 Children aged 15 0 0 584 584 People Entering 1075 202 359 1.636 Net change due to Migratory flows + 258 the Municipalities Population aged 23.170 1.761 25.870 50.801 15+ 20087Q1 Net change in cross-sectional employment +324 Complete Matrix with net and gross flows. Quarter 1 2007 – Quarter 1 2008. (Thousands) European Conference on Quality in Official Statistics – Q2010 3 - 6 May 2010 - Helsinki
Leaving employment 1.634 Labour Status at 2008Q1 Longitudinal Persistence in employment Population Employed Unemployed Inactive Total Employed 20.346 353 1.281 21.980 Unemployed 489 449 514 1.452 2007Q1 Labour Status at Inactive 1.260 757 23.131 25.149 Total 22.095 1.559 24.926 48.581 Net change +105 Entering employment 1.749 almost 3.400 movements Transition Matrix for longitudinal population. Quarter 1 2007 – Quarter 1 2008. (Thousands) European Conference on Quality in Official Statistics – Q2010 3 - 6 May 2010 - Helsinki