200 likes | 221 Views
Learn about the validation process of the French population census by INSEE, including data collection methods, estimation mechanisms, quality processes, and validation procedures to ensure accuracy and reliability of census data.
E N D
The validation of the french population census Olivier Lefebvre, Insee
The method of the new census • Two major principles • Data collection in rotation (five years cycle) • Sample surveys for large communes (more than 10 000 inhabitants) • The sampling base:the located buildings register (répertoire d’immeubles localisés - RIL) • Special case of persons not living in ordinary dwellings
The estimation mechanism • Municipalities with 10 000 inhabitants or more : • Sample surveyed at 8 % every year, i.e. 40 % over 5 years • Mobile mean over 5 years (number of persons by dwelling) • Fitted over the number of dwellings of the median year (given by the RIL)
The same year of referenceMunicipalities less than 10 000 inhabitants
The estimation mechanism • Municipalities with less than 10 000 inhab. : • Exhaustive census every 5 years • Population updated by : • Extrapolation taking into account administrative sources (Housing Tax files) • Interpolation with last census (1999)
The figures of population : a synthesis of … • Results of annual surveys • quality of data collection • quality of data capture • quality of data processing • Number of dwellings contained in the RIL • Evolution of dwellings in the local tax register • Population of institutionnal households • Each of these « sources » must be validated
The validation of the RIL • The « quality » survey of the RIL • An annual survey to assess • The surpluses and the deficits • The surplus-deficit balance • The proportion of communes for which the quality is deemed mediocre • The sample : 340 000 addresses and 900 000 dwellings • The results • 1% of dwellings in deficit, 1% in surplus • A level of quality higher than that of usual geographical registers • A regular improvement process
Data collection • The control procedures during the data collection • The follow-up and control upon the initiative of the communes • The controls at the Insee after the data collection (reception-control, in office control, controls in the field)
The control procedure during data collection • A precise data collection protocol • Checked by the municipal officer • And by the supervisor
The controls at the Insee after data collection • Successive « sorting » • The step of reception-registering of the questionnaires • Establishment of quality indicators • Communes with insufficient scores subject to in-depth control (approx. 15% of the municipalities representing 30% of the population) • Desk controls • Verification with the help of the housing Tax files • On the exhaustiveness of the dwellings, the FLNE, the structure of main/not main residence • In field controls (8% of municipalities, 1 dwelling out of 200)
The in-field controls • They are done by Insee agents • They relate to the communes whose quality is presumed insufficient • Most often, they confirm the data collection • The errors detected are corrected • N.B. these controls are not representative, therefore their results cannot be extrapolated
Data capture • Quality criteria specifically defined in the specifications • securing of transports and data • control of discrepancies with previous counts • maximum rate of error by category of variables • … and quality evaluated by an independent control process (sample that has been captured twice and arbitration)
Data processing • Encoding • Quality control work station for : • Assessing the quality of the automatic coding and manual correction • Estimating a « threshold » of percentage of bulletins « non codable » • The contributions of this approach : • adding to the expert systemfiles • the improvement of training and protocols (precisions added in codification rules) • A finer management of quality, without going in for « excess quality »
Data processing (2) • Data editing and imputation • Anayse the share of non responses or incoherent responses • Imputation of missing data or data presumed non-coherent (hot-deck procedure) • They are fine-tuned year after year, in view of the quality of the variables produced
Quality of extrapolations in municipalities more than 10 000 inhabitants
The validation in « big » municipalities An analysis of the evolutions
The validation in « big » municipalities (2) The contributions to the evolutions