The Dutch Virtual Census of 2001 A New Approach by Combining Different Sources

The Dutch Virtual Census of 2001A New Approach by CombiningDifferent Sources Eric Schulte Nordholt ECE Census meetings Geneva, 22-26 November 2004

Contents • Introduction Census • Data sources • Combining data sources: micro-linkage • Combining sources: micro-integration • Social Statistical Database (SSD) • Census tables • History of the Dutch Census • Comparison with Censuses in other countries • Conclusions

Introduction Census • Why a Census ? • Statistical information for research and policy purposes • What kind of information ? • Size of (sub)population(s) • Demographic and socio-economic characteristics, at national and regional level • Gentlemen’s agreement • Eurostat: co-ordinator of EU, accesion and EFTA • countries in the 2001 Census Round • Census Table Programme, every 10 years

Data sources • Registers: • Population Register (PR),16 million recordsdemographic variables: sex, age, household status etc. • Jobs file, employees, 6.5 million records,and self-employed persons, 790 thousand recordsdates of job, branch of economic activity • Fiscal administration (FIBASE)jobs,7.2 million records, and pensions and life insurance benefits, 2.7 million records • Social Security administrations, 2 million records,auxiliary information integration process • Surveys: • Survey on Employment and Earnings (SEE), 3 million records,working hours, place of work • Labour Force Survey (LFS), 2 years: 230.000 recordseducation, occupation, (economic) activity

Combining sources: micro-linkage • Linkage key:RegistersSocial security and Fiscal number (SoFi), uniqueSurveys Sex, date of birth, address (postal code and house number) • Linkage key replaced by RIN-person • Linkage strategyOptimizing number of matchesMinimizing number of mismatches and missed matches

Combining sources: micro-integration • Collecting data from several sources more comprehensive and coherent information on aspects of person’s life • Compare sources - coverage - conflicting information (reliability of sources) • Integration rules • - checks - adjustments - imputations • Optimal use of information quality improves • Example: job period vs. benefit period

Social Statistical Database (SSD) • Social Statistical Database (SSD): Set of integrated micro-data files with coherent and detailed demographic and socio-economic data on persons, households, jobs and benefits • No remaining internal conflicting information • SSD-set: • Population Register (back bone) • Integrated jobs file • Integrated file of (social and other) benefits • Surveys, e.g. LFSCombining element:RIN-person

Census tables (1) • Preliminary work before tabulating • Census Programme definitions:not always clear and unambiguous, e.g. economic activity • Priority rules • (characteristics of) main job (highest wage) • employee or employer • job or (partially) unemployed • job or attending education • job or retired • engaged in family duties or retired • age restrictions • Tabulating register variables:simply straightforward counting from SSD-register data

Census tables (2) • Tabulating survey (and register) variables • Mass imputation? • Pro’s:reproducible results • Con’s: danger of oddities in estimates (e.g. high educated baby) • Traditional Weighting? • Pro’s:simple, reproducible results (if same micro-data and weights) • Con’s:no overall numerical consistency between survey and register estimates • Demand for overall numerical consistency • 1 figure for 1 phenomenon • all tables based on different sources (e.g. surveys) should be mutually consistent

educLo...Hi employ1...m ethnic1...k Register Survey1 Register Survey1 Survey 2 Survey 2 • ethnic • not-NL • NL • Total • 30 • 70 Census tables (3), example • Ethnicity: register • Education: survey 1 and survey 2 • Employment status: survey 2 • Estimate: T1: educ x ethnic and T2: educ x employ Survey 2

Census tables (4) • Repeated Weighting (RW) : tool to achieve numerical consistency(VRD-software) • Basic principles of RW: • estimate table on most reliable source (mostly source with most records, e.g. register) • estimate tables by calibrating on common margins of the current table and tables already estimated (auxiliary information) • repeatedly use of regression estimator: • - initial weights (e.g. survey weights) calibrated as minimal as possible • - lower variances • - no excessive increase of (non-response) bias (as long as cell size>>0) • each table own set of weights

educLo...Hi employ1...m ethnic1...k Register Survey1 Register Survey1 2 Survey 2 Survey 2 3 1 Census tables (5), example continued Calibrate on ethnic, then on educ x ethnic Survey 2

History of the Dutch Census • TRADITIONAL CENSUS • Ministry of Home Affairs: • 1829, 1839, 1849, 1859, 1869, 1879 and 1889 • Statistics Netherlands: • 1899, 1909, 1920, 1930, 1947, 1960 and 1971 • Unwillingness (non-response) and reduction expenses  no more Traditional Censuses • ALTERNATIVE: VIRTUAL CENSUS • 1981 and 1991: Population Register and surveys • development 90’s: more registers → • 2001: integrated set of registers and surveys, SSD

Comparison with Censuses inother countries • Traditional Census (complete or partial enumeration): Most countries (Estonia, Slovenia, Greece and the UK) • Mixture of traditional Census and Registers: • Some countries (Norway and Switzerland) • Entirely or largely register-based Census: • A few Nordic countries (Sweden and Finland) • Virtual Census: • The Netherlands • Tables: http://www.cbs.nl/en/publications/articles/general/census-2001/census-2001.htm • Book: http://www.cbs.nl/en/publications/recent/census-2001/b-57-2001.htm

Conclusions • The Dutch Virtual Census 2001 was successfulwith its innovative approach: • new source: SSD, integration of registers and surveys (micro-integration remains important) • new methodology for consistent estimation was implemented • Pro’s: relatively cheap (cost per inhabitant) and quick • Con’s:publication of small subpopulations sometimes difficult or even impossible because of limited information • Solutions for Con’s: • small area estimation (synthetic estimators)

The Dutch Virtual Census of 2001 A New Approach by Combining Different Sources

The Dutch Virtual Census of 2001 A New Approach by Combining Different Sources

Presentation Transcript

The Dutch approach

Road Pricing The Dutch Approach

Dutch approach

Basic Approach to Mapping Different Sources, and the Sources of Spatial Datasets

The availability of Dutch census microdata

Combining Different Measurements

The Dutch florist approach

Dutch Virtual Census

The Dutch Delta Approach

2001 Census

The 2001 Census PUMFS Odyssey

1960 Dutch Population Census

A different approach

The DUTCH approach Implementation of nutritional routines from a Dutch national perspective

the Dutch approach

Information Sources Focus: The Census

Census of India 2001

The availability of Dutch census microdata

New vision, different approach

The Dutch Virtual Census based on registers and already existing surveys

Quality of registers used for the Dutch census