170 likes | 326 Views
Editing the Integrated Census in Israel. EDITING THE INTEGRATED CENSUS IN ISRAEL Prepared by Eva Rotenberg, Central Bureau of Statistics, Israel (1) (1) I would like to thank Ari Paltiel who edited this paper.
E N D
EDITING THE INTEGRATED CENSUS IN ISRAEL Prepared by Eva Rotenberg, Central Bureau of Statistics, Israel (1) (1) I would like to thank Ari Paltiel who edited this paper
The paper describes the editing and imputation procedures which were used for the demographic variables in the integrated 2008 census in Israel.
Background • The Israeli census is an "Integrated Census" • Combines administrative data for 100% of the population with data obtained from a large sample survey (approximately 17% of the households in the country). • The main administrative data source for the Improved Administrative File is the National Population Register (NPR) • All register records are identified by a unique “personal identity number” (PIN) , which can be used for matching records • The NPR contains personal records for all citizens and permanent residents of Israel and includes demographic and residential information
Background The area survey of the census serves two main purposes: • The survey results provide parameters to calculate a weight which represents the probability of a person to actually reside in his/her registered Statistical Area (which is a is a compound of consecutive buildings/blocks consisting of an average of 5,000 inhabitants) in the NPR http://www.cbs.gov.il/mifkad/integ_census.pdf • Collecting socio-economic information such as labor force characteristics, household typology, education, housing, ownership of durable goods and disability
Patterns of Demographic Data in the NPR • The demographic variables which were edited and imputed are: ‘year of birth’, ‘sex’, ‘marital status’, ‘year of immigration’, ‘country of birth’, and ‘parent’s country of birth’ • Edit checks, which were implemented with Canceis software, include : standard checks between relationships such as: ages of parents and children, marital status and age, year of immigration and year of birth, etc
Patterns of Demographic Data in the NPR • The missing values of ‘country of birth’ and ‘parent’s country of birth’ are concentrated in older persons' records in the NPR • The missing values of ‘year of immigration’ were dispersed among younger persons born abroad • The choice of imputation methods is dictated by such special population patterns and relationships between variables such as • country of birth , parents country of birth, year of immigration, year of birth
Methodology of Editing and Imputation • Cold deck imputation • Deterministic imputation • Statistical imputation • NIM (Nearest-neighbor Imputation Methodology) using Canceis software (Canadian Census Edit & Imputation System).
Methodology of Editing and Imputation Cold-deck imputation : • Imputation from external data sources - the census area sample survey and previous (traditional) censuses • The imputation process is based on failed edits in the administrative source • When a discrepancy is found in edited items between valid records of the administrative source with valid records of the census area survey we prefer the administrative source as the more reliable source of data for most variables.
Choice of methodology • The imputation sequence progresses by degree of accuracy from ‘strong’ to ‘weak’ imputation • Once the cold-deck imputation stage is exhausted, we weigh the possibilities of different imputation methods • NIM does not apply to all cases for which imputation is needed either because : • there are more certain possibilities • the data source does not meet the preconditions for hot-deck imputation • For these cases we used other imputation methods: deterministic imputation, statistical imputation
The Process of Editing and Imputation • Strong deterministic imputation • Completion from the Census sample survey • Matching with previous censuses • Weak deterministic imputation • Statistical imputation • Nearest-neighbor Imputation Methodology
Results • The relative proportions of imputations at each stage of the process were determined by the data patterns of different demographic variables in the NPR and the ‘tailoring’ of the combination of imputation methods
Summary • In this paper we have shown how the methodology of the Integrated Census in Israel, characterized by a combination of administrative source and a field survey dictated the choice of imputation methods • The imputation process as a whole and the relative proportions of imputations at each stage of the process were determined by the data patterns of different demographic variables in the NPR and the ‘tailoring’ of the combination of imputation methods