190 likes | 215 Views
Delve into the history, preparation, and anonymization of the 1971 GDR census microdata for the IECM project, exploring metadata availability and data insights. Learn about the sampling, recoding, and sorting techniques used for analysis.
E N D
Country report Germany Workshop “Integrating Global Census Microdata” - Lisbon, 22 August 2007- Andrea Harausz
Content • Microdata for the project “Integrated European Census Microdata” (IECM) • Delivered metadata • 1971 census of the former German Democratic Republic • History • Characteristics and metadata • Preparation • Anonymisation • Outlook
Microdata for the IECM project • 9 anonymized microdata files • 2 censuses - Federal Republic of Germany • 1970 • 1987 • 2 censuses - former German Democratic Republic • 1971 • 1981 • 5 microcensuses - Federal Republic of Germany • 1973 • 1982 • 1987 • 1991 • 2001
1990´s 1980´s 1970´s 1970 . 1973 1982 1987 1991 2001 1987 1981 1971 Microdata for the IECM project
1971 census of the former GDR • History • Characteristics and metadata • Preparation of microdata • Anonymisation
1971 census of the former GDR • History • Central State Administration for Statistics of the GDR • Legal successor: Federal Statistical Office • Backup and documentation of data • Partly adaptation to the system of the 1987 census of the FRG • Release to the Federal Archive • Federal Archive • Data protected by data protection act • Archive law allows availability of data after 60 years • Special permission for the release of data for scientific purposes
1971 census of the former GDR • Characteristics and metadata • 2 data files • Person file (demography, income, education, employment etc.) • Dwelling and building file (state of repair, occupancy, equipment etc.) • 16,4 mio. persons, 6,2 mio. households, 6 mio. dwellings • Metadata • no codebooks at FSO and Federal Archive • Archives of regional statistical offices in the former GDR states (Field of study, occupation codes)
1971 census of the former GDR • Preparation of microdata – Matching of data files • Matching by unique combination of variables in both files • Deletion of vacant dwellings and dwellings used for other purposes • Verification of matching method by comparison of certain variables existent in both files • number of principal residents in 1st household • number of children under age 17 in 1st household
1971 census of the former GDR • Anonymisation • Time • Drawing of a subsample • Geographic detail • Recoding • Sorting of households • fully anonymised microdata = Public Use File
1971 census of the former GDR • Sampling • Size: 25% household sample • Method: Systematic Random Sampling • Sorting of households by geographic variables • First household selected randomly from the first 4 cases • Then selecting every 4th household
1971 census of the former GDR • Geographic detail • State (Nuts1) • Construction of a size of place variable Categories: less than 2 000 2 000 - 10 000 10 000 - 50 000 50 000 and more. Berlin : less than 100 000 100 000 and more.
1971 census of the former GDR • Recoding of variables • Principle: every value of a variable should have at least 3 observations in the original file • Recoded variables (top coding): • Age • Floor space of rooms in dwelling • Number of rooms in dwelling • Number of secondary residents in household
1971 census of the former GDR • Sorting of households • Random sorting of • Building in state • Dwelling in building • Adding new running number for • Building in state • Dwelling in building • Household in dwelling
1971 census of the former GDR • Public Use File • 4,1 million persons • 50,000 in collective dwellings • 1,6 million households • 104 variables
1990´s 1980´s 1970´s 1970 . 1973 1982 1987 1991 2001 1987 1981 1971 Outlook
Thank you! andrea.harausz@destatis.de www.forschungsdatenzentrum.de