1 / 21

DwB-Training Cource on EU-SILC , February 13-15, 2013

Working with EU-SILC: data files, variables and data management Practical computing session I – Part 1 Heike Wirth GESIS – Leibniz Institut für Sozialwissenschaften. DwB-Training Cource on EU-SILC , February 13-15, 2013 Romanian Social Data Archive at the Departement of Sociology

faith
Download Presentation

DwB-Training Cource on EU-SILC , February 13-15, 2013

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Working with EU-SILC: data files, variables anddata managementPractical computing session I – Part 1Heike WirthGESIS – Leibniz Institut für Sozialwissenschaften DwB-Training Cource on EU-SILC , February 13-15, 2013 Romanian Social Data Archive at the Departement of Sociology University of Bucharest, Romania

  2. Overview • EU-SILC datasets • EU-SILC Variables • Differences between Data collected & anonymised User Database (UDB) • Hands on • Transform CSV-File into SPSS/Stata-Systemfile • number of households/persons in the file

  3. EU-SILC Data • Four separate files • Household ( = 1 observation per household) • Register data (D) • Household data (H) • Individuals (= 1 observation per person) • Register data (R) • Personal data (P) • Since cross & longitudinal data are provided separately => 8 files

  4. EU-SILC Data For example: • UDB_c10D_ver 2010-1 from 01-03-12.csv • UDB_c10H_ver 2010-1 from 01-03-12.csv • UDB_c10R_ver 2010-1 from 01-03-12.csv • UDB_c10P_ver 2010-1 from 01-03-12.csv • _c = cross; _l = longitudinal • 10 = year of the survey = 2010 • D = Household Register File • H = Household Data File • R = Personal Register File • P = Personal Data File • 2010-1= version of the data (e.g. 1st version of the 2010 data) • csv = type of data (=comma separated values)

  5. EU-SILC Data • Household Register File (D) • one record for every household including information regarding sample units, household weights, etc • e.g. UDB_c10D_ver 2010-2: N = 225 972 households • Household Data File (H) • one record for every household including household data • e.g. UDB_c10H_ver 2010-2: N = 225 972 households • Personal Register File (R) • one record for every person currently living in the household or temporarily absent • e.g. UDB_c10R_ver 2010-2: N = 576 531 persons • Personal Data File (P) • Reference population: members of the household aged 16 and over • e.g. UDB_c10R_ver 2010-2: N = 476 705 persons

  6. Domains & Areas - Households Source: Guidelines_Doc65_2010.pdf, p.73

  7. Domains & Areas - Persons Source: Guidelines_Doc65_2010.pdf, p.73

  8. EU-SILC Variables • Variable names in EU-SILC are composed of 3 parts: • 1st character refers to the dataset (D; H; R; P) • 2nd character refers to the domain • 3 digits represent a sequential number • e.g. PE040 = Highest ISCED Level attained • Most important piece of data documentation: • Guideline ‘Description of Target Variables’ • refers to variables delivered by the NSIs to EUROSTAT

  9. Guidelines – Target Variables (collected)

  10. Guidelines – Target Variables (collected)

  11. Guidelines – Target Variables (derived) (...)

  12. Different variable vames but same labels

  13. Check HH020 & HH021 (using flag-variables)

  14. Additional important information • DIFFERENCES BETWEEN DATA COLLECTED (as described in the guidelines) AND THE ANONYMISED USER DATABASE • All income variables are in € (EURO) • Variables removed • Top/Bottom coding • Variables added • in addition: country specific rules

  15. Anonymised User Database – Variables added • Names of variable added • 1st character refers to the file (D; H; R; P) • 2nd character ‘X’ • 3 digits represent a sequential number • e.g. • HX040: Household size • HX060: Household type • HX080: Poverty Indicator • (….)

  16. Anonymised User Database – Variables added

  17. Hands on – Exercise 1 • Step 1: Open the 4 SPSS and/or Stata – Systemfiles • Step 2: - Check the data • How many households are included in the data (H- & D-File) • total • by country • How many persons are included in the data (P- & R-File) • total (any differences between the P- & R-File?) • by country • There are 15 countries in the training files. Fill in the table (next slide) • What are the main differences across countries? • Are there differences in the % of unemployed depending whether you use RB210 or PL031, why?

  18. Exercise 1.3: Fill in the table Mean

  19. Exercise 1.3: Fill in the table

More Related