1 / 31

Outline

IPUMS-Europe, 2004-2008: Restricted-access, anonymized microdata for scientific and policy research * * * Robert McCaa, University of Minnesota Population Center Nikolai Botev, UN-ECE Population Activities Unit (Geneva) www.hist.umn.edu/~rmccaa/ipums-europe. Outline. PAU 1990s project

Download Presentation

Outline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IPUMS-Europe, 2004-2008: Restricted-access, anonymized microdata for scientific and policy research* * *Robert McCaa, University of Minnesota Population CenterNikolai Botev, UN-ECE Population Activities Unit (Geneva)www.hist.umn.edu/~rmccaa/ipums-europe hist.umn.edu/~rmccaa/ipums-europe

  2. Outline • PAU 1990s project • IPUMS-International means: Restricted access, anonymized microdata • IPUMS-Europe: sister project (Latin America), connections with PAU • IPUMS-International partners • Principles: integration, dissemination hist.umn.edu/~rmccaa/ipums-europe

  3. Population Activities Unit 1990 census round harmonization project:focused on Aging • Begun 1992: PAU/UNECE, UNFPA, US-NIA • Microdata acquired for 15 countries • Harmonized 26 core person variables plus 13 optional; 10 dwelling/household variables, 18 optional • Extensive metadata: questionnaires, nomenclatures, classifications • Progressive over-sampling with age hist.umn.edu/~rmccaa/ipums-europe

  4. Population Activities Unit 1990 census round harmonization project:focused on Aging hist.umn.edu/~rmccaa/ipums-europe

  5. Population Activities Unit, 1990 census round harmonization project:focused on Aging • General release: samples for 8 countries • Samples for the other 7 countries available under more restrictive conditions • Dissemination: CDs or other media; no online access • Sustainability: ICPSR (U. of Michigan) hist.umn.edu/~rmccaa/ipums-europe

  6. Problems with PAU effort: • Sample design too complex • Need for time series • Lacked legal authority • Inadequate funding • Insufficient computing infrastructure and human resources • Antiquated distribution system • Sustainability problematic hist.umn.edu/~rmccaa/ipums-europe

  7. Population Activities Unit: samples of older persons based on the 2000-round of censuses • Tightly integrated with IPUMS-Europe • Based on the same coding schemes, nomenclatures, and classifications • Utilize the same anonymization techniques and approaches; same data access modalities • Ensure sustainability through the integration with IPUMS-Europe: ICPSR & European Data Centers hist.umn.edu/~rmccaa/ipums-europe

  8. Population Activities Unit: samples of older persons based on the 2000-round of censuses • Sample design:- sample of households not included in the core IPUMS-Europe sample, where at least one member is over age 60 (recommended sampling density: 5 percent);- geography to match that of core samples; • Advantages:- more straightforward than the design used for 1990s;- in line with the practice of national statistical offices (e.g. PUMS-A and PUMS-O of the US Census Bureau); hist.umn.edu/~rmccaa/ipums-europe

  9. From IPUMS-USA (1989-) & PAU-Aging (1992-) to IPUMS-International (1999-) and beyond to IPUMS-International (1999-), Latin America (2003-), Europe (2004?) and beyond Restricted access Anonymized microdata hist.umn.edu/~rmccaa/ipums-europe

  10. IPUMS-International means Restricted access, Anonymized microdata • Should be “IRAMS” not IPUMS • Who are IPUMS-International users? Those who: • Have a demonstrated need for the data (project abstract) • Agree to abide by the restrictions of use • Place themselves under the jurisdiction of Institutional Review Boards hist.umn.edu/~rmccaa/ipums-europe

  11. IPUMSi Using the most demanding standards:legal & administrative ANONYMIZES as well as technical: » Suppress geographical detail (NUTS2/3?)» Corrupt the data! (just a little…)» Blur/aggregate sensitive codes» Convert dates to ages (blur key vars.) » Swap cases between districts! (just a few…)» Scramble order of unit records hist.umn.edu/~rmccaa/ipums-europe

  12. Anonymization example: Italy, 1991First assessmentNote: population uniques are anonymized after integration • 1. Suppress geographical variables below commune • 2. Convert • Dates of birth, marriage, immigration to ages • Band small groups • 3. Suppress sensitive codes for small groups: • Citizenship • Year of immigration to Italy • Commune of work/study hist.umn.edu/~rmccaa/ipums-europe

  13. EUROSTAT statistical anonymity standards(Thorogood, 1999)--all accepted by IPUMS-International • 1. small sample size • 2. limited geographical detail • 3. top and bottom coding of unique categories • 4. signed non-disclosure agreement • 5. prohibit redistribution of datasets to third parties • 6. prohibit attempts to identify individuals or the making of any claim to that affect • 7. require users to provide copies of publications hist.umn.edu/~rmccaa/ipums-europe

  14. EUROSTAT statistical anonymity standards(Thorogood, 1999)--all accepted by IPUMSi and more • 8. Age (constructed from birth date, where necessary) • 9. Never identify date of birth • 10. Never identify place of birth • 11. Migration: timing and place not identified in detail • 12. Place of residence identified by major civil division (pop>60k, 120k, 250k, 1 million--national rule) • 13. Sensitivity analysis of variables by national experts • 14. Confidentiality assessment by national experts hist.umn.edu/~rmccaa/ipums-europe

  15. Funded! Sister-project: IPUMS-Latin America: 17 countries, ~500 million pop., 5 census rounds80+ samples, 100+ million person records • Scope: Latin Americancensus microdata, 1960-present • Work Plan ( funded by National Institutes of Health) • 2001: Sign licensing agreements with official agencies • 2002: Obtain funding from U.S. NIH • 2003: Develop/translate microdata & metadata • 2004: Country expert teams design national integrations • 2005: MPC/expert teams design regional integration • 2006: MPC anonymizes/integrates microdata and metadata • 2007: MPC disseminates to bona fide researchers who sign non-disclosure license. National census/data/research institutes may distribute national versions via CDs/web. hist.umn.edu/~rmccaa/ipums-europe

  16. IPUMS-Europe Partnership: More… • Censuses: 1960s – 2000, where microdata exist • Countries: >350 million population, 16, inclined at present (* = signed): Austria, Bulgaria, Czech Republic*, France*, Germany, Greece, Ireland, Israel, Hungary*, Poland, Portugal, Romania, Slovenia*, Spain*, Switzerland, Turkey • Research: more knowledge, more users hist.umn.edu/~rmccaa/ipums-europe

  17. IPUMS-Europe Partnership: More uniformity… • Legal: signed memorandum of understanding • Administrative: restricted to approved users; strong enforcement procedures • Sample design: every nth household • Anonymization: includes corrupting data • Integration: more variables, composite coding • Dissemination: extract custom-tailored datasets, never entire samples hist.umn.edu/~rmccaa/ipums-europe

  18. Advantages…proven record of accomplishments: • Uniform legal protocols • Substantial institutional infrastructure • Experienced census microdata integrators • Cost-effective academic environment • Sustained funding from National Science Foundation, National Institutes of Health • Successful web-based distribution system: users! hist.umn.edu/~rmccaa/ipums-europe

  19. Advantages of IPUMS-International • Comparability: data are rigorously integrated; documentation is extensive, both primary (from NSIs) and integrated (from MPC) • Accountability: reports on users, usage and publications advisory board of statisticians and scientists • Sustainability: MPC, ICPSR hist.umn.edu/~rmccaa/ipums-europe

  20. IPUMS-Europe, 2004-2008: coverage~20 countries, representing ~400m. people • Scope: Europeancensus microdata, 1950-present • Work Plan (contingent upon funding) • 2003: Sign licensing agreements with census agencies Obtain funding from US NIH • 2004: Develop/translate microdata & metadata • 2005: Country expert teams design national integrations • 2006: MPC/expert teams design regional integration • 2007: MPC integrates microdata and metadata • 2008: MPC disseminates to bona fide researchers who sign non-disclosure license. National census/data/research institutes via CDs/web. hist.umn.edu/~rmccaa/ipums-europe

  21. IPUMS Imagine a new statistical product: scientifically anonymized, integrated census microdata samples made up of unidentifiable individuals... INTERNATIONAL » Easy-to-use web-interface» Highest scientific standards» Proven, powerful integration» A quantum leap in usage » 1998: 1 country signed» 1999: 3 countries» 2000: 9 » 2001: 15 » 2002: 32; first release, 6 countries hist.umn.edu/~rmccaa/ipums-europe

  22. IPUMSi RESCUES UN Demographic Center for Latin America (CELADE, Santiago, Chile)~3000 microdata tapes recovered and metadata (documentation) hist.umn.edu/~rmccaa/ipums-europe

  23. IPUMSi National experts in each country are contracted to assist with: PAYS • Assembling microdata and documentation • Developing samples • to minimize confidentiality risks • and to maximize robustness • Designing national integration plan • census-by-census • concept-by-concept • code-by-code • Writing integrated documentation hist.umn.edu/~rmccaa/ipums-europe

  24. IPUMSi PARTNERSHIP Census documentation compiled for Colombian microdata Standard:UN/Eurostat Principles & Recs... Photos from Colombia integration project, February-March, 2000:4 experts from DANE (census office)+7 academics (3 universities) hist.umn.edu/~rmccaa/ipums-europe

  25. IPUMSi integration principles • 1. Respect absolute anonymity and confidentiality • 2. Preserve all original data, except adjustments to insure privacy (top codes, blurrings, masking, re-ordering, etc.) • 3. Harmonize codes using international standardsoccupation: ISCO-88 (detailed, general)education: ISCED “ “family: IPUMS, etc. “ “ • 4. Enhance with constructed variables hist.umn.edu/~rmccaa/ipums-europe

  26. Composite coding scheme example:marital status hist.umn.edu/~rmccaa/ipums-europe

  27. Occupation: the ISCO standard, preliminary release: “1” digitfinal: 2-3 or 4 digit, depending upon country hist.umn.edu/~rmccaa/ipums-europe

  28. Variable availability, preliminary release hist.umn.edu/~rmccaa/ipums-europe

  29. IPUMSi Web-based extraction system DISSEMINATES Legally-binding license agreement • protects privacy and confidentiality • assures proper use • new sanction: loss of employment. Researcher selects • countries • censuses • cases/sub-populations • variables • sample densities • Facilitates comparative research hist.umn.edu/~rmccaa/ipums-europe

  30. Can we do it?? Yes we can!!! additional information at:www.hist.umn.edu/~rmccaa/ipums-europecontact:rmccaa@umn.edu * * * * *Thank you hist.umn.edu/~rmccaa/ipums-europe

  31. IPUMS-Europe, 2004-2008: coverage~20 countries, representing ~400m. people • Scope: Europeancensus microdata, 1950-present • Work Plan (contingent upon funding) • 2003: Sign licensing agreements with census agencies Obtain funding from US NIH • 2004: Develop/translate microdata & metadata • 2005: Country expert teams design national integrations • 2006: MPC/expert teams design regional integration • 2007: MPC integrates microdata and metadata • 2008: MPC disseminates to bona fide researchers who sign non-disclosure license. National census/data/research institutes via CDs/web. hist.umn.edu/~rmccaa/ipums-europe

More Related