1 / 15

Semi-Permeable Boundaries Among Institutions: Non-Public Data and the Census RDC at Berkeley

This article discusses the existence of RDCs (Research Data Centers) and their purpose in providing access to confidential, non-public Census Bureau data for qualified researchers. It explores the tension between data collection and distribution, as well as the creation of aggregate data, microdata with masked geography, and synthetic data. The article also discusses the benefits of RDCs to the Census Bureau and the process for researchers to access the data.

anewby
Download Presentation

Semi-Permeable Boundaries Among Institutions: Non-Public Data and the Census RDC at Berkeley

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semi-Permeable Boundaries Among Institutions: Non-Public Data and the Census RDC at Berkeley IASSIST 2009 – Tampere, Finland Jon Stiles May 27, 2009

  2. Three Questions • What is an RDC? (and why does it exist?) • Partners and Constituents • Proposal process and environment

  3. CCRDC California Census Research Data Center Berkeley The CCRDC is a joint project of the U.S. Bureau of the Census and the University of California Berkeley and UCLA to enable qualified researchers with approved projects to access confidential, unpublished Census Bureau data. There are nine RDCs in the U.S.: Berkeley, UCLA, Boston, Baruch, Cornell, Ann Arbor, Duke, Chicago, Washington, DC (plus Minnesota!) CCRDC on the web: http://www.ccrdc.ucla.edu/

  4. Why do RDC’s Exist? Tension between high quality data collection and distribution/use. Census Bureau and other federal agencies collect a huge amount of data, many items “sensitive”. To maintain high response rates, promises of confidentiality Data have diverse uses and users

  5. How to reconcile tensions? Release of aggregate data (Summary data)

  6. How to reconcile tensions? Release of aggregate data (Summary data) Release of microdata with masked geography, selected items, top-coded categories (PUMS)

  7. How to reconcile tensions? Release of aggregate data (Summary data) Release of microdata with masked geography, selected items, top-coded categories (PUMS) Creation of synthetic data (LEHD)

  8. How to reconcile tensions? Release of aggregate data (Summary data) Release of microdata with masked geography, selected items, top-coded categories (PUMS) Creation of synthetic data (LEHD) Controlled access with tight security and disclosure review (RDCs)

  9. Purpose of Census Research Data Centers • Protected Access to non-public use data for Researchers • Secure facility • Presence of Census Bureau employee • Disclosure Review • Benefits to Census Bureau • Necessary for access to Title 13 and Title 26 data • Not required for NCHS, AHRQ data if not linked to Title 13 data

  10. Data at the RDC’s include • Demographic Surveys and Censuses • Decennial Census • American Community Survey • CPS, SIPP, AHS, NLS, and more…. • Economic Surveys and Censuses • Longitudinal Business Database • Census of Manufactures, Services, Mining, Retail Trade, Wholesale Trade, Transportation, Communications and Utilities • Survey of Employers, Plant Capacity, Capital Expenditures, Pollution Abatement Costs, Energy Consumption, and more….

  11. Additional Data:National Center for Health Statistics We are now hosting research using confidential NCHS and AHRQ data in the CCRDC Rules for access and disclosure the same as those in their enclaves http://www.cdc.gov/nchs/r&d/rdc.htm http://www.meps.ahrq.gov No requirement to demonstrate Census benefit. Disclosure Avoidance review conducted by partner agencies Long list of datasets – including NHIS, NHANES, NSFG, LSOA….

  12. Partners and Constituents Census Bureau, UC Berkeley, Researchers …and oversight agencies Joint Project Agreement identifies responsibilities Financial, Security, Employees, Processes Individual Agreements with Researchers, Special Sworn Status

  13. Partners and Constituents Joint & Complementary Interests Census Bureau – Benefits to Bureau an integral and over-riding part of every project Berkeley – availability seen a key component for research, faculty recruitment/retention Researchers – Data allows:

  14. Why use data at RDC? Not available elsewhere Establishment level business data Linked household-firm (LEHD) data More detail than elsewhere Detailed geo-spatial variables Virtually no top or bottom coding Possible to link to other non-Census data

  15. Proposal Process & Environment • Proposal • Create account at Census – Online submission • Contact with local RDC administrator for project development, scope, benefits to Bureau • Special Sworn Status • Fairly long lead time, internal/external reviews, Disclosure Risks vs. Benefits • Secure Data Center, thin client, Linux • GIS tools include SAS 9.2, R, Grass • Also Stata, Sudaan, Gauss, Matlab, etc.. • Restricted Entry, Printing, Isolated from Internet, 24 hour surveillance

More Related