1 / 22

Unlocking Research Data: NCHS Data Enclave Overview

Explore the NCHS Research Data Center, offering access to public and confidential health data tailored to your project needs. Learn about data types, access methods, and collaboration examples.

lslate
Download Presentation

Unlocking Research Data: NCHS Data Enclave Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

  2. Session 43:The Research Data Center:Data Enclave of the NCHSSession Coordinator and PresenterDeborah Rose, Ph.D.

  3. Speakers and topics (updated list) Introduction to the RDC: • Overview of the Research Data Center – What it is and what it does - Deborah Rose, Ph.D., M.P.H. • Types of data available - Stephanie Robinson, B.A.

  4. Speakers and topics (updated list, continued) Examples of RDC research collaborations: • Emergency medicine research and the RDC - Julius Cuong Pham, M.D., Ph.D • Assessing health and health care in the District of Columbia - Carole Roan Gresenz, PhD • Combining contextual variables with data from NHANES III - Chloe Bird, Ph.D.

  5. What is the RDC? Location A suite of offices in Hyattsville Maryland Staff Project managers who are experienced analysts Security Keypad access office suite, stand- alone computers Data Public and confidential health and other information, combined and customized for your project Access Onsite, remote, Census RDC

  6. Start of the RDC • Modeled after the Census Bureau Data Research Centers • Opened in 1998 • Policies were developed to assure access and confidentiality

  7. Two contradictory mandates • Wide dissemination of the data - The Public Health Service Act of 1956 requires the collection and wide dissemination of data. • Maintenance of confidentiality - the NCHS 308(d) Confidentiality Statute requires that the information collected may not be released if the establishment or person supplying the information is identifiable

  8. Resolving the contradiction • Summary tables of aggregate data are published (on paper or on the web) • Public use datasets are released with person or institution level records • Records do not include individual identifiers • Variables that might allow record identification are suppressed • Values based on small samples are suppressed

  9. When do you need the RDC? Data needs and availability • You have a research project or policy objective best served by analyzing representative, federally collected health data • Public use data does not meet all your needs • NCHS has, or can get, the data of interest

  10. When do you need the RDC?(continued) Analytic skills and computer access • You or your staff have the skills to analyze individual level data using a standard statistical package • You can come to the NCHS RDC, a Census RDC, or have a secure email account to access our remote system.

  11. When do you need confidential variables? • Confidential information is directly related to your main research question. • You need to link link two or more datasets, using small area geographic identifiers (such as state, county or census tract) that are not publicly available. • You need to make a subset of the population using selection criteria from a confidential variable such as exact age, date of interview, small race/ethnic group.

  12. Types of Data • NCHS Supplied: Confidential variables from the vital statistics system, any of the NCHS data collection systems, or files linked between systems • User supplied: Public use or other data collected by other agencies, or compiled by the user • See next presentation for more detail

  13. Major Steps • See the website for the latest requirements • Develop and submit a proposal • We review it and accept, reject or ask for revisions • You sign the confidentiality agreements • You send us the public use files • We merge the public and confidential data • We send you an invoice for the setup and usage costs

  14. Major Steps(continued) • You run your analyses • We review the output for disclosure • You publish • Please send us a citation and copy of your published or reported work!

  15. Components of the proposal • Contact information • Key study questions/Public health benefits • Year, data system and dataset(s) • Lists of public use and confidential variables • Why publicly available data are insufficient • Analysis/statistical methods/software • Sample output table shells

  16. NCHS User Fees File construction and setup • Mortality files = $250 per day • All other files = $500 per day

  17. NCHS User Fees(continued) Access and Analysis On site • $200 per day (2-10 days) Remote • NSFG-CDF = $500/year • All other files = $500/month • Each added survey cycle = $250/month

  18. ANDRE: ANlytical Data Research by Email • Completely automated system • Operates round the clock without any human intervention • Registered subscribers only • Proposals already reviewed and approved • Confidentiality agreements have been signed • Unlimited Access during the subscription period

  19. How ANDRE Works • A registered subscriber sends an email to ANDRE with a SAS or SUDAAN program in an attachment • ANDRE’s lead server authenticates the user through password challenge and email • Researchers never see data but run their programs against a data set prepared to their specifications by RDC staff

  20. NCHS RDC Usage Statistics Average no. of projects 1998-2003 = 1.5/month Average no. of projects 2004-2006 = 2.5/month Average no. of proposals 2007 = 10/month Current no. of active projects, 2007 = 146 Average no. of daily remote users 2007 = 18 Average no. of proposals 2008 = 7/month Current no. of active projects, 2008 > 200 Average no. of daily remote users 2008 = 30

  21. Visit the NCHS RDC website at: http://www.cdc.gov/nchs/r&d/rdc.htm

  22. For more information the NCHS RDC website at: www.cdc.gov/nchs/r&d/rdc.htm

More Related