220 likes | 242 Views
Explore the NCHS Research Data Center, offering access to public and confidential health data tailored to your project needs. Learn about data types, access methods, and collaboration examples.
E N D
2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008
Session 43:The Research Data Center:Data Enclave of the NCHSSession Coordinator and PresenterDeborah Rose, Ph.D.
Speakers and topics (updated list) Introduction to the RDC: • Overview of the Research Data Center – What it is and what it does - Deborah Rose, Ph.D., M.P.H. • Types of data available - Stephanie Robinson, B.A.
Speakers and topics (updated list, continued) Examples of RDC research collaborations: • Emergency medicine research and the RDC - Julius Cuong Pham, M.D., Ph.D • Assessing health and health care in the District of Columbia - Carole Roan Gresenz, PhD • Combining contextual variables with data from NHANES III - Chloe Bird, Ph.D.
What is the RDC? Location A suite of offices in Hyattsville Maryland Staff Project managers who are experienced analysts Security Keypad access office suite, stand- alone computers Data Public and confidential health and other information, combined and customized for your project Access Onsite, remote, Census RDC
Start of the RDC • Modeled after the Census Bureau Data Research Centers • Opened in 1998 • Policies were developed to assure access and confidentiality
Two contradictory mandates • Wide dissemination of the data - The Public Health Service Act of 1956 requires the collection and wide dissemination of data. • Maintenance of confidentiality - the NCHS 308(d) Confidentiality Statute requires that the information collected may not be released if the establishment or person supplying the information is identifiable
Resolving the contradiction • Summary tables of aggregate data are published (on paper or on the web) • Public use datasets are released with person or institution level records • Records do not include individual identifiers • Variables that might allow record identification are suppressed • Values based on small samples are suppressed
When do you need the RDC? Data needs and availability • You have a research project or policy objective best served by analyzing representative, federally collected health data • Public use data does not meet all your needs • NCHS has, or can get, the data of interest
When do you need the RDC?(continued) Analytic skills and computer access • You or your staff have the skills to analyze individual level data using a standard statistical package • You can come to the NCHS RDC, a Census RDC, or have a secure email account to access our remote system.
When do you need confidential variables? • Confidential information is directly related to your main research question. • You need to link link two or more datasets, using small area geographic identifiers (such as state, county or census tract) that are not publicly available. • You need to make a subset of the population using selection criteria from a confidential variable such as exact age, date of interview, small race/ethnic group.
Types of Data • NCHS Supplied: Confidential variables from the vital statistics system, any of the NCHS data collection systems, or files linked between systems • User supplied: Public use or other data collected by other agencies, or compiled by the user • See next presentation for more detail
Major Steps • See the website for the latest requirements • Develop and submit a proposal • We review it and accept, reject or ask for revisions • You sign the confidentiality agreements • You send us the public use files • We merge the public and confidential data • We send you an invoice for the setup and usage costs
Major Steps(continued) • You run your analyses • We review the output for disclosure • You publish • Please send us a citation and copy of your published or reported work!
Components of the proposal • Contact information • Key study questions/Public health benefits • Year, data system and dataset(s) • Lists of public use and confidential variables • Why publicly available data are insufficient • Analysis/statistical methods/software • Sample output table shells
NCHS User Fees File construction and setup • Mortality files = $250 per day • All other files = $500 per day
NCHS User Fees(continued) Access and Analysis On site • $200 per day (2-10 days) Remote • NSFG-CDF = $500/year • All other files = $500/month • Each added survey cycle = $250/month
ANDRE: ANlytical Data Research by Email • Completely automated system • Operates round the clock without any human intervention • Registered subscribers only • Proposals already reviewed and approved • Confidentiality agreements have been signed • Unlimited Access during the subscription period
How ANDRE Works • A registered subscriber sends an email to ANDRE with a SAS or SUDAAN program in an attachment • ANDRE’s lead server authenticates the user through password challenge and email • Researchers never see data but run their programs against a data set prepared to their specifications by RDC staff
NCHS RDC Usage Statistics Average no. of projects 1998-2003 = 1.5/month Average no. of projects 2004-2006 = 2.5/month Average no. of proposals 2007 = 10/month Current no. of active projects, 2007 = 146 Average no. of daily remote users 2007 = 18 Average no. of proposals 2008 = 7/month Current no. of active projects, 2008 > 200 Average no. of daily remote users 2008 = 30
Visit the NCHS RDC website at: http://www.cdc.gov/nchs/r&d/rdc.htm
For more information the NCHS RDC website at: www.cdc.gov/nchs/r&d/rdc.htm