1 / 23

Presentation at the Research Conference on Research Integrity Niagara Falls, NY May 16, 2009

Research Data in The Social Sciences: How Much is Being Shared? Amy Pienta Myron Gutmann Jared Lyle ICPSR, University of Michigan. Presentation at the Research Conference on Research Integrity Niagara Falls, NY May 16, 2009. Types of Social Science Data. MAJOR SOCIAL SCIENCE TOPICS

simpsonm
Download Presentation

Presentation at the Research Conference on Research Integrity Niagara Falls, NY May 16, 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research Data in The Social Sciences: How Much is Being Shared?Amy PientaMyron GutmannJared LyleICPSR, University of Michigan Presentation at the Research Conference on Research Integrity Niagara Falls, NY May 16, 2009

  2. Types of Social Science Data MAJOR SOCIAL SCIENCE TOPICS • Social - class, crime, social movements, race relations, culture, folklore, family, aging • Economic - wealth, prosperity, labor, business • Psychological - cognition, attitudes, stereotypes • Politics - justice, democracy, public policy, public administration, international conflict TYPES OF DATA • Surveys, Opinion Polls, Structured Interviews, Experiments, GIS (map) • Administrative & Historical Records • Video, Audio, Transcripts, Text • Web sites, Email, Blogs

  3. How Can We Think About Data Sharing? • Making one’s research data available for others to analyze and/or reanalyze • Placing one’s data in the public domain • Data archive that has a explicit mission to preserve and disseminate data to a wide audience

  4. Value of Data Sharing in the Social Sciences+ • Replication • Surveys are often more comprehensive than any one researcher’s needs/time • Improve other data collections and measurement • Reduces costs by avoiding duplicate data collection efforts • Research training • Data ownership larger than the PI

  5. Many Avenues for Sharing Data in the Social Sciences • Broad-based social science data archives • National data archives (outside the US) • Thematic “boutique” archives • Institutional repositories • Journal-based archives • Individual/departmental websites

  6. Why are data not shared? • Preparing data and documentation can be enormously time consuming • Need to protect the confidentiality of respondents • Fear of getting “scooped” • Lack of rewards for sharing • Limited resources for data preparation

  7. NSF Data Sharing Policy National Science Foundation Important Notice 106 (April 17, 1989) states: "[NSF] expects investigators to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections, and other supporting materials created or gathered in the course of the research. It also encourages awardees to share software and inventions or otherwise act to make such items or products derived from them widely useful and usable."

  8. NIH Data Sharing Policy The NIH expects and supports the timely release and sharing of final research data from NIH-supported studies for use by other researchers. Starting with the October 1, 2003 receipt date, investigators submitting an NIH application seeking $500,000 or more in direct costs in any single year are expected to include a plan for data sharing or state why data sharing is not possible.

  9. Goals To identify the “universe” of social science data that have been collected To know how much social science data is “at risk” of being lost or has been lost (versus that which is available, preserved) To understand the value of sharing and/or data archiving

  10. LEADS Database at ICPSR • NICHD funding – PI Survey about Disclosure Risks • Library of Congress funding – Identification and Appraisal of “at risk” Social Science Data • ORI RRI funding (NLM) – Creating a research database

  11. What is LEADS? A database of records containing information about thousands of scientific studies that may have produced social science data The database contains: Descriptive information about scientific studies we identify. Information used to determine “fit” and “value” of a scientific study Value-added information from bibliometric analysis, PI surveys, constructed variables

  12. Sources of Information National Science Foundation National Institutes of Health

  13. LEADS Screening Criteria • Social science and/or behavioral science • Original or primary data collection proposed, including assembling a database from existing (archival) sources

  14. NSF Grant Awards in LEADS LEADS contains 17,194 awards made by NSF LEADS spans 30 years of NSF awards - 1976 to 2005

  15. NIH Grant Awards in LEADS • NICHD, NIA, NIMH, NINR, AHRQ, NIAAA, NIDA, Clinical Center, NIDCD, FIC, NCI, NHLBI, NIDDK (1972+) • 172,196 - total # awards screened

  16. LEADS Database at ICPSR

  17. Results: Total and By Funding Agency

  18. Results: By Award Year

  19. Results: By Gender of PI

  20. LEADS: How Data Are Lost Data Intentionally Discarded “I generally keep data for…10 years beyond the last time I do something with them.” “The material…was considered sensitive data. Institutional review boards.. required us to promise to destroy the data after a certain period of time...” “As I retired…I simply didn’t have the room to store these data sets at my house.”

  21. LEADS: How Data Are Lost Unintentionally Lost “Some data were collected, but the data file was lost in a technical malfunction.” “The data from the studies were on punched cards that were destroyed in a flood in the department in the early 80s.”

  22. Conclusion & Limitations • Most NIH and NSF funded social science data are not publicly archived • Lower Bound Estimate 3.8% • Upper Bound Estimate 14.2% Limitations • Selectivity Abound (e.g. Harvard Dataverse Catalog; PI Pilot Survey) • Have not taken into account informal data sharing

More Related