300 likes | 413 Views
The LEADS Database at ICPSR: Identifying Important Social Science Studies for Archiving. Presentation prepared for 2006 Annual Meeting of the IASSIST Friday, May 26, 2006. LEADS at ICPSR. We would like to know the “universe” of social science data that have been collected Identification
E N D
The LEADS Database at ICPSR: Identifying Important Social Science Studies for Archiving Presentation prepared for 2006 Annual Meeting of the IASSIST Friday, May 26, 2006
LEADS at ICPSR We would like to know the “universe” of social science data that have been collected Identification Data-PASS and ICPSR would like to know how much social science data is “at risk” of being lost or has been lost Appraisal We would also like to know what “at risk” social science data are important enough to be archived
Or…. What have we failed to catch? How big is the “one” that got away?
What is LEADS? LEADS is a database of records containing information about scientific studies that may have produced social science data LEADS contains descriptive information about various scientific studies that have been identified. LEADS also contains information that can be used to determine the “fit” and “value” of a scientific study LEADS keeps a record of all human (staff) decisions that have been made about the fit and value of a scientific study.
Sources of Records in LEADS NSF research grant awards downloaded from nsf.gov NIH research grant awards downloaded from CRISP Prospective searches of topical areas/journals Researcher nominations (self or other)
NSF Grant Awards in LEADS (pre-screening) LEADS contains 17,194 awards made by NSF LEADS spans 30 years of awards - 1976 to 2005 LEADS spans 53 NSF organizations that award grants Of the 53 organizations, the 4 organizations with the most records screened (each contributing 1,000+ records) were: SES: Social and Economic Sciences BCS: Behavioral and Cognitive Sciences DMS: Mathematical Sciences IOB: Integrative and Organism Biology
Screening Criteria • Social science and/or behavioral science • Original or primary data collection proposed, including assembling a database from existing (archival) sources
Types of Research Activity NSF has Awarded by Year ***Abstracts become widely available 1987+***
Most Prevalent Social Science Primary Data Collection Awards by NSF Organization
Other Fields Coded During Screening • Topic/Discipline • Data Collection Methodology • Sampling Characteristics
Topic/Discipline in NSF Awards for Primary Social Science Data Collection # of NSF Awards An additional 1,594 records coded “General Social Science”
Type of Data Collection Method/Design in NSF Awards for Primary Social Science Data Collection # of NSF Awards
NSF Awards for Social Science Primary Data: Proposed Sampling Method
NSF Awards for Social Science Primary Data: Type of Sampling Frame Proposed
NSF Awards for Social Science Primary Data: Proposed Sample Size
NSF Awards for Social Science Primary Data: Race/Ethnic Distribution of Sample
Following-Up: Prospects for Data Archiving • N=2,336 Primary Social Science Data Collection Awards • N=201 Combined Data Collection Activity and Secondary Data, Social Science Research Steps: • Select ~10-20 records per week • Generate updated contact information for PI • Determine if “obviously” archived already (ICPSR, Roper, Odum, Murray, Sociometrics, GOOGLE) • Review related citations • Review other NSF awards made to PI • Contact PI (Data Produced? Data Archived? Data Still Available?)
Other Qualitative Fields in LEADS • Description of how the collection fits within the scope of important social science studies • Description of the value of the study for archiving • Priority ranking • Citations • PI communication
Problems archiving studies… • PI unsure where data are stored • Data are in an old format that we may or may not be able to recover • Physical condition (storage media or documentation) has deteriorated • Paper copy documentation only, incomplete documentation • No English language documentation
NIH records in LEADS • We screened NIH awards for (1) social science/behavioral, (2) original data & (3) quantitative • All NIH Institutes (1990-2001) • NICHD, NIA, NIMH, NINR, AHRQ, NIAAA, NIDA, clinical Center, NIDCD, FIC, NCI, NHLBI, NIDDK (all years) • 172,196 - total # awards screened • 6,381 – selected awards
Challenges & Limitations • Size and scope of this project • Need for PI cooperation • Screening error rate has not been quantified • Addressing the ambiguous records • Collaborative projects and continuation projects have not been eliminated
Conclusions • NIH & NSF award databases are a valuable source of information about studies “at risk” of being lost • PI grant abstracts are highly variable regarding amount of detail about research aims & methodology • Preliminary results suggest that few studies have been archived; although the rate is higher for NSF • The large number of unarchived studies requires us to use appraisal methods to determine a particular study’s value for archiving
NSF Darrell Donakowski Lisa Quist Jared Lyle Tannaz Sabet NIH Russ Hathaway Felicia LeClere Brian Madden James McNally JoAnne O’Rourke Kelly Zidar People Working on LEADS