210 likes | 230 Views
Learn how NIH matches databases to analyze research training programs, allocation diversity, completion rates, and post-PhD activities for improvements in policy and planning. Use of auxiliary mapping improves dataset quality.
E N D
Best Practices: Leveraging Existing Data Resources to Evaluate NIH Research Training Programs American Evaluation Association October 18, 2013 Jennifer Sutton, MS, Deepshikha RoyChowdhury, PhD, Cassandra Spears, Katrina Pearson, and Robin Wagner, PhD
Overview • Background • Research Questions • Data and Methodology • Results • Conclusions
Background • In evaluating its research training programs, NIH regularly utilizes information from existing national resources • This presentation will cover: • The methods used to match NIH and external databases • The broader range of evaluation questions that can be answered as a result • How the evaluation results are being used to inform NIH policies and planning
Research Questions • What is NIH’s role in doctoral education nationally? • How are NIH investments in research training allocated across scientific fields? Has that distribution changed over time? • How do NIH-supported predoctoral trainees and fellows compare to other PhD recipients in related fields in terms of time to degree and completion rates? Diversity? • What are the post-PhD plans and research activities of NIH-supported trainees and fellows?
Data Sources IMPAC II: NIH’s administrative database, which includes information on research grants, contracts, and research training and fellowship awards Doctorate Records File (DRF): The consolidated results of the Survey of Earned Doctorates, an annual census of all individuals receiving US research doctorates since 1957 • The survey is coordinated by the National Science Foundation (NSF) and co-sponsored by the NIH and other federal agencies • NIH receives a copy of the DRF under a licensing agreement with the NSF
Methodology • NIH receives an updated copy of the DRF every fall and matches it to its IMPAC II database • The latest version of the DRF includes US PhD recipients through June of 2011 • For individuals listed in NIH administrative files as predoctoral trainees or fellows, appearing in the DRF: • Confirms whether they completed the PhD • Provides information on their doctoral training, such as field of study, time to degree, and post-PhD plans
Data Matching Process DRF File is received and uploaded in IMPACII Database Validity of the data is confirmed by running manual queries. Existing auxiliary mapping tables are updated, if needed. Step 1 DRF and NIH data in IMPACII are uploaded to Oracle staging tables and matched on basis of main matching attributes such as last four digit of the SSN, date of birth and name. Various permutations of the matching attributes such as SSN+DOB, SSN+last name, DOB+last name etc. are used to calculate weighted scores. Cases of flip-flopped names are accounted for as well. “Non-matches” are identified and filtered out. Matching Algorithm Executed Matches with weights below minimum threshold are removed and matched table is uploaded in IMPACII. Step 2 Most common names in DRF and NIH data sets are identified and the weighted score for those is deducted in the matching table. Individuals with matching fields such as state address information, expertise, PhD institution, foreign citizenship etc. are identified and weighted score is increased each time a match is found for a field. Weighted scores are rescaled. Previously identified “bad matches” are removed from the matching table. Step 3 Inspection and Testing Manual Inspection and testing of the matches are performed before finalizing the table.
Broad Fields of Study of NIH-Supported Trainees and Fellows, 2011
Doctoral Fields With the HighestConcentrationof NIH-Supported Trainees and Fellows, 2011 = Arrows indicate fields where the percentage of NIH-supported PhDs has increased >10% since 2006
Trends in PhD Fields of Study ofNIH-Supported Trainees and Fellows
Trends in NIH-Supported TraineesandFellows Receiving PhDs, by Sex The percentage of NIH-supported trainees and fellows that were women increased from 39% in 1986 to 54% in 2011.
Trends in NIH-Supported Trainees and Fellows Receiving PhDs, by Race and Ethnicity The percentage of NIH trainees and fellows from groups underrepresented in science (i.e., African Americans, Hispanics, Native Americans/ Alaska Natives) increased from 5% in 1986 to 14% in 2011.
Post-PhD Plans of NIH-SupportedTrainees and Fellows, 2011 Percentage of PhD Recipients Year of PhD
Subsequent NIH Grant Activity Within 15 years of their degrees:
Conclusions Matching NIH records with data from the DRF and other sources enhances NIH’s capacity to evaluate its research training and fellowship programs, by providing: • Access to information not available in NIH databases • Context and comparison groups • An understanding of how NIH programs contribute to US doctoral education overall 18
Conclusions, cont. Our analyses indicate: • NIH-supported trainees and fellows are more likely to complete the PhD than other graduate students and do so in less time • The percentages of women and underrepresented minorities among the NIH-supported trainees and fellows receiving PhDs have steadily increased over time • NIH-supported trainees and fellows are more likely to remain active in research careers and be successful in obtaining research funding from the NIH than other PhD recipients • However, there are signs of uncertainty in the job market that bear watching 19
Contact Information Office of Extramural Programs Jennifer Sutton – suttonj@mail.nih.gov Office of Planning, Analysis and Communications Deepshikha RoyChowdhury – roychowdhuryd@mail.nih.gov Cassandra Spears – spearsc@mail.nih.gov Katrina Pearson – pearsonk@mail.nih.gov Robin Wagner – wagnerr2@mail.nih.gov
Questions? Thank you!