1 / 23

Metrics, research award grades, and the REF

Metrics, research award grades, and the REF. Harvey Goldstein University of Bristol. With support from Mary Day, Ian Diamond and Phil Sooben. The context. REF proposal to use metrics Journal impact factors and citations Research income Research students

Download Presentation

Metrics, research award grades, and the REF

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Metrics, research award grades, and the REF Harvey Goldstein University of Bristol With support from Mary Day, Ian Diamond and Phil Sooben

  2. The context • REF proposal to use metrics • Journal impact factors and citations • Research income • Research students • Research council grant application grades Little discussion so far of the technical measurement issues associated with Research Council awards

  3. The database • All ESRC applications 2001-2007 • Details of applicants, reviewer, assessor and board grades • Identification of departments and HEIs • Award amounts (not considered) • Final analysis of 2698 applications, 1698 departments

  4. A naïve analysis Consider the discipline of Education • Note that we have not been able to assign departments to RAE disciplines so ‘principal discipline’ used. • Similar results for other disciplines • Final award grade converted to a numeric score • All award types considered – similar results if fellowships excluded • PI weighted more than Co-apps: same award score given to each applicant • Weighted analysis of these scores in a 3-level model: • Application within Applicant within HEI

  5. Results of 3-level model • HEI/DEPT.; APPLICANT; APPLICATION

  6. Problems • Invalid analysis since scores not independent: • Imagine a situation where we have N applications, each of which has a different pair of applicants drawn from two particular HEIs, A & B where for an application each applicant is given the application’s awarded score. A simple analysis would compare the mean score for HEI A with the mean score for B, but these mean scores are equal by definition. Thus this analysis contains no information about HEI differences, as opposed to the case where for each pair we have a score derived separately for each applicant. • Applicants may also come from different departments not associated with the principal discipline

  7. A more valid analysis • We reconceptualise the data as follows: • We assume each applicant contributes a level of ‘quality’ to the application – • The application score is just the average of these (weighted according to whether PI or Coapp) • Some applicants are on more than one application associated with different combinations of other applicants and this allows us, in principle, to assign (estimate) a score for each applicant • Known as a multiple membership (MM) modelFormally: • i indexes application, j indexes applicant, is application score

  8. Another serious problem • There are, for education, 454 applications and 989 applicants and in general there are more applicants than applications. • This means that we cannot use the MM model to score applicants – non-identifiability. • However, there are only 98 HEIs so we can fit a model that identifies the HEI only (aggregating all applicants for one HEI within an application – will lead to some overestimation of the separation of HEIs). • This provides HEI/department scores.

  9. Results Note that HEI variance now about half what we saw before.

  10. Caterpillar plot Note how all confidence intervals overlap zero So no separation from overall mean is possible. Also, of the four highest in ‘naïve’ analysis, only one is in four highest here. Similar result if fellowships excluded

  11. It’s even more complicated • So far all applicants on an application have been assigned to the principal discipline. • We need to assign to their actual discipline/department and this implies we should carry out a joint analysis of all applications • Again, there are 2698 applications and but only 1698 departments • So we have a MM model and we estimate scores for each department

  12. Results The between-department variance is now larger (19%). Only 0.5% of departments have CIs overlapping the mean. Including the principal discipline in the model indicates (moderate) discipline differences in award grading (see below).

  13. One hundred lowest and highest ranked residuals for multiple membership model using all departments, with 95% confidence intervals.

  14. MM model with selected principal disciplines (>100 applications) Parameter Estimate Standard error Intercept (Econ) 7.17 0.11 Management -0.69 0.17 Social Policy -0.54 0.20 Education -0.24 0.11 Sociology 0.06 0.15 Human Geog 0.10 0.18 Psychology 0.16 0.13 Level 2 variance (HEI/department) 0.45 0.11 Level 1 variance (Application) 2.37 0.09 VPC 16.0%

  15. Using the results • Given uncertainty how useful are they? • Can they be combined (formally) with citations to provide greater precision? • The technical limitations of the analyses are likely to apply to citation analyses also • E.g. analysis of NAS 2001 database shows 2,600 papers with 13,000 unique authors (Borner et al., 2004) • What are side effects – perverse incentives

  16. Perverse incentives • All high stakes performance monitoring systems encourage ‘gaming’ – some possibilities: • Large numbers of co-applicants squeezed into applications • Discouraging of cross-disciplinary applications • HEI behaviour would change over time with a destabilising and distorting effect. • Encouragement of many small and short term grants rather than fewer large and long term ones. • Distort behaviour of referees and board members (How?)

  17. Comparisons with RAE 2008 scores • Results for Economics and Education: • Simple (4,3,2,1,0) RAE scoring system • Insensitive to other scorings • Dept. results (residuals) from ESRC analysis (weighted) averaged to RAE HEI categories.

  18. Correlations between RAE and ESRC scores – selected disciplines

  19. Economics • 27 HEIs. Correlation =0.50 (P<0.01) highest 7 RAE scores are (from the top) are:LSE, UCL, Warwick, Oxford, Essex, Nottingham, Bristol

  20. Economics RAE ranks

  21. Education • 37 HEIs. Correlation = 0.30 (P=0.07) The top 7 are: IOE=Oxford, Cambridge=Kings, Bristol= Leeds, Exeter

  22. Education

  23. What next? • Incorporation of other research councils in a combined analysis • Include citation data in a combined model: • In the REF it can be argued that an analysis at least as complex as the present is unavoidable for validity • Using citations encounters the same issues of more applicants than papers/books.

More Related