The future of the British RAE The REF (Research Excellence Framework) Jonathan Adams

The future of the British RAEThe REF(Research Excellence Framework)Jonathan Adams

Research Assessment Exercise - timeline • 1980s - policy on concentration and selectivity • 1986 - 1st Research Selectivity Exercise • 1989 - modified and formalised as the RAE • 1992 - Polytechnics access research funding, enter a streamlined RAE • 1996 and 2001 - further cycles, higher quality thresholds for funding • 2008 – new ‘Roberts’ profiling format

The shift to metrics • Evolution • RAE = peer review of an evidence portfolio, including data on outputs, training and grants’ funding • RAE2008 profiling adds emphasis to the data • Discontinuity • Treasury’s 2007 announcement was disruptive, from many perspectives • Compromise • HEFCE consultations shifted emphasis away from the gross simplification, and restored peer review

Research assessment must support the UK’s enhanced international research status Is the assessment dividend beginning to plateau? Has the RAE delivered all it can?

If there is a shift to ‘metrics’, then disproportionate change should be avoided

Research performance - indicators, not metrics What we want to know research quality Research black box Inputs Outputs Time Time Funding Numbers.. Publications What we have to use

How can we judge possible ‘metrics’? • Relevant and appropriate • Are metrics correlated with other performance estimates? • Do metrics really distinguish ‘excellence’ as we see it? • Are these the metrics the researchers would use? • Cost effective • Data accessibility, coverage, cost and validation • Transparent, equitable and stable • Is it clear what the metrics do? • Are all institutions, staff and subjects treated equitably? • How do people respond, and can they manipulate metrics? • “Once an indicator is made a target for policy, it starts to lose the information content that initially qualified it to play such a role”

Three proposed data components • Research funding • Research training • Research output • The key quality measure • All have multiple components • PLUS Peer Review

HEFCE favours bibliometrics: impact (1996-2000) is related to RAE2001 grade (data for UoA14 Biology)

Impact index is coherent across UK grade levelsdata for core science disciplines, grade at RAE96

HEFCE favours bibliometrics: impact (1996-2000) is related to RAE2001 grade (data for UoA14 Biology) The residual variance is very great

What is the right impact score? • Correct counts • 25% of cites are to non-SCI outputs • Proliferating versions • How do you collate? • Collaboration vs fractional citations • Fractional citation counts would work against trends and policy • Self citation – does it matter? • It is part of the sociology of research • Normalisation strategies • Clustering into subject groups

TOTAL INSTITUTIONAL OUTPUT UNPUBLISHED OR CLIENT PUBLISHED REPORTS etc PUBLICATIONS Non-print

INSTITUTIONAL PUBLICATIONS Books and chapters Conference proceedings Journal articles Will be in WoS within 2-3 months

INSTITUTIONAL PUBLICATIONS Articles in journals not covered by THOMSON WoS and/or SCOPUS or journal not covered at time of publication Journals covered by THOMSON WoS and/or SCOPUS

INSTITUTIONAL PUBLICATIONS Timeline 2001 CENSUS PERIOD Journals covered by THOMSON WoS and/or SCOPUS 2007 CENSUS DATE

INSTITUTIONAL PUBLICATIONS Timeline Papers with an institutional address published by staff who left or retired before the census date 2001 Journals covered by THOMSON WoS and/or SCOPUS CENSUS PERIOD Journals covered by THOMSON WoS and/or SCOPUS All papers with an institutional address published by all staff and students employed or in training during 2001-2007 All papers with an institutional address published by all staff and students employed or in training during 2001-2007 2007 CENSUS DATE

INSTITUTIONAL PUBLICATIONS Papers without that institutional address published by staff recruited during 2001-2007 2001 Journals covered by THOMSON WoS and/or SCOPUS CENSUS PERIOD All papers with an institutional address published by all staff and students employed or in training during 2001-2007 2007 CENSUS DATE

Papers published during census period by staff while at the institution 2001 Leavers CENSUS PERIOD PAPERS BY ADDRESS PAPERS BY AUTHOR Recruits 2007 CENSUS DATE Papers published during 2001-2007 by staff present at census date

Quality differentiation: do you assess total activity or selected papers? (data for UoA18 Chemistry)

The average does not describe the profile Average = 2.39 Average = 1.86 Two units in the same field differ markedly in average normalised citation impact (2.39 vs. 1.86) because of an exceptionally high outlier in one group, but the groups have similar profiles

Distribution of data values - income Minimum Maximum

Distribution of data values - impact The variables for which we have data are skewed and therefore difficult to picture in a simple way

Simplifying the data picture • Scale data relative to a benchmark, then categorise • Could do this for any data set • All journal articles • Uncited articles (take out the zeroes) • Cited articles • Cited less often than benchmark • Cited more often than benchmark • Cited more often but less than twice as often • Cited more than twice as often • Cited less than four times as often • Cited more than four times as often

Categorising the impact data This grouping is the equivalent of a log 2 transformation. There is no place for zero values on a log scale.

UK ten-year profile 680,000 papers MODE (cited) AVERAGE RBI = 1.24 MODE MEDIAN THRESHOLD OF EXCELLENCE?

Profiles are informative and work well across institutions and subjects

HEIs – 10 year totals smoothed Absolute volume would add a further element for comparisons

HEIs – 10 year totals by volume

Normalisation strategy will affect the outcome(Data for UoA13 Psychology)

0.00 0.16 0.32 0.48 0.64 Subject clustering needs to fit UK research Clinical Lab Sci... Medical Hosp. based... Com. based... Other stud. Pharmacy Biochemistry Biol. sciences Pre-clin. stud. Physiology Bio-Med Pharmacology Anatomy Veterinary sci. Clin. Dentistry Food sci... Agriculture Earth sci. Environment Environ. sci. Geography Archeology Mineral/mining... Chemistry Physical Metallurgy... Physics Chem. eng. Computer sci. Gen. Eng. Engineering Mechanical eng... Electrical eng... Civil eng. Maths Pure maths. Applied maths. Statistical res... Nursing Sports related... Psychology Education Politics... Social policy... Sociology Social work Communication... Built environ. Social Town/country... Economics... Business... stud. Accountancy Law Library and info... Anthropology Asian stud. Middle east... Theology... American stud. This tree diagram illustrates similarity in the frequency with which journals were submitted to RAE1996 Iberian... European stud. Arts & hum’s French German, Dutch... English History Italian Russian... Linguistics Classics... Philosophy History of Art... Art and Design Drama, Dance... Music Celtic stud.

How should we map data to disciplines?i.e. what is Chemistry? Thomson

How well do metrics respond to variation? • Subject differences • Can we accept differences in criteria and balance between clusters? • What about divergence within clusters? • How do metrics support the growth of interdisciplinarity? • How can emerging (marginal?) research groups be recognised? • Differences in mode • Where is the balance between basic and applied research? • Differences in people • Career breaks, career development

How well do metrics represent different HEIs?Output coverage by articles on Thomson Reuters’ databases

What will it cost? • Data costs • Core data – how much, from whom? • Data cleaning and validation • Pilot studies are elucidating this – and the task is big • Requirements on institutions • Pilot studies will elucidate this • System development • System maintenance • Will it cover institutional quality assurance?

Other issues • Census period • What about synchrony and sequence? • Weighting indicators • ERA will weight research training at ‘0’ • Need to weight within types as well as between • Interface between quantitative (indicators) and qualitative (peer review) • Role of panel members • Risk of mis-match

Do outputs hang together with income and training? We can tell you … “You are the REF” Check it out now RAE2008.com

How can we judge possible metrics? • Relevant and appropriate - YES • Technical ‘correctness’ of metrics is not a problem, but there is a lot of work to do in refining and comparing options • Cost - MAYBE • Data accessibility is not a problem • But we have yet to scope full system requirements • So is there a problem? • Are all subjects, HEIs, staff and modes treated equitably? • What will 50,000 intelligent people start to do? • Goodhart’s Law - for how long will the metrics track excellence? • Researchers must decide, not metricians (RMM, 1997) • The devil is in the detail: get involved

REF pilot projects • 20+ institutions (July ’08) • Collect and collate databases, reconciling authors to staff (Oct ’08) • Compare Thomson and Scopus coverage • Collate and normalise citation counts (Dec ’08) • Run evaluations of alternative methodologies • Disseminate outcomes and consult (Mar ’09)

Over 8,000 people participated in recent PBRF rounds (50,000 in the RAE). Thomson recorded fewer than 5,000 articles per year recently (100,000 for the UK). That is less than one article per NZ researcher per year.

Implications for Aotearoa New Zealand • Relative data coverage • Balance of regional journals • ‘International’ = trans-Atlantic • The relevance of citations • Scale factors and relative load • Fixed costs • Community size and anonymity • Compatibility of stakeholder and researcher views on assessment outcomes

The future of the British RAEThe REF(Research Excellence Framework)Jonathan Adams

The future of the British RAE The REF (Research Excellence Framework) Jonathan Adams