MEASURING COMMUNICATION IN SCIENCE OPPORTUNITIES AND LIMITATIONS OF BIBLIOMETRIC METHODS

MEASURING COMMUNICATION IN SCIENCEOPPORTUNITIES AND LIMITATIONS OF BIBLIOMETRIC METHODS Wolfgang Glänzel and Koenraad Debackere SooS, Leuven, Belgium

STRUCTURE OF THE PRESENTATION • Introduction • Structure of Bibliometrics • Data sources of bibliometric research and technology • Elements, units and measures of bibliometric research • Citations and Self-citations • “Against absolute methods” • Journal Impact Factor • Distorted behaviour based on policy use and misuse of bibliometric data • Conclusions Measuring communication in science

INTRODUCTION Measuring communication in science

Introduction • What is bibliometrics? • The terms bibliometrics and scientometrics were almost simultaneously introduced by Pritchard and by Nalimov & Mulchenko in 1969. • According to Pritchard bibliometrics is • “the application of mathematical and statistical methods to books and other media of communication”. Measuring communication in science

Introduction • Nalimov and Mulchenko defined scientometrics as • “the application of those quantitative methods which are dealing with the analysis of science viewed as an information process”. • The two terms have become almost synonyms; nowadays, the field informetrics (Gorkova, 1988) stands for a more general subfield of information science dealing with mathematical-statistical analysis of communication processes in science. Measuring communication in science

Introduction • Present-day use of bibliometrics • Bibliometrics has evolved to a standard tool of science policy and research management. • A vast array of indicators to measure and to map research activity and its progressis available. • Science indicators relying on comprehensive publication and citation statistics and other, more sophisticated bibliometric techniques, are used in science policy and research management. • A growing, often controversial, policy interest to use bibliometric techniques in measurements of research productivity and efficiency. Measuring communication in science

Introduction • Common misbeliefs on bibliometrics • Main task of bibliometrics should be the expeditious issuing of “prompt” and “comprehensible” indicators for science policy and research management. • Research on bibliometric methodology is unnecessary; instead bibliometricians should elaborate guidelines explaining the use of their indicators. • Bibliometrics might be reduced to simple counting activities in order to replace/supplement qualitative assessment by quantitative indicators and to set publication output off against funding. Measuring communication in science

Introduction • Facts about bibliometrics • Bibliometrics is a powerful, multifaceted endeavour encompassing subareas such as structural, dynamic and evaluative scientometrics. • Structural scientometrics came up with results like the re-mapping of the epistemological structure of science. • Dynamic scientometrics constructed sophisticated models of scientific growth, obsolescence, citation processes, etc. • Evaluative scientometrics developed arrays of indicators to be used to characterise research performance at different levels of aggregation. Measuring communication in science

Introduction • What is bibliometrics dealing with and what can bibliometrics not be responsible for? • Bibliometrics can be used to develop and provide tools to be applied to research evaluation, but is not designed to evaluate research results. • Bibliometrics does not aim at replacing qualitative methods by quantitative approaches. • Consequently, bibliometrics is not designed to correct or even substitute peer reviews or evaluation by experts but qualitative and quantitative methods in science studies should complement each other. Measuring communication in science

1. STRUCTURE OF BIBLIOMETRICS Measuring communication in science

1. Structure of Bibliometrics • The three “components” of present-day bibliometrics according to its three main target-groups • Bibliometrics for bibliometricians (Methodology) • This is the domain of bibliometric “basic research”. • Bibliometrics for scientific disciplines (Scientific information) • A large but also the most diverse interest-group in bibliometrics. Due to the scientists’ primary scientific orientation, their interests are strongly related to their speciality. Here we also find joint borderland with quantitative aspects of informationretrieval. • Bibliometrics for science policy and management (Science policy) • At present the most important topic in the field. Here the national, regional, and institutional structures of science and their comparative presentation are in the foreground. Measuring communication in science

applied Scientometrics basic 1. Structure of Bibliometrics Links of bibliometrics with related research fields and application services Science policy Scientific information Research management Librarianship Services for Research in Economics Sociology of science History of science Library and Information Science Life sciences Informetrics Mathematics/Physics Webometrics Measuring communication in science

2. DATA SOURCES OF BIBLIOMETRIC RESEARCH AND TECHNOLOGY Measuring communication in science

2. Sources of Bibliometrics • Data sources of bibliometric research and technology • Data sources of bibliometrics are bibliographies and bibliographicdatabases. Large scale analyses can only be based on bibliographic databases. • Prominent specialised databases are, e.g., Medline, Chemical Abstracts, INSPEC and Mathematical Reviews in the sciences and, e.g., Econlit,Sociological Abstracts and Humanities Abstracts in the social sciences and humanities. •  Disadvantage:Lack of reference literature, incomplete address recording • The databases of the Institute for Scientific Information (Thomson - ISI), above all, the Science Citation Index (Expanded) have become the most generally accepted source of bibliometrics. Measuring communication in science

2. Sources of Bibliometrics • Although, there are several objections against the journal coverage and the data processing policy of the ISI in preparing the SCI, its unique features are basic requirements of bibliometric technology. Among these features we have • Multidisciplinarity • Selectiveness • Completeness of addresses • Full coverage • Bibliographical references •  Disadvantage: no individual subject classification for papers available. Measuring communication in science

2. Sources of Bibliometrics ProceedingsSMis available in two editions (Science & Technology and Social Sciences & Humanities) covering about 2,000,000 papers from over 60,000 conferences since 1990. Non-serial literature (except for proceedings) such as monographs and books is not indexed in these ISI databases. Since non-serial literature is an important conveyor of information in the social sciences and humanities, journal based data-sources are accepted by scientists only with certain reservations. Measuring communication in science

3. ELEMENTS, UNITS AND MEASURES OF BIBLIOMETRIC RESEARCH Measuring communication in science

3. Elements of Bibliometrics Elements, units and measures of bibliometric research Basic units in bibliometrics are usually not further subdivided. These form the elements of bibliometric analyses. Elements are, e.g., publications, (co‑)authors, references and citations. Publications can be assigned to the journals in which they appeared, through the corporate addresses of their authors to institutions or countries, references and citations to subject categories, and so on. Units are specific sets of elements, e.g., journals, subject categories, institutions, regions and countries to which elements can – not necessarily uniquely – be assigned. The clear definition of the assignment – or in mathematical parlance – of mappings between elements and units allows the application of mathematical models. Measuring communication in science

3. Elements of Bibliometrics Publication activity and authorship Publication activity is influenced by several factors. At the micro level, we can distinguish the following four factors. • the subject matter • the author’s age • the author’s social status • the observation period The publication activity in theoretical fields (e.g., mathematics) and in engineering is lower than in experimental fields or in the life sciences. Cross-field comparison – without appropriate normalisation – would not be valid. This applies above all to comparative analyses at the meso level (universities and departments). Measuring communication in science

3. Elements of Bibliometrics Can scientific collaboration be measured through co-authorship? • Laudel (2001) (micro study): A large share of persons involved in the preparation of a scientific paper does thus not appear either as co-author or as a sub-author. Katz & Martin (1997) argue that co-authorship is no more than a partial indicator of collaboration. • Intensifying collaboration, however, goes with growing co-authorship (Patel, 1973). There is a positive correlation between collaboration and co-authorship at the level of individual actors, too. • The phenomenon described by Laudel and Katz & Martin rather applies to intramural collaboration. Extramural collaboration, above all international collaboration, is usually well acknowledged. Measuring communication in science

4. CITATIONS AND SELF-CITATIONS Measuring communication in science

4. Citations and Self-citations • The notion of citations in information science and bibliometrics • Citations became a widely used measure of the impact of scientific publications. • Cozzens: “Citation is only secondarily a reward system. Primarily, it is rhetorical-part of persuasively arguing for the knowledge claims of the citing document.” • L. C. Smith: "citations are signposts left behind after information has been utilized". • Cronin: Citations are "frozen footprints in the landscape of scholarly achievement … which bear witness to the passage of ideas“. Measuring communication in science

4. Citations and Self-citations Glänzel and Schoepflin: Citations are “one important form of use of scientific information within the framework of documented science communication,” Although citations cannot describe the totality of the reception process, they give, “a formalised account of the information use and can be taken as a strong indicator of reception at this level.” Westney: “Despite its flaws, citation analysis has demonstrated its reliability and usefulness as a tool for ranking and evaluating scholars and their publications. No other methodology permits such precise identification of the individuals who have influenced thought, theory, and practice in world science and technology.” GarfieldandWeinstockhave listed 15 different reasons for giving citations to others’ work. Measuring communication in science

4. Citations and self-citations The process of re-interpreting the notion of citation and its consequences interpretation citation Bibliometrics/Information science Signpost of information use uncitedness: unused information frequent cite: good reception self-cite: part of scient. communication repercussion (possible distortion of citation behaviour) re-interpretation uncitedness: low quality frequent cite: high quality self-cite: manipulation of impact Rewarding system/ Quality measure Research evaluation/Science policy Measuring communication in science

5. “AGAINST ABSOLUTE METHODS” Measuring communication in science

5. “Against absolute methods” • Factors influencing citation impact • Citation impact is mainly influenced by the following five factors that are analogously to the case of publication activity at higher levels of aggregation practically quite inseparable. • the subject matter and within a subject, the “level of abstraction” • the paper’s age • the paper’s “social status” (through authors and journal) • the document type • the observation period (”citation window”) Measuring communication in science

5. “Against absolute methods” Complexity of influences and biases in calculating citation impact measures Mean citation rate of two journals in time as a function of time(source year: 1980) Measuring communication in science

5. “Against absolute methods” Impact of different document types ”3-year impact measure” for selected journals by document types (source: 1995/96) Measuring communication in science

5. “Against absolute methods” • Influence of subject characteristics • Mean citation rate of subfields (source: 1996, citation window: 1996-1998) • Mechanical, civil and other engineering 1.12 • Mathematics 1.46 • Analytical chemistry 3.00 • Solid state physics 3.06 • Neurosciences 4.54 • Citation measures are thus – without normalisation – not appropriate for cross-field comparisons. Measuring communication in science

5. “Against absolute methods”  The only possible way to compensate for the subject-specific characteristics is an appropriate normalisationand the application of exactly the same underlying publication period and citation windowto all units under study. Measuring communication in science

5. “Against absolute methods” Citation indicators can be normalised using a reference standard based on journals or subjects in which the papers under study have been published. Problem: Subject assignament is not unique. The Relative Citation Rate (RCR) introduced by Schubert et al. in 1983gauges observed citation rates of the papers against the standards set by the specific journals. Ithas largely been applied to comparative macro and meso studies since. A version of this relative measure, namely, CPP/JCSm, where JCSm denotes the mean Journal Citation Score is used at CWTS. The Normalised Mean Citation Rate (NMCR) introduced by Braun and Glänzel in 1993normalises observed citation rates by weighted average of the mean citation rates of subfields. A similar measure (CPP/FCSm, FCSm being the mean Field Citation Score) is used at CWTS. Measuring communication in science

6. JOURNAL IMPACT FACTOR Measuring communication in science

6. Journal Impact Factor • On the role of the Impact Factor • The Garfield ISI Impact Factor (IF) represents a paradigm in bibliometric/information science research. • The IF is used frequently and has obtained a very strong ‘market’ position. • From the mathematical viewpoint, the IF is the mean value, i.e., an arithmetic mean of citations in a particular (citing) year to a particular set of articles published in a particular journal one or two years earlier. • The Impact Factor has become perhaps the most popular bibliometric product used in bibliometrics and outside the scientific community. Measuring communication in science

6. Journal Impact Factor • Problems in using the ISI Impact Factors • The strengths of the Impact Factor lies first of all in the comprehensibility, stability and seeming reproducibility, but some flaws have provoked critical and controversial discussions about its correctness and use. • The above-mentioned popularity involves also dangers. The use of impact factors ranges from well-documented and methodically sound applications to rather ‘grey’ applications as background information for scientific journalism or in the context of refereeing procedures. Impact factors are sometimes used even as substitutes for missing citation data. • Although it is difficult to theoretically define the concept of (journal) impact, there is a wide spread belief that the ISI Impact Factor is affected or ‘disturbed’ by factors that have nothing to do with (journal) impact. Measuring communication in science

6. Journal Impact Factor • Being a statistical mean, the IF should be size-independent. Large journals might, however, often have a higher visibility. • The robustness, comprehensibility and methodological reproducibility of the ISI journal Impact Factor is contrasted by methodological shortcomings and its technical irreproducibility. • It became quite tempting to apply the impact factor as a universal bibliometric measure. This is certainly one source of possible uninformed use. • Methodological improvements in combination with complementary measures and an appropriate documentation may help to overcome limitations described above. • The question of reproducibility can thus at least partially be solved for those who have access to the bibliographic databases and the technology to produce journal indicators. Measuring communication in science

6. Journal Impact Factor Visibility vs. publication targeting vs. citation impact Publication in a high-IF journal might guarantee excellent visibility, but not automatically imply high citation rates, too. In several fields, targeting, i.e., reaching the desired audience is more important than publishing in high-impact journals (e.g., in clinical medicine, mathematics). The latter observation substantiates that research may have other impact than citations. In order to gain new insight in the utility of biomedical researchGrant Lewisonstudied citations from clinical guidelines, textbooks, government policy documents, international or national regulations and newspaperarticles. Moreover, publications in technical sciences and clinical medicine might find practical application that cannot be measured through citations.  Citations and, above all, the IF do not measure all aspects of impact published research results might have. Measuring communication in science

6. Journal Impact Factor The myth of delayed recognition An often-heard argument on limitations of citation-based indicators is that important publications are often not cited in the beginning, and only become recognised in a time that is beyond the standard citation windows used in most bibliometric studies. Studies by Glänzel at al. and Glänzel & Garfield in 2004 have shown that the chance that a paper, uncited for three to five years after publication, will ever be cited is quite low, even in slowly aging fields such as mathematics. The citation impact of papers not cited initially usually remains low even 15 to 20 years later. The potential number of delayed recognition papers is extremely small. A statistically marginal share of 1.3 per 10,000 papers published in 1980 were "neglected" at first, and then, belatedly, received relatively high citational recognition. Measuring communication in science

7. DISTORTED BEHAVIOUR BASED ON POLICY USE AND MISUSE OF BIBLIOMETRIC DATA Measuring communication in science

7. Distorted behaviour • Distorted behaviour based on policy use and misuse of bibliometric data • An additional issue concerns the changes in the publication, citation and collaboration behaviour of scientists (both positive and negative) that the consistent policy use of bibliometric indicators might potentially induce. • Studies on the problem choice behaviour of academic scientists have revealed that both cognitive and social influences determine the manner in which scientists go about choosing the problems they work on (Debackere and Rappa 1994). Hence the issue should be raised to what extent the policy use of bibliometrics might or could affect this behaviour. Measuring communication in science

7. Distorted behaviour The problem of inappropriate use ranges from uninformed use, over selecting and collecting ‘most advantageous’ indicators to the obvious and deliberate misuse of data. Uninformed use and misuse are not always beyond the responsibility of bibliometricians. Unfortunately, bibliometricians do not always resist the temptation to follow popular, even populist, trends in order to meet the expectations of the customers. Clearly, any kind of uninformed use or misuse of bibliometric results involves the danger of bringing bibliometric research itself into disrepute. Measuring communication in science

7. Distorted behaviour • Uninformed use • incorrect presentation, interpretation of bibliometric indicators or their use in an inappropriate context caused by insufficient knowledge of methodology, background and data sources • generalisation (induction) of special cases or of results obtained at lower levels of aggregation • Misuse • intentionally incorrect presentation, interpretation of bibliometric indicators or their deliberate use in inappropriate context • tendentious application of biases • tendentious choice of (incompatible) indicators Measuring communication in science

7. Distorted behaviour But even correct use might have undesired consequences. Example: Re-interpreting underlying contexts such as the notion of citation (cf. Section 4) shows author self-citations in an unfavourable light. Authors might thus be urged avoiding self-citations – a clear intervention into the mechanism of scientific communication. Less obvious repercussionsmight be observed when bibliometric tools are used in decision-making in science policy and research management and the scientific community recognises the feedback in terms of their funding. Butler (2004)has shown on the example of Australia what might happen when funding is linked to publication counts. She found that the publications component of the Composite Index has stimulated an increased publication activity in the lower impact journals. Measuring communication in science

Schematic visualisation of the feedback of policy use of bibliometrics on the scientific community 7. Distorted behaviour Measuring communication in science

7. Distorted behaviour • Possible positive effects • Scientists might recognise that scientific collaboration and publishing in high-impact or even top journals pays. Also their publication activity might be stimulated. • Possible negative effects • Exaggerated collaboration, even trends towards hyper-authorship, inflating publication output by splitting up publications to sequences, inflating citation impact by self-citations and forming citation cliques, etc. Trend towards replacing quality and recognition by visibility at any price or towards preferring journals as publication channels in social sciences and humanities might be among these effects. Measuring communication in science

CONCLUSIONS Measuring communication in science

Conclusions • The future will show in how far these negative effects will become reality. Empirical monitoring and examination of hypothetical biases will be worthwhile. • Similar trends could already be observed far before the time of bibliometrics: Striving after visibility and reputation is part of human nature. Most negative effects will probably be hindered or prevented through the natural competition and peer review among researchers. • The only negative feedback from policy use and misuse of bibliometric data might on the long run results in general ‘inflationary values’ described, e.g., by Cronin (2001) and Persson et al. (2003). Bibliometricians have the tools to normalise and standardise indicators under such conditions, and are thus able to cope with this problem, too. Measuring communication in science

MEASURING COMMUNICATION IN SCIENCE OPPORTUNITIES AND LIMITATIONS OF BIBLIOMETRIC METHODS