440 likes | 583 Views
UKSG Conference April 2010, Edinburgh. The UKSG Usage Factor Project A Progress Report Richard Gedye and John Cox UKSG Conference April 2010, Edinburgh. UKSG Usage Factor Project. Brief background Issues addressed before data collection and analysis
E N D
The UKSG Usage Factor Project A Progress Report Richard Gedye and John Cox UKSG Conference April 2010, Edinburgh
UKSG Usage Factor Project • Brief background • Issues addressed before data collection and analysis • Collecting and analysing the data • What data we have collected • Methodology • Issues and challenges • Anticipated issues that will need to be addressed • Next steps
UKSG Usage Factor Project • Brief background • Issues addressed before data collection and analysis • Collecting and analysing the data • What data we have collected • Methodology • Issues and challenges • Some early recommendations • Anticipated issues that will need to be addressed • Next steps
The challenge……. • ISI's Impact Factor compensates for the fact that larger journals will tend to be cited more than smaller ones • Can we do something similar for usage? • In other words, should we seek to develop a “Usage Factor” as an additional measure of journal quality/value?
For example….. Usage Factor = Total usage over period ‘x’ of articles published during period ‘y’ Total articles published during period ‘y’
Usage factor advantages • Especially helpful for journals and fields not covered by ISI • Especially helpful for journals with high undergraduate or practitioner use • Especially helpful for journals publishing relatively few articles • Data available potentially sooner than with Impact Factors
“Authors select journals that will give their articles prestige and reach. Impact Factor is a widely used surrogate for the former, while perceived circulation and readership reflect the latter. But usage is becoming more important as a measure of reach” Carol Tenopir
Real journal usage data is currently being analysed by John and Laura Cox • Participating publishers:- • American Chemical Society • Emerald • IOP • Nature Publishing • OUP • Sage • Springer
UKSG Usage Factor Project • Brief background • Issues addressed before data collection and analysis • Collecting and analysing the data • What data we have collected • Methodology • Issues and challenges • Anticipated issues that will need to be addressed • Next steps
UKSG Usage Factor Project • Brief background • Issues addressed before data collection and analysis • Collecting and analysing the data • What data we have collected • Methodology • Issues and challenges • Anticipated issues that will need to be addressed • Next steps
Key data issues we have addressed • Consistency – numerator/denominator • Defining article usage year • Defining article publication date • Different usage patterns by subject
Data issues we have addressed 1. Consistency • Items in numerator must be in denominator • Clear definition of qualifying “items” • Machine recognisable • Unambiguous Solution? All items with a DOI? • This will include items such as editorial board listings, calendars of events, sponsoring society announcements, etc.
Other Possible Solutions • Rejected • Item must have references • Item must not have an empty author field • Item must be more than one page in length • Possible • Cross mapping items against one of the large and inclusive A and I services or citation databases • Examining article DTD tags • Intelligent textmining
Longer-term Solutions • Encourage publishers to:- • Lodge more detailed article metadata with CrossRef • Adopt the NLM DTD, use its article categories element, and make the results harvestable
Key Data Issues • Consistency – numerator/denominator • Defining article usage year • Defining article publication date • Different usage patterns by subject
Data issues we have addressed 2. Article usage year • Inter-journal comparisons can be distorted by different patterns of article publication during the calendar year • Usage in the first calendar “year” could be as little as one month and as much as 12 months Solution • provide data about the first 12, 24, 36 months of usage of articles published in each chosen calendar year rather than calendar year usage
Key Data Issues • Consistency – numerator/denominator • Defining article usage year • Defining article publication date • Different usage patterns by subject
Data issues we have addressed 3. Article publication date • Early online version • Final online version • Printed issue publication date • Some early or even “final” versions of articles are published online many months (sometimes years) before the official publication date of the journal issue of which they are nominally a part. Solution • Supply usage data at the article version level, showing usage patterns of different versions separately
Key Data Issues • Consistency – numerator/denominator • Defining article usage year • Defining article publication date • Different usage patterns by subject
Data issues we have addressed 4. Potential differences by subject • Might usage patterns vary between subject areas? • To find out, we needed to identify a third party schema which had classified by subject all journals participating in our project Solution • Use the Dewey Decimal Codes (DDC) which the British Library have assigned to all the journals for which they hold records (>20,000)
With key data issues addressed, we developed a specification for a report via which participating publishers would deliver their usage data for analysis
UKSG Usage Factor Project • Brief background • Issues addressed before data collection and analysis • Collecting and analysing the data • What data we have collected • Methodology • Issues and challenges • Anticipated issues that will need to be addressed • Next steps
UKSG Usage Factor Project • Brief background • Issues addressed before data collection and analysis • Collecting and analysing the data • What data we have collected • Methodology • Issues and challenges • Anticipated issues that will need to be addressed • Next steps
JUF variables to be tested • All journal content (excluding standing matter) • Articles only: • Version of Record • All versions of the article on the publishers’ platform • Differing publication periods – 1 or 2 years (2006-2009) • Differing usage periods: • Single year of usage from the online publication date • Two years of usage from the online publication date • Single year of usage from a year after the online publication date • Two years of usage from a year after the online publication date • Samples of calendar year usage
JUF variables to be tested – continued… Subject comparisons • Broad subjects: • Physical Sciences • Medicine and Life Sciences • Social Sciences • Humanities • Engineering • Narrow subjects • Business and Management • Clinical Medicine
The calculation Journal Usage Factor = Total usage over period ‘x’ of items published online during period ‘y’ Total items published online during period ‘y’ • ‘x’ is the usage period • ‘y’ is the publication period Create comparative subject data • JUFs for each journal into seven spreadsheets (one per subject) • All content JUFs • Article only JUFs • VoR only JUFs • All version JUFs
Determine the best definitions for the calculation • To include non-article content or not • To include versions of articles other than the VoR • Which definitions of ‘x’ and ‘y’ work best • Does calendar year create as meaningful data • Are the differences between subjects significant – do they need different definitions or calculations • What will be easiest for publishers
UKSG Usage Factor Project • Brief background • Issues addressed before data collection and analysis • Collecting and analysing the data • What data we have collected • Methodology • Issues and challenges • Anticipated issues that will need to be addressed • Next steps
UKSG Usage Factor Project • Brief background • Issues addressed before data collection and analysis • Collecting and analysing the data • What data we have collected • Methodology • Issues and challenges • Anticipated issues that will need to be addressed
Issues to address in next phase of project • Detecting and deterring gaming • Differences between disciplines and journal types • What about print usage • What about offline usage • How to integrate usage data when journal content hosted on multiple sites • Responding to technological innovations
Responding to technological innovations • Prefetching to local cache (E.g. PubGet, WebFeat) • Need to establish list of user-agent names • Then ignore prefetch requests and count only those with a “304” response
Responding to technological innovations • Bulk downloading to local hard disk (E.g. Quosa, PubGet bulk download plug-in) • If specifically requested (e.g. Quosa), these should ideally be counted but considered separately • We are still considering ways to address the automated downloading of articles to hard disk
UKSG Usage Factor Project • Brief background • Issues addressed before data collection and analysis • Collecting and analysing the data • What data we have collected • Methodology • Issues and challenges • Anticipated issues that will need to be addressed • Next steps
UKSG Usage Factor Project • Brief background • Issues addressed before data collection and analysis • Collecting and analysing the data • What data we have collected • Methodology • Issues and challenges • Anticipated issues that will need to be addressed • Next steps
Next steps • Submission to UKSG of final report from John and Laura Cox – end of July 2010 • This report will:- • Outline the various metrics assessed • Recommend which of them prove consistent and robust enough to be adopted for scaled up onward monitoring • Suggest any ways in which data providers might amend the way they capture, structure, label, and maintain their data which would make the measurement of Usage Factors:- • Easier • More reliable • Propose ways to audit Usage Factors for accuracy
UKSG Research Committee will consider the report and decide whether it justifies seeking funding for a further (third) phase for the Project
UKSG Usage Factor Phase 3 • Scaled up testing of candidate metric(s) recommended in Cox report • Address outstanding issues revealed during the course of the project so far • In collaboration with data suppliers, develop agreed standards and templates which, going forward, will streamline the process of data collection and analysis • More detailed practical recommendations for a cost-effective infrastructure to manage the Usage Factor process.
UKSG Usage Factor Project • Many thanks to the sponsors of this latest phase:- • GOLD • SILVER • ALPSP • American Chemical Society • STM • Nature Publishing Group • Springer
Thank you for your attention! http://www.uksg.org/usagefactors/