180 likes | 192 Views
Explore the background of the IRS project, stakeholder perspectives, internal deposit statistics, and the project's current progress. Join the discussion on enhancing impact assessment and request participation in this collaborative effort.
E N D
Interoperable Repository Statistics Les Carr & Tim Brody University of Southampton http://irs.eprints.org/
Introduction • Background to & history of our interest in statistics • The purpose of the IRS project • Progress so far • Request for participation
Topics of Discussion • Statistics - how to get them • Statistics - what services we can build with them • Statistics - what current users want from them now • Statistics - new measures of impact • and significance? and quality? • Statistics - validation for the academic community
Some History • 1999 NSF / JISC International Digital Library Project “OpCit” • provide external citation linking service for LANL (as was) • ECS, Cornell, Los Alamos • Supporting study on use of citations • Analysed download patterns from UK mirror wrt citation patterns • Were downloads influenced by citations? • Could citation impact be predicted by downloads?
OpCit Outcome • Citebase ‘OAI’ service • Used OAI to obtain article metadata • Used local UK mirror’s file system to obtain • article text (to extract citations) • article download stats from web logs • No standard way to obtain necessary data. • By now most repositories accept web spiders. • Still no agreement with arxiv.org to harvest central download data
Stakeholders • Authors - • Encouragement to research • Encouragement to archive • Researchers • New Discovery & Navigation methods • New Filter mechanisms
Stakeholders (II) • Repository Administrators • Management & maintenance decisions • Marketing, feedback to research managers • Fund holders • Assess impact, inform future funding decisions
Internal Deposit Statistics • Who is depositing what, how frequently • Managers can collect info to feedback on effectiveness • Archives.eprints.org provides simple growth data • OpenDOAR could collect information on best practice • Evaluate policy outcomes • Are mandates necessary?
Management Stats (the Easy Way) • Google Analytics • Repository overview • Individual document downloads over time • JavaScript invokes external service to gather stats
Project Aims • investigate the requirements for UK and international stakeholders • design an (API) for gathering download data • build software • distribution and collection software for repositories • generic analysis and reporting tools
Scenario 1 • Forty physicists collaborate on a paper which is deposited into each of their institutional repositories plus arxiv.org • Each repository reports to its author that it has received n downloads. • How can they be aggregated?
Scenario 2 • Two repositories report that a paper has received 50 downloads. • Have they both filtered out spiders in the same way? Self-downloads? Repeated downloads from the same IP? • Are abstract and PDF downloads treated equivalently? • How can they be compared?
Participants • Project Partners • ECS (Leslie Carr, Stevan Harnad, Tim Brody), U Tasmania (Arthur Sale), Counter (David Goodman, Long Island U), Key Perspectives (Alma Swan) • International Panel • Rob Tansley (DSpace), Herbert Van de Sompel (OAI), Alberto Pepe (CERN), Laurent Romary (CNRS), Bill Hubbard (SHERPA), Leo Waaijers (DARE), Sune Karlsson (LogEc), Andrew Bennett (APSR)
Current Progress • Report on stakeholder requirements • 35 people representing 18 different institutions interviewed • Their priorities: • Origin of access (country, domain, institution) • Timing of access (date, time-series, cumulative) • Comments requested • http://irs.eprints.org/report/
Comments Sought • Are the pre-occupations of new repository managers correct? • How much effort should be devoted to new bibliometrics? • We need international feedback and support to go there! • Please join expert panel!
Impact for (e.g) UK • Research Assessment exercise • Performed in 2001, and next in 2008 • IRRA project defines roles for repository (collecting research evidence, providing it for panels) • But to simplify: • RAE results citation impact • citation impact correlated to downloads • But academic community still very wary of ‘web logs’ • but citation used to be the only auditable use of an article
Message to UK Fundholders • Let many flowers bloom (Harnad) • i.e. Freely provide download and citation statistics • So each community can define its own statistical measures of quality, impact and success • Physicists can use journals • Computer Scientists can use conferences • All disciplines can see the impact of their own departments, individuals, projects etc.
Next Steps • Looking for feedback • Looking for agreement • Looking for collaboration • How can we join in?