1 / 18

Interoperable Repository Statistics

Interoperable Repository Statistics. Les Carr & Tim Brody University of Southampton http://irs.eprints.org/. Introduction. Background to & history of our interest in statistics The purpose of the IRS project Progress so far Request for participation. Topics of Discussion.

matteson
Download Presentation

Interoperable Repository Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Interoperable Repository Statistics Les Carr & Tim Brody University of Southampton http://irs.eprints.org/

  2. Introduction • Background to & history of our interest in statistics • The purpose of the IRS project • Progress so far • Request for participation

  3. Topics of Discussion • Statistics - how to get them • Statistics - what services we can build with them • Statistics - what current users want from them now • Statistics - new measures of impact • and significance? and quality? • Statistics - validation for the academic community

  4. Some History • 1999 NSF / JISC International Digital Library Project “OpCit” • provide external citation linking service for LANL (as was) • ECS, Cornell, Los Alamos • Supporting study on use of citations • Analysed download patterns from UK mirror wrt citation patterns • Were downloads influenced by citations? • Could citation impact be predicted by downloads?

  5. OpCit Outcome • Citebase ‘OAI’ service • Used OAI to obtain article metadata • Used local UK mirror’s file system to obtain • article text (to extract citations) • article download stats from web logs • No standard way to obtain necessary data. • By now most repositories accept web spiders. • Still no agreement with arxiv.org to harvest central download data

  6. Stakeholders • Authors - • Encouragement to research • Encouragement to archive • Researchers • New Discovery & Navigation methods • New Filter mechanisms

  7. Stakeholders (II) • Repository Administrators • Management & maintenance decisions • Marketing, feedback to research managers • Fund holders • Assess impact, inform future funding decisions

  8. Internal Deposit Statistics • Who is depositing what, how frequently • Managers can collect info to feedback on effectiveness • Archives.eprints.org provides simple growth data • OpenDOAR could collect information on best practice • Evaluate policy outcomes • Are mandates necessary?

  9. Management Stats (the Easy Way) • Google Analytics • Repository overview • Individual document downloads over time • JavaScript invokes external service to gather stats

  10. Project Aims • investigate the requirements for UK and international stakeholders • design an (API) for gathering download data • build software • distribution and collection software for repositories • generic analysis and reporting tools

  11. Scenario 1 • Forty physicists collaborate on a paper which is deposited into each of their institutional repositories plus arxiv.org • Each repository reports to its author that it has received n downloads. • How can they be aggregated?

  12. Scenario 2 • Two repositories report that a paper has received 50 downloads. • Have they both filtered out spiders in the same way? Self-downloads? Repeated downloads from the same IP? • Are abstract and PDF downloads treated equivalently? • How can they be compared?

  13. Participants • Project Partners • ECS (Leslie Carr, Stevan Harnad, Tim Brody), U Tasmania (Arthur Sale), Counter (David Goodman, Long Island U), Key Perspectives (Alma Swan) • International Panel • Rob Tansley (DSpace), Herbert Van de Sompel (OAI), Alberto Pepe (CERN), Laurent Romary (CNRS), Bill Hubbard (SHERPA), Leo Waaijers (DARE), Sune Karlsson (LogEc), Andrew Bennett (APSR)

  14. Current Progress • Report on stakeholder requirements • 35 people representing 18 different institutions interviewed • Their priorities: • Origin of access (country, domain, institution) • Timing of access (date, time-series, cumulative) • Comments requested • http://irs.eprints.org/report/

  15. Comments Sought • Are the pre-occupations of new repository managers correct? • How much effort should be devoted to new bibliometrics? • We need international feedback and support to go there! • Please join expert panel!

  16. Impact for (e.g) UK • Research Assessment exercise • Performed in 2001, and next in 2008 • IRRA project defines roles for repository (collecting research evidence, providing it for panels) • But to simplify: • RAE results  citation impact • citation impact correlated to downloads • But academic community still very wary of ‘web logs’ • but citation used to be the only auditable use of an article

  17. Message to UK Fundholders • Let many flowers bloom (Harnad) • i.e. Freely provide download and citation statistics • So each community can define its own statistical measures of quality, impact and success • Physicists can use journals • Computer Scientists can use conferences • All disciplines can see the impact of their own departments, individuals, projects etc.

  18. Next Steps • Looking for feedback • Looking for agreement • Looking for collaboration • How can we join in?

More Related