250 likes | 425 Views
Peter West and Timothy Miles-Board EPrints Services University of Southampton Southampton, UK pjw@ecs.soton.ac.uk tmb@ecs.soton.ac.uk. Mirror Mirror on the wall does your repository reflect it all?. Introduction.
E N D
Peter West and Timothy Miles-Board EPrints Services University of Southampton Southampton, UK pjw@ecs.soton.ac.uk tmb@ecs.soton.ac.uk Mirror Mirror on the wall does your repository reflect it all?
Introduction How can we help repository administrators validate the completeness and accuracy of their repository holdings? Enquiry, Development and Support. • Understand the problems faced by repository owners.
Community Engagement Concerns: • Does our repository content accurately reflect the published output of our institution? • Is our bibliographic metadata accurate and complete? • Are our publications correctly and unambiguously associated with the right authors, editors, contributors?
Case Studies • We have been involved in investigating solutions to specific instances of the three concerns.
Case Study 1 – Publication Matching • Does our repository content accurately reflect the published output of our institution? • The repository is used to drive an internal approval workflow. • Approval is required before submission to publishers. • Very early deposit. • Problems: • Are published items approved? • Are approved items published?
Publication Matching • Collate lists of known published work • Import list into the repository and run a publication match process. • Generate a report for the administrator • Provide tools to act on the data generated.
Publication Matching • Future work: • Generalise the framework to support the requirements for another organisation. • Merge with the concepts behind the meta data update script. • End Goal: • Validation tool framework that will allow for matching a dataset using a comparison function. • Plugin support for custom lists (held in a reference manager database), existing services (using DOIs or Pub Med Ids) and new emerging sources (ORCiD). • Integrate the reporting with IRStats2.
Case Study 2 – Authority Lists • Is our bibliographic metadata accurate and complete? • Accuracy of journal and publisher information was affecting the efficiency of both the repository's editorial team and its submitters. • Direct impact on funding allocation. • The data collected by the editorial team could be utilised more effectively if it was integrated into the submission process.
Authority Lists • A database of journal information (JDB) was developed. • Retrieve journal and publisher data for an item via an interactive dialog. • Users can search other external databases. • In the worst case the user can manually enter the data.
Issues • Data Integrity. • Duplicate entries • Broken links • Search Performance. • Reduce similar/duplicate entries • Did you mean? • Order results based on popularity • Future work • Multi user support
Publisher Data Core Journal Data Uni A Data Uni B Data Authority Lists Journal Database Client Uni A Client Uni B Client Uni C
Publisher Data Core Journal Data Uni A Data Uni B Data Authority Lists Journal Database Client Uni A Client Uni B Client Uni C
Publisher Data Core Journal Data Uni A Data Uni B Data Authority Lists Journal Database Client Uni A Client Uni B Client Uni C
Case Study 3 – Author Disambiguation • Are our publications correctly and unambiguously associated with the right authors, editors, contributors? • Common problem. • Leverage “single sign-on data”. • Replace free-text input fields • Possibility of utilising contributor data in other ways: • Subjects • Affiliations
Case Study 3 – Author Disambiguation • Temporal nature of roles.
Case Study 3 – Author Disambiguation • Staff Identifiers Vs ORCiD.
Road Map • Our goal is to produce a set of tools and procedures for the repository community. • Q3 • Publication Matching • Author disambiguation • Q4 • Authority Lists • Consolidation • Reflection • Generalisation