1 / 42

Integrate external services in DSpace submission process

Integrate external services in DSpace submission process. How to make self-deposit easy and improve metadata quality and presence of full-text Andrea Bollini – Susanna Mornati. Topics. Some context: CINECA a brief overview DSpace as part of a CRIS solution.

inga-sims
Download Presentation

Integrate external services in DSpace submission process

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Integrate external services in DSpace submission process How to make self-deposit easy and improve metadata quality and presence of full-text Andrea Bollini – Susanna Mornati

  2. Topics • Some context: • CINECA a brief overview • DSpace as part of a CRIS solution • Integration of external services: • Bibliographic database: Scopus, PubMed, CrossRef, ArXiv, etc. • Publishers policy: Sherpa/Romeo • Make the repository an active actor: • Discovering missing content • Improve Fulltext presence www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  3. The Company as last week! • InteruniversityConsortium • No-Profit • Founded in 1969 • Headquarter in Bologna • 57 Members • 54 Universities • 2 Researchinstitutes • MIUR • Owned companies: Kion, SCS. • Employees: 400 (+150 Kion) • Total turnover:  70M€ www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  4. The Merge • The “mergingprocess” of the threeItalianConsortiastarted in September2012 • Itwasconcluded in July1st 2013 (last week!) • 67 Members • More than 700 employees (+ 150 Kion) • The onlyItalianInteruniversityConsortium 2.0 www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  5. What CINECA does www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  6. Howwe work withUniversities • CinecaBoardofDirectors • UniversityCustomers • Focus Groups • UniversityCustomers • CinecaTechnicalBoard Tech Road Map Apps Road Map U-GOV & SURplusRestrictedBoard Requirements Requirements ProductManagers Board CustomerService Board Technical& Delivery Board www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  7. Authentication Solutionsfor HE = ERP = Best ofBreed AU Gateway GW www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  8. SURplus: CINECA’ CRIS System • An interoperableinfrastructuremadeofdifferentcomponents • Ingestion of data from any legacy systems adopted by an institution • Maintenance of specific functional requirements, data model and preferred technologies at the level of applications • Data warehouse and Business Intelligence tools to facilitate aggregations of data and the application of measurement parameters and algorithms www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  9. SURplus: Dimension • Beginningofactivities: 2004 • 9 institutions • 22 institutionalrepositories • Total modules: 77 www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  10. Topics • Some context: • CINECA a brief overview • DSpace as part of a CRIS solution • Integration of external services: • Bibliographic database: Scopus, PubMed, CrossRef, ArXiv, etc. • Publishers policy: Sherpa/Romeo • Make the repository an active actor: • Discovering missing content • Improve Fulltext presence www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  11. DSpace:SURplus’ Open Archive Module CINECA is a registered service provider at DuraSpace Long-termcollaboration with DSpace community, since 2003 The OA Module, developed on DSpace: • Managescollection and disseminationofresearchresults • Simplifies data collection’s processes • Service Integration Upgrades are periodicallyreleasedto the open source community www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  12. DSpace-CRIS:SURplus’ Expertise & Skills DSpace-CRIS: designedtogether with the Hong Kong University & releasedas open-source “disseminationofentities’ descriptions in the researchenvironmentwhich go beyondpublications” www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  13. IR as part of a CRIS system: whatchange? Professional support HA infrastructure Dedicated team • Benefits: • Strong deposit mandate • More funding advocacy • Issues to mitigate: • IR become a critical application • Author have a “requirements” perception • Wasting time • Late submission The information already exists in other database! Make the submission process easy www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  14. Topics • Some context: • CINECA a brief overview • DSpace as part of a CRIS solution • Integration of external services: • Bibliographic database: Scopus, PubMed, CrossRef, ArXiv, etc. • Publishers policy: Sherpa/Romeo • Make the repository an active actor: • Discovering missing content • Improve Fulltext presence www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  15. New first submissionstep Free searchform Available providers: each provider is a spring service Mainmetadata common to allpublicationtypes (article, book, etc.) Title of the contribution Year Authors/Editors www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  16. New first submissionstep Lookup by uniqueidentifier Each provider declareswhichidentifiersisable to manage www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  17. New first submissionstep For eachresult providers are shown that match the record. Groupingisdone via DOI www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  18. Modal box publicationdetails Records from different providers are merged to getrichermetadata The systemguesses a collection for the submissionbut the user can changeitifrequired www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  19. Manual submission Whenlookupfails the user can alwaysproceedmanually www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  20. Batch import from external source Import data (identifiers or structured text) can be inputedmanually or uploadedas a file Format/provider must be specified by the user www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  21. Batch import from external source • Request are processed: • Inline for specific providers and/or within configured data limits  Submitter can immediately complete the pre-filled submissions • In a background process • Submitter will receive a summary email with import result • Pre-filled submissions are available as in-progress submission in the MyDSpace The legacy batch import feature for JSPUI hasbeenalreadysharedas pull request on GitHub, seeDS-1252 www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  22. EnhancedDescribestep: showingmetadata source www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  23. WGET http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi ?db=pubmed&id=23297105&retmode=xml&rettype=full PubMedLookup Provider Mapping file Split, aggregate fields Derive data ISSN  Journal title … public classPubmedItem { private StringpubmedID; private Stringdoi; private Stringissn; private Stringeissn; private StringjournalTitle; private Stringtitle; private StringpubblicationModel; private Stringyear; private String volume; private Stringissue; private Stringlanguage; private List<String> type; private List<String> primaryKeywords; private List<String> secondaryKeywords; … JAVA Bean PubMed record <bean name="pubmedLookupProvider" class=“...lookup.PubmedLookupProvider"> <property name="pubmedService" ref="pubmedService"/> </bean> implements SubmissionLookupProvider Mapping file Enhancer plugins arXivLookup Provider JAVA Bean <bean name="pubmedService" class=“...service.PubmedService"/> Technical details DSpace Item arXiv record Translationlogic original normalized Normalized record Translationlogic Normalized Repository Mapping file public class PubmedLookupProvider extends ConfigurableLookupProvider public abstract class ConfigurableLookupProvider … ScopusLookup Provider JAVA Bean Scopus record Mapping file www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  24. Topics • Some context: • CINECA a brief overview • DSpace as part of a CRIS solution • Integration of external services: • Bibliographic database: Scopus, PubMed, CrossRef, ArXiv, etc. • Publishers policy: Sherpa/Romeo • Make the repository an active actor: • Discovering missing content • Improve Fulltext presence www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  25. Enhanced upload step • Using the ISSN or EISSN provided in the describestep • the upload formisimprovedshowing on the right side the publisher policy from the Sherpa/Romeo database www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  26. Enhanced upload step Access policy for the bitstream: Open access, embargo, intranet, etc. Deposit of fulltext to the national database for individual CVs www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  27. Topics • Some context: • CINECA a brief overview • DSpace as part of a CRIS solution • Integration of external services: • Bibliographic database: Scopus, PubMed, CrossRef, ArXiv, etc. • Publishers policy: Sherpa/Romeo • Make the repository an active actor: • Discovering missing content • Improve Fulltext presence www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  28. Whatis the problem? • (very) late submissions produce some issues for the repository both at technical and organization level: • The system is subjected to periods of intense input activities. DSpace, but in general IR software, scaleswell for readoperationslesswell for writeoperations • IR staff involved in workflowgetlot of task to perform in small period Getresearcheraware Remindresearcherabout IR presence Interceptearly new content www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  29. How weplan to mitigate the problem? • Citation databases provide APIs to perform search (we already use them for the lookup) and in some cases they provide additional APIs or searchfilters/indexes to make more raffinatedsearch and allow scanning of the database. • The interestingfilters/indexes are: • Time based (muchbetterifrelated to insertion in the citation database) • Author ID (betterifrelated to a «standard/common» identifieras ORCID) • Affiliation • Subjectcategory www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  30. Implementation idea • Allow the researcher to store personal preferencesabout scanning: • Enabled providers (e.g disable arXiv if you are not a physicist) • Frequencies • Subjectcategoriesfilters • AuthorIDswill be stored/retrieved from the Researcherprofile. • Subjectcategoriescould be proposed from previousitems or researcherprofile. www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  31. DSpace-CRIS: Researcherprofile www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  32. Who are the potential targets? • ORCID • Scopus • Web of Science • arXiv • PubMed Central • DBLP • REPEC The Repositoryitself! www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  33. The repositoryas source of missingcontent? • The submitter has to match authors of publication with the University staff to higthlight internal authors • Sometimesmatches are missing • Othertimesmatches are wrong (homonymous) • Externalauthorscouldbecome «internal» at some point in the future www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  34. The repositoryas source of missingcontent? • Send email to internal «co-authors» when a submissionisdone preventwrongattribution (and reduce duplication) • Allowresearcher to unclaimpublications from herprofile last chance to fixwrongattribution • Allowresearcher to claimpublications  fixmissingattribution and/or engagement of new researcher The last twofeatures are included in the DSpace-CRISaddon www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  35. Currentimplementation: claim/unclaimpublications in the repository You can claimit A  Active, simpleclaim S  Makeit a selectedpublication H  Claimitbuthide from you public profile Thisis the current status of the publication U  Unlinked www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  36. Currentimplementation: claim/unclaimpublications in the repository You can unclaim a publication U  Unlink www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  37. Currentimplementation: claim/unclaimpublications in the repository www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  38. Topics • Some context: • CINECA a brief overview • DSpace as part of a CRIS solution • Integration of external services: • Bibliographic database: Scopus, PubMed, CrossRef, ArXiv, etc. • Publishers policy: Sherpa/Romeo • Make the repository an active actor: • Discovering missing content • Improve Fulltext presence www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  39. Improvefulltextpresence • Use the Sherpa/Romeo policy database to analyzerepositorycontent • Use external database API to find an actualfulltext (arXiv, pubmed, ...whynot the publisherversion via librarysubscription?) • Send email to researcher to validate foundPDFs or ask for an «author» versions • Use statistics to encourage upload www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  40. Sherpa/Romeo Statistics (Example) 51% ISSN 36% Not in Sherpa 24.000 items 32% green 21.000 items 7,3% have a fulltext… 5,3% open access www.cineca.it | Integrate externalservices in DSpacesubmissionprocess | OR2013| July 2013

  41. SURplus: prevision 2014 • 50+ institutionalrepositories (DSpace) • 10 researchportals (DSpace-CRIS) www.cineca.it | Innovative Open Source Technologies for a CRIS: SURplus | euroCRIS | May 2013

  42. Thank you! Andrea Bollini a.bollini@cineca.it • SURplus - http://www.cineca.it/en/content/surplus • DSpace-CRIS - http://cilea.github.com/dspace-cris

More Related