200 likes | 314 Views
Technical Issues for Repository Software. Theses Alive! Project @ Edinburgh University Library SHERPA Project @ Nottingham University Funded by JISC (Joint Information Systems Committee). My Role Within These Projects.
E N D
Technical Issues for Repository Software Theses Alive! Project @ Edinburgh University Library SHERPA Project @ Nottingham University Funded by JISC (Joint Information Systems Committee)
My Role Within These Projects • Evaluate, adapt and develop an open source package for use across the UK • Produce an OAI-compliant E-Thesis repository • Develop a pilot national service with the aim of supporting E-Theses creation and management for UK universities http://www.thesesalive.ac.uk/arch_project.shtml
This Presentation • What is an Institutional Repository? • Common Popular Open-Source Packages • Generic Software Issues • Specific Repository Software Issues • Final Remarks http://www.thesesalive.ac.uk/arch_project.shtml
What is an Institutional Repository? A set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members. Clifford Lynch Executive Director, Coalition for Networked Information (CNI)
Common Popular Open-Source Packages • DSpace (http://www.dspace.org/) • MIT, HP, DSpace Federation • EPrints.org (http://www.eprints.org/) • University of Southampton • Fedora (http://www.fedora.info/) • University of Virginia, Cornell Univeristy • ETD-db (http://scholar.lib.vt.edu/ETD-db/) • Virginia Tech • Endorsed by NDLTD for E-Theses
Generic Software Issues (1) Support & Development • Support from authors • Documentation essential, mailing lists etc. • Continued development • Bug fixes, feature requests, minimal local development
Generic Software Issues (2) System Architecture • Modular architecture • Easy to upgrade, develop and customise • Appropriate programming languages • Stable and appropriate database system • Easy to integrate into current web services • Templates and styles, using language standards (e.g. HTML/CSS, XML/XSLT)
Generic Software Issues (3) System Security • Authentication methods • Authorisation methods • Authenticate-able content • Secure supporting systems • Well-known, open security systems and coherent standard architectures
Generic Software Issues (4) System Administration • Coherent user administration • Different types of user and user groups • Granular, distributable administration • Delegate areas of the system to different administrators • Access policies
Generic Software Issues (5) Additional Functionality • Public API (Application Programming Interface) • Providing additional services from the same code base • Coherent internal data structuring
Specific Repository Software Issues (1) System Architecture • Web services protocols for data retrieval • OAI-PMH, Z39.50, SRW/U, SOAP, OpenURL • Appropriate database system • PostgreSQL (open-source), Oracle (proprietary)
Specific Repository Software Issues (2) System Security • Authentication methods • Most importantly: the one you use at your institution, with the option to insert your own • Authorisation methods • Integrate-able into current institutional information systems such as staff, student or course lists • Authenticate-able content • Provenance metadata, paper-trails, data checksums (e.g. MD5)
Specific Repository Software Issues (3) System Administration • Coherent user administration • Granular administration system • Possible administrator types: Collection admin, User admin, User Group admin, Structure admin, Database Content admin, System Administrator • Licensing System • Related to access policies, with separate submitter, institution and user licences, ideally with a time-dependent facility • Access Policies • Possible requirements: domain restrictions, time-dependent restrictions, partial restrictions
Specific Repository Software Issues (4) Record Handling (1) • Metadata Capture • What metadata do you need? Flexible, appropriate schema (e.g. Qualified DC, ETD-MS (E-Theses), MARC21) • Customisable Submission System • Collects relevant metadata, and can be modified conditionally on the fly • Ingest Methods • Standard submission, batch import, harvesting (e.g. OAI-PMH (metadata only)), customised insert using native API
Specific Repository Software Issues (5) Record Handling (2) • Extract Methods • Native viewing system, batch export, metadata cross-walk, harvest (e.g. OAI-PMH (metadata only)), customised extract using API • Item Wrappers • Multiple files, multiple metadata records/schemas, internal structure mapping (e.g. METS, DIDL)
Specific Repository Software Issues (6) Digital Preservation (1) • Persistent Identifiers • Some available systems: Handle, PURL, URN, DOI, ARK • Migration • On Ingest (migrate submission to open format), or on request (preserve migration tool) • Viewers • Tools to render the format are preserved • Emulation • The original viewer is emulated in the new system
Specific Repository Software Issues (7) Digital Preservation (2) • Universal Virtual Computer (UVC) • On Ingest (migrate submission to open format), or on request (preserve migration tool) • Representation Information • Metadata regarding the representation of the file format • Global Digital Format Registry (GDFR) • Typed Object Model (TOM) Wheatley, P. 2003 “A way forward for developments in the digital preservation functions of DSpace: options, issues and recommendations” (http://dspace.org/news/readings.html)
Specific Repository Software Issues (8) Additional Functionality • Coherent data structuring • An internal structure that can represent your institution in one or more overlaying schemas • Native Browse • Hierarchical browsing, filtering by structure and metadata; aids indexing by search engines • Native Search • Constrained search locations, using browse functionality to display results • Full Text Indexing • Public API (Application Programming Interface) • Creating Portal-like services within the institution
Final Remarks • No systems yet deal with all issues • Some good development work ongoing with the various packages • Not all issues need to be solved: • To provide an Institutional Repository • For your institution • The Institutional Repository is still in its infancy, and may not mature for another 10 years • There are significant policy and community issues that also need to be addressed.
Thanks for Listening Richard Jones r.d.jones@ed.ac.uk http://www.thesesalive.ac.uk/ http://www.sherpa.ac.uk/ JISC: http://www.jisc.ac.uk/ This presentation: http://www.thesesalive.ac.uk/archive/ePrintsUKWorkshop.ppt