500 likes | 726 Views
Creation and Management of ETD Archive at IISc Bangalore using Open Source Software . By Filbert Minj National Centre for Science Information Indian Institute of Science Bangalore – 560 012 (E-Mail:filbert@ncsi.iisc.ernet.in). Outline of the presentation.
E N D
Creation and Management of ETD Archive at IISc Bangalore using Open Source Software By Filbert Minj National Centre for Science Information Indian Institute of Science Bangalore – 560 012 (E-Mail:filbert@ncsi.iisc.ernet.in)
Outline of the presentation • Introduction to Indian Institute of Science (IISc) and etd@IISc • Software selection for etd@IISc • Repository implementation steps • Thesis submission and Archival • Handle Server Configuration / OAI complaint • Challenges encountered in implementation • Back up and restore mechanism • Access Policies • Conclusion
About IISc cont… • Academic and research Institution • 2500+ active researchers including 500+ faculty • 2000+ research publications per year • 200+ M.Sc Engineering and Ph.D thesis per year
About ePrints@IISc • Research publications repository of IISc (http://eprints.iisc.ernet.in) • Pioneering efforts towards the cause of Open Access Initiative in India • Started in 2002 and has 32878+ publications as of now and growing steadily • Accessed significantly from around the world
About etd@IISc • Digital repository of theses and dissertations of Indian Institute of Science • Was started in June 2004 as a student project and launched the service towards the end of the month February 2005 • Accessible at http://etd.ncsi.iisc.ernet.in/ • Thesis Repository has 1567 thesis as of now and growing slowly • IISc library has taken the initiative to archive old theses of the Institute • With formulation of Institute’s policy, submissions are expected to grow steadily
Need for etd@IISc • A centralized system for managing and presenting the research output of the Institute in an organized fashion • Could facilitate easy, fast, and open access to the intellectual output of the Institute • Preservation and long-term access to the scholars' research output • OAI and the "Google"-ing of thesis in etd@IISc can be immediately found in global indexing and search services
Why DSpace? • Largest community of users and developers worldwide • Free open source software • Completely customizable to fit your needs • Used by educational, government, private and commercial institutions • Can be installed out of the box • Quickly install DSpace on your computer- DSpace Live CD. (http://cadair.aber.ac.uk/dspace/handle/2160/565) • Can manage and preserve all types of digital content
What is DSpace? • DSpace is a platform that allows you to • capture items in any format – in text, video, audio, and data. • It distributes it over the web. • It indexes your work, so users can search and retrieve your items. • It preserves your digital work over the long term.
Repository structure Repository is organized as communities and collections
Prerequisite Software (latest) • Latest release of DSpace, version 1.8.2 • UNIX-like OS or Microsoft Windows • Linux (recommended) • Oracle Java JDK 6 (standard SDK is fine, you don't need J2EE) • Apache Maven 2.2.x or higher (Java build tool) • Apache Ant 1.8 or later (Java build tool) • Relational Database: (PostgreSQL or Oracle). • 2.6 Servlet Engine: (Apache Tomcat 5.5 or 6, Jetty, Caucho Resin or equivalent).
Repository implementation steps • Prototype Repository • Formulate key requirements • Metadata addition for Compliance with ETD-MS • Customization to meet the requirements • Creation of community and collections • Thesis submission and Archival • Back up and restore mechanism
Prototype Repository • Prototype repository for ETD using DSpace 1.2 was setup to • Understand the system • Workflow • Local requirements • Compliance with standards • Value addition to be done
Key requirements • The prototype setup helped us to arrive at the following key requirements • System should support only post-approval (accepted) online submission of theses • Reflection of IISc divisions and departments as communities and collections • Compliance with ETD-MS metadata standard • Validation of student registration using students’ record database • Automatic community and collection assignment to students upon registration
Key requirements cont.. • Automatic metadata assignment and validation during online submission • e.g. Author’s details (extracted form students’ database) • Support for assigning subject categories • Metadata and full text quality assessment by library staff • E-mail notifications to concerned parties during submission, approval and archiving processes
Student Workspace Reviewer (Library staff) Academic Section Registration request Submi-ssion Ok? Reque-st Valid? Local copy of Students’ Database Archive Advisors (Thesis Guide) Registration completed SRNo Email Reject No No Check Yes Approve Yes Login Request To Admin etd@IISc workflow
Compliance with ETD-MS metadata • thesis.degree.name • Name of the thesis (Ph.D, MSc Engg. etc.) • thesis.degree.level • Level of the degree (Master, Doctorate) • thesis.degree.discipline • Discipline of the degree (Science, Engineering) • thesis.degree.grantor • Grantor of thesis (IISc)
Customization • Look and feel • Registration process • Submission fields added e.g. Advisor, Provision for subject classification etc. • E-mail notification in all stages of the submission process • Displays the total number of thesis in the repository
Customization: Automatic Association to a collection upon registration • Normally administrator associates an user (eperson) to a collection for submission • etd@IISc automatically assigns a user to a collection • e.g. Email to be registered • shwetha@mcbl.iisc.ernet.in • Identifies the department mcbl (collection) using email • shwetha@mcbl.iisc.ernet.in-> Microbiology and Cell Biology (mcbl)
Communities and Collections of etd@IISc • A division as community which has many departments • A department as collection • A thesis from a department goes to respective collection • No subcommunity
Accept/Reject/Edit Metadata Step by Library staff Workflow Steps etd@IISc
Handle Server Configuration / Open Archives Initiative (OAI) complaint • etd@IISc creates persistent identifier for every submitted thesis • The handle prefix provided by CNRI is ‘2005’ • e.g http://hdl.handle.net/2005/140 is a URL of a thesis abstract page • OAI compliance and the base URL is :http://etd.ncsi.iisc.ernet.in/oai/request
Challenges encountered in implementation… • Communities and Collections Strengths • Patch was developed by us (NCSI) for displaying of Communities and Collections Strengths • This feature accepted and is now part of DSpace code base (v-1.2.2 onwards) • Displaying of total number of these in etd@IISc.
Challenges encountered in implementation… • Browse views for subject fields and Thesis Guide • Code was developed at NCSI • Accepted and is part of DSpace code base • This feature available in configuration now
Challenges encountered in implementation • Creating metadata submission field • Editing JSP files and make changes in java serverlet files • Pretty simple now (edit input-forms.xml)
Challenges encountered in implementation… • Subject classification • To enable the submitters to include their thesis under the most appropriate subject headings, etd@IISc provides a classification scheme based on Dissertation Abstracts International (DAI)
Challenges encountered in implementation… • Pre-filled submission text box • e.g • Author text box • Identifier (SRNo of a student) • Thesis degree name (MSc. Engg, Ph.D) • Thesis degree level (Master, Doctoral) • Rights • Thesis grantor (IISc)
Challenges encountered in implementation… • Up gradation from lower versions to higher version • Reason lot of customization have to be taken care of • Versions compliance of postgreSQL database • Backup/Restore database
Challenges Always • Self Archiving • Archiving back volumes of ETD • Proper metadata tagging
Back up and restore mechanism • Database, assetstore, configuration files, Web pages (jspui, xmlui) • Scripts written for backup (rsync tool) • Scheduled the scripts (crontab utility) • Restored to a mirror site in case system crash and etd@IISc will up in seconds
Access Policies • Access by registration • Abstract available for everyone • Registration allowed only for Institutes users
Conclusion • etd@IISc is a digital repository of theses and dissertations of IISc • Facilitates better means to capture, store, process, and disseminate the intellectual output of IISc • Prototype Repository to Formulate key requirements • Implemented and customized to meet our requirements • We are observing the various operational implications of the repository and are very keen to incorporate further improvements
Thanks for listening Any question?