240 likes | 327 Views
Establishing National Digital Repository System employing Harvesting Model. Surinder Kumar *Technical Director, NIC, New Delhi suri@nic.in , 011-24305503. IRs…contd.
E N D
Establishing National Digital Repository System employing Harvesting Model Surinder Kumar *Technical Director, NIC, New Delhi suri@nic.in, 011-24305503
IRs…contd • At present, the University of Southampton’s worldwide registry of OAI compliant open access repositories lists more than 1000 repositories. Number of IRs produced by India is around 50. To make it available as single virtual archive and also means of providing seamless search, it is becoming essential to form a network of connected research repositories and resource discovery services to form National digital repository system. Examples are CARL, ARROW, DRIVER etc
National Digital Repository System • To build an appropriate NDRS, analysis of existing infrastructure are analyzed. • Technology Components • Requisite Hardware • OS • IRs software such as DSpace, Eprints • Interoperability among IRs is proven with the development of OAI-PMH protocol by OAI.
Technical Model of NDRS • Alma Swan and Chris Awre has mentioned three models in “Linking UK Repositories. These are: • Centralized Model • Distributed Model • Harvesting Model
Centralized Model • metadata and content are submitted directly to a central server. Advantages • Have complete control of the whole process from article deposition through to the user interface • Software selection • Able to manage preservation issue Disadvantages • It is an expensive option • It may surpass the existing institutional repositories
Distributed Model • All metadata and content remain in their source locations and metadata is searched on the fly. Advantages • providing up-to-date metadata as it provides instant access to source locations of metadata • Relatively very less expensive as compared to centralized model Disadvantages • No enhancement of metadata • Network dependent • Not many IRs support Z39.50 or SRU/W
Harvesting Model • It is a hybrid model where metadata is harvested into a central searchable server and also distributed as content (full text) would be provided by individual repositories. Under this model, service provider would harvest metadata from existing institutional repositories using the Open Archives Initiatives Protocol for Metadata harvesting (OAI-PMH). Service provider can enhanced the quality of metadata and provide the various services from their centralized server. The metadata canbe further exposed via OAI_PMH, SRU/W, RSS feed for use by other service providers.
Harvesting Model-advantages Advantages • OAI-PMH is a standard protocol which is easy to implement • Unqualified Dublin Core is mandated to be OAI-compliant, however, more complex metadata schemas can be employed. • The institutional archives employ software which supports OAI-PMH • Harvesting can be carried out by automatic scheduled tasks
Harvesting Model-disadvantages • Only Unqualified dublin core is mandated for harvesting, it lacks rich semantic as compared to other metadata schema • The metadata exposed by the services may not always latest. Also changes made in metadata may not be reflected in the central server.
NDRS-Accepted Harvesting Model • It is clear that OAI-PMH model has much advantages as compared to other model • It has gained worldwide acceptance • It makes easy to share information about scholarly resources and to offer enhanced resource discovery tools. • It has been adopted by thousands of institutions around the world.
NDRS-benefits National Digital Repository system would offer number of benefits to end users as well to the various stake holders of the Institutions. Benefits to IR Administrator • IR administrator would only maintain the content of the repository while offering metadata to service provider. • NDRS would be inbetter position to provide long term preservation through appropriate metadata provision and/or content package • It would offer an enhanced metadata to the end users
NDRS-benefits…contd End Users as readers and searchers • NDRS would provide end users access to a large number of repositories rather than accessing individual repository. • It would push the content to end users through RSS/ATOM feed. • It would provide document delivery services to the end users
NDRS-benefits…contd End Users as a content manager • NDRS would provide means to expose authors’ work so as to make their work widely available to their peers throughout the globe. • It would able to provode provide preservation and metadata enhancement capabilities to support the long term storage and access to the content.
NDRS-benefits…contd Content Aggregators • NDRS would offer added-value services of their own to enhance aggregated metadata and supply this back to the repository concerned. • IT would provide a single point of information for statistics about access and downloads of data. • It would offer a single point of information to multiple source of research and other materials to aid discovery. • It would able to provide certain collections by adding value added services on top of it.
Impediments in implementing in NDRS • Technical issues at data provider levels such as installation of IR software, server, server malfunctioning, backup of data and updating of IR software etc whereas in case of service provider level, successful harvesting of data involves error free network, the proper use of Dublin core metadata field, data sets and problems with the correct use of date stamp etc. • Coordination among IR members • Federated Authentication and Authorization • Long term preservation, format, migration and access • Sustainability in providing ling term access to NDRS
Current Scenarios of Institutional Repositories in India Registry of Open Access Repositories (ROAR) lists 52 repositories have been registered, however, this number may be higher as certain repositories have yet not been registered with ROAR. Analysis of IRs in India • Out of 52, 13 were not functional at the time of writing paper • Number of them have not been updating • To look further, it is not reaching the critical mass
Current Scenario..contd • As per survey conduced by Webometrics 2010 for latest ranking of World’s open access repositories for visibilities, quality and available items[18], there are seven repositories listed from India and their details as given in the following table.
Current Scenario-service providers • There are 9 service providers in the country who are harvesting data majority of them follows OAI-PMH and harvesting software used is PKP Harvester. Out of 9, four are not functional, though these are highly cited in the literature.
Proposed NDRS • Establishing successful, well populated National level repositories, we need to look at prevailing information system in our country. For example, ICMR, CSIR, ICAR, Envis, Deptt of Atomic Energy, ISRO. Onus should be on those national information system should able to provide “publications arising out of public funded research should make it available free of cost to researchers”
RSS(for further processing) OAI-PMH Metadata refine Document Delivery Alert service NDRS IISc OAI-PMH Social Science OAI-PMH OAI-PMH OAI-PMH ICMR CSIR Inflibnet Agriculture OAI-PMH OAI-PMH OAI-PMH OAI-PMH IR IR IR IR IR IR IR IR IR
NDRS-Recommendations • There is a need of national body in the country as in JISC in UK who is providing advisory as well technical services to individual repositories • Responsibility should be given to National level organizations to set up a national resource centre that should harvest data from their respective institutional repositories • Develop strategies to make institutional repositories a permanent and sustainable part of the national and local research infrastructure • Guidelines to the respective institutional members mediate deposit or voluntary deposit and needs for mandatory deposit of papers and dissertation • Develop guidelines for metadata entry and best practices followed
Conclusion • There is a new challenge to create an environment based on OAI protocol so that public funded research should be made available to the whole community • National level body is needed so that development in institutional repositories should be more coherent as it may able to provide the best advisory services and adoption of guidelines set and best practices followed by various national level systems such as DRIVER, DAREnet, HAL