OAIster

OAIster Kat Hagedorn University of Michigan Libraries September 12, 2007

Outline • Brief history/overview of OAI • Why OAIster was created • OAIster: digital union catalog • EPrints, open access and OAIster • Google (and others) and OAIster

What is OAI? • OAI stands for Open Archives Initiative “…develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content.” • Probably should have been called SAI: Shared Archives Initiative • Includes a Protocol for Metadata Harvesting (PMH), i.e., what we use to fill OAIster • Consists of data providers and service providers

Metadata records • Data providers use protocol to share their metadata records • Service providers harvest the metadata so they can provide a service using them • Metadata needs to be • XML1.1 compliant • UTF-8 enabled • Sufficient for discovery

OAI: what it is not • OAI ≠ open access • “…defining and promoting machine interfaces that facilitate the availability of content from a variety of providers. Openness does not mean ‘free’ or ‘unlimited’ access to the information repositories that conform to the OAI-PMH.” • However, a large majority of OAIster records are available to all and sundry • Perfect opportunity-- freely sharing free stuff

Why OAIster? • Initially, wanted to build the Academic HotBot (now we would say the Academic Google) • Essentially, a union catalog of digital objects that often can’t be roboted or spidered • Currently, have more records that link to “objects” than there are records in our OPAC: 13+ million

What does OAIster contain? • Pre-prints, post-prints, published articles, grey literature, scanned images, archival videos… • Harvest everything available • except obvious test repositories • Keep nearly everything • must have a valid digital object link • must have decent metadata • must be scholarly or informational

http://memory.loc.gov/mbrs/varsmp/0526.mpg Library of Congress Digitized Historical Collections http://name.umdl.umich.edu/ADM0370.0002.001 University of Michigan Digital Collections

Why do (should) people use it? • It’s big-- will pass 14 million shortly • It’s varied-- besides articles, photos, and videos, it contains datasets, audio files, finding aids, manuscripts… • It keepsgrowing-- as long as they keep paying my salary

EPrints in OAIster • Kept pace with EPrints movement • Currently actively harvest 138 EPrints repositories (another 18 are inactive) • All told, 190K+ records • A word about repository discovery…

EPrints in OAIster • Not a drop in the bucket, when consider that EPrints is less than a decade old • Don’t forget that OAIster contains more than records pointing to full-text • And it also doesn’t point to only open access materials

Restricted materials • Effort needed to partition restricted from freely accessible • Often, full-text objects are embargoed, but not reflected in metadata • Also felt not providing restricted materials could be a disservice

Benefit of EPrints in OAIster • All valid records in one place • Actively checked and re-harvested • Inclusive search, so end-users retrieve associated and serendipitous materials

When use OAIster, when Google? • Google vs. everything else • Google can’t get at everything, until it starts using OAI itself (DSpace aside) • Google contains junk, OAIster rarely does (anymore) • Most importantly, we use metadata, Google doesn’t

Questions? • Kat Hagedorn • University of Michigan Libraries • Digital Library Production Service • www.oaister.org • khage@umich.edu

OAIster

OAIster

Presentation Transcript

University of Michigan’s OAIster Service Provider

OAIster != Google

Finding Pearls using OAIster

OAIster: What’s with the Weird Name?

University of Michigan’s OAIster Progress Report

OAIster, OAI, and Metadata