160 likes | 258 Views
Open Archives Initiative. Protocol for Metadata Harvesting. Collections in isolation. Some thoughts A wonderful collection is of limited use if it is not well known. Very redundant collections are often wasteful. Virtual collections.
E N D
Open Archives Initiative Protocol for Metadata Harvesting
Collections in isolation • Some thoughts • A wonderful collection is of limited use if it is not well known. • Very redundant collections are often wasteful
Virtual collections • Some collections do not contain actual materials, only information about materials and links to the home site. • How do these virtual collections get the information about other collections? How do they stay up to date? • --> The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH)
OAI - PMH • A protocol -- that is just an agreement to exchange messages and interpret them according to strict rules. • Metadata -- data about the data -- information about the material in the collection • Harvesting -- gathering in the desired part of the collection for further use
The protocol • See http://www.openarchives.org/OAI/openarchivesprotocol.html • Two sides - the repository and the harvestor • The repository (data providers) • Prepares the required metadata • Responds to the harvester queries • Acts like a server - responding to queries when they come • The Harvester (data gatherer) • Gathers the metadata from the collections • Organizes the harvested metadata in a way to serve its purpose. • Acts like a client - requesting service when it needs it.
Resource, item, record • Resource: the actual content of the collection; the point of the digital library • Item: a part of the repository that generates the metadata. • Record: metadata in a specific format available for dissemination. • Encoded in XML • Unique identifier • Datestamp • setSpecµ • Optional status
Sets • Repositories may organize items into sets • Allows selective harvesting • Each node in a set organization has • setSpec • Set may be hierarchical. If so, the levels are separated by colons • setName • setDescription
Requests • Request embedded in an HTTP request • Valid OAI PMH Requests: • GetRecord • Identify • ListIdentifiers • ListMetadataFormats • ListRecords • ListSets
GetRecord • Required arguments • Identifier = unique identifier of an item whose record is requested • metadataPrefix = prefix part of the metadata record relevant to the requested item • This identifies the type of metadata applied to the record. Example = oai_dc (the OAI version of the Dublin Core -- standard 15 elements, no extension.) • Errors: badArgument, cannotDisseminateFormat, idDoesNotExist
Identify • No arguments • Requests information about the repository. • Response includes • repositoryName • BaseURL • protocolVersion • earliestDatestamp • deletedRecord (how does the repository handle deletions -- no, transient, persistent • Granularity (how finely can the datestamp be specified?) • adminEmail • compression (what schemes are supported) • description Optional
ListIdentifiers • Required Argument • metadataPrefix • Optional Arguments • from • until • set • Exclusive argument • resumptionToken (flow control token for resuming an incompleted previous ListIdentifiers request) • Errors: badArgument, badResumptionToken, cannotDisseminateFormat, noRecordsMatch, noSetHierarchy
ListMetadataFormats • Optional argument • identifier (if metadataformat is needed only for some particular item) • Errors - badArgument, idDoesNotExist, noMetadataFormats • Response includes both metadataPrefix and the associated schema
ListRecords • Required arguments • metadataPrefix - Only records for which the specified metadataPrefix applies should be returned • Optional arguments • from • until • set • Exclusive arguments • resumtpionToken
ListSets • Exclusive Argument • resumptionToken (used to continue a previous incomplete response to ListSets) • Errors - badArgument, badResumtpionToken, noSetHierarchy
Resources • Compliance testing - www.dlib.vt.edu/projects/OAI/repexp/repexp.html • OAI PMH - www.openarchives.org/OAI/openarchivesprotocol.html • Implementation Guidelines www.openarchives.org/OAI/2.0/guidelines.htm