100 likes | 482 Views
Hub and Spoke (H&S) . Repository Interoperability Architecture with a forward-looking emphasis on preservation metadata and activities. The problem. Plethora of repositories Not just across institutions, but even with a single institution Overabundance of data sources
E N D
Hub and Spoke (H&S) Repository Interoperability Architecture with a forward-looking emphasis on preservation metadata and activities
The problem • Plethora of repositories • Not just across institutions, but even with a single institution • Overabundance of data sources • Web crawlers like Heritrix or OCLC's WAW, digitization and scanning services, individual authors, batch ingest from legacy systems • Current integration solutions are local and ad hoc
The solution • A common METS-based profile • A standard programming API • A series of scripts that use the API and METS profile for creating SIPs and DIPs which can be used across different repositories
METS profile • DRAFT: http://dli.grainger.uiuc.edu/echodep/METS/DRAFTS_2006-06-29/METSProfile.xml • Foci • Repository interoperability • minimally at the file and descriptive metadata level, probably not at the structural level • Digital preservation • Web captures • Administrative metadata: technical and provenance • Integrating the PREMIS data model into METS • Priority in preserving the ‘representation’: descriptive metadata, content, and structure
Phased implementation • Phase 1: Interoperability • Phase 2: Persistent Storage Layer • METS Profile as an AIP • JSR-170 content repository standard • End-user Access (search/browse/render) is low priority
Details of current implementation • Based on ingest scripts that were developed to support the repository evaluation • Java except at the outermost layers where native API calls are utilized • We consider this a proof-of-concept implementation; the goal being to demonstrate round-trip interoperability between three repositories: DSpace, Eprints, and Fedora • Currently working with minimal administrative metadata
Hub Generate/collect provenance metadata To-Hub Spoke Data Store / DIPs Extract format-specific technical metadata Generate/collect digital provenance metadata Embed links to digital items image.jpg Model structure of the item Embed native metadata Transform/enrich native metadata metadata.xml
Hub Generate provenance metadata From-Hub Spoke SIPs Transform hub metadata to repository-compatible metadata Assemble into packages for repository ingest Add the METS file as an item in the submission package metadata.xml hubMets.xml
Future Work • True repository for Archival Information Packages (AIPs) • Add a persistence layer possibly based on JSR-170 • Assign global, persistent identifiers • Support basic functionality required for a interoperability such as the Pathways’ obtain, harvest, and put services • Other preservation functions…