300 likes | 389 Views
DRS 2 one in a series of periodic updates. Harvard University Library Andrea Goethals October 21, 2009. DRS = Digital Repository Service. Agenda. DRS 2 context DRS vs DRS 2 Current work: DRS 2.1 Next set of work: DRS 2.2 Questions & comments. 1. DRS 2 context ….
E N D
DRS 2one in a series of periodic updates Harvard University Library Andrea Goethals October 21, 2009 DRS = Digital Repository Service
Agenda • DRS 2 context • DRS vs DRS 2 • Current work: DRS 2.1 • Next set of work: DRS 2.2 • Questions & comments
HUL’s Digital Preservation Program • A continuation of HUL’s mission to provide current and future access to research materials and resources, with recognition that preserving access to digital content requires different strategies, tools and skills • Centerpiece of the preservation program: the DRS
Shapers of the DRS • Digital Preservation Community • Best practices, standards, lessons learned, experiments • Collaborative projects, member organizations, interest groups, meetings, conferences, correspondence, conversations, shared tool development • Harvard needs • Increasing amount of digital content • DRS growth has been fueled by large projects… • Require services to store, preserve, manage, make discoverable, etc. • New formats and genres, born-digital material • Bring new requirements • Support changing user expectations • Print on demand, e-readers
DRS growth 10/1/09: 118 TB in the DRS (Counting all backups: 378 TB)
DRS • Set of professionally managed services preservation planning & activities, administration, management tools creation & format guidelines, training, ingest service delivery services, access restrictions, persistent names storage & monitoring service creation/ acquisition use
DRS 2 • Same services, but much improved preservation planning & activities, administration, management tools creation & format guidelines, training, ingest service delivery services, access restrictions, persistent names storage & monitoring service creation/ acquisition use
DRS 2 • Improvements revamped management tools, adding reporting, more preservation planning additional access restrictions, redundant delivery servers, additional delivery services more guidelines, acceptance of more formats and metadata richer data model, more robust and scalable storage system, better monitoring and recovery processes creation/ acquisition use
DRS 2.1 Scope • Redesign of conceptual foundation • Release to a QA environment
DRS 2.1 Scope • Redesign of conceptual foundation • Modified data model • Content models • Object descriptors • New and different metadata schemas • Release to a QA environment • New and enhanced tools for creation and deposit of objects for depositor testing
Modified Data Model • Current DRS: file level • All metadata is associated at the file level • Even if the same metadata applies to a group of files • All management has to be done at the individual file level • Non-intuitive and unwieldy • DRS 2: adding 2 more levels • objects • (files) • bitstreams
Objects? • Aggregations of files that together represent a coherent unit of content • All the files that make up a single digital book • All the master and use copies representing a single photograph • Useful for management, reporting and searching • “How many PDS document objects do I have in the DRS?” • Hook for new metadata • Administrative categories (projects, exhibits, collections, etc.) • Descriptive metadata, catalog records
Bitstreams? • A subset of a file • Hooks for metadata that apply to part but not all of the file • To characterize the audio portion of a video file • To describe the contents of a ZIP file • Allow fine-grained description and management • May save storage space • some types of content can remain compressed and still be described
Content models • Object types • Define • valid file formats and relationships • known delivery and rendering applications • associated assessments and preservation plans • Enforces conformity - we know what we have • Tie directly to technology watches and preservation plans
DRS 2.1 content models – deposit & delivery • Still image • Image objects, delivered by IDS • PDS document • Page-turned documents, delivered by PDS • Document • Initially just PDF files, delivered by FDS • Opaque • Files in any format • Text • Text, XML, etc. delivered by FDS
Object descriptors • A METS metadata file per object on the file system alongside content files • Descriptive, administrative, preservation, technical and structural metadata • Describes the object, all its files and bitstreams and related significant events • Gives the metadata the same secure storage as the content files • Self-contained, portable objects
Peering into a METS object descriptor • For the object • MODS • PdsMD (for PDS document objects) • For the object, each file and bitstream • PREMIS • HulAdminMD • For each applicable file and bitstream • MIX • TextMD • DocumentMD • …
Deposit tools • Currently: • BatchBuilder • DRS Loader • DRS 2.1: • Enhanced BatchBuilder • New! File Information Tool Set (FITS) • New! Object Tool Set (OTS) • Enhanced DRS Loader • New! DRS Services
Enhanced BatchBuilder • Will build batches of objects rather than batches of files • Will automatically determine all technical metadata (using FITS) • Will automatically create all object descriptors (using OTS)
DRS Services • New back-end service to centralize and control access to DRS objects • Simplifies front end applications • Secures content and metadata • DRS 2.1 services • Object ingest • File delivery
June 2010: QA release to depositors • Depositors will be able to test new workflows in QA • New BatchBuilder and DRS Loader to create and deposit objects into the DRS • Enhanced IDS, FDS and PDS to view the deposited content
DRS 2.2 Scope • DRS Web Admin • Easier discovery, batch updates, reporting, etc. • Repository administration and monitoring • Additional content models • Audio, Web Harvest, Dark PDS Document, various Google, MOA2 document, Biomedical Image, Target Image and Email • Improved audio support • MP3, MP4/AAC • BatchBuilder support • Rights and access management metadata • Rights metadata stored in DRS with content • Analysis of need for more granular access restrictions
June 2011: Production release • Creation, deposit and management of objects • All delivery services integrated with the DRS Services • All DRS files will have been migrated to objects
Many people in OIS working on DRS 2 • Digital Library Projects Group • Systems Operations Group • Systems Development Group • Metadata Analyst
More information • HUL’s Digital Preservation Program http://hul.harvard.edu/ois/digpres/ • DRS 2 Enhancements http://hul.harvard.edu/ois/systems/drs/enhancements.html • andrea_goethals@harvard.edu