610 likes | 751 Views
A Preservation Repository in Prose Being a Story of the DRS Past, Present and Future By. Andrea Goethals, Wendy Gogel In Cambridge, Massachusetts 2009. Today’s Agenda. DRS 1: Being a Story of the Past A Transition: Being a Story of the Present DRS 2 and You!: Being a Story of the Future
E N D
A Preservation Repository in Prose Being a Story of the DRS Past, Present and FutureBy Andrea Goethals, Wendy Gogel In Cambridge, Massachusetts 2009
Today’s Agenda DRS 1: Being a Story of the Past A Transition: Being a Story of the Present DRS 2 and You!: Being a Story of the Future Questions?
DRS 1: Being a Story of the Past 1997-2007
The Formative years - LDI • November 1997 Proposal for the Library Digital Initiative “…create the first-generation technical infrastructure to support storage of and access to digital library materials.” • In July 1998, LDI was approved and funded • In December 1998, planning for DRS began
October 2000 Launch Digital Repository Service (DRS) • provides a set of professionally managed services to ensure the usability of securely stored digital objects over time. • is both a preservation and an access repository • includes the bundled delivery services
LDI Grant projects 49 Grants were awarded 1999-2006 • Digitizing analog collections • Images • Text • Audio • Music scores • Born Digital • Biomedical images • Geospatial data • Web sites • Online cataloging projects
Digitizing Facilities June 1999 • Harvard College Library Imaging Services 2001 - 2002 • HCL Fine Arts Library Digital Imaging Lab (FAL DIL) • Harvard Art Museum Digital Imaging and Visual Resources (DIVR) • Harvard College Library Audio Preservation Services (HCL APS) • Peabody Museum of Archaeology and Ethnology
The first Deposit • and the first object was deposited on October 23, 2000…
w/ Metadata • Administrative • Stewardship, contacts (e.g., HCL Harvard-Yenching Library, Ray Lum, etc.) • Billing account (e.g., 33-digit account number) • Access flag (e.g., open to the public, restricted to the Harvard community, no access) • Technical • Physical characteristics (e.g. for images, x and y resolution, MD5 signature, pixel width and height, compression, bit sample rate, etc.) • Production methods (e.g. for images, Scitex; Leaf Volare; Leaf Colorshop 5.x )
The first Audio was deposited on January 28, 2003 • Matins for Sunday after the Elevation of the Holy Cross • Laura Boulton (1899-1980) Collection of Byzantine and Orthodox MusicsArchive of World Music • One of a series of Byzantine hymns and liturgies recorded in a monastery on Patmos, 1960. • Logbook (Part I, p. 1-10)
The first georeferenced map was deposited on January 14, 2005 • Barnstable, Massachusetts 15 Minute Digital Raster Graphic • From an 1893 Historic USGS map reprinted in 1907
Systems and Services 1985 • HOLLIS –our OPAC 1998 - 1999 • VIA Visual Information Access– union catalog • OASIS Online Archival Search Information System – union catalog 1999-2000 • OLIVIA – image cataloging tool
Systems and Services 2000-2001 • DRS Digital Repository Service – preservation and access repository • NRS Name Resolution Service – to resolve persistent identifiers • AMS Access Management Service – to provide access controls • IDS Image Delivery Service • PDS Page Delivery Service • FTS Full-text Search Service • NRS Web Admin • Policy Web Admin
Systems and Services 2001-2002 • DRS Web Admin – staff interface to DRS • PDS Maint • Harvard Geospatial Library – union catalog 2002-2003 • TED TEmplated Database – collection building tool • SDS Streaming Delivery Service – for audio delivery • ADS Asynchronous Delivery – for large files • Cross-catalog search – for federated searching
Systems and Services 2003-2004 • Dynamic IDS – for zoom and pan features w/ JP2 • DMART - Audio deposit tool 2004-2005 • RList – Course reserves tool 2005-2006 • Virtual Collections 2006 - 2007 • Batch Builder 2008 - 2009 • Google data loading • WAX
2008: new DRS storage system • New servers, new storage arrays, new tape library, new storage software • Increased storage capacity • Less complex - DRS loader doesn’t need to know the details of storage system anymore • Higher availability for deliverable content • Copies stored in 3 different geographic locations • 3 “low use” copies, 4 “high use” copies
Annual file size per harvard unit (gb) Art Museums HCL
Cumulative non-Google file sizeper use (gb) • April 2009: 45,742 GB
Cumulative file size (gb) • April 2009: 105,652 GB
2008: new program, new position • HUL takes next step in its commitment to digital preservation and establishes: • Digital Preservation and Repository Manager Position • March 2008 • Andrea Goethals • Digital Preservation Program • June 2008 • Established within OIS
2008/9 priorities of new digital preservation program • Define additional infrastructure requirements to support digital preservation • DRS enhancements • Global digital format registry (GDFR) • Identify and analyze new formats for the DRS to support • PDF, email, audio, architectural drawings, etc. • Establish communication network with the 2 communities we inhabit • Broader digital preservation community • Harvard community
Avenues of communication • Broader digital preservation community • Conferences and meetings • Collaborative projects • Email conversations, blogs, newsgroups • Harvard community • Committees (ULC, CCCC, DMCC, DCSWG, etc.) • Digital project librarians • Ad-hoc focus groups, meetings and email with stakeholders (depositors, curators and collection managers) • Customer surveys
These communities inform our thinking about: • Concepts and terms • Metadata • Data models • Content • Recommended & supported formats • Best practices • Preservation planning and actions • Storage, management and monitoring • Certifications • Registries • Tools and services
DRS customer survey 2008 • August - September 2008 • Users of DRS tools or services • To evaluate the level of satisfaction with DRS tools, services, and websites • To understand any unmet needs
Survey findings • Question 1: What word or phrase best describes the DRS? • In general the DRS is valued for its preservation services and perceived as stable, secure and trusted.
Other key findings of survey DRS Customers want: • Support for more formats • Guidance on preservation formats and content creation • Better search and editing management tools • Delivery services that use common or popular third-party applications
Trends in DRS customer needs • Problem of abundance • Remote creators • Diversity of formats
DRS owners and depositors: Are increasingly overwhelmed by the amount of digital content to preserve Can’t fully process the material they want to deposit into the DRS Can’t go through a deposit process that is time-consuming 1. Problem of abundance
Increasingly DRS owners and depositors are acquiring content they did not create DRS staff can not influence the formats or technical properties of this content during creation 2. Remote creators
3. Diversity of formats • DRS owners and depositors increasingly need to preserve formats and genres that aren’t currently supported by the DRS
Implications of these trends The DRS needs to: • accept and preserve minimally-processed content • provide a time-efficient deposit process • support a broad range of formats and genres And: • can’t rely on the content being in “preservable” formats prior to deposit into the DRS
DRS 2 changes Why? • To better support digital preservation • To better support needs of DRS depositors, curators and collection managers
DRS 2 changes • New conceptual foundation • Objects • Content models • User improvements • Support for opaque objects • Support for new file formats • Deposit, management & delivery tools • Guidance & user community • A new approach to metadata • Increased preservation planning and activities
Objects • Currently only a file level in the DRS • All management has to be done at the individual file level • Objects are aggregations of files • Page-turned object • Still image object • More intuitive unit for management, reporting and searching • Example: How many Page-turned objects do I have in the DRS?
Content models • Types of objects • Example: audio content model