200 likes | 224 Views
PeDALS Persistent Digital Archives & Library System. Richard Pearce-Moses Deputy Director for Technology & Information Resources Arizona State Library, Archives and Public Records. Curatorial Rationale. Transformation of traditional, paper-based practices into the digital arena
E N D
PeDALSPersistent Digital Archives & Library System Richard Pearce-Moses Deputy Director for Technology & Information Resources Arizona State Library, Archives and Public Records
Curatorial Rationale • Transformation of traditional, paper-based practices into the digital arena • Open Archival Information System (OAIS) • Acquisition • Arrangement & description • Housing & storage • Reference and access • Preservation • Ingest • Storage • Data management • Preservation • Access
Middleware: Microsoft BizTalk • Automated business rules • Transforming SIPs to AIPs and DIPs • Mapping, generating metadata • Connecting multiple databases (“glue”) • Many OOOs • One repository • Allows communication between systems • Validation
1. OOO Recordkeeping System • For each series of records OOO and repository • Negotiate metadata you will receive • Negotiate format of the records (TIFF, PDF, XML) • Negotiate format of the submission information package • Negotiate frequency and manner of transfer • OOO develops procedures to create SIPs • Metadata, Record • Shipping manifest with hash and file names
Submission Information Packages • OOO Metadata • “Well number" , "Owner" , "Title" , "File name" • "56-000001","CITY OF TUCSON","2003 annual report","56 files\56-000001_0000.pdf" • "56-000001","CITY OF TUCSON","2004 annual report","56 files\56-000001_0000_E52B0.pdf" • "56-000001","CITY OF TUCSON","2005 annual report","56 files\56-000001_0000_E8578.pdf" • "56-000001","CITY OF TUCSON","2006 annual report","56 files\56-000001_0000_EC3F8.pdf" • Records • XML • PDF • Other formats
2. Ingest: Transfer to Drop Box • Transfer to a drop box in DMZ • FTP • Tape • Disk • Isolated for virus scanning • Validation • Were all records received without corruption? • Were any false records received?
3. Data Management: Metadata • Generate core metadata • Administrative (6 elements) • Descriptive (28 elements) • Preservation (12 elements) • Stored in “Accessions Register” • MS SQL Server
Administrative Metadata • Information created by repository to track records in the system • Accession Number • Transfer Authority • Acquisition Ingest Identifier • Acquisition Date • Unique Item Identifier • Item Location
Discovery Metadata • Information created by OOO or Repository to help retrieving records for a variety of purposes • Office of Origin, Variant name • Source • Series Title, ID • Series Dates • Series Extent • Series Description • Arrangement • Restrictions • Series Subjects, Keywords • Activity • Item Title • Originator ID • Item Extent • Item Date • Item Description • First1024 • Party and Role, Subjects, Location • Item Keywords, Form/Genre • Related Item • Language • Open Date
Preservation Metadata • Information created by Repository to support to protect integrity, support readability over time • Access Facilitators • Operating System • Access Inhibitors • Hardware • Exceptions • Signature Information • File Description • Fixity • Functionality • Software • Structural Type • Technical Infrastructure
4. Storage • Create AIP • <AIP> <Hash> </Hash> <CoreMetadata> </CoreMetadata> <Metadata> </Metadata> <Record> </Record></AIP> • Deposit in Digital Stacks (LOCKSS) • Generate manifest list to expose to LOCKSS • LOCKSS harvests from manifest server
Why LOCKSS? • Benefits • Automatic integrity checking • Automate error-correction • Geographically dispersed copies • Bitstream preservation • Committed community of support • Hardened operating system • Concerns • Maximum number of objects in a Unix file system • Community of support is small
4. Access • DIPs for public access • No administrative, preservation metadata • Formats supported by common browsers • Website • Records not confidential (by law) • SQL query engine with discovery metadata • Limited access website • In repository, selected locations • Record series with personally identifying information
5. Preservation • Bitstream preservation • Developing audit procedures • Periodic validation of dark archives against accession register • For future development • Capturing minimum preservation metadata • On-the-fly rendering tools • Long-term format migration
Community of Shared Practice • Personal Relationships • Challenge of building relationships over the Internet • Lack of rich, immediate feedback in communication • Lack of spontaneity, serendipity, play • Inter-Agency Relationships • Different practices • Laws and regulations • Money
For more information • http://rpm.lib.az.us/PeDALS/ • Principal Investigator • Richard Pearce-Moses • Project Coordinator • Sara Muth • State Partner Leads • Florida: Mark Flynn • New York: Bonnie Weddle • South Carolina: Bill Henry • Wisconsin: Helmut Knies