240 likes | 358 Views
Finding a New Way. Richard Pearce-Moses Deputy Director for Technology & Information Resources Arizona State Library, Archives and Public Records. Using Automated Business Rules to Process Electronic Records. PeDALS Persistent Digital Archives and Library System.
E N D
Finding a New Way Richard Pearce-Moses Deputy Director for Technology & Information Resources Arizona State Library, Archives and Public Records Using Automated Business Rules to Process Electronic Records
PeDALS Persistent Digital Archives and Library System • Arizona State Library, Archives and Public Records • Florida State Library and Archives • South Carolina Department of Archives & History • South Carolina State Library • New York State Library • New York State Archives • Wisconsin State Archives • Two more partners • Kudos to Washington State Archives
Curatorial Rationale Question traditional, paper-based practices in order to transform them into the digital era Appraisal Acquisition Arrangement and description Housing and storage Reference and access Preservation Preserving archival principles of provenance, context, collective control, and authenticity and integrity
Technical Goals To demonstrate the use of middleware to implement business rules in software as an integrated workflow to process collections of records and publications To build “digital stacks” using LOCKSS as the basis of an inexpensive storage network that can preserve the authenticity and integrity of the materials.
Additional Goals To build a community of shared practice that meets the needs of a wide range of repositories For best practices ~ appropriate practices, what works, what’s practical For resource sharing ~ avoid redundant work To remove barriers to preservation by keeping costs as low as possible
Preliminary Results:A New Research Agenda • Vulcan mind meldImmediate understanding, no confusion • CloningNo time wasted in job search, known results • Time travelAll the time you need while meeting deadlines
PeDALS at 50,000 feet • Based on OAIS Reference Model • Metadata • Transforms and normalizes received metadata • Enhances received metadata • Archival Information Package • Creates and stores in LOCKSS • Dissemination Information Package • Creates and publishes to the web
Appropriate Record Sets • Ideal scenario • Created, stored in a recordkeeping system • Indexed • Likely to succeed • Certificates • Email • Indexed documents • Less likely to succeed • Hard drives with no index • Sufficient number and consistency to allow rules
Curatorial Rationale • Focus on why, not just how • Strategic shift in how we work • Not limited to doing things differently • Doing different things • Curators work with rules, not records • Describe business processes (rules) • Monitor the process for quality assurance
Metadata and Queries • Single schema • Administration, discovery, preservation • Elements common to all government records • Definition and cross-walks • Rationale • What is it • Who uses it • For what purpose
Example: Item Title • Definition: The word or phrase, taken from a prescribed source by which a work is known • Rationale: Serves as a "handle" to represent the object at an abstract level in lists, such as search results. A supplied title should contain sufficient information to aid patrons in the selection of materials. Because date is preferred and included in search results by default, the title need not include date information.
Creation • Prepare Submission Information Package • Extract records for transmission • Extract metadata • Create shipping manifest • Negotiation • File formats for records, metadata, shipping manifest • Transfer methodology • Frequency of transfer
Description • Traditional archival description • Provenance • Series • Acquisition • Rules-based description • Metadata mapping • AIP schema • DIP schema
Submission • Transfer • sFTP, disk, tape, sneakernet • Deposit on Point of Ingest server • Data wrangling • Virus scan • Normalize process during initial transfers • Run manual processes to prep data
Create AIP • Simple schema for single files • Normalized metadata • Received metadata • Record (typically Base64 encoding) • Compound schema for multiple files • Normalized metadata • Received metadata • Structural metadata • Files (typically Base64 encoding)
Ingest • Update administrative catalog • Encapsulate AIPs in Superpackage • Expose to LOCKSS • Automatic integrity checking • Automatic error correction • Distributed preservation model • Sustainable business model • Inexpensive • Testing a 16TB system
Dissemination • Create DIP • Browser friendly format • Update public catalog • Publish to website
Simplification • Community of shared practice • Many hands make light work • Resource sharing • Support network • Generic, modular processes • Code reuse • Standard schema • Catalog databases • Packages
Automated Processing • Open source v. proprietary software • Middleware • Microsoft BizTalk • Metadata tools • New Zealand Metadata Extractor • Bag It file transfer validation • Agile-Scrum project management methodology
Project Status – Completed Technical infrastructure installed Core metadata defined Schema for a simple AIP Developed administrative catalog AZ marriage certificates ingested, transformed and created metadata, packaged as AIPs, and deposited in LOCKSS Demonstrated reuse of code by adapting rules for marriage certificates to SC Public Service Commission orders
Project Status – To Do Complete Administrative Catalog Interface Develop AIP for compound records Develop DIP Develop Public Catalog web interface Write rules to ingest additional records and publications Project to be completed by December 2010
State Initiatives Symposium • Results from NDIIPP State Initiatives Projects • Arizona • Minnesota • North Carolina • Washington • Possibly with Best Practices Exchange • Phoenix • Fall 2010
For more information http://www.pedalspreservation.org/ Principal Investigator Richard Pearce-Mosesrpm@lib.az.us Project Coordinator Sara Muthsmuth@lib.az.us