190 likes | 200 Views
Learn how the Producer-Archive Workflow Network (PAWN) provides support for customized archival practices, including extensibility, custom authorization, API for new ingestion interfaces, and flexible structure for publishing into repositories. Case studies highlight ways PAWN addresses bulk ingestion, modeling government interactions, and reliable data transfer.
E N D
Supporting Customized Archival Practices Using the Producer-Archive Workflow Network (PAWN) Mike Smorul, Mike McGann, Joseph JaJa
Overview • PAWN overview of extensibility • Custom authorization and role granting • APIs for building new ingestion interfaces • Flexible structure for publishing into repositories • Case Studies using PAWN • Bulk ingestion or at-risk collections • Modeling government interactions
Problems facing ingestion • Reliable data transfer from producer to archive. • Each producer-archive interaction is unique. • How the archive deals with each collection is unique as well.
Distributed Ingestion with PAWN • Multiple producing sites with different requirements. • Separation of administrative responsibility. • Customizable roles for various parties. • Scalable infrastructure.
Package Workflow Overview • Create Producer-Archive Agreement • Client package template. • Create package based on template • Once approved, packages can be archived • Rejected packages can be held until rectified or deleted for resubmission.
Custom Roles • Actions in PAWN can be grouped together to create roles. • Modify items in a package, create users, etc. • Default roles • Producer – Individual data supplier • Records Manager – Oversight of producers • Archive Manager – Final review and archive publishing • Global Administrator – Creates domain, sysadmin-like account
PAWN Actions • Domain creation, modification, deletion • Modification of the organizational structure of a domain • Account creation and modification • Role creation modification • Record set creation and modification • Setting permissions on record sets • Record Schedule creation and modification • Add or delete whole packages • Modify items in a package • Limiting an account to working with it’s own packages, all packages, or all in a domain. • Approving, rejecting, and archiving items in a package • Lock or unlock entire packages to prevent modification • Configure publishing resources
Data • Type • Descriptive Name • Bits Metadata … • Metadata • Type • Bits • Name • Manifest • Namespace • Type • Descriptive Name Manifest … Custom Package Building • PAWN provides an API for developing custom package builders • Custom package builders can be written in JAVA and implement a simple interface. • Builders interact with a hierarchical structured package
Package Builders • Default Builder • Create files and folders • Attach descriptive metadata to files or folders • ICDL Builder • Create ‘books’ with dublin core metadata • Uses ICDL database as source for book list and metadata
PAWN Archive Gateway • Pluggable component that provides an API for developing gateways into various services. • Each gateway may have multiple instances, each configured differently • PAWN handles managing and associating gateways with the appropriate data.
SRB Gateway workflow • Before any submission, the gateway is configured with basic SRB information and associated with a domain. • The client supplies final destination and additional settings • Driver returns handle to final destination for log files
Case Study: 15,000 cdroms • 15,000 cdroms containing landsat data. • CD’s in control of library, processing and data storage across campus. • Moving cd collection not feasible. • Need for untrained (student) labor to ingest without supervision. • Final copy needed to be accessible by several parties.
Case Study: 15,000 cdroms • Custom PAWN Interface. • Two workstations, 4 cd drives apiece. • Generate thumbnails and barcode cdroms. • Use SRB as final archive, and pre-existing PAWN-SRB driver.
Case Study: SLAC Records • What parties are involved in transferring records from a government agency to NARA? • How can the Record Schedule view of required records be simplified and presented to a client
Case Study: SLAC records • Created specialized roles • Records Creator • Create new packages and modify own submissions. • Records Liaison Officer • View or modify any packages in their domain. • Create users. • Create record templates • Records Manager • Sends packages on for more permanent storage. • Create domains and producer-archive agreements • Used pre-existing SRB gateway
More information • Web site: • http://www.umiacs.umd.edu/research/adapt • Wiki link for technical details. • Or “I’m feeling lucky” Google keywords: • ADAPT UMIACS