150 likes | 296 Views
PAWN: A Policy-Driven Software Environment for Implementing Producer-Archive Interactions in Support of Long Term Digital Preservation. Mike Smorul, Mike McGann, Joseph JaJa Institute for Advanced Computer Science Studies University of Maryland, College Park
E N D
PAWN: A Policy-Driven Software Environment for Implementing Producer-Archive Interactions in Support of Long Term Digital Preservation Mike Smorul, Mike McGann, Joseph JaJa Institute for Advanced Computer Science StudiesUniversity of Maryland, College Park Sponsored by National Archives and Records Administration, Library of Congress and NSF Archiving 2007
Problems Facing Ingestion • Ensure integrity of data ingestion • Each producer-archive interaction is unique • Final destination for items in an archive is unique. • Differing roles between producer and archive • Hostile producers Archiving 2007
What is PAWN? • Software that provides an ingestion framework • Distributed and secure ingestion of digital objects into an archive. • Handles the process • From package assembly • To archival storage • Simple, customizable interface for end-users • Flexible interface for archive publication Archiving 2007
Package Workflow • Create Producer-Archive Agreement • Client package template. • Create package based on template • Once approved, packages can be archived • Rejected packages can be held until rectified or deleted for resubmission. Archiving 2007
Expanding a Simple Workflow • Support for multiple workflows. • Grouped into logical domains • Definable roles per workflow • Pluggable components for assembly and archival publishing • Distributed components • Web-service based components Archiving 2007
Domain Organization • Producers organized into domains, each domain contains a transfer agreement negotiated with the archive. • Each domain contains a hierarchical organization of data grouped into record sets/templates (convenient groupings from the transfer agreement). • Each domain contains its own users. • An end-user operates within a set of record sets. Archiving 2007
Domain Example Archiving 2007
Custom Roles • Actions in PAWN can be grouped together to create roles. • There are no common roles between archives, so allow custom ones. • Default roles • Producer – Individual data supplier • Records Manager – Oversight of producers • Archive Manager – Final review and archive publishing • Global Administrator – Creates domain, sysadmin-like account • Sample Actions • Setting permissions on record sets • Record Schedule creation and modification • Add or delete whole packages • Modify items in a package … Archiving 2007
Data • Type • Descriptive Name • Bits Metadata … • Metadata • Type • Bits • Name • Manifest • Namespace • Type • Descriptive Name Manifest … Custom Package Building • PAWN provides an API for developing custom package builders • Custom package builders can be written in JAVA and implement a simple interface. • Builders interact with a hierarchical structured package Archiving 2007
PAWN Archive Gateway • Pluggable component that provides an API for developing gateways into various services. • Each gateway may have multiple instances, each configured differently • PAWN handles managing and associating gateways with the appropriate data. Archiving 2007
PAWN Architecture • Divided into producer and archive side components • Producer: data supplying and domain management • Archive: data storage, resource allocation and archival publishing • Web-service based communication • Trust relationship between producer and archive components • SAML and PKI Archiving 2007
Components Archiving 2007
ICDL Book Builder SLAC Record Ingestion 10,000 CDroms Case Studies • Custom package builder • Multiple data sources • Model logical books • Sample NARA ingestion • Model government roles • DOE Record Schedule • Remote ingestion • Unskilled labor • Custom hardware Archiving 2007
PAWN Summary • Platform for ingestion • Customizable Components • Roles, ingest and publishing • Distributed architecture Archiving 2007
More information • Web site: • http://www.umiacs.umd.edu/research/adapt • Wiki link for technical details. • Or “I’m feeling lucky” Google keywords: • ADAPT UMIACS Archiving 2007