130 likes | 139 Views
Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN). Distributed and secure ingestion of digital objects into the archive. Use of web/grid technologies – platform independent Ease of integration with data grids or digital libraries.
E N D
Producer – Archive Workflow Network (PAWN) • Distributed and secure ingestion of digital objects into the archive. • Use of web/grid technologies – platform independent • Ease of integration with data grids or digital libraries. • XML Representation of metadata and bitstream • Self describing bitstream submissions • Accountability of transfer and guarantee of data integrity
Ingestion Workflow (PAWN) • Negotiate Submission Agreement. • Workflow Initialization and Submission Information Packet (SIP) creation. • Transfer of SIPs to receiving servers. • Validation of SIP transfer • Organization of data into collections and transfer into the distributed archive.
Distributed Ingestion • Each Producer registers and arranges files locally prior to transport. • Multiple distributed archival receiving stations. • X.509 based authentication between sites. • Independent Certificate Authorities at each Producer. • Persistent archive is geographically distributed and managed by a data grid.
Producer • Provides data to an Archive based on a prior agreement. • Consists of a management/metadata server and an ingestion client. • Provides initial arrangement, context, and metadata.
Enhancements to the Producer • Data submissions are organized through a logical hierarchy negotiated between the archive and the producer. • Clients no longer see entire hierarchy, but rather attachments points • Better state tracking and oversight of submissions • METS documents are no longer merged together, but rather kept separate to support larger submissions. • Submission can be broken into multiple METS documents linked together through pointers. • Producer signed submissions to ensure integrity.
Different administrator and client views • Manager / Record Manager • Administrator • Views entire producer hiearchy • Producer / Record Creator • View restricted to allowable submission points
New Interactions Between Client and Receiving Servers • Ability of client to reserve resources before starting to transfer data into the archive. • Client creates a session with a receiving server and uploads metadata. • Clients upload bitstreams, and receiving server validates checksums during transfer • Client can resume or retransmit failed submissions
Archive - receiving • Receives data from a Producer • Validates bitstreams and metadata, and sends acknowledgement to Producer. • Arranges into collections and specifies preservation policy. • Publishes bitstreams into a digital archive.
New Features for Receiver • Validation Services • Designed a standard API and test suite for rapid development of validation services. • New classes of services can be easily developed. • Receiving Server • Configurable endpoints into storage or metadata repositories • Better handling of multiple producers
Scheduler • Allocates the processing of data streams from multiple clients to a cluster of receiving servers. • Clients are required to request a resource reservation. • Receiving server will acknowledge/deny the reservation. • Client will be informed about reservation/receiving server. • Currently, receiving server has hooks for scheduler