200 likes | 350 Views
PAWN Progress. July 06, 2006. Overview of changes. New flexible environment for setting up and managing interactions between producers and the archive Domains to organize accounts, record organization, and packages Definable roles that can be flexibly combined and assigned to accounts
E N D
PAWN Progress July 06, 2006
Overview of changes • New flexible environment for setting up and managing interactions between producers and the archive • Domains to organize accounts, record organization, and packages • Definable roles that can be flexibly combined and assigned to accounts • Interfaces for designing package builders and archival resource gateways
Archive Managed Producer Managed Management Server Validation Services Authentication Package Information Distributed Archive Ingestion Status Schedule Request Receiving Server Producer data suppliers Components
Overall Organization • Producers organized into domains, each domain containing a record schedule negotiated with the archive. • Each domain contains a hierarchy of the types of data and record sets (convenient groupings from the record schedule). • An end-user operates within a domain with record sets associated with the account.
Package Workflow • Client selects a record set to use as a package template. • A package is built locally and then transferred to a PAWN receiving server. • Optionally lock package to signal complete submission. • Review and possible reject items. • Transfer items from PAWN into final archive. • Remove package from PAWN.
Record Organization • Previous version had one hierarchy with attachment points for items as leaf nodes. • Did not allow for linking of related leaf nodes • Hierarchy performed multiple roles, record organization and administrative organization. • Current version based on Record Sets. • Separate administrative structure and record structure. • Record Sets are template packages.
Record Organization • Each domain contains a record schedule • Record schedule is a hierarchy containing authorities as endpoints • Domains also contain an organizational hierarchy. Offices, projects, etc. • Record Sets • group of authorities from the record schedule • attached to a point in the record hierarchy. • Have access permissions • Presented to producers as package templates
Record Set • Name: Research Results • Note: Reports, presentations, • and other published • research results • Allowed Accounts • Record Schedule Mapping • Presentations • Presentations • Technical Reports • Technical Reports Record Sets Record Set Sample • Record Schedule • Administrative • Strategic and Performance Plans • Appointment and Promotion • Policies and Committees • Alumni Affairs • Financial • Contracts and Grants • Payroll • Donations • Publication Reports • Technical Reports • - Archiving Rules • Presentations • Posters • Domains • Offices of the President and Vice-Presidents • College of Sciences • College of Engineering • College of Medicine • College of Arts and Humanities • College of Behavioral and Social Sciences • ….. • College of Sciences Domain • Office of the Dean • Chemistry • Mathematics • Physics • Computer Science • Business Office • Research Groups • Labs • … • …
Flexible Account Roles • Previous version had fixed accounts, producer, manager and administrator. • Current version allows actions in PAWN to be grouped into roles. • Each account is assigned a role. • Sample actions in PAWN • Record Set/Schedule management • Package creation/deletion/modification • Account management
SAML Usage • SAML Assertions are issued by managers • Contain manager namespace, domain, username • Contain list of allowed actions by the client • Contain client’s public key (holder-of-key) • Signed by manager • SAML Assertions authenticate and authorize a client for archive-side services. Producer Archive Administrative Metadata Calls Package Management Calls Archive Management Calls Call Overlap
Sample SAML Assertion <Assertion AssertionID="b5ad81157714985340250bc43d704c44" IssueInstant="2006-07-05T15:07:33.898Z" Issuer="http://umiacs.umd.edu" MajorVersion="1" MinorVersion="1"> <Conditions NotBefore="2006-07-05T09:07:33.898Z" NotOnOrAfter="2006-07-05T15:07:33.898Z"></Conditions> <AttributeStatement> <Subject> <NameIdentifier NameQualifier="umiacs">umiacs:toaster</NameIdentifier> <SubjectConfirmation> <ConfirmationMethod>urn:oasis:names:tc:SAML:1.0:cm:holder-of-key</ConfirmationMethod> <ds:KeyInfo xmlns:ds="http://www.w3.org/2000/09/xmldsig#"> <ds:X509Data> <ds:X509Certificate>MIIDxjCCAy+gAwIBAgIDEAACMA0GCSqGSIb3DQEB....</ds:X509Certificate> </ds:X509Data> </ds:KeyInfo> </SubjectConfirmation> </Subject> <Attribute AttributeName="package_item" AttributeNamespace="http://umiacs.umd.edu/adapt/saml"> <AttributeValue>view</AttributeValue> <AttributeValue>create</AttributeValue> <AttributeValue>modify</AttributeValue> </Attribute> ... ... </AttributeStatement>
SAML Assertion (cont) <ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#"> <ds:SignedInfo> <ds:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"></ds:CanonicalizationMethod> <ds:SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"></ds:SignatureMethod> <ds:Reference URI="#b5ad81157714985340250bc43d704c44"> <ds:Transforms> <ds:Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"></ds:Transform> <ds:Transform Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"> <ec:InclusiveNamespaces xmlns:ec="http://www.w3.org/2001/10/xml-exc-c14n#" PrefixList="code ds kind rw saml samlp typens #default"></ec:InclusiveNamespaces> </ds:Transform> </ds:Transforms> <ds:DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"></ds:DigestMethod> <ds:DigestValue>r7C4oNmlf4h8cXi1dGU+MIGmGbM=</ds:DigestValue> </ds:Reference> </ds:SignedInfo> <ds:SignatureValue>Rstfd1HKTe68WLQrgAvmS5hDm7SVbXnEgMlotW3aiu....</ds:SignatureValue> <ds:KeyInfo> <ds:X509Data> <ds:X509Certificate>MIIDyjCCAzOgAwIBAgIDEAABMA0GCSqGSIb3DQ....</ds:X509Certificate> </ds:X509Data> </ds:KeyInfo> </ds:Signature> </Assertion>
Data • Type • Descriptive Name • Bits Metadata … • Metadata • Type • Bits • Name Manifest … Package Creation • Packages are built using a Record Set as a template. • Each category in a Record Set has a hierarchy of manifests attached. • Manifests are an abstraction of underlying METS documents • Custom package builders use manifest interface. • Manifest • Namespace • Type • Descriptive Name
Package Builders • Default Builder • Create files and folders • Attach descriptive metadata to files or folders • ICDL Builder • Create ‘books’ with dublin core metadata • Uses ICDL database as source for book list and metadata
1. Space Requirements Scheduler Client 2. Evaluate classad 4. Allocated Server 5. Package Transfer Receiver Classads 3. Create Reservation Receiver Package Scheduling and Submission • Scheduler decides which receiving server to store a package • Condor classad system used • Receiving server periodically publishes available resources • Client request space.
Publishing into Archival Resources • PAWN provides an interface for registering gateways into archival resources • Gateways provide: • Configuration gui • Client gui • Mover to transfer data from PAWN to archive • PAWN provides: • Configuration storage • Access to all items in a package • Access to contextual information about a package • Infrastructure for storing and loading gateway drivers.
PAWN Scheduler 1. SRB Configuration SRB Gateway 2. SRB Path & item list 4. Package Items PAWN Client SRB 3. Package Items Archival Context 5. GUID or Path PAWN Package SRB Publishing
Screenshots Client Interface Configuration Interface Resulting Log Entry
XFDU publishing • Create XFDU compatible Information Packet. • XFDU is similar to METS. • Separate data definitions from structural information • Similar file attributes (size, checksum, etc..) • PAWN mapping • InformationPackageMap contains ContentUnits to recreate the hierarchy of data in a PAWN package. • DataObjects register individual files. • XFDU manifest and data files combined to form an Information Package.