230 likes | 245 Views
Presented by Paul McKeown (EMC) at the 4 th Annual Digital Curation Conference December 3, 2008 Edinburgh, Scotland. The eXtensible Access Method (XAM) Standard by Steve Todd (EMC). Purpose. This paper was written to introduce XAM technology to the Digital Curation community. Agenda
E N D
Presented by Paul McKeown (EMC) at the 4th Annual Digital Curation Conference December 3, 2008 Edinburgh, Scotland The eXtensible Access Method (XAM) Standardby Steve Todd (EMC)
Purpose • This paper was written to introduce XAM technology to the Digital Curation community. • Agenda • Why XAM? • XAM Coding Example • Standards-based Digital Archives • Vendor plug-fest: Sun, EMC, and HP • Opportunities for Digital Curation Research
Why XAM? • SNIA (Storage Networking Industry Association) • 100 Year Archive Survey • Key Problem: Logical and Physical Retention of Fixed Content • Logical: Data Formats • Physical: Migration to new Digital Archives • XAM Solution • Strong binding between content and meta-data • Vendor-neutral API for storing “fixed content objects” • Standards-based export-import between different vendors
The XAM Approach • Location-independent Object naming • Each ‘Data Object’ (XSet) stored in XAM is assigned a globally unique name (a XUID). • These Location Independent Names allow data to migrate within or between XAM storage systems without impacting applications accessing the data. • Rich Metadata • XAM allows applications to bundle MIME-typed contextual metadata together with application data, facilitating easier data interchange among applications and longer “shelf life” for application data. • XAM provides for SQL like Query functionality • Pluggable Architecture for Storage System support • XAM Storage System vendors can plug their systems into the XAM API by creating a provider for the Vendor Interface Module API. • XAM also provides a standardized set of management disciplines and semantics for fixed content, such as retention, query, shredding etc.
XAM Software Architecture Application ProgramISV / Custom XAM Toolkit API XAM Toolkit Library SNIA XAM API XAM API XAM API LibrarySNIA VIM API VIM API VIM Vendor A VIM Vendor B VIM Reference Vim - SNIA
XAM Object Model • XAM defines 3 Primary Objects • XSet • Primary storage abstraction in XAM • Stores metadata and data in a collection of Fields • XSystem • Logical Container of XSets • Each XSystem Instance represents a connection with a particular XAM Storage System • XAM Library • Responsible for discovering and managing all VIMs in the XAM application environment • Provides Fields to report on and control global XAM attributes (API Revision Level, API Logging Level etc.) • Serves as a factory for XSystem Instances via Connect method
XAM Primary Objects - XSystem • XSystem is an abstraction in the XAM API representing a logical container of XSets. • This is distinct from and possibly a subset/superset of the physical XAM Storage System. • An XSystem Instance combines an XSystem with an authenticated connection to one or more XAM Storage Systems. • An XSystem Instance is equivalent to a XAM Session. • In the XAM API, an XSystem Instance is created by calling XAM_Connect with a valid XRI (XSystem Resource Identifier). • XSystem instances are used to create, retrieve and delete XSets. • XSystem instances also have fields describing the system’s supported capabilities and management policies.
XSet Fields XSet XAM Primary Objects - XSet • XSet – Addressable “Unit of Storage” in XAM Model • To store data in an XSystem an application must: • Create XSet Instance via XSystem • Create/Populate XSet fields with data • Commit the new XSet to persistent storage, saving resultant XUID Properties • “Simple” types (Boolean, Int, Float, String, DateTime, XUID) • - Type checked/enforced by storage system • Manipulated via “Property Get/Set” methods XStreams • Bytestreams, up to 2^64 bytes • Type assumed to be a valid MIME-type, but not checked/enforced by storage system • - Manipulated via Posix-style I/O methods (e.g., open, read, write, close)
XSet Field ‘Binding’ Attribute • When a field is marked ‘Binding’ ( ) it means that this field’s value has a direct correspondence with this XSet’s XUID (the field’s value is relevant for the XSet’s identity) • ‘Non-Binding’ fields may be freely modified within the XSet, just as with traditional read/write storage • However, on an attempt to modify a ‘Binding’ field, the XSystem silently creates a completely new XSet, an identical copy of the original XSet, and assigns it a new XUID; the original XSet must be preserved under the original XUID • Applications are free to decide which XSet fields are ‘Binding’ at the time they are created XSet Fields XSet
XUID – XSet Unique Identifier • XUID is the permanent name for an XSet • Assigned by XAM Storage System • XUIDs are Globally Unique • XUID Native format is binary sequence (10 – 80 bytes) • Base64 (RFC 2045) recommended for printable interchange • XSet’s XUID has a strict relationship with the XSet’s ‘Binding’ fields • If a ‘Binding’ field is modified, a new XSet with a new XUID is created upon commit • The old XSet is preserved as-is XUID Format
XAM API XAM Write Example - Overview XAM 1.0 Application Program • Connect to XSystem • Create New XSet • Add XSet Metadata • Write XSet Data • Commit XSet • Release Resources XAM API Libraryxam.dll VIM API VIM example_vim.dll XSS Protocol XAM Storage System
Storage System Application XAM API XAM Write Example – Connect to XSystem xam_string vXRI = “snia-xam://example_vim!10.1.1.1”; vStatus = XAMLibrary_Connect(vXRI,&vXSystem); XAMLibrary_Connect XAM Storage System10.1.1.1 XAM API Libraryxam.dll snia-xam://example_vim!10.1.1.1 VIM example_vim.dll
XAM Connection Authentication • Once connected, an XSystem Instance must authenticate • XAM uses SASL Authentication Framework • Simple Authentication and Security Layer (RFC 4422) • PLAIN and ANONYMOUS methods always available • More advanced SASL methods (DIGEST-MD5, SECURID, KERBEROS-V5) advertised via XSystem property • .xsystem.auth.SASLmechanism.list.<mechanism> • PLAIN Authentication • “PLAIN\0\0<name>\0<secret>\0”
Storage System Application XAM API XAM Write Example – Authenticate • authDataLength = BuildAuthBuffer(authBuffer,authName, authSecret); • vStatus = XSystem_Authenticate(vXSystem, authBuffer, authDataLength, &authStream); • XStream_Close(authStream); XSystem_Authenticate XAM Storage System10.1.1.1 XAM API Libraryxam.dll “PLAIN\0\0<name>\0<secret>\0” VIM example_vim.dll
xset_handle vXSet; vStatus = XSystem_CreateXSet(vXSystem, XSET_MODE_MODIFY, &vXSet); Storage System Application XAM API XAM Write Example – Create XSet XSystem_CreateXSet XAM Storage System10.1.1.1 XAM API Libraryxam.dll XSET_MODE_MODIFY VIM example_vim.dll
vStatus = XAM_CreateString(vXSet, “com.example.archive.invoice_id”, /* field name */ TRUE, /* binding */ vInvoiceID);/* value */ Storage System Application XAM API XAM Write Example – Add XSet Metadata XAMLibrary_Connect XAM Storage System10.1.1.1 XAM API Libraryxam.dll “com.example.archive.invoice_id” VIM example_vim.dll
XAM Write Example – Write Binary Data • XAM stores binary data in special fields called Streams • Streams present POSIX-like write/read interface xstream_handle vXStream; vStatus = XAM_CreateXStream(vXSet, “com.example.archive.invoice_image",/* field name */ TRUE,/* binding */“image/tiff",/* MIME type */&vXStream); … Transfer Image Data ….vStatus = XStream_Close(vXStream); /* completes write*/
XAM Write Example – Writing XStream Data • Transferring data from a file to an XStream do { /* Read data from file into buffer. */ vReadLength = fread(vDataBuffer, sizeof(char), BUFFER_SIZE,vInputFile); if (vReadLength > 0) { vPosition = 0; while (vPosition < vReadLength) { /* Write data from buffer to XStream. */ vStatus = XStream_Write(vXStream, &vDataBuffer[vPosition], vReadLength - vPosition, &vWriteLength); vPosition += vWriteLength; } /* Update accounting info */ vTotalWrite += vPosition; vTotalRead += vReadLength; } } while (vReadLength > 0);
vStatus = XSet_Commit(vXSet, &vXUID); After Commit, application stores XUID return value permenently, for future XSet retrieval. Storage System Application XAM API XAM Write Content – Commit XSet XSet_Commit XAM Storage System10.1.1.1 XAM API Libraryxam.dll xset_handle VIM example_vim.dll
XAM Write Content – Code Overview // Connect to XSystem vStatus = XAMLibrary_Connect(vXRI, &vXSystem); // XSystem Authentication vAuthDataLength = BuildAuthBuffer(vAuthBuffer, sAuthMethod, sAuthName, sAuthSecret); vStatus = XSystem_Authenticate(vXSystem, vAuthBuffer, vAuthDataLength,&vAuthStream); // Close auth stream vStatus = XStream_Close(vAuthStream); // Create XSet vStatus = XSystem_CreateXSet(vXSystem, XSET_MODE_MODIFY, &vXSet); // Add some identifying metadata vStatus = XAM_CreateInt(vXSet, PROPERTY(format_revision), true, 1); vStatus |= XAM_CreateString(vXSet, PROPERTY(invoice_id), true, vInvoiceID); vStatus |= XAM_CreateString(vXSet, PROPERTY(cust_id), true, vCustomerName)); // Write invoice image data using multiple threads and streams… vStatus = writeFileData(vXSet, vFilename, &vTotalBytes, &vSegmentCount); //Additional metadatavStatus = XAM_CreateInt(vXSet, PROPERTY(image_count), true, vSegmentCount); // Commit XSet vStatus = XSet_Commit(vXSet, &vXUID); // Display resulting XUID XUIDToString(vXUID, &vXUIDString); printf("Successfully wrote XSet, Base64 XUID:\n %s\n", vXUIDString);
XAM Retention Model • Retention is the ability of an XSystem to prevent XSet deletion for an application-specified time period. • All XAM Retention policies have 3 primary aspects: • Enabled: Has the associated retention been turned on? • Duration:How long shall deletion be prevented (in milliseconds)? • Start Time: When does the retention clock start counting down? • XAM Base Retention • Start Time == XSet XUID Time • XAM Event Retention • Start Time is defined at some future date, triggered by business conditions • XAM Hold • “Freezes” an XSet until all holds are released
Opportunities for Research • XAM provides a globally flat namespace. • Reduces management burden for configuring large digital archives (potentially no file systems, LUNs, or databases to configure) • XAM provides a location independent object model for data. • XUIDs are a form of persistent identifier; XAM places no limits on object counts. • XAM defines a canonical format for data migration. • Standardized data exchange format allows customers to migrate data from one system to another without engaging the applications that reference the data. • XAM provides a framework for coupling policy and other metadata together with data. • Archival Information Packages can be stored in one XSET and represented with one XUID.
For Additional Information • SNIA XAM Initiative • XAMI home page - http://www.snia.org/forums/xam/ • XAM API specs -http://www.snia.org/forums/xam/specs • XAM Demo -http://www.snia.org/forums/xam/flshdemo/1282_SNIA_XAM.htm • SNIA Data Management Forum (DMF) • DMF home page - http://www.snia.org/forums/dmf/ • DMF Long Term Archive - http://www.snia.org/forums/dmf/programs/ltacsi/ • DMF 100 Year Survey - http://www.snia.org/forums/dmf/knowledge/100YrATF_Archive-Requirements-Survey_20070619.pdf • Steve Todd’s weblog • http://stevetodd.typepad.com/my_weblog/digital_preservation/ • http://stevetodd.typepad.com/my_weblog/centera/ • EMC Developer’s Network (Centera) • http://community.emc.com/community/edn/centera