330 likes | 477 Views
Grid-Based File Access: The Legion I/O Model. Brain S White, Andrew S Grimshaw Any Nguyen-Tuong Department of Computer Science University of Virginia. Overview. Overview of Legion I/O User Interface Server Implementation Performance. Design Principle of Legion.
E N D
Grid-Based File Access: The Legion I/O Model Brain S White, Andrew S Grimshaw Any Nguyen-Tuong Department of Computer Science University of Virginia
Overview • Overview of Legion • I/O User Interface • Server Implementation • Performance
Design Principle of Legion • Object-based grid operating system • a single virtual system • a collection of heterogeneous resources • extensible and replaceable components • Specify the functionality (interface), not the implementation • reference implementation of core objects only • users are encouraged to use their own implementation for specific requirements.
Grid File Systems • Performance • Low response times • High throughput • Usability • Minimizing any change to legacy code • Interface which hides the grid environment • Rich interface to take full advantage of grid’s potential
LOID Object Address Context Name Context Space Binding Process Naming and Binding • Three-level naming hierarchy • Name-space is hierarchical and rooted • Human-readable context name • location-independent identifiers (LOID – Legion Object Identifiers) via ContextObjexts • Object Addresses
I/O User Interface • Familiar Interfaces • Command Line Utilities • e.g. legion_ls, legion_cat and etc • Remote I/O Interface • Basic I/O Interface • raw and buffered • Low impact buffered interface • Legion-aware nfsd • Parallel I/O Interface
Command Line Utilities • Use command to navigate context space and manipulate its structure • UNIX-like commands: • legion_ls, legion_cat, legion_cat and etc • Bridging the gap between context space and traditional file system • legion_import_tree • copies a local directory tree, creating a Legion object for each subdirectory and file • legion_export_dir • makes a UNIX directory visible in context space, without creating stand-alone objects for each contained file and subdirectory
Basic I/O Interface • C and C++ I/O library • Follow the naming and arugment conventions of their C counterparts. • Encapsulate the lower-level BasicFileObject primitives • Raw I/O interface • fine-grained file accesses are relatively expensive • unoptimized Legion protocol stack • Buffered I/O interface • C, C++ and Fortran • reduce the frequency of remote procedure calls
Low Impact Buffered Interface • Give the user the option of making the smallest number of changes to their program and still use Legion file objects • Transfer the contents of BasicFileObjects between the local file system • lio_legion_to_tempfile • lio_tempfile_to_legion • Similar to GASS staging
Legion-aware nfsd (lnfsd) • acts as ordinary nfs between the kernel client and Legion • receives NFS request from the kernel • translates into Legion method invocations • return result in NFS format to client • Performance Issue • use larger granule to reduce overhead • read ahead (asynchronous prefetch) • asynchronous write-behind
Legion-aware nfsd • Security • Legion users on a Legion host trust privileged processes on that host • User credentials are stored in /tmp file system • lnfsd only accepts connections made from a reserved port • The NFS client and lnfsd are collocated on the same host
Parallel I/O Interface • Synchronous or asynchronous interface • Allow user-specified striping of data across BasicFileObjects • Allow multiple clients to access the data without contending with one another at a central server object • Individual client performance benefits because multiple BasicFileObjects may be accessed concurrently to deliver the desired data
Server Implementation • BasicFileObject • each object represents exactly one file • persistently stored in a LegionBuffer, a random-access array • LegionBuffer is stored in a VaultObject (storage server) • ProxyMultiObject • serves both ContextObject and BasicFileObject • changes to a BasicFileObject’s contents are immediately and automatically reflected to the user’s file
Performance • Compared with standard ftp transfer and Globus GASS • Environment • an SGI Origin 2000, 56 processors, Irix 6.5 at NCSA in Champaign, Illionis • dual processor 400Mhz Penitum II, Linux kernel 2.2.14, University of Virginia in Charlottesville, Virginia • OC-12 backbone and OC-3 intermediate connection • Experiment • Transfer data from (and written to) an NFS mount on the SGI Origin array • Security option is disabled
Protocol Overhead • The fixed cost of connection setup and tear-down
Protocol Overhead • Significant overhead in Legion mechanism • location-independent naming scheme which requires several context name/LOID translations and LOID/OA bindings • an expensive remote call to BasicFileObject
Bandwidth Measurements • File transfers of various sizes • Result summary • 70-85% of ftp’s read bandwidth for mass transfers • Achieves 55-65% of ftp’s write bandwidth • Suffers owing to its protocol overhead for sizes less than 1MB
Legion Read Bandwidth • low impact interface is similar to basic I/O interface • lnfs throughput is limtied due to • lnfs periodically queries the remote BASICFileObject to satisfy GETATTR requests • max transfer size is restricted to a 4K page.
Legion Write Bandwidth • low impact interface is worse than basic I/O interface and lnfs since file must be first, modified, and written back
Read Bandwidth • ftp and GASS has similar performance • basic I/O and low impact I/O approach 70-85% of ftp bandwidth due to extensible protocol stack • lnfs lags significantly due to more RPC calls and sophisticated block cache
Write Bandwidth • Legion basic I/O outperforms GASS and low impact I/O because it does not need copy-in/copy-out • lnfs lags significantly due to sophisticated block cache
Reference • The Core Legion Object Model, Mike Lewis,Andrew Grimshaw, August 1995, Technical Report • Architectural Support for Extensibility and Autonomy in Wide-Area Distributed Object Systems, A S Grimshaw, M J Lewis, A J Ferrarri and J F Karpovich. June 3 1998, Technical Report • GASS: A Data Movement and Access Service for Wide Area Computing Systems. J Bester, J Foster, C Kesselman, J Tedesco and S Tuecke, May 1999, Proceedings of the Sixth Workshop on Input/Output in Parallel and Distributed Systems • A Flexible Security System for Metacomputing Environments. A Ferrari, F Kanbe, M Humphrey, S Chapin and S Grimshaw. December 1998. Technical Papers. • The Legion Research Group. Legion 1.7 user manual • The Legion Research Group. Legion 1.7 development manual
Related Work • NFS • artificially restricted transfer sizes based on the virtual memory architecture • cache consistency mechanism limits throughput • AFS • transfers and copies entire files • unnecessary traffic when a dataset is partitioned between multiple distributed works • cache consistency mechanism limits throughput
Related work • Globus GASS • remote access through • x-gass • ftp • HTTP • whole-file caching • streaming append operation • specialized calls are cumbersome
Legion Interface • Method calls are non-blocking and may be accepted in any order by the called object • Legion class interface can be described in an Interface Description Language. CORBA IDL and MPL are initially supported.
Legion I/O Objects • HostObjects • computational resources running in a Legion system • VaultObjects • persistent storage of inactive Legion object • BasciFileObjects • corresponding to files in a conventional file system • ContextObjects • analogous to distributed rooted directory tree
Context Name • Context name • Use UNIX-like structure • e.g. /User1/ContextA/Foo • Single object with multiple context name
Duplicate Request Cahce • A short-term memory mechanism in which the original completion status of a request is remembered and the operation attempted only once • If a duplicate copy of this request is received, then the original completion status is returned
LegionBuffer • The fundamental data container in the Legion Library. Legionbuffer exports operations to read and write data from and to a logical buffer • Implementations for in-memory buffers, Unix file buffers and Legion file objects • compress or encrypt data