170 likes | 294 Views
SlashGrid (“/gridâ€). a framework for Grid-aware filesystems. Motivation: dynamic-accounts issues Local storage: implementation alternatives Generalisation: remote file access Implementation: Coda, ACLs, plugins Current status Future work. Andrew McNab - Manchester HEP - 29 January 2002.
E N D
SlashGrid (“/grid”) a framework for Grid-aware filesystems • Motivation: dynamic-accounts issues • Local storage: implementation alternatives • Generalisation: remote file access • Implementation: Coda, ACLs, plugins • Current status • Future work Andrew McNab - Manchester HEP - 29 January 2002
Motivation: dynamic accounts • For TB1 we provided a patch for Globus gatekeeper, gsi_wuftpd etc to associate Unix UIDs from a pool with the Grid DN identities of incoming requests. • This is ok when all jobs do on the machine in question is computation. • But (1) any files created by pool UID need to cleaned up before account can be reallocated. • But (2) no good for long term storage, since no promise to maintain UID-DN association in long term. • But (3) what if malicious user creates a cron entry, writes to some obscure writeable directory we didn’t think of, etc? Andrew McNab - Manchester HEP - 29 January 2002
Solution: get away from UID filesystems • All these problems are fundamentally because files are owned according to UID, but we want UID to have no long term meaning. • Obvious solution is to have a filesystem where file ownership depends on Grid DNs not temporary UIDs. • Can then ban user processes from writing anywhere else (straightforward to impose this with a modified ext2 device driver: eg no disk files can be created if UID > 99) • UID becomes as transitory as Process Group ID. • Problem now becomes: how to implement a DN/Grid aware filesystem? Andrew McNab - Manchester HEP - 29 January 2002
Implementation alternatives • 1) Fake a filesystem by making user process use modified versions of open(), read() etc system calls. • Can do this by relinking, or by an interposition / bypass library that is preloaded before real, shared libc. • But, this cannot enforce access restrictions on files accessible on local disk (since you can use a static binary and ignore permissions) • Need to put filesystem behind a server, accessed via TCP ports, named pipe, or shared memory (all the usual X tricks.) This going to be slow for streaming large files: the very thing we need to be fast. • 2) Put filesystem into kernel • Lets kernel enforce access control. Potentially as fast as normal disk. • User space daemon useful to parse proxies, and do any remote IO. Andrew McNab - Manchester HEP - 29 January 2002
Coda • A suitable kernel module already exists for Linux: Coda • introduced into main kernel tree in 1997 (during 2.1) and present in all 2.2 and 2.4 stable kernels. • This is part of the Coda project at CMU, an open source fork of AFS2. • Very similar architecture to AFS • Kernel module and client side cache daemon (Venus) • Kerberos based • Already used “parasitically” by other Linux projects • eg AVFS maps files to virtual filesystems (eg cd into a tar file…) • Coda kernel module / Venus also available for *BSD and Windows 98/NT upwards. Andrew McNab - Manchester HEP - 29 January 2002
Implementation with Coda • Coda kernel module talks to client cache daemon by exchanging messages via /dev/cfs0 • Since we already have the kernel module, we just need to write a Venus-like daemon: SlashGrid (“/grid”) • Coda implementation allows efficient streaming: • open(), close(), stat() handled by calls to Venus/SlashGrid daemon • coda_open call returns the inode of the cached copy to the kernel • subsequent read() and write() operations handled by kernel itself, without daemon being involved. • So streaming a local copy is just as fast as reading/writing a normal disk file. • Since SlashGrid called for open()’s etc, can enforce DN based access control at that point. Andrew McNab - Manchester HEP - 29 January 2002
Standard Unix System calls with SlashGrid User process User process ordinary directory /grid/... SlashGrid open() read() stat() read() write() open() stat() /var/spool/slashgrid/fcache /dev/cfs0 kernel a real (ext2) disk Andrew McNab - Manchester HEP - 29 January 2002
Remote file access • Another idea that has been around a while: AFS-like system using Grid protocols. • All the usual advantages of a global filesystem • Makes a lot of the tedious management of “parameter” files needed by jobs just another operating system service. • Very useful for interactive users: they just see the Grid as one big file system. • Makes all applications (even ls) Grid-enabled immediately. • Already using URLs to refer to remote files, so easy to find an appropriate mapping into a filesystem space. • So we want to design a system that can be generalised to remote file access too. Andrew McNab - Manchester HEP - 29 January 2002
ACL format • Need to specify permissions in some way. • Commonly used compromise between granularity and simplicity is the per-directory ACL (cf AFS) • We’ve used the same format as the GridSite website management system (used for WP6 and GridPP websites): • admin: can modify ACL • write: can write/create files • list: can get a directory listing • read: can read a named file • ACL consists of lines: <level> <DN/group> • Currently only implement <DN> but in future will add VO groups, CAS authorisation symbols etc (when dust settles...) Andrew McNab - Manchester HEP - 29 January 2002
ACL implementation • Each directory has, or appears to have, a read-only file .grid-acl consisting of ACL lines in <level> <DN/group> format. • Can easily be transferred via existing protocols • eg if cache daemon fetches a file from a remote gsi-ftp server, can fetch the .grid-acl from the same directory without modifying gsi_wuftpd or GridFTP protocol. • Modification of ACL done by accessing “virtual files” - these operations are trapped by SlashGrid and ACL updated • cf. Coda’s .CONTROL mechanism • eg remove file .grid-acl-write-%%url-encoded-DN%% to change the DN’s permission level to write • Provide command line tools to hide this from users Andrew McNab - Manchester HEP - 29 January 2002
Plugin framework • Avoid making a monolithic system since: • Lots of interesting filesystems possible: anon ftp, http, https, gsi-ftp, rfio, ldap, SQL databases (cf. Oracle 8i) … • Lots of uncertainty about which caching strategies to use. • Some people will want some but not all of this on their systems. • Have /etc/slashgrid.conf that specifies mount points and then which loadable module handles which part of the file system (cf. /etc/fstab) • At start time, load dynamic modules which all export a common API. • SlashGrid daemon hands each request to the right plugin • user: stat() => coda_getattr => PluginStat() => plugin: stat() Andrew McNab - Manchester HEP - 29 January 2002
Example configuration /etc/slashgrid.conf [/] plugin=certfs.so [/gsiftp] plugin=gsiftpfs.so /grid - mount point for Coda kernel module fs /var/spool/slashgrid/fcache/ => /grid/ /var/spool/slashgrid/fcache/tmp/ => /grid/tmp/ /var/spool/slashgrid/fcache/gsiftp/ => /grid/gsiftp/ /usr/lib/slashgrid/plugins/certfs.so, gsiftpfs.so ... Andrew McNab - Manchester HEP - 29 January 2002
Remote file access strategies • SlashGrid framework allows several options: none “the best” • simplest: make a local copy when the coda_open call is received, and return the copy’s inode when transfer finishes • ok for small files • awful for very big files: need lots of disk cache and have to wait • pure streaming: plugin forks a process to stream the file from remote server; makes a temporary named pipe and returns its inode to kernel; writes incoming file to pipe; kernel (and therefore user) read file as it comes in; tidy up pipe when coda_close received. • good when we have a copy on a “close” file server (cf. NFS) • both: stream file down a named pipe, but keep a copy too. • Writing even more complicated: when to transfer local write-cache? • do we need consistency for different machines viewing the same server? Andrew McNab - Manchester HEP - 29 January 2002
Current status • Have implemented SlashGrid daemon and one plugin to provide local file storage with ACLs (certfs.so) • SlashGrid obtains DN of a UID from /tmp/x509up_uUID • so you do grid-proxy-init to get started • stat / read / creat / mkdir / write / remove / rename / chmod system calls working for files and directories • can already do normal shell commands (ls etc), edit files with emacs, even copy the SlashGrid and certfs sources into the filesystem and build them with make and gcc. • some things not yet done • hard and soft links (means I can’t try building a Linux kernel yet…) • modifying ACL’s - have to be set manually as root still Andrew McNab - Manchester HEP - 29 January 2002
Future work • Finish certfs and ACL tools • Implement an example remote IO plugin • probably anonymous ftp since simplest • Document the plugin API • Encourage other people to write plugins for things they need. • Write plugins for the major protocols: gsi-ftp and https • Investigate specialised filesystems for dynamic accounts, automated cleanup, extra logging / auditing, ... • Look at porting to other OS’s: • Coda kernel module exists for *BSD and Windows already • The Linux Coda module was only 4000 lines of C... Andrew McNab - Manchester HEP - 29 January 2002
Conclusion • Have implemented a read/write filesystem for Linux, based on Grid DNs rather than Unix UIDs. • Have done this in an extendable way using plugins for different filesystem types. • Should be straightforward to write a plugin for your favourite remote file access protocol. • System is efficient for streaming local copies of files • But can still accommodate many different strategies for fetching, caching and streaming files from remote servers. • (Thanks to Anders, Cal and Fabio of Integration Team for useful discussions about all these issues.) Andrew McNab - Manchester HEP - 29 January 2002
More information... • mcnab@hep.man.ac.uk • (now) • http://www.gridpp.ac.uk/slashgrid/ • (later today) • WP6 CVS repository • (later this week) Andrew McNab - Manchester HEP - 29 January 2002