800 likes | 965 Views
Legion: The Grid OS Architecture and User View. Anand Natrajan ( anand@virginia.edu ) Marty Humphrey ( humphrey@cs.virginia.edu ) The Legion Project, University of Virginia ( http://legion.virginia.edu ). Grid Environment. Disjoint file systems Disjoint namespaces
E N D
Legion: The Grid OSArchitecture and User View Anand Natrajan (anand@virginia.edu) Marty Humphrey (humphrey@cs.virginia.edu) The Legion Project, University of Virginia (http://legion.virginia.edu)
Grid Environment • Disjoint file systems • Disjoint namespaces • Multiple administration domains • Unpredictable load, availability, failures • Security problems • Computers • Networks • People • Data • Devices
Wide-area High Performance Complexity Management Extensibility Security Site Autonomy Input / Output Heterogeneity Fault-tolerance Scalability Simplicity Single Namespace Resource Management Platform Independence Multi-language Legacy Support Grid OS Requirements
MPI / PVM P-space studies - multi-run Parallel C++ Parallel object-based Fortran CORBA binding Object migration Accounting Remote builds and compilations Fault-tolerant MPI libraries Post-mortemdebugger Console objects Parallel 2D file objects Collections Licence support Tools
Mentat Avaki Legion Commercial Support - Avaki Corp. Web • Venture funded • Headquartered in Boston • Growing number of employees • Multi-tiered support offering
Protein Folding with CHARMM Molecular Dynamics Simulations 100-200 structures to sample (r,Rgyr ) space r Rgyr
Resources Available IBM SP3 UMich 375MHz Power3 24/24 HP V-class CalTech 440 MHz PA-8700 128/128 DEC Alpha UVa 533MHz EV56 32/128 IBM Blue Horizon SDSC 375MHz Power3 512/1184 Sun HPC 10000 SDSC 400MHz SMP 32/64 IBM Azure UTexas 160MHz Power2 32/64
Transparent Remote Execution • User initiates “run” • User/Legion selects site • Legion copies binaries • Legion copies input files • Legion starts job(s) • Legion monitors progress • Legion copies output files
Mechanics of CHARMM Runs Register binaries Legion Create task directories & specification Dispatch runs Dispatch more runs
Types Of Applications • Legacy applications • Legion-aware applications • I/O library • 2D file object • Applications Using Stdgrid • Parameter Space Studies • Parallel Programs • MPI, PVM, MPL, Basic Fortran Support (BFS)
Grid Application Requirements • Security • Fault-tolerance • Heterogeneity • Collaboration • … • Legion supports these and other needs
Heterogeneous Runs BT-Med Ocean Model
Cross-Organisation Collaboration • Different companies • Proprietary simulations and data • Each needs the other • Form virtual partnership
Windows NT, 2K, 98, 95 Sun (Solaris) SGI (Irix, Origin) Intel (Linux, Free BSD) DEC (Unix, Linux) Cray (T90, T3E) IBM (AIX, SP-2) HP (HPUX) Codine LoadLeveler Maui PBS NQS LSF Platforms
Applications • Biochemistry and Molecular Science • Information Retrieval • Materials Science • Climate Modelling • Neuroscience • Aerospace • Astronomy • Graphics NPACI - SDSC, UCSD, Caltech, UTexas, Umich, UCB, UVa. DoD MSRCs - NAVO & ARL, NASA Ames
User View Command-Line Interface
Setup • Setup shell environment variables . ~legion/setup.sh OR export LEGION=/home/legion/Legion export LEGION_OPR=/home/maya/OPR . $LEGION/bin/legion_env.sh • Specifies where binaries and configuration files can be found • Sets root context
Login • Authentication to system legion_login /users/stephen • Currently uses password - other mechanisms, e.g., Kerberos ticket possible • Login object (a.k.a. Authentication object) - /users/stephen - is user’s proxy to world • Login object generates certificate identifying user
Context Space / • Unix-like legion_ls legion_pwd legion_cd legion_cat ... hosts home users mach1 mach2 mydir you me subdir prog file1 tty
Context Space • Network-wide, transparent file system • Location-independent read/write of files • Convenient transfer of files between context space and local file system • I/O libraries for access • Unix-like utilities
Context Example legion_ls /
Another Context legion_ls /hosts
Yet Another Context legion_ls /users
Other Context Commands • Locate a LOID in context space legion_list_names • Locate an object on a machine legion_whereis • Find status of an object legion_object_info • List metadata of an object legion_list_attributes
Status Of An Object legion_object_info -c work
Physical Location Of Object legion_whereis -c work
Context Space vs. Local Space • Local space = your machine’s directory structure • OS-specific, Machine-specific • Use cp, copy, etc. • e.g., C:\Program Files\, /usr/bin, /mnt/disk1 • Context space = Legion’s directory structure • OS-independent, Machine-independent • Uselegion_cp, etc.
Context Space and Local Space • Transfer one file from local space to context space legion_cp -localsrc <localfile> <contextfile> • Transfer one file from context space to local space legion_cp -localdest <contextfile> <localfile>
Context Space and Local Space • Copying local directory to context space legion_cp -r -localsrc <localdir> <contextdir> OR legion_import_tree <localdir> <contextdir> • Copying context directory to local space legion_cp -r -localdest <contextdir> <localdir>
Context Space and Local Space • Map (not copy!) local directory to context space temporarily legion_export_dir <localdir> <contextdir> • Does NOT make copy of local directory • Merely provides Legion-like access to local directory • Use legion_cat on local files
Making Context Space… • Local sub-directory with Legion NFS daemon • Use cat on context files • FTP directory with FTP interface • Windows directory with Samba interface • URL tree with HTTP interface
I/O Performance • X-Axis = number of clients simultaneously performing 1MB reads on 10MB files • Y-Axis = total read bandwidth • Each point = average of multiple runs • Clients = 400MHz Intels, NFS Server = 800MHz Intel
Making Context Space… • Local sub-directory with Legion NFS daemon • Use cat on context files • FTP directory with FTP interface • Windows directory with Samba interface • URL tree with HTTP interface
Flexible Context Space Disk e ftp Directory NFS HTTP Samba FTP Context Context Context legion_export_dir legion_import_tree Disk Directory Context Directory Directory
Access Control • MayI for each object implements access control on a per-function basis • Users named by login object • Sets of users grouped by contexts legion_change_permissions [+-rwx] [-v] <group/user context> <target context> legion_change_permissions +r /users/fred /home/grimshaw/myfile
Unified Console TTY File Program produces stdout, stderr User creates tty object Prog. User shares tty LOID User starts running program Legion passes tty LOID to program User shares tty LOID
TTY Object • Redirect run-time output to central (or multiple) consoles • Connect and disconnect dynamically • Debug quickly and simply • Monitor status, errors, easily • Share console with others legion_tty <ttyobj>
User View Web Interface