110 likes | 354 Views
GRAM: Globus Resource Allocation and Management. GRAM designed to provide a single common protocol and API for requesting and using remote system resources uniform, extensible interface local job scheduling systems. API for submitting and canceling a job request
E N D
GRAM: Globus Resource Allocation and Management • GRAM designed to provide a single common protocol and API for requesting and using remote system resources • uniform, extensible interface local job scheduling systems. • API for • submitting and canceling a job request • checking the status of a submitted job. • Specifications written by user using Resource Specification Language (RSL) • processed by GRAM as part of the job request. • By design, GRAM does not guarantee user environments on remote hosts.
GRAM: • Resource • An entity capable of running one or more processes on behalf of a user. • Client • The process that is using the resource allocation client-side API. • Job • A process or set of processes resulting from a job request. • Job Request • A request to gatekeeper to create one or more job processes, expressed in the RSL. • Gatekeeper • A process, running as root, which begins the process of handling allocation requests. It exists on the remote computer before any request is submitted. • When the gatekeeper receives an allocation request from a client, it mutually authenticates with the client, maps the requestor to a local user, starts a job manager on the local host as the local user, and passes the allocation arguments to the newly created job manager. • Job Manager • One job manager is created by the gatekeeper to fulfill every request submitted to the gatekeeper. • It starts the job on the local system, and handles all further communication with the client.
RSL Resource Specification Language • RSL is a type of formal language • Has its own syntax and parsing rules • Works like unix regular expressions and shell scripts • We will cover in more detail next week • Ugly…users hate it… • & (rsl_substitution = (TOPDIR "/home/nobody") (DATADIR $(TOPDIR)"/data") (EXECDIR $(TOPDIR)/bin) ) (executable = $(EXECDIR)/a.out (* ^-- implicit concatenation *)) (directory = $(TOPDIR) ) (arguments = $(DATADIR)/file1 (* ^-- implicit concatenation *) $(DATADIR) # /file2 (* ^-- explicit concatenation *) '$(FOO)' (* <-- a quoted literal *)) (environment = (DATADIR $(DATADIR))) (count = 1) Performing all variable substitution and removing comments yields an equivalent RSL string: • & (rsl_substitution = (TOPDIR "/home/nobody") (DATADIR "/home/nobody/data") (EXECDIR "/home/nobody/bin") ) (executable = "/home/nobody/bin/a.out" ) (directory = "/home/nobody" ) (arguments = "/home/nobody/data/file1" "/home/nobody/data/file2" "$(FOO)" ) (environment = (DATADIR "/home/nobody/data")) (count = 1)
GRAM Job Execution Environment • HOME • The user's home directory. • LOGNAME • The user's login name. • X509_USER_PROXY • The path to the job manager's delegated credential. (GSI only). • GLOBUS_GRAM_JOB_CONTACT • The job manager's contact string for this job. • GLOBUS_GRAM_MYJOB_CONTACT • The GRAM MyJob contact string for intrajob communication. • GLOBUS_LOCATION • The path to the Globus installation on the job manager host. • X509_CERT_DIR* • The path to a trusted certificate directory. This variable will only be set if the -x509-cert-dir argument is given to the job manager. • GLOBUS_GASS_CACHE_DEFAULT* • The path to the job's GASS cache (if the gass_cache RSL attribute is present). • GLOBUS_TCP_PORT_RANGE* • A system-specific range of TCP ports which may be used by the job. Globus I/O will automatically honor this range. Only present if the related configuration option is present in the job manager configuration file. • GLOBUS_REMOTE_IO_URL* • The path to a file containing a URL string of a GASS server which the job may access (if the remote_io_url attribute is present).
Globus client tools • Use these to submit jobs to GRAM and get remotes tasks done: • globusrun – most basic way • globus-job-run powerful • Test authentication (effectively ‘ping’): $ globusrun -a -r cab047.info.uvt.ro GRAM Authentication test successful • Execute remote simple Unix commands • $globus-job-run blue.info.uvt.ro /bin/uname -a Linux blue 2.6.16-hardened-r10 #2 SMP Fri Sep 1 22:36:46 EEST 2006 i686 Intel(R) Xeon(TM) CPU 2.40GHz GenuineIntel GNU/Linux
Globus Jobs: complex but powerful • Most of these require RSL input • Typically used for batch job submission (e.g. jobs submitted to a queuing system on a cluster) • globus-job-get-output • globus-job-run • globus-job-status • globus-job-submit • globus-job-cancel • globus-job-clean
Globus-job-run: used to run code remotely • Run on single node of remote cluster [dana@Hport ~]$ globus-job-run cab047 ./hello.sh Hello from cab047 [dana@Hport ~]$ • Run on multiple CPU’s: [dana@Hport ~/.globus]$ globus-job-run cab047 -np 4 /home/dana/hello.sh Hello from cab047 Hello from cab047 Hello from cab047 Hello from cab047
globus-job-run: Examples • Run multiple commands: globus-job-run cab047 /bin/sh -c “cd my_dir ; ls” • Run several mpi jobs: globus-job-run \ -: wn01 -np 64 -s my-aix-exec \ -: nanosim1 -np 128 -s my-linux-exec • For help: globus-job-run -help Examples taken from NPACI training class (L. Brieger)
globus-job-submit: Remote batch jobs • For help: globus-job-submit -help • To submit jobs to the remote batch scheduler %globus-job-submit \ cab047/jobmanager-batch \ -queue normal -np 4 /home/dana/mpi/little https://cab047.info.uvt.ro:44864/68982/1047069851/ ( jobID in response to submission )
Job management • Use jobID to check on job status: globus-job-status https://cab047.info.uvt.ro:44864/68982/1047069851 PENDING …ACTIVE…DONE • Use jobID to retrieve output or cancel job globus-job-get-output \ https://cab047.info.uvt.ro:44864/68982/1047069851 globus-job-cancel \ https://cab047.info.uvt.ro:44864/68982/1047069851 • Use jobID to clean up cached output from job (on remote machine): globus-job-clean https://ca047.info.uvt.ro:44864/68982/1047069851