130 likes | 293 Views
Condor and Globus. What is condor?. “Condor is a software system that creates a High Throughput Computing (HTC) environment by effectively harnessing the power of a cluster of Unix or NT workstations on a network”.
E N D
Condor and Globus What is condor? “Condor is a software system that creates a High Throughput Computing (HTC) environment by effectively harnessing the power of a cluster of Unix or NT workstations on a network” These resources can be dedicated or non-dedicated pre-existing resources ,e.g. machines on people’s desks
Condor and Globus Some of Condors features:- Checkpointing and migration System of ClassAds for job-resource matching Mixed pools (can mix different flavours of Unix and NT in the same pool) Pools can be given permission to use each others resources … flocking. Resources can be used in a heterogeneous way…different machines and different users can have different setups/priorities.
Condor and Globus Different Condor Universes:- Standard: Executables must be condor_compiled, but has all the desirable features of migration etc. Remote job sees local environment through system calls. Vanilla: When you cannot condor_compile (e.g. commercial package like mathematica). Globus: For Globus enhancement. PVM and MPI … parallel stuff … never looked at them myself.
Condor and Globus Condor’s Limitations:- Mainly in the area checkpointing and migration: Checkpointing doesn’t work for multiprocessor jobs no fork(), exec() etc Checkpointing will be delayed if a socket is left open for a long time. There are others, like no sleep() etc Only vanilla universe available on NT as yet.
Condor and Globus What we have done at IC:- HEP ~50 NT4 ~5 Linux Flocking Flocking DoC ~140 Linux CPUs Biochemistry ~20 Dec Alpha ~10 Linux Boxes 40 NT5 Flocking
Condor and Globus Does it work? Yes, we have run test jobs on all three pools from each of the others. From HEP we have have run cmsim 120 on linux boxes in DoC and BC (yesterday). SICb has been run on the NT. BC about to run Imagic protein reconstruction program on our NT system at night (once licensing has been sorted out)
Condor and Globus Combining Globus and Condor:- • 2 ways: • Use Globus to add nodes to your Condor pool. • This is called glidein . This (I believe) is how the Italians have been doing it. • Use Globus to submit jobs to your Condor pool … more traditional and what we have been doing. Thanks to University of Bristol
So we have something like this: World Outside (Actually gatekeeper.phy.bris.ac.uk) IC HEP
Condor and Globus Problem with passing requirements etc to the condor job. Required kludged version of the gatekeeper from the condor people. Passes the condor requirements via environment variables … but it works!
Condor and Globus globusrun -s -r trpc02.hep.ph.ic.ac.uk -f test2.rsl &(environment=(myvar fred)) (executable="/opt/globus-install/tools/i686-pc-linux-gnu/bin/globus-url-copy") (arguments=$(GLOBUSRUN_GASS_URL)/home/colling/condor_test/perf.out file:/tmp/perf.out) globusrun -s -r trpc02.hep.ph.ic.ac.uk/jobmanager-condor -f test4.rsl &(executable="/tmp/perf.out") (jobtype="condor") (environment=(CONDOR_REQUIREMENTS "Memory>32")(CONDOR_RANK "Mips"))
Condor and Globus #################### # # description file for condor submision # #################### Universe = standard Executable = /tmp/perf.out Requirements = Memory>32 Rank = Mips Environment = GLOBUS_GRAM_MYJOB_CONTACT=URLx-nexus://trpc02.hep.ph.ic.ac.uk:1187/;X509_CERT_DIR=/opt/globus/share/certificates;GLOBUS_ GRAM_JOB_CONTACT=https://trpc02.hep.ph.ic.ac.uk:1186/3912/982745141/;GLOBUS_DEPLOY_PATH=/opt/globus;GLOBUS_INSTALL_PATH=/opt/globus-insta ll;X509_USER_PROXY=/usr/users/collngdj/.globus/.gass_cache/globus_gass_cache_982745143; Initialdir = /usr/users/collngdj Input = /dev/null Output = /usr/users/collngdj/.globus/.gass_cache/globus_gass_cache_982745144 Error = /usr/users/collngdj/.globus/.gass_cache/globus_gass_cache_982745145 queue 1
The to do list:- Try this sort of setup on a real production environment. First candidate will be cmsim production. See where it falls over. Think about how to advertise the condor resources on LDAP … All the other things that will come up
Seminar The Condor view of GRID computing. Miron Livny Imperial College, 28th of Feb, 4pm Lecture theatre 3 All welcome