340 likes | 470 Views
Introduction to Condor. DMD/DFS J.Knudstrup December 2005. Motivation. Need for a system to harvest unused CPU cycles and other resources in a network. What is Condor?. Full-featured batch queue system. Condor is a specialized workload management system for compute-intensive jobs.
E N D
Introduction to Condor DMD/DFS J.Knudstrup December 2005
Motivation • Need for a system to harvest unused CPU cycles and other resources in a network.
What is Condor? • Full-featured batch queue system. • Condor is a specialized workload management system for compute-intensive jobs. • Condor provides a job queuing mechanism, scheduling policy, priority scheme, resource monitoring, and resource management. • Users submit their serial or parallel jobs to Condor • Condor places them into a queue, chooses when and where to run the jobs based upon a policy, carefully monitors their progress, and ultimately informs the user upon completion. • Can be used to manage a cluster of dedicated compute nodes • In addition, unique mechanisms enable Condor to effectively harvest wasted CPU power from otherwise idle desktop workstations. • Condor does not require a shared file system across machines - if no shared file system is available, Condor can transfer the job's data files on behalf of the user • Condor can be used to seamlessly combine all of an organization's computational power into one resource.
History of Condor • Hosted at University of Wisconsin, USA. • Condor project started in 1988. • Directed by Professor M.Livny. • Preliminary version of the Condor Resource Management system implemented in 1986. • Originally focusing on the problem of Load Balancing in a distributed system, • Shifted its attention to Distributively Owned computing environments where owners have full control over the resources they own.
Status • ~17 years development. • Condor team consists of ~30 people. • Available on many platforms. • Basic installation and usage very easy. • Contracted + free support. • Used in research environments and by industry. • Sponsored by various major IT companies and organizations (IBM, Intel, Microsoft, NASA, …).
Architecture • Coordinated by a Central Manager Node. • No central DBMS. • Condor provides set of daemons defining the roles of each node in the pool. • Daemons: • condor_master: Basic coordination on each node. • condor_collector: Collects system information. Only on Central Manager. • condor_negotiator: Assigns jobs to machines. Only on Central Manager. • condor_startd: Executes jobs. • condor_schedd: Handles job submission.
Execute-Only Execute-Only Submit-Only Regular Node Regular Node Central Manager = Process Spawned negotiator collector schedd schedd schedd schedd master master master master master master startd startd startd startd startd Condor Pool - Example = ClassAd Communication Pathway
Personal Condor vs. Condor Pool • Condor Pool: • Collection of several nodes coordinated by one, Central Manager. • Personal Condor: • Condor on one workstation, no root access required, no system administrator intervention needed. • Benefits of ‘pool’ with only one node (same as for a pool): • Schedule large batches of jobs and have these processed in background. • Keep an eye on jobs and get progress updates. • Implement own scheduling policies on the execution order of jobs. • Keep a log of the job activities. • Add fault tolerance to the job execution. • Implement policies for when jobs can run on a workstation.
Dedicated Nodes vs. Non-Dedicated Nodes • Dedicated Node: • Condor has all CPUs at its disposal. • Non-Dedicated Node: • Can't always run Condor jobs. • If user is accessing keyboard/mouse or CPU is used by other processes, the Condor jobs are preempted. • The policies for when Condor jobs can be started and may be preempted, are defined in the Condor Configuration.
Shared File System vs. File Distribution • Use Shared Filesystem if available • Administration and handling easier. • Normally the case for a Dedicated Cluster. • If no shared filesystem? • Condor can transfer files. • Can automatically send back changed files. • Atomic transfer of multiple files. • Data can be encrypted during transfer. • Usually the case for pools with non-dedicated nodes or in a GRID environment.
Submission Node Personal Condor Personal Condor Dedicated Pool Common User Desktop Pool Personal Condor - Condor PoolCondor Flocking
Condor Configuration • Simple format (based on ClassAd). • Possible to use environment variables. • Global configuration and local specific to each node. • Large set of configurable parameters. • Example: CONDOR_HOST = dfo09.hq.eso.org RELEASE_DIR = /home/condor/INSTROOT/ LOCAL_DIR = $(TILDE) LOCAL_CONFIG_FILE = $(LOCAL_DIR)/condor_config.local REQUIRE_LOCAL_CONFIG_FILE = TRUE CONDOR_ADMIN = jknudstr@hq.eso.org MAIL = /bin/mail UID_DOMAIN = hq.eso.org FILESYSTEM_DOMAIN = $(FULL_HOSTNAME) …
Command Line Tools • Many command line tools provided – some of these are: • condor_config_val: Get/set value of configuration parameters. • condor_history: Query job history queue. • condor_off: Stop Condor daemons. • condor_q: Check job queue. • condor_reconfig: Force sourcing of configuration. • condor_rm: Remove jobs from the queue. • condor_status: Status of Condor pool. • condor_submit: Submit a job or a cluster of jobs. • condor_submit_dag: Submit a set of jobs with dependencies. • …
condor_history [condor@dfo09 condor]$ condor_history -l 4174 (ClusterId == 4174) MyType = "Job" TargetType = "Machine" ClusterId = 4174 QDate = 1130316429 Owner = "sinfoni" LocalUserCpu = 0.000000 LocalSysCpu = 0.000000 RemoteUserCpu = 0.000000 RemoteSysCpu = 0.000000 ExitStatus = 0 NumCkpts = 0 NumRestarts = 0 NumSystemHolds = 0 CommittedTime = 0 TotalSuspensions = 0 LastSuspensionTime = 0 CumulativeSuspensionTime = 0 CondorVersion = "$CondorVersion: 6.6.8 Jan 27 2005 $" CondorPlatform = "$CondorPlatform: I386-LINUX_RH9 $" RootDir = "/" Iwd = "/home/condor/data/sinfoni/products/condor/dag/CALIB_2005-10-02-1130316408.83897996" JobUniverse = 5 Cmd = "/home/sinfoni/bin/processAB" …
condor_status [Condor@ngasdev3 condor]$ condor_status Name OpSys Arch State Activity LoadAv Mem ActvtyTime vm1@ngasdev3. LINUX INTEL Owner Idle 0.120 252 0+00:00:04 vm2@ngasdev3. LINUX INTEL Unclaimed Idle 0.000 252 0+00:20:05 vm3@ngasdev3. LINUX INTEL Unclaimed Idle 0.000 252 0+00:20:06 vm4@ngasdev3. LINUX INTEL Unclaimed Idle 0.000 252 0+00:20:07 Machines Owner Claimed Unclaimed Matched Preempting INTEL/LINUX 4 1 0 3 0 0 Total 4 1 0 3 0 0
condor_q [condor@dfo09 condor]$ condor_q -- Submitter: dfo09.hq.eso.org : <134.171.16.145:58750> : dfo09.hq.eso.org ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 4157.0 sinfoni 10/26 10:46 0+00:34:25 R 0 3.2 condor_dagman -f - 4177.0 sinfoni 10/26 10:47 0+00:01:47 R 0 0.0 processAB -a SINFO 4178.0 sinfoni 10/26 10:47 0+00:01:26 R 0 0.0 processAB -a SINFO 4179.0 sinfoni 10/26 10:47 0+00:00:00 I 0 0.0 processAB -a SINFO 4180.0 sinfoni 10/26 10:47 0+00:00:00 I 0 0.0 processAB -a SINFO 4181.0 sinfoni 10/26 10:47 0+00:00:00 I 0 0.0 processAB -a SINFO 4182.0 sinfoni 10/26 10:47 0+00:00:00 I 0 0.0 processAB -a SINFO 4183.0 sinfoni 10/26 10:47 0+00:00:00 I 0 0.0 processAB -a SINFO … 4201.0 sinfoni 10/26 10:47 0+00:00:00 I 0 0.0 processAB -a SINFO 4202.0 sinfoni 10/26 10:47 0+00:00:00 I 0 0.0 processAB -a SINFO 4203.0 sinfoni 10/26 10:47 0+00:00:00 I 0 0.0 processAB -a SINFO 4204.0 sinfoni 10/26 10:47 0+00:00:00 I 0 0.0 processAB -a SINFO 4205.0 sinfoni 10/26 10:47 0+00:00:00 I 0 0.0 processAB -a SINFO 4206.0 sinfoni 10/26 10:47 0+00:00:00 I 0 0.0 processAB -a SINFO 4207.0 sinfoni 10/26 10:47 0+00:00:00 I 0 0.0 processAB -a SINFO 4208.0 sinfoni 10/26 10:47 0+00:00:00 I 0 0.0 processAB -a SINFO 4209.0 sinfoni 10/26 10:47 0+00:00:00 I 0 0.0 processAB -a SINFO 34 jobs; 31 idle, 3 running, 0 held
Requirements for a Condor Job • Must be able to run in the background: no interactive input, windows, GUI, etc. • Can still use STDIN, STDOUT, and STDERR, but files are used for these instead of the actual devices • Organize data files, make data available for the jobs.
Job Job Scheduling Central Manager Negotiator Collector Submit Machine Execute Machine Schedd Startd Starter Shadow Submit
Job Universes • A universe in Condor defines an execution environment. • The universe to use for a job is specified upon job scheduling. • Following universes provided by Condor: • Standard Universe: Close integration between the job and Condor. Application must be re-linked with condor_compile. • Vanilla Universe: Jobs executed as shell commands. Condor collects output and exit status. • PVM Universe: Allows programs written for the Parallel Virtual Machine interface to be used within the Condor environment. • MPI Universe: Allows programs written to the MPICH interface to be used within the Condor environment. • Globus Universe: Provide standard Condor interface to start Globus jobs from Condor. • Java Universe: Execute natively jobs based on Java applications. • Scheduler Universe: Job does not wait to be matched with a machine, it executes right away, on the machine where the job is submitted.
Example Simple Job Submission $ more ~/tmp/Job1.cmd universe = vanilla executable = /bin/sleep output = /home/condor/tmp/Job1.out error = /home/condor/tmp/Job1.err log = /home/condor/tmp/Job1.log arguments = 5 #requirements = (use default requirements) should_transfer_files = NO notification = NEVER queue
Job Monitoring • While the job is running: $ condor_q -- Submitter: ngasdev3.hq.eso.org : <134.171.21.32:35346> : ngasdev3.hq.eso.org ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 11848.0 condor 11/17 09:17 0+00:00:03 R 0 0.0 sleep 5 1 jobs; 0 idle, 1 running, 0 held • After job completion (no other jobs running): $ condor_q -- Submitter: ngasdev3.hq.eso.org : <134.171.21.32:35346> : ngasdev3.hq.eso.org ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 0 jobs; 0 idle, 0 running, 0 held
Job History • Query historical info about job, after it terminated: $ condor_history -l 11848|more (ClusterId == 11848) MyType = "Job" TargetType = "Machine" ClusterId = 11848 QDate = 1132219077 Owner = "condor" ExitStatus = 0 NumRestarts = 0 NumSystemHolds = 0 CommittedTime = 0 TotalSuspensions = 0 CondorVersion = "$CondorVersion: 6.6.8 Jan 27 2005 $" CondorPlatform = "$CondorPlatform: I386-LINUX_RH9 $" RootDir = "/" Iwd = "/diska/home/condor/tmp" JobUniverse = 5 …
Jobs and Resources • Job, requiring certain amount of memory + disk space to run. • The higher the RANK, the better the match. $ more ~/tmp/Job-reqs-ex.cmd universe = vanilla executable = /bin/sleep output = /home/condor/tmp/Job-reqs-ex.out error = /home/condor/tmp/Job-reqs-ex.err log = /home/condor/tmp/Job-reqs-ex.log arguments = 5 Requirements = Memory >= 256 && Disk > 10000 Rank = (KFLOPS*10000) + Memory should_transfer_files = NO notification = NEVER queue
Submitting Clusters of Jobs # Example condor_submit input file that defines # a cluster of 600 jobs with different directories Universe = vanilla Executable = my_job Log = my_job.log Arguments = -arg1 –arg2 Input = my_job.stdin Output = my_job.stdout Error = my_job.stderr InitialDir = run_$(Process) Queue 600
File Transferring universe = vanilla executable = /home/condor/bin/process-files.py output = /home/condor/data/out/transferdata1.out error = /home/condor/data/err/transferdata1.err log = /home/condor/data/log/transferdata1.log arguments = input1.in input2.in input3.in requirements = should_transfer_files = YES when_to_transfer_output = ON_EXIT transfer_input_files = input1.in,input2.in,input3.in transfer_output_files = input1.out,input2.out,input3.out notification = NEVER Queue
DAGs • Directed Acyclic Graph (DAG). • Represents a set of jobs with mutual dependencies. • Corresponds to the “Cascade” in the ‘DFS world’. • Has to specify a DAG Submission file which makes references to Job Submission Files. • Submitted with condor_submit_dag. • Controlled by DAGMan utility running as a normal Condor job. • Possible to make DAGs of DAGs.
Example simple DAG Job Job-1-1 /home/condor/tmp/dag-ex1/Job-1-1.cmd Job Job-2-1 /home/condor/tmp/dag-ex1/Job-2-1.cmd Job Job-2-2 /home/condor/tmp/dag-ex1/Job-2-2.cmd Job Job-3-1 /home/condor/tmp/dag-ex1/Job-3-1.cmd PARENT Job-1-1 CHILD Job-2-1 PARENT Job-1-1 CHILD Job-2-2 PARENT Job-2-1 Job-2-2 CHILD Job-3-1 DOT dag-ex1.dot DONT-OVERWRITE UPDATE 1-1 2-2 2-1 3-1
DAG Execution Status $ condor_submit_dag dag-ex1.dag ----------------------------------------------------------------------- File for submitting this DAG to Condor : dag-ex1.dag.condor.sub Log of DAGMan debugging messages : dag-ex1.dag.dagman.out Log of Condor library debug messages : dag-ex1.dag.lib.out Log of the life of condor_dagman itself : dag-ex1.dag.dagman.log Condor Log file for all Condor jobs of this DAG: dag-ex1.dag.dummy_log Submitting job(s). Logging submit event(s). 1 job(s) submitted to cluster 11849. -----------------------------------------------------------------------
DAG Visualization • Possible to visualize DAG. • DAGMan process produces snapshot files showing the status of the DAG execution. • Can be processed with the Graphviz package:
Scheduling Vizualization • Example of monitoring a running DFO/QC DAG/Cascade:
Recipe B/1 Recipe C/1 Recipe A/1 Recipe B/2 Recipe C/2 Recipe A/2 . . . . . . . . . Recipe B/N Recipe C/N Recipe A/N Pipeline Cascade • A science pipeline cascade may look like this: Postproc Preproc
DFS Condor Integration Activities • In the process of defining an environment for the Condor/BQS for DFO/QC and Paranal. • A few tools implemented to facilitate the interaction with Condor. • Will purchase blade systems for DFO/QC and for Paranal (+ file servers based on a fiber channel network). • At Paranal Condor might be controlled directly from the Data Organizer (new implementation). • Will use shared file system for all nodes in the cluster (RedHat Global File System). • Blade systems at HQ, will be closely integrated with the archive for fast file access (Fast Cache Archive).
Dedicated Pool Blade Systems Submit Node Submit Node Submit Node Personal Condor Personal Condor Personal Condor Personal Condor Personal Condor Condor GFS RedHat Global File System http://www.redhat.com/software/rha/gfs DFO/QC Condor Pool(s)
More Info • Condor WEB site: http://www.cs.wisc.edu/condor " ... Since the early days of mankind the primary motivation for the establishment of communities has been the idea that by being part of an organized group the capabilities of an individual are improved. The great progress in the area of inter-computer communication led to the development of means by which stand-alone processing sub-systems can be integrated into multi-computer 'communities'. ... " Miron Livny (Creator of Condor)