160 likes | 301 Views
BOSS: a tool for batch job monitoring and book-keeping. Claudio Grandi (INFN Bologna). BOSS. “Batch Object Submission System” Is a tool for job monitoring and book-keeping Allows to deal with job-specific information
E N D
BOSS:a tool for batch job monitoring and book-keeping Claudio Grandi (INFN Bologna) CHEP'03 Conference, San Diego
BOSS “Batch Object Submission System” Is a tool for job monitoring and book-keeping Allows to deal with job-specific information Is not a job scheduler, but can be interfaced with most schedulers: LSF (CERN, INFN) PBS (Bristol, Caltech, UFL, Imperial College, INFN) FBSNG (Fermilab) Condor (INFN, U.Wisconsin) Has been designed to work on computing farms Is compatible with use on a WAN, but is not robust against network failures (yet) CHEP'03 Conference, San Diego
Basic BOSS components boss executable: the BOSS interface to the user MySQL database: where BOSS stores job information jobExecutor executable: the BOSS wrapper around the user job dbUpdator executable: the process that writes to the database while the job is running Local scheduler may be a “Grid” scheduler CHEP'03 Conference, San Diego
Wrapper farm node farm node Basic flow Accepts job submission from users Stores info about job in a DB Builds a wrapper around the job (jobExecutor) Sends the wrapper to the local scheduler The wrapper sends to the DB info about the job BOSS Local Scheduler boss submit boss query boss kill BOSS DB CHEP'03 Conference, San Diego
User defined information User registers a job type: Schema for the information to be monitored A new table is created in the BOSS database with a defined structure Algorithms to retrieve the information from the job The user programs (filters) are stored in the database as blobs User submits jobs: One or more job types can be specified for the job A new entry is created for the job in the database tables The filters are extracted from the database and made available to the running job CHEP'03 Conference, San Diego
BOSS DB journal #!/usr/bin/perl $i = 0; while($i<3){ sleep(1); $i++; print "counter $i\n"; } User job test JOBID COUNTER 12345 0 stdout The job interface to BOSS The job interfaces to BOSS are its standard input, output and error streams The user defined algorithms are filters that read stdin/out/err and write key=value pairs The keys are the user-defined schema variables 1234 JOB T_START xxx 1234 JOB …… …… 1234 test counter 1 BOSS dbUpdator 1234 test counter 2 1234 test counter 3 2 3 1 1234 JOB …… …… 1234 JOB T_STOP yyy #!/usr/bin/perl while(<STDIN>){ if($_=~/.*counter\s+(\d+).*/){ print “COUNTER=$1\n"; } } counter 1 BOSS jobExecutor counter 2 counter 3 COUNTER=3 COUNTER=2 COUNTER=1 Filter CHEP'03 Conference, San Diego
STDOUT BOSS DB OUT pipe tee dbUpdator tee pipe Journal tee pipe USER LOG STDIN tee pipe jobExecutor tee ERR pipe Filter pipe RunTime Filter pipe RunTime Filter pipe RunTime STDERR User supplied or returned to the user Temporary processes and files BOSS Processes and files Standard input or output Standard error Other I/O streams Runtime data flow CHEP'03 Conference, San Diego
Information line Header line Width of 1st field Width of nth field number of fields … Queries Standard queries: Get job status and user defined quantities % boss q -all -specific -type test ID S_USR EXECUTABLE ST EXE_HOST START TIME STOP TIME comment counter 1 grandi test.pl 15 E pccms10.bo 14:30:00 06/06 14:30:16 06/06 ...STOP 15 2 grandi test.pl 15 R pccms10.bo 14:30:02 06/06 -------------- START... 13 Advanced queries: Use SQL to query job info (standard + user defined) Output suitable for parsing by a script: % boss SQL -query "select JOB.ID,EXEC,counter from JOB,test WHERE JOB.ID=test.JOBID" 3,4,23,9 ID EXEC counter 1 test.pl 15 2 test.pl 13 CHEP'03 Conference, San Diego
Interface to the scheduler User registers a scheduler: Scripts for job submission, deletion and query The scripts are stored in the database as blobs The fork scheduler is already registered User submits/deletes/queries jobs: The scheduler can be specified for the submission The boss executable fetches the scripts from the database and uses them as interface to the scheduler Job submission via ClassAd file is supported BOSS manages the keys it understands and passes the others to the submission script User-defined keys are possible! CHEP'03 Conference, San Diego
gatekeeper gatekeeper farm node farm node BOSS as a grid-tool Local BOSS gateway BOSS DB boss submit boss query boss kill GRID Scheduler boss registerScheduler • Tested on the European DataGrid testbed • Interface scripts incluided in BOSS distribution • See talk by P.Capiluppi • dbUpdator uses native MySQL calls • Proof of concept using R-GMA (from EDG-WP3) as BOSS transport layer (H.Nebrensky, Brunel Univ.) CHEP'03 Conference, San Diego
Input/Output Sandbox BOSS and R-GMA boss executable EDG WP1 + GRAM jobExecutor starts user job User output BOSS DB User Interface BOSS journal Worker Node Computing Element R-GMA Receiver servlets R-GMA Producer servlets R-GMA enabled dbUpdator lookup subscribe R-GMA Registry Firewall CHEP'03 Conference, San Diego
Current use of BOSS CMS 2002 productions: • about 500,000 jobs running in about 20 regional centers • complete book-keepig of every single job CMS/EDG stress test (Nov.-Dec. 2002): • about 10,000 jobs submitted by 4 user interfaces on the European DataGrid testbed • allowed validation of jobs for which the output sandbox was lost due to EDG internals R-GMA demo at EDG review (Feb. 2002): • proof of concept CHEP'03 Conference, San Diego
BOSS data analysis boss2root (by D.Bonacorsi) • Produce root trees from BOSS MySQL tables • Used to analyze the data of the CMS/EDG stress test • - complete • classification • of problems • graphical • representation • of results CHEP'03 Conference, San Diego
Summary BOSS is a tool that allows real-time monitoring and book-keeping of batch jobs User-defined information is archived for different job types Has been used by CMS for 2002 official productions Has been used during the CMS/EDG stress test in a grid environment Is a general tool: nothing CMS or even HEP specific Web site: http://www.bo.infn.it/cms/computing/BOSS/ CHEP'03 Conference, San Diego