170 likes | 374 Views
CRAB Tutorial. (Cms Remote Analysis Builder). What is CRAB. Tool which allows you to run CMSSW on Grid environments Data discovery through DBS/DLS catalogues You don’t need to know details of Grid environments but you just have to know how to run CMSSW
E N D
CRAB Tutorial (Cms Remote Analysis Builder)
What is CRAB • Tool which allows you to run CMSSW on Grid environments • Data discovery through DBS/DLS catalogues • You don’t need to know details of Grid environments • but you just have to know how to run CMSSW • CRAB is a tool written in python language • CRAB as to be installed in a Grid User Interface • The last released CRAB version is “1_4_2”
What you need (before you start) • A UI (User Interface) where to develop your code • This means you also need CMSSW available • A valid Grid certificate provided by your VO, that is CMS • if you still don’t have one have a look at http://cmsdoc.cern.ch/cms/aprom/www/top/CMS_VO.html • CRAB itself • Maybe you can ask your site administrator to install it for you on the UI. Otherwise…
How to install CRAB • Get CRAB either from afs or the Web • /afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_X_Y_ Z.tgz • http://cmsdoc.cern.ch/cms/ccs/wm/www/Crab/CRAB_X_Y_Z.tgz • Untar it (tar zxvf CRAB_X_Y_Z.tgz) • cd CRAB_X_Y_Z • run ./configure • creates CRAB configuration files (crab.sh(csh)) • Installs BOSS • Installs DBS/DSL API
How to install CRAB(II) • source crab.sh(csh) • Warning: with tcsh shell there is the “word too long” problem so with the 1_4_2 please use bash shell • The very first time you’ll need to run also ./configureBoss • You will be advised by crab.sh(csh) sourcing, if necessary • This script sets up the Boss configuration files which will be written under ~/.bossrc/ • If you’re working on lxplus at CERN, CRAB is already installed for you. Just: • source /afs/cern.ch/cms/ccs/wm/scripts/Crab/crab.sh
Environment setup • CMSSW • eval `scramv1 runtime –(c)sh` • Grid UI commands • if you are working on lxplus at CERN just source /afs/cern.ch/cms/LCG/LCG-2/UI/cms_ui_env.csh
How to set up CRAB (crab.cfg) • Relevant key-value pairs • jobtype = cmssw (now only cmssw is supported) • scheduler = [edg, glite, glitecoll, condor-g] • glitecoll is glite schedulers with bulk submission. • datasetpath = [string retrieved from your query to DBS web page] • You have to find out the string (datasetpath) http://cmsdbs.cern.ch/discovery/(Keyword search)
How to set up CRAB II • pset = [the name of Pset which fits your code] • total_number_of_events = [number of events you want to access OR -1 to access all data] • events_per_job = [number of events accessed by a single job] OR • number_of_jobs = [number of jobs you want to run] • CRAB will work out the right number of events/jobs according to the user requests with its splitting algorithm • ouput_file = [any name you like for your output file] • WARNING: remember to be consistent with the name inside your Pset
Hot to set up CRAB III • return_datacopy_data= [0|1] • To receive at the end of job the CMSSW output (i.e root file) on UI CRAB dir or/and to copy it on a SE • if you usecopy_data= 1 • storage_element= [name of the storage element where you want to copy your output files] • storage_path= [path on the SE] • Warning: you can use your own directory on castor (srm.cern.ch) but before you have to change its writing permission (rfchmod) • Before going through the list of all other keys we try to submit some jobs • Warning: CRAB should be run in a directory containing crab.cfg and Pset.cfg
Setup for this tutorial • We use CMSSW_1_2_0 • Specific code • AnalysisExamples/SimTrackerAnalysis/SimHitTrackerAnalyzer • crab.cfg values • jobtype = cmssw • scheduler = glitecoll • datasetpath = /mc-onsel-120_Incl_ttbar/FETV/CMSSW_1_2_0-FEVT-1166726158 • pset = runSimHitAnalyzer.cfg • total_number_of_events = -1 (= all) • number_of_jobs = 40 • output_files = Histos.root
Useful setup keys for CRAB • EDG section • rb= [cern, cnaf]if commented you will use the default RB to which your UI points. Otherwise CRAB will download some configuration files to use either Cern or Cnaf RBs • white_list, black_list used to select or avoid specific sites hosting data
Useful setup keys for CRAB • USER section • additional_input_files= [comma separated list of additional files you would need on the WN] • ui_working_dir= [name of the working dir if you don’t like the standard naming convention] • use_central_bossDB= [0|1|2] • 0 means that you’re creating a Sqlite db per task • 1 means that you’re creating a single Sqlite db which is located in your home dir • 2 means that you can provide the Boss configuration files via a given directory
Basic CRAB commands • simplest command • crab –create –submit N • by default CRAB will create ALL jobs according to the information provided • status, output, kill… • crab –status • crab –getoutput [range] • crab –kill [range]
Basic CRAB commands • Resubmission • crab –resubmit [range] • What if something went wrong (aborted jobs) • crab –postMortem • this command queries the Grid and returns a verbose status of what happened during the job lifetime, useful when you ask for support to Grid experts
Dashboard reports • Crab jobs are also monitored by Dashboard • http://arda-dashboard.cern.ch/cms/
Documentation and Feedback • CRAB homepage • http://cmsdoc.cern.ch/cms/ccs/wm/www/Crab/ • FAQ • https://twiki.cern.ch/twiki/bin/view/CMS/Crab.faq • To report a bug and to suggest new features • hn-cms-crabFeedback@cern.ch (registration at https://hypernews.cern.ch/ page) • https://savannah.cern.ch/projects/crab/