CRAB Tutorial. (Cms Remote Analysis Builder). What is CRAB. Tool which allows you to run CMSSW on Grid environments Data discovery through DBS/DLS catalogues You don’t need to know details of Grid environments but you just have to know how to run CMSSW

  1. CRAB Tutorial (Cms Remote Analysis Builder)

  2. What is CRAB • Tool which allows you to run CMSSW on Grid environments • Data discovery through DBS/DLS catalogues • You don’t need to know details of Grid environments • but you just have to know how to run CMSSW • CRAB is a tool written in python language • CRAB as to be installed in a Grid User Interface • The last released CRAB version is “1_4_2”

  3. What you need (before you start) • A UI (User Interface) where to develop your code • This means you also need CMSSW available • A valid Grid certificate provided by your VO, that is CMS • if you still don’t have one have a look at http://cmsdoc.cern.ch/cms/aprom/www/top/CMS_VO.html • CRAB itself • Maybe you can ask your site administrator to install it for you on the UI. Otherwise…

  4. How to install CRAB • Get CRAB either from afs or the Web • /afs/cern.ch/cms/ccs/wm/scripts/Crab/CRAB_X_Y_ Z.tgz • http://cmsdoc.cern.ch/cms/ccs/wm/www/Crab/CRAB_X_Y_Z.tgz • Untar it (tar zxvf CRAB_X_Y_Z.tgz) • cd CRAB_X_Y_Z • run ./configure • creates CRAB configuration files (crab.sh(csh)) • Installs BOSS • Installs DBS/DSL API

  5. How to install CRAB(II) • source crab.sh(csh) • Warning: with tcsh shell there is the “word too long” problem so with the 1_4_2 please use bash shell • The very first time you’ll need to run also ./configureBoss • You will be advised by crab.sh(csh) sourcing, if necessary • This script sets up the Boss configuration files which will be written under ~/.bossrc/ • If you’re working on lxplus at CERN, CRAB is already installed for you. Just: • source /afs/cern.ch/cms/ccs/wm/scripts/Crab/crab.sh

  6. Environment setup • CMSSW • eval `scramv1 runtime –(c)sh` • Grid UI commands • if you are working on lxplus at CERN just source /afs/cern.ch/cms/LCG/LCG-2/UI/cms_ui_env.csh

  7. How to set up CRAB (crab.cfg) • Relevant key-value pairs • jobtype = cmssw (now only cmssw is supported) • scheduler = [edg, glite, glitecoll, condor-g] • glitecoll is glite schedulers with bulk submission. • datasetpath = [string retrieved from your query to DBS web page] • You have to find out the string (datasetpath) http://cmsdbs.cern.ch/discovery/(Keyword search)

  8. How to set up CRAB II • pset = [the name of Pset which fits your code] • total_number_of_events = [number of events you want to access OR -1 to access all data] • events_per_job = [number of events accessed by a single job] OR • number_of_jobs = [number of jobs you want to run] • CRAB will work out the right number of events/jobs according to the user requests with its splitting algorithm • ouput_file = [any name you like for your output file] • WARNING: remember to be consistent with the name inside your Pset

  9. Hot to set up CRAB III • return_datacopy_data= [0|1] • To receive at the end of job the CMSSW output (i.e root file) on UI CRAB dir or/and to copy it on a SE • if you usecopy_data= 1 • storage_element= [name of the storage element where you want to copy your output files] • storage_path= [path on the SE] • Warning: you can use your own directory on castor (srm.cern.ch) but before you have to change its writing permission (rfchmod) • Before going through the list of all other keys we try to submit some jobs • Warning: CRAB should be run in a directory containing crab.cfg and Pset.cfg

  10. Setup for this tutorial • We use CMSSW_1_2_0 • Specific code • AnalysisExamples/SimTrackerAnalysis/SimHitTrackerAnalyzer • crab.cfg values • jobtype = cmssw • scheduler = glitecoll • datasetpath = /mc-onsel-120_Incl_ttbar/FETV/CMSSW_1_2_0-FEVT-1166726158 • pset = runSimHitAnalyzer.cfg • total_number_of_events = -1 (= all) • number_of_jobs = 40 • output_files = Histos.root

  11. Useful setup keys for CRAB • EDG section • rb= [cern, cnaf]if commented you will use the default RB to which your UI points. Otherwise CRAB will download some configuration files to use either Cern or Cnaf RBs • white_list, black_list used to select or avoid specific sites hosting data

  12. Useful setup keys for CRAB • USER section • additional_input_files= [comma separated list of additional files you would need on the WN] • ui_working_dir= [name of the working dir if you don’t like the standard naming convention] • use_central_bossDB= [0|1|2] • 0 means that you’re creating a Sqlite db per task • 1 means that you’re creating a single Sqlite db which is located in your home dir • 2 means that you can provide the Boss configuration files via a given directory

  13. Basic CRAB commands • simplest command • crab –create –submit N • by default CRAB will create ALL jobs according to the information provided • status, output, kill… • crab –status • crab –getoutput [range] • crab –kill [range]

  14. Basic CRAB commands • Resubmission • crab –resubmit [range] • What if something went wrong (aborted jobs) • crab –postMortem • this command queries the Grid and returns a verbose status of what happened during the job lifetime, useful when you ask for support to Grid experts

  15. Dashboard reports • Crab jobs are also monitored by Dashboard • http://arda-dashboard.cern.ch/cms/

  16. Documentation and Feedback • CRAB homepage • http://cmsdoc.cern.ch/cms/ccs/wm/www/Crab/ • FAQ • https://twiki.cern.ch/twiki/bin/view/CMS/Crab.faq • To report a bug and to suggest new features • hn-cms-crabFeedback@cern.ch (registration at https://hypernews.cern.ch/ page) • https://savannah.cern.ch/projects/crab/

