180 likes | 196 Views
CMS T2_FR_CCIN2P3 Towards the Analysis Facility (AF). JP CMS-France May 27-28, 2009 Strasbourg. Available Resources Data Transfers – PhEDEx User Tools - CRAB Users Jobs Monitoring Conclusions. Tibor Kur ča Institut de Physique Nucléaire de Lyon. CMS Distributed Analysis.
E N D
CMS T2_FR_CCIN2P3Towards the Analysis Facility (AF) JP CMS-France May 27-28, 2009 Strasbourg Available Resources Data Transfers – PhEDEx User Tools - CRAB Users Jobs Monitoring Conclusions Tibor Kurča Institut de Physique Nucléaire de Lyon T.Kurca JP CMS-France
CMS Distributed Analysis • To be run at T2/T3 or locally T2/T3 local resources needed • CMS software CMSSW pre-installed on the sites • Grid Analysis is Data Driven physics groups data allocation • Data Distribution via PhEDEx specifica for sites with T1&T2 • User tools to run analysis jobs CRAB • Monitoring of jobs related activities tracked by Dashboard (central monitoring service) T.Kurca JP CMS-France
T2/T32009 Pledged Resources T2 T3 CPU845k SI2k 562k SI2k ~500 jobs ~340 jobs Disk space dCache 171 TB 114 TB physics groups 4 x 30 TB (EWK 38 TB) /sps 25+8 TB (50% usage) xrootd25 TB (24% usage) T.Kurca JP CMS-France
CMS Data Access T0 T1, T2, T3 HPSS … HPSS DMZ IFZ rfcp 16TB 33 TB dCache Transf In/Out 1 TB 150 TB 84 TB Semipermanent /sps (gpfs) 38TB Prod pool Data pool Imp T0 pool Analysis pool 25 TB srmcp dcap gsidcap dcap xrootd dcap cp Prod-Merging Jobs Production Jobs Analysis Jobs T.Kurca JP CMS-France
CMSSW Installations • Centralized from T0 • By a high-priority grid jobs • Release versionpublishedon a site information system • Deprecated releases removed • Localy:Possibility of additional installations for the needs of local users • Two partitions: • 1./afs/in2p3.fr/grid/toolkit/cms2 = $VO_CMS_SW_DIR • ccali38:tcsh[210] fs lq • Volume Name Quota Used %Used Partition • grid.kit.cms2 60000000 39153734 65% 60% • - in the past problems with space & removal of old releases • additional 30 GB + central regular removal • 2./afs/in2p3.fr/grid/toolkit/cms • ccali05:tcsh[214] fs lq • Volume Name Quota Used %Used Partition • grid.kit.cms 46000000 31918855 69% 49% T.Kurca JP CMS-France
T2 & Physics Groups CCIN2P3: EWK, QCD, Tau/Pflow, Tracker IPHC: Top, b-tag GRIF: Higgs, Exotica, Egamma T.Kurca JP CMS-France
CMS Tier 2 vs Tier 1 • T2_FR_CCIN2P3 is specific • Usually diffrent sites for different Tiers • - exceptions : CERN (T0, T1, T2) , FNAL (T1, T3) and CCIN2P3 (T1, T2 ) • CE …. ok • - SE, PhEDEx node : some complications to be solved • What can we learn from CERN/FNAL ? Tier 3 IPNL Tier 2 GRIF Tier 1 CC IN2P3 Tier 2 CC AF Tier 2 IPHC T.Kurca JP CMS-France
CERN-FNAL Comparison CERN FNAL PhEDEx nodes: different different SE: reallydifferent different (only alias) srm-cms.cern.ch (T1) cmssrm.fnal.gov (T1) caf.cern.ch(T2) cmsdca2.fnal.gov(T3) dCache: the same for T1 & T2 the same for T1 & T3 Disk pools : different the same needed special download agents T.Kurca JP CMS-France
Data Transfers CERN : - T2 subscription – if data already at T1 then no actual PhEDEx transfer again …. just stageing to the right disk - developed dedicated T1CAF local download agents to ensure replication to the correct service class and to register download data in the local CAF DBS - using space tokens to separate T1T1_CH_CERN from T1T2 transfers FNAL : -T1 subscription doesn’t mean automatically also data at T3 - T1 data are fully accessible via CRAB to T3 users (no blacklisting) - user data are subscribed to T3 – track kept by the T3-manager as the dcache is the same for T1/T3 T3 data will be migrated to tape, but PhEDEx doesn’t know about it - caveat : don’t subscribe the same data to T1 & T3 T.Kurca JP CMS-France
T2_FR_CCIN2P3 Before • Site configuration : • CE - different for T1 & T2 • SE - one for both T1&T • PhEDEx – only T1 node • Access to data in T1 for users of T2 • - data stored at T1 only • - non productions jobs to be run at T2 • Jobs: temporaryhack from CRAB_2_4_4(Jan 23,2009) • users jobs can access T1_CCIN2P3 data without show-prod = 1 option • …. all T1 are masked in DLS by default except CCIN2P3 • at the end transparent for the user T.Kurca JP CMS-France
T2_FR_CCIN2P3 Now dCache: the same for T1 & T2 Disk pools : only for T1 ? create specific for T2 ? … for the moment one pool PhEDEx nodes: T1_FR_CCIN2P3_Buffer,T1_FR_CCIN2P3_MSS created & installed T2 node T2_FR_CCIN2P3 as disk only VOBox …. cclcgcms06 SE: ccsrm.in2p3.fr (T1) - ccsrm.in2p3.fr (T2) created T2 specific ccsrmt2.in2pp3.fr …. alias Main Goals: - avoid transferring the same data 2x - avoid T1T2 intra CCIN2P3 transfers - avoid hacks on different levels should be solved at PhEDEx level with different T2_FR_CCIN2P3 node & correct config T.Kurca JP CMS-France
CRAB CMS Remote Analysis Builder • transparent access to distributed data & computing resources https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideCrab • intended to simplify the process of creation & submisson of CMS analysis jobs to grid • implemented in Python as a batch-like command line application crab –c crab.cfg -create (-submit , -status, -getoutput, -resubmit ….) • CRAB standalone : direct submission from UI via WMS - simple, but lacks some important features, suitable for small tasks (~100 jobs) limits the size of the sandbox • Client-Server architecture : CRABServer - automating as much as possible the whole analysis workflow : (re)submission, error handling, output retrieval - improving the scalability of the system - transparent to end users: interface, installation, configuration procedure and usage the same as in standalone mode possibility of submission to local batch system ! …. For BQS needed to write BossLite plugin T.Kurca JP CMS-France
CRAB Architecture Courtesy G. Codispoti T.Kurca JP CMS-France
CRAB Installations CRAB client 2_5_1 https://twiki.cern.ch/twiki/bin/view/CMS/CrabClientRelNotes251 installed on afs : $VO_CMS_SW_DIR/CRAB no need for private installations ! CRAB Server1_0_6 https://twiki.cern.ch/twiki/bin/view/CMS/CrabServer#CRABSERV • https://twiki.cern.ch/twiki/bin/view/CMS/CrabServer_RelNotes_106 installed from a scratch on the new hardware node ccgridli03.in2p3.fr : double powering Intel Xeon 2.50 GHz (E5420) 16 GB RAM 250 GB disk RAID (redundancy) SATA - monitoringhttp://ccgridli03.in2p3.fr:8888/ T.Kurca JP CMS-France
CRAB Environment Setup your environment 1) Grid UI : lcg_env 2) CMSSW environment: cms_def alias for source $VO_CMS_SW_DIR/cmsset_default.(c)sh cms_sw alias for eval `scramv1 runtime -(c)sh` 3) CRAB environment : crabX alias for source $VO_CMS_SW_DIR/CRAB/crab.(c)sh OR if working in the existing directory simply do « cms_env » an alias for : source $VO_CMS_SW_DIR/cmsenv.(c)sh cms_env T.Kurca JP CMS-France
CRAB Data Stageout • CRAB Server usage: crab.cfg • [CRAB] • scheduler=glite • jobtype=cmssw • server_name = in2p3 • W/o CMS Storage Name Convention: • [USER] • copy_data = 1 • storage_element = ccsrmt2.in2p3.fr • user_remote_dir = /test • storage_path = /srm/managerv2?SFN=/pnfs/in2p3.fr/data/cms/data/store/user/kurca • With CMS Storage Name Convention: • [USER] • copy_data = 1 • storage_element = T2_FR_CCIN2P3 • user_remote_dir = /test • data will be written to /pnfs/in2p3.fr/data/cms/data/store/user/kurca/test • …. the same as in the w/o case ! T.Kurca JP CMS-France
Jobs Monitoring • CRAB Server : http://ccgridli03.in2p3.fr:8888/ Service Description Tasks Tasks entities data in this CrabServer Jobs Jobs entities data in this CrabServer Component Monitor Component and Sevice status User Monitoring User task and job log information • CMS Dashboard:http://arda-dashboard.cern.ch/cms/ - link to job exit codes - Task monitoring for the analysis users - Site availability based on the SAM tests - Site status board • Comments: crab status behind that of Dashboard inconsistencies possible space for improvements T.Kurca JP CMS-France
Conclusions • T2_FR_CCIN2P3 - operationel long time , strong contribution to CMS computing - not fully separated from T1 (few hacks needed) separate PhEDEx node installed, testing/debugging phase « new » SE ccsrmt2.in2p3.fr declared & published (alias only) • User Tools Available: - CRAB client 2_5_1 installed - CRAB server 1_0_6 - Monitoring via Dashboard & CRAB server • ‘Base de Connaisance’ CC-IN2P3 you can find a collection of different information localy + cms related http://cc.in2p3.fr/cc_accueil.php3?lang=fr into empty field ‘Rechercher’ type your word e.g. ‘crab’ - not complete yet, feedback, suggestions welcome • Plans ?: - to have fully transparent tools for local (nongrid) and grid analysis develop BossLite plugin for CRAB enabling direct submission to BQS the same jobs submitted locally, w/o additional grid layer T.Kurca JP CMS-France