180 likes | 197 Views
This project aims to streamline data transfers, implement the PhEDEx User Tools, CRAB for job tasks, and provide monitoring services for analysis jobs. Resource allocation and software installations are detailed, focusing on CMS Tier 2 and 3 requirements.
E N D
CMS T2_FR_CCIN2P3Towards the Analysis Facility (AF) JP CMS-France May 27-28, 2009 Strasbourg Available Resources Data Transfers – PhEDEx User Tools - CRAB Users Jobs Monitoring Conclusions Tibor Kurča Institut de Physique Nucléaire de Lyon T.Kurca JP CMS-France
CMS Distributed Analysis • To be run at T2/T3 or locally T2/T3 local resources needed • CMS software CMSSW pre-installed on the sites • Grid Analysis is Data Driven physics groups data allocation • Data Distribution via PhEDEx specifica for sites with T1&T2 • User tools to run analysis jobs CRAB • Monitoring of jobs related activities tracked by Dashboard (central monitoring service) T.Kurca JP CMS-France
T2/T32009 Pledged Resources T2 T3 CPU845k SI2k 562k SI2k ~500 jobs ~340 jobs Disk space dCache 171 TB 114 TB physics groups 4 x 30 TB (EWK 38 TB) /sps 25+8 TB (50% usage) xrootd25 TB (24% usage) T.Kurca JP CMS-France
CMS Data Access T0 T1, T2, T3 HPSS … HPSS DMZ IFZ rfcp 16TB 33 TB dCache Transf In/Out 1 TB 150 TB 84 TB Semipermanent /sps (gpfs) 38TB Prod pool Data pool Imp T0 pool Analysis pool 25 TB srmcp dcap gsidcap dcap xrootd dcap cp Prod-Merging Jobs Production Jobs Analysis Jobs T.Kurca JP CMS-France
CMSSW Installations • Centralized from T0 • By a high-priority grid jobs • Release versionpublishedon a site information system • Deprecated releases removed • Localy:Possibility of additional installations for the needs of local users • Two partitions: • 1./afs/in2p3.fr/grid/toolkit/cms2 = $VO_CMS_SW_DIR • ccali38:tcsh[210] fs lq • Volume Name Quota Used %Used Partition • grid.kit.cms2 60000000 39153734 65% 60% • - in the past problems with space & removal of old releases • additional 30 GB + central regular removal • 2./afs/in2p3.fr/grid/toolkit/cms • ccali05:tcsh[214] fs lq • Volume Name Quota Used %Used Partition • grid.kit.cms 46000000 31918855 69% 49% T.Kurca JP CMS-France
T2 & Physics Groups CCIN2P3: EWK, QCD, Tau/Pflow, Tracker IPHC: Top, b-tag GRIF: Higgs, Exotica, Egamma T.Kurca JP CMS-France
CMS Tier 2 vs Tier 1 • T2_FR_CCIN2P3 is specific • Usually diffrent sites for different Tiers • - exceptions : CERN (T0, T1, T2) , FNAL (T1, T3) and CCIN2P3 (T1, T2 ) • CE …. ok • - SE, PhEDEx node : some complications to be solved • What can we learn from CERN/FNAL ? Tier 3 IPNL Tier 2 GRIF Tier 1 CC IN2P3 Tier 2 CC AF Tier 2 IPHC T.Kurca JP CMS-France
CERN-FNAL Comparison CERN FNAL PhEDEx nodes: different different SE: reallydifferent different (only alias) srm-cms.cern.ch (T1) cmssrm.fnal.gov (T1) caf.cern.ch(T2) cmsdca2.fnal.gov(T3) dCache: the same for T1 & T2 the same for T1 & T3 Disk pools : different the same needed special download agents T.Kurca JP CMS-France
Data Transfers CERN : - T2 subscription – if data already at T1 then no actual PhEDEx transfer again …. just stageing to the right disk - developed dedicated T1CAF local download agents to ensure replication to the correct service class and to register download data in the local CAF DBS - using space tokens to separate T1T1_CH_CERN from T1T2 transfers FNAL : -T1 subscription doesn’t mean automatically also data at T3 - T1 data are fully accessible via CRAB to T3 users (no blacklisting) - user data are subscribed to T3 – track kept by the T3-manager as the dcache is the same for T1/T3 T3 data will be migrated to tape, but PhEDEx doesn’t know about it - caveat : don’t subscribe the same data to T1 & T3 T.Kurca JP CMS-France
T2_FR_CCIN2P3 Before • Site configuration : • CE - different for T1 & T2 • SE - one for both T1&T • PhEDEx – only T1 node • Access to data in T1 for users of T2 • - data stored at T1 only • - non productions jobs to be run at T2 • Jobs: temporaryhack from CRAB_2_4_4(Jan 23,2009) • users jobs can access T1_CCIN2P3 data without show-prod = 1 option • …. all T1 are masked in DLS by default except CCIN2P3 • at the end transparent for the user T.Kurca JP CMS-France
T2_FR_CCIN2P3 Now dCache: the same for T1 & T2 Disk pools : only for T1 ? create specific for T2 ? … for the moment one pool PhEDEx nodes: T1_FR_CCIN2P3_Buffer,T1_FR_CCIN2P3_MSS created & installed T2 node T2_FR_CCIN2P3 as disk only VOBox …. cclcgcms06 SE: ccsrm.in2p3.fr (T1) - ccsrm.in2p3.fr (T2) created T2 specific ccsrmt2.in2pp3.fr …. alias Main Goals: - avoid transferring the same data 2x - avoid T1T2 intra CCIN2P3 transfers - avoid hacks on different levels should be solved at PhEDEx level with different T2_FR_CCIN2P3 node & correct config T.Kurca JP CMS-France
CRAB CMS Remote Analysis Builder • transparent access to distributed data & computing resources https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideCrab • intended to simplify the process of creation & submisson of CMS analysis jobs to grid • implemented in Python as a batch-like command line application crab –c crab.cfg -create (-submit , -status, -getoutput, -resubmit ….) • CRAB standalone : direct submission from UI via WMS - simple, but lacks some important features, suitable for small tasks (~100 jobs) limits the size of the sandbox • Client-Server architecture : CRABServer - automating as much as possible the whole analysis workflow : (re)submission, error handling, output retrieval - improving the scalability of the system - transparent to end users: interface, installation, configuration procedure and usage the same as in standalone mode possibility of submission to local batch system ! …. For BQS needed to write BossLite plugin T.Kurca JP CMS-France
CRAB Architecture Courtesy G. Codispoti T.Kurca JP CMS-France
CRAB Installations CRAB client 2_5_1 https://twiki.cern.ch/twiki/bin/view/CMS/CrabClientRelNotes251 installed on afs : $VO_CMS_SW_DIR/CRAB no need for private installations ! CRAB Server1_0_6 https://twiki.cern.ch/twiki/bin/view/CMS/CrabServer#CRABSERV • https://twiki.cern.ch/twiki/bin/view/CMS/CrabServer_RelNotes_106 installed from a scratch on the new hardware node ccgridli03.in2p3.fr : double powering Intel Xeon 2.50 GHz (E5420) 16 GB RAM 250 GB disk RAID (redundancy) SATA - monitoringhttp://ccgridli03.in2p3.fr:8888/ T.Kurca JP CMS-France
CRAB Environment Setup your environment 1) Grid UI : lcg_env 2) CMSSW environment: cms_def alias for source $VO_CMS_SW_DIR/cmsset_default.(c)sh cms_sw alias for eval `scramv1 runtime -(c)sh` 3) CRAB environment : crabX alias for source $VO_CMS_SW_DIR/CRAB/crab.(c)sh OR if working in the existing directory simply do « cms_env » an alias for : source $VO_CMS_SW_DIR/cmsenv.(c)sh cms_env T.Kurca JP CMS-France
CRAB Data Stageout • CRAB Server usage: crab.cfg • [CRAB] • scheduler=glite • jobtype=cmssw • server_name = in2p3 • W/o CMS Storage Name Convention: • [USER] • copy_data = 1 • storage_element = ccsrmt2.in2p3.fr • user_remote_dir = /test • storage_path = /srm/managerv2?SFN=/pnfs/in2p3.fr/data/cms/data/store/user/kurca • With CMS Storage Name Convention: • [USER] • copy_data = 1 • storage_element = T2_FR_CCIN2P3 • user_remote_dir = /test • data will be written to /pnfs/in2p3.fr/data/cms/data/store/user/kurca/test • …. the same as in the w/o case ! T.Kurca JP CMS-France
Jobs Monitoring • CRAB Server : http://ccgridli03.in2p3.fr:8888/ Service Description Tasks Tasks entities data in this CrabServer Jobs Jobs entities data in this CrabServer Component Monitor Component and Sevice status User Monitoring User task and job log information • CMS Dashboard:http://arda-dashboard.cern.ch/cms/ - link to job exit codes - Task monitoring for the analysis users - Site availability based on the SAM tests - Site status board • Comments: crab status behind that of Dashboard inconsistencies possible space for improvements T.Kurca JP CMS-France
Conclusions • T2_FR_CCIN2P3 - operationel long time , strong contribution to CMS computing - not fully separated from T1 (few hacks needed) separate PhEDEx node installed, testing/debugging phase « new » SE ccsrmt2.in2p3.fr declared & published (alias only) • User Tools Available: - CRAB client 2_5_1 installed - CRAB server 1_0_6 - Monitoring via Dashboard & CRAB server • ‘Base de Connaisance’ CC-IN2P3 you can find a collection of different information localy + cms related http://cc.in2p3.fr/cc_accueil.php3?lang=fr into empty field ‘Rechercher’ type your word e.g. ‘crab’ - not complete yet, feedback, suggestions welcome • Plans ?: - to have fully transparent tools for local (nongrid) and grid analysis develop BossLite plugin for CRAB enabling direct submission to BQS the same jobs submitted locally, w/o additional grid layer T.Kurca JP CMS-France