200 likes | 398 Views
Setup of Swiss CMS Tier-3. Zhiling Chen (IPP-ETHZ) Doktorandenseminar June, 4 th , 2009. Outline Intro of CMS Computing Model Setup of Swiss CMS Tier-3 at PSI Working on the Swiss CMS Tier-3 Operational Experience. LCG Tier Organization. T0 (CERN) Filter farm Raw data custodial
E N D
Setup ofSwiss CMS Tier-3 Zhiling Chen (IPP-ETHZ) Doktorandenseminar June, 4th, 2009
Outline • Intro of CMS Computing Model • Setup of Swiss CMS Tier-3 at PSI • Working on the Swiss CMS Tier-3 • Operational Experience
LCG Tier Organization • T0 (CERN) • Filter farm • Raw data custodial • Prompt reconstruction • 7 T1s • Raw data custodial (shared) • Re-reconstruction • Skimming, calibration • ~ 40 T2s • Central scheduled MC production • Analysis and MC Simulation for all CMS Users • ~ Many T3s at institutes • Local institutes’ users • Final-stage analysis and MC Simulation • optimized for users' analysis needs Swiss Tier-2 for ATLAS, CMS, LHCb …. Tier-3 for Swiss CMS community
CMS Data Organization Physicist’s View System View
CMS Data Organization Physicist’s View CMS Data management Map Physicist and system views
CMS Analysis work flow Globe CMS Data Management Service CMS Tier-3 Local Site PhEDEx Local Data Transfer Agents PhEDEx Globe Data Transfer Agents and Database File Transfer Service Tier-3 Storage Element Data Book-keeping DB DBS LHC Grid Computing Remote SE Analysis Tool CRAB Tier-3 User Interface Tier-3 Local Cluster Data Location DB DLS CRAB is a Python program to simplify the process of creation and submission of CMS analysis jobs into a grid environment.
Overview of Swiss CMS Tier-3 • For CMS members of ETHZ, University of Zurich and PSI • Located at PSI • Try to adapt best to the users' analysis needs • running in test mode in October 2008, and in production mode since November 2008 • 30 registered physicist users • Manager: Dr. Derek Feichtinger, Assistant: Zhiling Chen
Hardware of Swiss CMS Tier-3 Present Computing Power Present Storage
Layout of Swiss CMS Tier-3 at PSI Submit/retrieve LCG jobs User Interface NFS Server NFS (home and shared software directories: CMSSW, CRAB, Glite) Access Home/Software Directory Submit/retrieve Batch Jobs Access Home/Software Directory User login Work Nodes Work Nodes CMS VoBox (PhEDEx) Work Nodes Computing Element [Sun Grid Engine ] Work Nodes Work Nodes Access PheEDEx Central DB Work Nodes Work Nodes Work Nodes Dispatch/Collection Batch Jobs Work Nodes [Sun Grid Engine Clients] Monitoring [ganglia collector, ganglia web front end ] Access Remote SE Access Local SE : SRM, gridftp, dcap … Storage Element (t3se01.psi.ch) [dcache admin, dcap, SRM, gridftp, resource info provider] FileServer FileServer CMS Tier-3 at PSI FileServer FileServer FileServer FileServer Accessed by LCG FileServer [dcache pool cells, gridftp, dcap, gsidcap ] DB Server [postgres, pnfs, dcachepnfs cell] Network connectivity: PSI has a 1Gb/s uplink to CSCS.
Setup of Swiss CMS Tier-3 • User Interface (8 cores): t3ui01.psi.chA fully operational LCG UI. It enables users to: • login from outside • Submit/Manage local jobs on the Tier-3 local cluster • Interact with the LCG Grid: Submit Grid jobs, access storage elements, etc. • Interact with AFS, CVS … • Test users’ Jobs • Local batch cluster(8 Work Nodes * 8 Cores): • Batch System: Sun Grid Engine 6.1
Setup of Swiss CMS Tier-3(cont.) • Storage Element (SE): t3se01.psi.chA fully equipped LCG storage element running a dCache. It allows users to: • Access files by local jobs (dcap, srmcp, gridftp etc.) in Tier-3 • Access files(srmcp, gridftp) from other sites • Give users extra space in addition to the space in CSCS Tier-2 • NFS Server (for small storage) • Hosts users’ home directories: analysis code, jobs output • Shared software: CMSSW, CRAB, Glite … • Easy to access, but not for huge files Note: If you need large storage space for longer time, you should use SE.
Setup of Swiss CMS Tier-3(cont.) • CMS VoBox (PhEDEx): • Users can order datasets to Tier-3 SE • Admin can manage datasets with PhDEDx • Monitoring: • Status of batch system • Accounting • Worker nodes load • Free storage space • Network Activities …
Working on Swiss CMS Tier-3Before Submit jobs: Order dataset • Check currently stored data sets for the Tier-3 from DBS Data Discovery Page • If the data sets are not stored on Tier-3, Order data sets to T3_CH_PSI by PhEDEx central web page
Working on Swiss CMS Tier-3Submit and Manage batch jobs • CRAB • CRAB module for SGE • Simplify creation and submission of CMS analysis jobs • Consistent way to submit jobs to Grid or Tier-3 Local Cluster • Sun Grid Engine • More flexible • More powerful controls • Priority • Job Dependency… • Command line and GUI Work Flow on Tier-3
Operational Experience • User acceptance of the T3 services seems to be quite good • Our CRAB SGE-scheduler module works well with SGE batch system. • SGE provides flexible and versatile way to submit and manage jobs on Tier-3 local cluster • Typical Problems in “Bad” jobs: • CMSSW jobs produce huge output file with tons of debug messages -> Fill up home directory quickly, cluster stalled • Set Quota for every user • Jobs initiate too many requests to SE in parallel -> Overload SE, jobs waiting • Users should beware
Upgrade Plan Hardware Upgrade • Software Upgrade: • Regular upgrade • Glite • CMS Software: CMSSW, CRAB • … • Upgrade under discussion: • using a parallel file system instead of NFS • Better performance than NFS • Good for the operational of large root files
Documents and User Support • Request Account: Send email to cms-tier3@lists.psi.ch • Users mailing list: cms-tier3-users@lists.psi.ch • https://twiki.cscs.ch/twiki/bin/view/CmsTier3/WebHome Swiss CMS Tier-3 Wiki page
Tier - 1 Tier Tier - - 2 2 Tier - 0 Online system Tier - 1 RAW Tier Tier - - 2 2 RECO O(50) primary datasets AOD O(10) streams (RAW) Scheduled data Tier Tier - - 2 2 First pass processing (skim reconstruction & reprocessing) Tier Tier - - 2 2 tape • • Analysis Analysis tape RAW,RECO • • MC simulation MC simulation RECO, AOD AOD CMS Event Data Flow Tier-3 Event Data Flow Tier-3 Tier-3 Analysis Mc sinmulation • Based on the hierarchy of computing tiers from LHC Computing Grid