170 likes | 303 Views
NL-T1 Report. Ron Trompert. Contents. Infrastructure Usage CPU Disk Tape I/O Disk storage Compute Tape LHCOPN dashboard Issues New procurements dCache. Infrastructure. Infrastructure. CPU Usage. CPU Usage. SARA. NIKHEF. ATLAS Disk Usage. SARA disk in GB. NIKHEF disk in GB.
E N D
NL-T1 Report Ron Trompert
Contents • Infrastructure • Usage • CPU • Disk • Tape • I/O • Disk storage • Compute • Tape • LHCOPN dashboard • Issues • New procurements • dCache
CPU Usage SARA NIKHEF
ATLAS Disk Usage SARA disk in GB NIKHEF disk in GB
ATLAS Tape Usage SARA tape usage in TB
I/O storage In Out
I/O tape • Read performance: about 400-500 MB/s from tape to dCache on average when there is no heavy tape writing going on • Have done some on the fly tuning to get there • Adapted dcachehsm copy script • Cxfs/dmf clientnode tuning (queue lengths) • Performance is OK given the circumstances but not as much as what we aim for (1 GB/s) • Should be better with new hardware and DMF5 • Have replaced hpn-ssh with globusgridftp to copy files between cxfs/dmf clientnodes and dCache • Write performance is also about 400-500 MB/s • Adapted hsm copy script to compute checksums+flush and fsync writing to cxfs/dmf clients to avoid data corruption
Issues • Part of the farm at NIKHEF has not been usuable due to longstanding network issues related to the built-in switches of blade centers delivered in the autumn of 2009. The vendor has not been very active in attempting to resolve this but we hope to have a solution soon. • Due to the issue above, ATLAS jobs are only running on a part of the farm which implies that they are queued for a longer period of time. Pilot factories submit lots of jobs so that it appears that sometimes the batch system does not find any non-ATLAS runnable jobs due to this huge queue. This leads to unused job slots. • According the VO ID card ATLAS jobs need 3072 MB of virtual memory. Nevertheless, our batch systems limits this at 4096MB and this is still too small for some ATLAS jobs. How is ATLAS going to tackle this?
Issues • Is ATLAS able to use CVMFS using the mount point /cmvfs/atlas.cern.ch/? This would solve two problems? • A third of the content of the BDII • No quota on experiment software disk • We have seen transfers fro ATLASLOCALGROUPDISK to elsewhere. Isn’t LOCAL supposed to be LOCAL? • FTS channels • Wouldn’t it be good to let the site admins within the NL cloud be channel admin of their own channels? Then you can tune the channel anyway you want or turn yourself off when going in downtime.
New procurements • Compute: 50 KSi2006 rate, 27 KSi2006 rate at NIKHEF and 23 KSi2006 rate at SARA • Tape: 2PB • Disk: 850 TiB at SARA, 280 TiB at NIKHEF • Pledges are still under discussion
New procurements: Mass storage • Scalable solution with DMF5 • Investigating faster SAN storage
dCache@SARA • The Golden release 1.9.5-* has been a very reliable workhorse the past year. But ….. • There will be a new Golden release 1.9.12 with some very nice features for admins but also for users, like, for example: • srmGetTurl does not wait anymore for the standard 4 seconds • WebDAV (http/https). Mount dCache on your laptop. • So, we intend to upgrade