160 likes | 265 Views
SC3 experiences. Ron Trompert SARA. SC3 Infrastructure. Starting point DMF-based HSM DMF has no SRM implementation DMF does not support functionality promised by the SRM standard, like file pinning. SC3 Infrastructure. dCache. dCache provides an srm I/F
E N D
SC3 experiences Ron Trompert SARA
SC3 Infrastructure • Starting point • DMF-based HSM • DMF has no SRM implementation • DMF does not support functionality promised by the SRM standard, like file pinning.
SC3 Infrastructure dCache • dCache provides an srm I/F • dCache provides flexibility with respect to HSM backends • If we need to switch to another HSM setup for some reason
SC3 Throughput phase • Disk2disk: 100-110 MB/s • Problems with stability of the nodes:solved by limiting the number of I/O movers • Disk2tape: 50 MB/s • Not enough bandwidth, SAN not dedicated
SC3 service phase statistics Percentage of computational resources used (october-december)
SC3 service phase statistics • Setting up the infrastructure took longer than we had hoped so unfortunately we missed ALICE. • Sizes and number of files transferred to srm SE
SC3 service phase observations • Networking problems • Hardware problems • 10GE to CERN was dedicated but the 10G switch not. Switching back and forth between dedicated 10GE and Geant. • Routing problems • Considerably less data stored for Atlas than expected. • In plans on Wiki 20 TB
SC3 service phase observations • Communication problem • Network changes not reported • We were not informed of changes in subnets. • Problems are not always reported • Failed transfers are not always reported • Network outage CERN-SARA between Xmas and New Year, nobody informed us • Monitoring: experiment monitoring websites in Wiki but also found other monitoring website urls in emails. • Not clear what the experiments exact plans are • When there are no transfers and no problems are reported, it is not clear whether there is something wrong or things go just as planned.
SC3 service phase observations • Failed transfers by attempting to overwrite files • Not allowed by PNFS • At dCache sites running a gridftp door on there srm node files can be thrown away immediately using edg-gridftp-rm or glite-gridftp-rm • At dCache sites that don’t run a gridftp door on the srm node an advisory delete can be done. But then files are not immediately deleted.
SC3 service phase observations • dCache security (gsi)dcap • Using dccp it is possible to get anything in /pnfs/grid.sara.nl/data/<vo> by anyone • Unix permissions on directories are not honoured • Files in a directory with –rwxr-x--- are world readable. • File permission are honoured but when data is copied in /pnfs it gets –rw-r--r--. • Using gsidcap you are authenticated but the behaviour above stays the same. • Write permissions are OK. • Maybe this is OK for HEP VOs but for some VOs this is too liberal.
SC3 service phase observations • Oracle database • Every now and then it just hangs and needs to be restarted. • Backups didn’t work but FTS and LFC did.
SC3 service phase observations • A user wanted to run a job using root I/O which is rfio/dcap based. • Rfio/dcap are unauthenticated protocols to access data • Rfio comes automatically when installing a classic SE with yaim. • We don’t really like it but what do the other T1s think about this?
SC4 Outlook • Current plans (being updated) -Setup T2 tests -Separate T1 tape storage from general storage -Replace old SE by SRM SE -Setup DB node for FTS/LFC