110 likes | 123 Views
Learn about DQ2 usage, data distribution methods, and space management for Tier-3 sites to optimize performance without disrupting production systems. Implementing FTS with dq2-get to improve data transfer efficiency. Efficient SE and storage management techniques discussed.
E N D
DDM for (US) Tier3 Hironori Ito Brookhaven National Laboratory US ATLAS Tier-2/Tier-3 Meeting March 2010
Data Distribution to Tier-3 • T3gs • Must use DQ2 SS • T3 must satisfy all other requirements: space token, BDII, SAM test, etc… • BNL has been running DQ2 SS for Tier3 for several months • BNL is also hosting LFC for Tier3 for the same period. • Sites are listed in TiersOfATLAS and working normally (like T2s) • T3g • If desired, a site can run SRM (and other obligatory services) to be a part of the DDM system. • If not DDM, sites/users will be responsible for getting data. • SRM, plain GridFTP server or no grid storage. • What is the good way? • dq2-get? • Does it provide the performance? • Does it interfere with the production system?
Can T3s Destroy ATLAS Computing Operations at T0/T1s/T2s? dq2-get –L site –T=1,10000 DSN list=Array.new() `dq2-list-files DSN`.split("\n").each do |line| line=line.split(" ") if line.size>3 list.push(Thread.new { `"dq2-get -L site -f #{line[0]} DSN"` }) end end list.each { |t| t.join } Easy! But, don’t do this, please!
Idea • Let’s try to discourage users from possibly disrupting production SE(s) by providing alternate method without sacrificing the performance and functions • The production DDM uses FTS, which is managed by T0/T1s and configured not to overload SEs • Why not use FTS?
dq2-get with FTS • dq2-get is modified to use FTS to transfer between two remote SEs (srm and gridftp) • dq2-get –to-fts=FTS_SERVICE –p ‘fts’ –to-storage=Destination_SE_Path –L Source-site DSN • Follow all standard DDM conventions • The convention enforces the structure of the destination directory path, which can be used as the file catalog without LFC • Does not allow a user to place dataset to a production area • It works with SRM as well as the plain GridFTP server. • The destination site does not need to be listed in TiersOfATLAS, lowering the workload by T3 administrators
Real example of dq2-get with FTS • dq2-get --to-fts=https://fts02.usatlas.bnl.gov:8443/glite-data-transfer-fts/services/FileTransfer -p 'fts' –to-storage= srm://atlfs02.phy.duke.edu:8443/srm/v2/server?SFN=/srv/data/srm/testhiro -L BNL-OSG2_MCDISK groupmc08.105807.JF35_pythia_jet_filter.merge.AOD.e418_a84_t53_tid076929 • dq2-get --to-fts=https://fts02.usatlas.bnl.gov:8443/glite-data-transfer-fts/services/FileTransfer -p 'fts' –to-storage= gsiftp://atlfs02.phy.duke.edu:2811/srv/data/srm/testhiro -L BNL-OSG2_MCDISK mc08.005010.J1_pythia_jetjet.recon.AOD.e323_s400_d99_r474_tid023482
Data Management at T3g • If sites use DDM, it is up-to Tier-3 administrators to maintain the SE by themselves. • There is no consistency problem if the site does not use LFC because all files are “SE Dark” files by definition. • If files are organized in SE according to DDM standard structure, it will be fairly easy to maintain/delete. • Future modification to dq2-ls: • Instead of asking LFC to find files, it can check SE at an expected directory with given path to find them. • If files can be found (or not found), all other functions of dq2-ls should work.
T3gs Data Management • Sites are part of TiersOfATLAS • Behave like T2s • Needs a tool to manage space • To manage space, it needs the list of files in LFC and SE. • Automated creation and delivery of the content from LFC as a dataset to the corresponding T3 site by DDM • It is expected to send one dataset/LFC dump to a site per week. But, it can be adjusted as needed. • The file is formatted in sqlite3 • It contains the information not only about LFC, but also dataset creation date as well as status of files in BNL. • Example of LFC dump dataset • DSN: user10.HironoriIto.T3Dump.ILLINOISHEP.20100301_13_16_32 • LFN: user10.HironoriIto.T3Dump.ILLINOISHEP.20100301_13_16_32.db
T3gs Data Management • A management tool • storageManager.py • It scan SE and store the information to the same sqlite3 file above. • Using the corrected information, it can do • Find/delete dark files in : LFC, SE and dataset replica • python storageManger.py –sqliteFile= =/home/hiroito/user10.HironoriIto.T3Dump.ILLINOISHEP.20100301_13_16_32.db –sePath=/pnfs/hep.uiuc.edu/data4/atlas/proddisk --showSeDarkFiles • Clean PRODDISK files according to given cut-off time • python storageManager.py --sqliteFile=/home/hiroito/user10.HironoriIto.T3Dump.ILLINOISHEP.20100301_13_16_32.db --sePath=/pnfs/hep.uiuc.edu/data4/atlas/proddisk --deleteProdDisk --cutOffDay=135 --site=ILLINOISHEP_PRODDISK
T3 Performance and Monitor for Data Distribution • The throughput-test program was extended to include Tier-3 sites. • http://www.usatlas.bnl.gov/dq2/throughput • Several T3 sites are already being tested daily. And, the results are shown at the monitor automatically. • Any site can be added without being added to TiersOfATLAS. • It can be viewed as an availability test for a site, particularly for T3g which does not have regular FT transfers via DDM. • A site administrator can easily look at the test rests to see if the SE (SRM/GridFTP) is working. • The monitor can send warning emails to site admin(s) if so desired.