1 / 11

Effective Data Management Strategies for ATLAS Tier-3 Sites

Learn about DQ2 usage, data distribution methods, and space management for Tier-3 sites to optimize performance without disrupting production systems. Implementing FTS with dq2-get to improve data transfer efficiency. Efficient SE and storage management techniques discussed.

jedinger
Download Presentation

Effective Data Management Strategies for ATLAS Tier-3 Sites

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DDM for (US) Tier3 Hironori Ito Brookhaven National Laboratory US ATLAS Tier-2/Tier-3 Meeting March 2010

  2. Data Distribution to Tier-3 • T3gs • Must use DQ2 SS • T3 must satisfy all other requirements: space token, BDII, SAM test, etc… • BNL has been running DQ2 SS for Tier3 for several months • BNL is also hosting LFC for Tier3 for the same period. • Sites are listed in TiersOfATLAS and working normally (like T2s) • T3g • If desired, a site can run SRM (and other obligatory services) to be a part of the DDM system. • If not DDM, sites/users will be responsible for getting data. • SRM, plain GridFTP server or no grid storage. • What is the good way? • dq2-get? • Does it provide the performance? • Does it interfere with the production system?

  3. Can T3s Destroy ATLAS Computing Operations at T0/T1s/T2s? dq2-get –L site –T=1,10000 DSN list=Array.new() `dq2-list-files DSN`.split("\n").each do |line| line=line.split(" ") if line.size>3 list.push(Thread.new { `"dq2-get -L site -f #{line[0]} DSN"` }) end end list.each { |t| t.join } Easy! But, don’t do this, please!

  4. Idea • Let’s try to discourage users from possibly disrupting production SE(s) by providing alternate method without sacrificing the performance and functions • The production DDM uses FTS, which is managed by T0/T1s and configured not to overload SEs • Why not use FTS?

  5. dq2-get with FTS • dq2-get is modified to use FTS to transfer between two remote SEs (srm and gridftp) • dq2-get –to-fts=FTS_SERVICE –p ‘fts’ –to-storage=Destination_SE_Path –L Source-site DSN • Follow all standard DDM conventions • The convention enforces the structure of the destination directory path, which can be used as the file catalog without LFC • Does not allow a user to place dataset to a production area • It works with SRM as well as the plain GridFTP server. • The destination site does not need to be listed in TiersOfATLAS, lowering the workload by T3 administrators

  6. Real example of dq2-get with FTS • dq2-get --to-fts=https://fts02.usatlas.bnl.gov:8443/glite-data-transfer-fts/services/FileTransfer -p 'fts' –to-storage= srm://atlfs02.phy.duke.edu:8443/srm/v2/server?SFN=/srv/data/srm/testhiro -L BNL-OSG2_MCDISK groupmc08.105807.JF35_pythia_jet_filter.merge.AOD.e418_a84_t53_tid076929 • dq2-get --to-fts=https://fts02.usatlas.bnl.gov:8443/glite-data-transfer-fts/services/FileTransfer -p 'fts' –to-storage= gsiftp://atlfs02.phy.duke.edu:2811/srv/data/srm/testhiro -L BNL-OSG2_MCDISK mc08.005010.J1_pythia_jetjet.recon.AOD.e323_s400_d99_r474_tid023482

  7. Data Management at T3g • If sites use DDM, it is up-to Tier-3 administrators to maintain the SE by themselves. • There is no consistency problem if the site does not use LFC because all files are “SE Dark” files by definition. • If files are organized in SE according to DDM standard structure, it will be fairly easy to maintain/delete. • Future modification to dq2-ls: • Instead of asking LFC to find files, it can check SE at an expected directory with given path to find them. • If files can be found (or not found), all other functions of dq2-ls should work.

  8. T3gs Data Management • Sites are part of TiersOfATLAS • Behave like T2s • Needs a tool to manage space • To manage space, it needs the list of files in LFC and SE. • Automated creation and delivery of the content from LFC as a dataset to the corresponding T3 site by DDM • It is expected to send one dataset/LFC dump to a site per week. But, it can be adjusted as needed. • The file is formatted in sqlite3 • It contains the information not only about LFC, but also dataset creation date as well as status of files in BNL. • Example of LFC dump dataset • DSN: user10.HironoriIto.T3Dump.ILLINOISHEP.20100301_13_16_32 • LFN: user10.HironoriIto.T3Dump.ILLINOISHEP.20100301_13_16_32.db

  9. T3gs Data Management • A management tool • storageManager.py • It scan SE and store the information to the same sqlite3 file above. • Using the corrected information, it can do • Find/delete dark files in : LFC, SE and dataset replica • python storageManger.py –sqliteFile= =/home/hiroito/user10.HironoriIto.T3Dump.ILLINOISHEP.20100301_13_16_32.db –sePath=/pnfs/hep.uiuc.edu/data4/atlas/proddisk --showSeDarkFiles • Clean PRODDISK files according to given cut-off time • python storageManager.py --sqliteFile=/home/hiroito/user10.HironoriIto.T3Dump.ILLINOISHEP.20100301_13_16_32.db --sePath=/pnfs/hep.uiuc.edu/data4/atlas/proddisk --deleteProdDisk --cutOffDay=135 --site=ILLINOISHEP_PRODDISK

  10. T3 Performance and Monitor for Data Distribution • The throughput-test program was extended to include Tier-3 sites. • http://www.usatlas.bnl.gov/dq2/throughput • Several T3 sites are already being tested daily. And, the results are shown at the monitor automatically. • Any site can be added without being added to TiersOfATLAS. • It can be viewed as an availability test for a site, particularly for T3g which does not have regular FT transfers via DDM. • A site administrator can easily look at the test rests to see if the SE (SRM/GridFTP) is working. • The monitor can send warning emails to site admin(s) if so desired.

  11. Sample Throughput Test results at T3

More Related