160 likes | 325 Views
FAX FDR results. 28 th January 2013. For reference. Indico page: https:// indico.cern.ch / conferenceDisplay.py?confId =229966 Twiki : https:// twiki.cern.ch / twiki /bin/ viewauth /Atlas/ JanuaryFDR ML based monitor at SLAC: http://atl-prod07.slac.stanford.edu:8080/ display
E N D
FAX FDR results 28th January 2013
For reference • Indico page: https://indico.cern.ch/conferenceDisplay.py?confId=229966 • Twiki: https://twiki.cern.ch/twiki/bin/viewauth/Atlas/JanuaryFDR • ML based monitor at SLAC: http://atl-prod07.slac.stanford.edu:8080/display • CERN dashboard: http://dashb-atlas-xrootd-transfers.cern.ch/ui/# • WAN HC based tests:http://ivukotic.web.cern.ch/ivukotic/WAN/index.asp • FDR dedicated test submission page: http://ivukotic.web.cern.ch/ivukotic/FDR/index.asp Ilija Vukotic ivukotic@uchicago.edu
Endpoints • From AGIS • Two more sites added during the week: Lancaster and Liverpool • RALPP and Cambridge joined but still testing against them • PIC should join shortly • All the endpoints showed very high availability during the week. Ilija Vukotic ivukotic@uchicago.edu
Links • WAN HC tests are continually testing full mesh of links. • That’s used for SSB cost matrix. Ilija Vukotic ivukotic@uchicago.edu
Links • But not all of the links worked due to authorization issue • Plot shows situation ten days ago. Ilija Vukotic ivukotic@uchicago.edu
Links • During the week • Not all servers shown Ilija Vukotic ivukotic@uchicago.edu
Input data • Week started without any input data that could be used and confusion concerning dataset and file naming. • Federica kindly provided both a list of datasets to be used (SUSY and SMWZ) and a script to replicate them automatically. • dq2-put/dq2-get combination proved to be inadequate for task at hand. • We decided to start with just the first dataset of SUSY sample: 19 files of 68.5 GB. • Hiro made transfer requests to all of the US sites and than manually re-registered them. • Simone did equivalent thing for CERN and 3 Italian sites, than continued with the DE + RU + UK cloud • Wahid used get/put method for all of the UK sites but only the first dataset. Ilija Vukotic ivukotic@uchicago.edu
Input data • Current distribution status of the first SUSY dataset Ilija Vukotic ivukotic@uchicago.edu
Week prior to FDR Blue and light blue are MWT2 mostly tier3 users using tier2 data through FAX Stream of ~100-150MB/s of testing data. Peaks every 30 min. On 11th Jan just sent a bit more tests. Ilija Vukotic ivukotic@uchicago.edu
During FDR week University of Chicago users were using FAX from both Tier3 and grid jobs at Tier2. Will remove it from further plots. Ilija Vukotic ivukotic@uchicago.edu
During the week • Submitting jobs through the web site. • All of the details down to PandaID’s can be found there. • In short: HC jobs that were normally used for local site tests where changed so they contact Oracle DB at CERN and from there get info: which FAX endpoint to use, which files to use, what to do with them copy/read, how many jobs to do etc. Jobs upload back time, MB/s, ev/s. • System is very easy to use, still some space for improvement: • It lets you try to use link which (currently) does not work. • To get more than two simultaneous jobs you have to make change in HC test. • Even when you request 10 simultaneous jobs you easily end up with much less, depending on how fast are your jobs and how long is the client sites ANALY_* queue. • Testing pattern: • reading from same site (10 files from 10 jobs ) • from site in same cloud (10 files from 10 jobs ) • from main regional site (CERN, BNL) (10 files from 10 jobs ) • from across the pond (10 jobs 1 file) Ilija Vukotic ivukotic@uchicago.edu
During the week Bandwidth measure • Started low: copy and read from Roma1 • Real performance measure • To Napoli penalty 30% • To CERN 50% • Worst case scenario Ilija Vukotic ivukotic@uchicago.edu
During the week • Started low: copy and read from Oxford Ilija Vukotic ivukotic@uchicago.edu
During the week • Started low: copy and read from Oxford Ilija Vukotic ivukotic@uchicago.edu
During the week • QMUL LRZ-LMU Ilija Vukotic ivukotic@uchicago.edu
conclusion • A huge progress in FAX readiness just before and during the week. THANK YOU ALL ! • Sites were testing was possible showed no problems in delivering data. • Performance was fluctuating a lot, depending on other transfers to/from site. Fine spatial granularity cost matrix will be essential for some applications. • We gained a lot of experience and surely next FDR will be much better. Could be made as soon as: • Authentication issue properly solved • Proper implementation of both copy-to-scratch and direct-access modes • The test data will be available from the first day Ilija Vukotic ivukotic@uchicago.edu