1 / 10

FAX Performance

FAX Performance. TIM, Tokyo May 2013. Performance. Metrics Data Coverage Number of users Percentage of successful jobs Total amount of data delivered Bandwidth usage Source Ganglia plots MonaLisa FAX Dashboard HC tests CostMatrix tests Special tests using dedicated resources.

padma
Download Presentation

FAX Performance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FAX Performance TIM, Tokyo May 2013

  2. Performance • Metrics • Data Coverage • Number of users • Percentage of successful jobs • Total amount of data delivered • Bandwidth usage • Source • Ganglia plots • MonaLisa • FAX Dashboard • HC tests • CostMatrix tests • Special tests using dedicated resources better than 97%, more than 2 replicas mostly UofC, Prague users Latest HC tests >99% ~ 2PB/week Ilija Vukotic ivukotic@uchicago.edu

  3. Cost matrix A place to get idea on rate a single job can expect to see. Are our pipes really this full? Let’s see other sources of information. Ilija Vukotic ivukotic@uchicago.edu

  4. Cost matrix VS. Perfsonar Comparison of just one link in one direction: source AGLT destination MWT2 Perfsonar info at 4 h intervals. Can it be worker nodes links are saturating? Ilija Vukotic ivukotic@uchicago.edu

  5. Clogging the pipes CERN • Using HC submitted jobs submitted to 4 ANALY queues • AGLT2, BNL, MWT2, SLAC • Each site runs 300 jobs of two types – 50 in parallel • xrdcp 3 files randomly chosen from SMWZ datasets prepared for FDR from others • Reads 10% of events from 3 file randomly chosen from FDR SMWZ from others • Uploads time to finish, events/s, MB/s for each job, pandaid so jobs can be investigated • All jobs submitted through FDR web interface http://ivukotic.web.cern.ch/ivukotic/FDR/index.asp • All in parallel to other HC stress tests AGLT2 BNL SLAC MWT2 Ilija Vukotic ivukotic@uchicago.edu

  6. TESTS 0.17% failure rate ! Ilija Vukotic ivukotic@uchicago.edu

  7. COPY • Clearly not limited by WN links • Assuming just 30 simultaneous jobs worst case delivery rates are: • BNL to CERN: 75 MB/s • CERN to AGLT2: 170 MB/s • MWT2 to AGLT2: 100 MB/s • AGLT to CERN: 90 MB/s • SLAC to BNL: 300 MB/s • Average WAN access ~ 300 MB/s Ilija Vukotic ivukotic@uchicago.edu

  8. Read • Rates are the same as for xrdcp except when local access. • Over WAN one should expect at least 50% of CPU efficiency of local access. • Less than 100 simultaneous standard analysis jobs will saturate 10 Gb WAN link. • FAX needs to be used judiciously, can easily overwhelm weaker links • Jobs were reading 10% of events using TTC 30MB • 100% data are transferred and decompressed. • ROOT can decompress our D3PD at ~20 MB/s Ilija Vukotic ivukotic@uchicago.edu

  9. Mona Lisa Ilija Vukotic ivukotic@uchicago.edu

  10. WAYs AHEAD • Increase coverage, add redundancy, increase total bandwidth • Enlargement • Increases performance, reduces bandwidth needs • Caching • Cost matrix – smart FAX • Smart network - Bandwidth requests, QOS assurance • Improve adoption rate • Presenting, teaching, preaching • New services • Improve satisfaction • FAX tuning • Application tuning • New services Ilija Vukotic ivukotic@uchicago.edu

More Related