200 likes | 214 Views
This project aims to analyze the performance of ATLAS jobs on the grid, understand the numbers obtained, and improve the software, files, and usage of resources. The goal is to have a simple, realistic, and versatile system running on most resources, with a fast turnaround time and web interface for important indicators.
E N D
Atlas ANALYSIS performance on the grid monitoring and improving Me, Wahid, Paul, Doug, Jack, Johannes, Dan
We want to • Know what is performance of ATLAS jobs on the grid • We don’t have one widely used framework that we could instrument so we need to be open to any kind of jobs: root analysis scripts, athena jobs, d3pd maker • Understand the numbers we get • Improve • Our software • Our files • Way we use root • Middleware • Sites • Way to test developments • Have it as simple, realistic, accessible, versatile as possible • Running on most of the resources we have • Fast turn around • Test codes that are “recommended way to do it” • Web interface for most important indicators Ilija Vukotic ivukotic@cern.ch
Why analysis jobs are important ? • Number of analysis jobs are increasing • Production jobs are mostly CPU limited, well controlled, hopefully optimized and can be monitored through other already existing system • Analysis jobs we know very little about and potentially could: be inefficient, wreck havoc at storage elements, networks. Ilija Vukotic ivukotic@cern.ch
How its done 1. HammerCloud submits jobs 2. Jobs collects and sends info to DB Ilija Vukotic ivukotic@cern.ch • Continuous • Job performance • Generic ROOT IO scripts • Realistic analysis jobs • Site performance • Site optimization • One-off • new releases (Athena, ROOT) • new features, fixes • All T2D sites (currently 35 sites) • Large number of monitored parameters • Central database • Wide range of visualization tools
Pilot numbers obtained from panda db Ilija Vukotic ivukotic@cern.ch
message • Everybody • Visit http://ivukotic.web.cern.ch/ivukotic/HC/index.asp • Give it a spin, give us feedback and ask for features • Site admins • We are trying to improve our performance and reduce stress on your systems, and not to judge sites. • Compare your site to others, see what they do differently and improve. • ROOT / CMS / StorageTesting people • Give us you code/data and we do fast testing for you on all different kinds of CPUs /storage backends / protocols. • We’ll learn something from your tests too. Ilija Vukotic ivukotic@cern.ch
Result – Efficiency • Average results over all the sites during last month using 17.0.4 (ROOT 5.28) • 77% Event loop CPU efficiency • Total job CPU efficiency 41% comparable Realistic analysis Ilija Vukotic ivukotic@cern.ch
Result – Efficiency of TTC • For EOS it is indispensable Ilija Vukotic ivukotic@cern.ch
Result – Efficiency of TTC • TTC effects will get more pronounced over WAN Ilija Vukotic ivukotic@cern.ch
Result – setup time part 1 Even under one minute the setup time is way too large overhead for analysis jobs. Analysis jobs duration limited by size of temp disk (<10GB). Any reasonable analysis job should be shorter than 20 min. • At some sites we occasionally noticed very large setup times. • They allow for 24 jobs per machine and these machines have 24GB of RAM, • To avoid swapping problems they make accepted job wait in setup until there is 2GB of RAM free. • Occasionally this leads to job waiting hour or two in setup . • Even then the job often runs into swapping problem few minutes later. At some CVMFS sites setup times in thousands of seconds traced to a bug in CVMFS that causes cache corruption. • The biggest problem are times of 50-100 seconds. Against all the expectations CVMFS sites are in average slower to setup: 40 vs 52 seconds • Is cache invalidated so often? • Very big and a long standing issue of CMT doing millions of stat calls. • Working on it with David Q. , Grigori R. Ilija Vukotic ivukotic@cern.ch
Result - overbooking • There is often a suboptimal overbooking of the nodes. • Example • use Intel(R) Xeon(R) CPU E5645 @ 2.40GHz, 12 cores machines. • While loads up to 14 -15 are maybe acceptable loads of 16+ are just wasting resources as job execution times basically doubles. There is nothing preventing any grid job spawning 15 threads. This affects everybody. Can / Should we do something about it? Ilija Vukotic ivukotic@cern.ch
CPU normalization • CPU HS06 not a reliable indicator of how much CPU time our jobs will spend • Use our jobs to derive this info Ilija Vukotic ivukotic@cern.ch
US sites Efficiency Ilija Vukotic ivukotic@cern.ch
US sites Efficiency WN load is not very correlated to CPU eff. But site occupancy may be. Empty queue Ilija Vukotic ivukotic@cern.ch
US sites Pilot timings Ilija Vukotic ivukotic@cern.ch
Curious BNL machines INTEL X5xxx US sites INTEL Exxxx Ilija Vukotic ivukotic@cern.ch
Questions to answer ASAP • Optimize each site – example: is it better to pre-stage input files? • Performance of different storages/protocols • What comes into stage out time? • Optimal autoflush / TTC settings? • Performance of all the ROOT versions TO come • Stress tests • WAN tests Ilija Vukotic ivukotic@cern.ch
Reserve Ilija Vukotic ivukotic@cern.ch
Result – Hardware issue • In Glasgow we have found a set of 6 nodes of X5650 having longer CPU times than the rest and contacted the site with node names. • Explanation • The 2 sets of 3 nodes map to 2 "4 node" boxes. • Both of those boxes had a single failed PSU out of the redundant PSUs that power each box • The nodes underclockedto manage the lower available power. • The PSUs in question have been fixed and now operating at their full clock speed. Ilija Vukotic ivukotic@cern.ch
Result – setup time part 2 Against all the expectations CVMFS sites are in average slower to setup: 40 vs 52 seconds – will see with Jakob. Ilija Vukotic ivukotic@cern.ch