1 / 29

Moving from CREAM CE to ARC CE: Migration at RAL

This article discusses the migration process from CREAM CE to ARC CE at RAL, including the reasons for choosing ARC CE and the steps involved in the migration. It also highlights important information and modifications made to the information system and job management for efficient operation of ARC CE.

Download Presentation

Moving from CREAM CE to ARC CE: Migration at RAL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Moving from CREAM CE to ARC CE Andrew Lahiff andrew.lahiff@stfc.ac.uk

  2. The short version • Install ARC CE • Test ARC CE • Move ARC CE(s) into production • Drain CREAM CE(s) • Switch off CREAM CE(s)

  3. Migration at RAL • In 2013 we combined • migration from Torque to HTCondor • migration from CREAM CE to ARC CE • Initial reasons for choice of ARC CE • we didn’t like CREAM • HTCondor-CE was still very new, even in OSG • had heard good things about ARC • Glasgow & Imperial College in the UK had already tried it • looked much simpler than CREAM • YAIM not required • ATLAS use it a lot

  4. Migration at RAL Initially had CREAM CEs + Torque CREAM CEs (Torque) Torque server / Maui worker nodes (Torque) APEL glite-CLUSTER

  5. Migration at RAL Added HTCondor pool + ARC & CREAM CEs CREAM CEs (Torque) Torque server worker nodes (Torque) APEL glite-CLUSTER ARC CEs condor_schedd HA HTCondor central managers worker nodes condor_startd CREAM CEs condor_schedd

  6. Migration at RAL Torque batch system decommissioned APEL glite-CLUSTER ARC CEs condor_schedd HA HTCondor central managers worker nodes condor_startd CREAM CEs condor_schedd

  7. Migration at RAL CREAM CEs & APEL publisher decommissioned - once all LHC VOs & non-LHC VOs could submit to ARC glite-CLUSTER ARC CEs condor_schedd HA HTCondor central managers worker nodes condor_startd

  8. Migration at RAL glite-CLUSTER decommissioned ARC CEs condor_schedd HA HTCondor central managers worker nodes condor_startd

  9. ARC CEs at RAL • 4 ARC CEs – each is a VM with • 4 CPUs • 32 GB memory • most memory usage comes from the condor shadows we use 32-bit HTCondor rpms will move to static shadows soon • see slapd using up to ~1 GB • we wanted to have lots of headroom! (were new to both ARC and HTCondor) • Using multiple ARC CEs for redundancy & scalabilty

  10. ARC CEs at RAL • Example from today – 5.5K running jobs on a single CE

  11. Usage since Oct 2013 • Generally have 2-3K running jobs per CE Running jobs per ARC CE monitoring glitch

  12. Things you need to know

  13. glite-WMS support • Some non-LHC VOs still use glite-WMS • getting less & less important • In order for the WMS job wrapper to work with ARC CEs, need an empty file /usr/etc/globus-user-env.sh on all worker nodes

  14. Software tags • Software tags (almost) no longer needed due to CVMFS • some non-LHC VOs may need them however • again, probably getting less & less important • ARC • runtime environments appear in the BDII in the same way as software tags • unless you have a shared filesystem (worker nodes, CEs), no way for VOs to update tags themselves • our configuration management system manages the runtime environments • mostly just empty files

  15. Information system • Max CPU & wall time not published correctly • only a problem for the HTCondor backend • no way for ARC to determine this from HTCondor • could try to extract from SYSTEM_PERIODIC_REMOVE? • what if someone does this on the worker nodes, e.g. WANT_HOLD? • We modified /usr/share/arc/glue-generator.pl

  16. Information system - VO views • ARC reports the same number of running & idle jobs for all VOs • We modified /usr/share/arc/glue-generator.pl • cron running every 10 mins queries HTCondor & creates files listing numbers of jobs by VO • glue-generator.pl modified to read these files • Some VOs still need this information (incl LHC VOs) • hopefully the need for this will slowly go away

  17. Information system – VO shares • VO shares not published • Added some lines into /usr/share/arc/glue-generator.pl GlueCECapability: Share=cms:20 GlueCECapability: Share=lhcb:27 GlueCECapability: Share=atlas:49 GlueCECapability: Share=alice:2 GlueCECapability: Share=other:2 • Not sure why this information is needed anyway 

  18. LHCb • DIRAC can’t specify runtime environments • we use an auth plugin to specify a default runtume environment • we put all essential things in here (grid-related env variables etc) • Default runtime environment needs to set • NORDUGRID_ARC_QUEUE=<queue name> https://github.com/alahiff/ral-arc-ce-rte/blob/master/GLITE

  19. Multi-core jobs • In order for stdout/err to be available to VO, need to set RUNTIME_ENABLE_MULTICORE_SCRATCH=1 in a runtime environment • In ours we have: if [ "x$1" = "x0" ]; then export RUNTIME_ENABLE_MULTICORE_SCRATCH=1 fi (amongst other things) https://github.com/alahiff/ral-arc-ce-rte/blob/master/GLITE

  20. Auth plugins • Can configure an external executable to run every time a job is about to switch to a different state • ACCEPTED, PREPARING, SUBMIT, FINISHING, FINISHED, DELETED • Very useful! Our standard uses • Setting default runtime environment for all jobs • Scaling CPU & wall time for completed jobs • Occasionally for debugging • keep all stdout/err files for completed jobs for a particular VO https://github.com/alahiff/ral-arc-ce-plugins

  21. User mapping • Argus for mapping to local pool accounts (via lcmaps) • In /etc/arc.conf [gridftpd] ... unixmap="* lcmaps liblcmaps.so /usr/lib64 /usr/etc/lcmaps/lcmaps.db arc" unixmap="banned:banned all” ... • Setup Argus policies to allow all supported VOs to submit jobs

  22. Monitoring

  23. Monitoring - alerts • ARC Nagios tests • Check proc a-rex • Check proc gridftp • Check proc nordugrid-arc-bdii • Check proc nordugrid-arc-slapd • Check ARC APEL consistency • check that SSM message sent successfully to APEL < 24 hours ago • Check HTCondor-ARC consistency • check that HTCondor & ARC agree on number of running + idle jobs

  24. Monitoring - alerts • HTCondor Nagios tests • Check HTCondor CE Schedd • check that the schedd ClassAd is available • we found that a check for condor_master is not enough, e.g. if you have a corrupt HTCondor config file • Check job submission HTCondor • check that Nagios can successfully submit job to HTCondor

  25. Monitoring - Ganglia • Ganglia metrics • standard host metrics • Gangliarc: http://wiki.nordugrid.org/wiki/Gangliarc • ARC specific metrics • condor_gangliad • HTCondor specific metrics

  26. Monitoring - Ganglia +more...

  27. Monitoring - InfluxDB • 1-min time resolution • ARC CE metrics • job states, time since last arex heartbeat • HTCondor metrics include: • shadow exit codes • numbers of jobs run more than once

  28. Monitoring - InfluxDB

  29. Problems we’ve had • APEL central message broker hardwired in config • when hostname of the message broker changed once, APEL publishing stopped • now have Nagios check for APEL publishing • ARC-HTCondor running+idle jobs consistency • before scan-condor-job was optimized, had ~2 incidents in the past couple of years where ARC lost track of jobs • best to use ARC version > 5.0.0

More Related