120 likes | 256 Views
Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team WLCG Operations Coordination F2F – CERN [11 th February 2014]. Ongoing Tasks Forces. Today ’ s detailed talks. I am providing here a brief summary for some other TFs. gLExec deployment.
E N D
Status of Tasks Forces Josep Flix (PIC/CIEMAT) On behalf of the WLCG Operations Coordination Team WLCG Operations Coordination F2F – CERN [11th February 2014]
Today’s detailed talks • I am providing here a brief summaryfor some other TFs
gLExec deployment • Multi-user pilot jobs should make use of gLExec to change user identity. The TF aims to coordinate the deployment of gLExec without interfering with current Exp. workflows • Each site has its gLExec infrastructure regularly tested through SAM tests(at some point to become critical) • # of closed tickets is 75 • # of open tickets is 20 • A serious bug in gLExec has been discovered:https://twiki.cern.ch/twiki/bin/view/LCG/GlexecDeployment#Known_issues • CMS will made gLExec SAM test criticalsoon https://twiki.cern.ch/twiki/bin/view/LCG/GlexecDeploymentTracking
perfSONAR deployment • Goal is to encourage all WLCG sites to deploy, configure and register perfSONAR-PS instances gathering network metrics on the network paths for all of the WLCG sites • A new release (3.3.2) is available: • sites should upgrade. Procedure is straight-forward and requires no re-configuration • A campaign to get remaining sites installed and out-of-date installations upgraded is still ongoing (tickets) • Several sites are behind this deployment: • WLCG OpsCoord is raising the issue with the WLCG management and experiments [I. Bird @ LHCONE WS] https://twiki.cern.ch/twiki/bin/view/LCG/PerfsonarDeployment
Tracking Tools Evolution I • Developers, deployers, experts of GGUS, SNOW, Savannah, JIRA and the experiments discuss development options for each tool and interfaces between them, when required • GGUS releases: • Last release done on 29thJan. 2014 • Includes several minor bug fixes, and new WLCG Monitoring SU • Next release: 26th of February • Prototype of multiple site notification expected before the end of the month in a test instance (hopefully in Prod. by March)
Tracking Tools Evolution II • Developers, deployers, experts of GGUS, SNOW, savannah, JIRA and the experiments discuss development options for each tool and interfaces between them, when required • Savannah to JIRA migration: • Very slow progress in this area • Main issue for the 'GGUS Shopping list' tracker (cross-references between tickets) still not solved after more than one year • Other trackers do not depend on this functionality, so it might be the moment to accept that these references will be lost during the migration https://twiki.cern.ch/twiki/bin/view/LCG/TrackingToolsEvolution
XrootD deployment • The aim of this task force is to help the deployment at the WLCG sites of the Xrootd federated data storage for the FAX (ATLAS) and AAA (CMS) projects. • Campaign for publishing xrootd endpoints in GOC/OIM is about to start (tickets!!) • this will ease the operations and monitoring effort https://twiki.cern.ch/twiki/bin/view/LCG/XrootdDeployment
SHA-2 Migration • How services used by WLCG VOs (ALICE, ATLAS, CMS, DTEAM, LHCb, ops) can be tested for SHA-2 readiness • The EOS SRM for LHCb is not OKyet • patch needed to support the "root" protocol expected by LHCb jobs • voms-proxy-initon lxpluscrashes when creating SHA-2 RFC proxies (discovered by CMS) • works OK with Java-based version provided by voms-clients3 • VOMRS: • VOMS-Admin test cluster will soon be available • host certs of future VOMS service from new SHA-2 CERN CA • campaignto get the new servers recognized in LSC files across the Grid (also provide such files in rpms) https://twiki.cern.ch/twiki/bin/view/LCG/SHA2readinessTesting
Machine/Job Features • Machine/Job features to provide information from a resource provider (batch system, IaaS) to the payload: • static (eg. power of the machine, number of cores, local scratch space) • dynamic (eg. shutdown time of a VM) • Current prototypeat CERN lxbatch (bare metal / vWNs): • received feedback from ALICE who were testing mjf on the CERN batch nodes (waiting for feedback from ATLAS/CMS) • For cloud-like installations the TF has decided to look into alternatives of communicating the features: • investigating nosql key/value stores as a viable alternative. A test instance has been setup and is being validated right now https://twiki.cern.ch/twiki/bin/view/LCG/MachineJobFeatures
IPv6 validation and deployment • The imminent exhaustion of the IPv4 address space will eventually require to migrate the WLCG services to an IPv6infrastructure. TF works in close relation with the HEPIX IPv6 Working Group • Agreed at the last F2F that it would be beneficial toprogress with volunteering sites moving to dual stack • trying to understand how to make sure the instability this would cause does not have negative impact on the site (?) • A document for the MB is being prepared, covering also the case of the MW readiness WG https://twiki.cern.ch/twiki/bin/view/LCG/WlcgIpv6
Conclusions • All of the TFs are progressing well! • Sites and experiments are encouraged to actively participateon the discussions and the TFs! • WLCG Operations coordination twiki: https://twiki.cern.ch/twiki/bin/view/LCG/WLCGOpsCoordination • Mailing list: wlcg-ops-coord@cern.ch