40 likes | 166 Views
AMOD report. Alessandro Di Girolamo Stephane Jezequel Guido Negri. Tier0 – Central Services. Castoratlas/t0atlas: a few failures in writing (Tue 17 th , still not understood) and in reading (Wed 18 th , a single server with problems)
E N D
AMOD report Alessandro Di Girolamo Stephane Jezequel Guido Negri
Tier0 – Central Services • Castoratlas/t0atlas: a few failures in writing (Tue 17th, still not understood) and in reading (Wed 18th, a single server with problems) • 2 new servers added to the LFC frontend (7 in total, i.e. 90*7=630 possible connections, backend database limited to 500 connections; DB team raised to 900); need to increase communications between GS and DB • Castoratlas/atlt3 being dismissed (clean up ongoing, TMPLOCALGROUPDISK removed from ToA)
Tier1s/Tier2s • PIC: many transfer failures on Monday (GGUS:84311). All dCache pools assigned to Atlas filled, new disk space assigned. PIC reported they had problems in installing 1.7PB of new hardware, should be done now • TRIUMF: GGUS:84327, bad ACLs on some directories in LFC, asked the site to kindly change them (Kors certificate used to create them is no more valid) • INFN-NAPOLI power cut during week end (14th Jul), took 3 days to fully recover (back in prod on Tue 17th) • DESY-ZN (20th –23th ) : Lost credentials during SE update
ATLAS internals • US cloud draining jobs over the weekend (21-22 Jul). Most probably due to the MaxTime attribute changed in SchedConfig (formerly a static attribute filled by hand, then changed to the one collected from the BDII from AGIS). Alden reverted back to the old values, he will now dump the different values to see discrepancies and possible solutions • some GroupProduction tried to run over ESD on tape (no more DISK copy at T1). This should not be done, they should run on RAW. Nurcan has been contacted