90 likes | 100 Views
This summary provides an overview of support-related events in the past 6 weeks, including the number of real alarm tickets, their status, and the most challenging case encountered. It also mentions the GGUS monthly release and the analysis of test alarms.
E N D
Support-related events since last MB There were 7 real ALARM tickets since the 2011/11/29 MB (6 weeks), 5 submitted by ATLAS,1 by CMS, 1 by LHCb. All ALARM tickets concerned CERN. All of them are in status ‘solved’, most are also ‘verified’. The most difficult case was a LSF problem that occupied supporters for 3 days (including a long debugging session on a Saturday night), the root cause of which was never understood by CERN or Platform engineers. The GGUS monthly release took place on 2011/12/07. 18 test ALARMs were issued and analysed in Savannah:124732 Details follow… WLCG MB Report WLCG Service Report
ATLAS ALARM->CERN LFC restart required GGUS:77049 WLCG MB Report WLCG Service Report
ATLAS ALARM->CERN LSF slow response GGUS:77065 WLCG MB Report WLCG Service Report
ATLAS ALARM->CERN LFC sessions locked for long GGUS:77069 WLCG MB Report WLCG Service Report
CMS ALARM->CERN Oracle sessions time-out GGUS:77142 WLCG MB Report WLCG Service Report
LHCb ALARM->CERN DIRAC host unreachable GGUS:77246 WLCG MB Report WLCG Service Report
ATLAS ALARM->CERN passwd change breaks important T0 web service GGUS:77467 WLCG MB Report WLCG Service Report
ATLAS ALARM->CERN LSF batch down GGUS:77547 WLCG MB Report WLCG Service Report