Network/Controls issue 15 September 2011 1h00 – 9h30

Network/Controls issue15 September 20111h00 – 9h30 P.Charrue on behalf of the BE/CO group P.Charrue - 8h30 LHC meeting

Report from eLogbook & CO experts • 1h07: Task 'Make LHC.USER.INJ resident’ in sequencer cannot reach the end of its execution • Followed by several issues with controls software not running, DIAMON giving many reds alarms, and several servers are unreachable • 1h17: TI calls IT firstline who replied that it does not seem to be their fault • Around 2h00: several CO experts were called (Greg, CO piquet, Mark and Enzo) • Enzoarrived in CCC and started the investigations • Several servers could not connect to the NFS world • Some machines (cmw1, jas5 and diam1) where fully working while others only partially • All faulty servers were connected to the IT/CS IP57 switch: "P874-RV-IP57-SHP3L-4115" which appeared OK by IT/CS expert remote diagnostics • All CO efforts were concentrated to check the NFS exports, faulty servers and working servers to identify the cause of a possible network bandwidth overflow • No evidence were found that controls servers were contributing to the problem • 8h00: CO calls again IT/CS (E. Martelli) to ask for them to recheck their diagnostic • 9h30: decision was taken to reboot the switch: "P874-RV-IP57-SHP3L-4115" . • The reboot of the switch unblocked all servers and every system was able to reach normal operational state P.Charrue - 8h30 LHC meeting

The IP57 switch P.Charrue - 8h30 LHC meeting

Impacted servers • CS-CCR-CMW1, CS-CCR-QPS12, CSM-CCR-QPS23, CS-CCR-QPS23, CSM-CCR-QPS12, CSM-CCR-DIAM1, CS-CCR-DIAM1, CSM-CCR-DIAM2, CS-CCR-DIAM2, CS-CCR-Q4DS7, CSM-CCR-Q4DS7, CS-CCR-INCA2, CS-CCR-TIM10, CS-CCR-MONI, CS-CCR-PVSS4, CS-CCR-LSA1, CS-CCR-INCA4, CS-CCR-INCA3, CS-CCR-JAS5, CS-CCR-TECRG1, CS-CCR-RAMS2, CS-CCR-COCA P.Charrue - 8h30 LHC meeting

First analysis • The origin of the problem which put the switch in a bad state is not yet found • From our investigations (A. Bland) the problem is in no way related to ICMP, UDP or TCP protocol, it is only related to packet size or switch buffer availability • We are certain that the Procurve 3400CL uplink to the IP57 switch interface of the CCC/CCR router was working perfectly • According to IT/CS remote diagnostics the IP57 network switch seemed OK but the reboot of this switch solved instantly all problems • IT/CS is now closely collaborating with BE/CO studying all diagnostic traces to find the explanation P.Charrue - 8h30 LHC meeting

How to react next time • When a network issue is suspected, IT/CS has requested that their network piquet must be called in to the CCC by TI • IT/CS has now added extra diagnostics from their IP57 switch, to have more information in case it happens again P.Charrue - 8h30 LHC meeting

Network/Controls issue 15 September 2011 1h00 – 9h30

Network/Controls issue 15 September 2011 1h00 – 9h30

Presentation Transcript

Information Systems Control

Security+ Guide to Network Security Fundamentals, Third Edition

Export Controls: In the Trenches: Hands-On Approach to Export Compliance

INFS7004 Accounting Information Systems

Controls on the manufacture and distribution of veterinary medicinal products (VMPs) in the EU

Internal Controls in a computerized environment

St Andrew’s Parish Church 11 September 2011

Special Education Leadership Network November 10, 2011

Montréal Qu é bec 13 September 2011

Security in Computing Chapter 7, Security in Networks

Bypassing Client-Side Controls

Kaan Yücel M.D., Ph.D . 13. 15.September. 2011 Thursday

Kaan Yücel M.D., Ph.D . 13. 15.September. 2011 Thursday

September 14-16, 2011

September 2011 Intake Course Induction Teng Kwok Kheong

UK and Global Real Estate Trends – Opportunities for 2011/2012 20 September 2011, London

MODULE 4 Votre plan d'action et les outils

September, 2011 Concord, NH Lynn Fielding

Chapter 7 –Security in Networks

September 21, 2011

EXPORT CONTROLS AND EMBARGOES : The Challenge for U.S. Universities