90 likes | 203 Views
Orbit Feedback Issues Status and Plans. LBOC – 17/04/2012 JJ Gras with input from Laurette, Rudiger, Mike, Joe rg , Gianluigi , Maxim, Ralph and Lars. Issues #1: Wrong Orbit References. Intermittent problems with orbit feedback:
E N D
Orbit Feedback IssuesStatus and Plans LBOC – 17/04/2012 JJ Gras with input from Laurette, Rudiger, Mike, Joerg, Gianluigi, Maxim, Ralph and Lars
Issues #1: Wrong Orbit References • Intermittent problems with orbit feedback: • H or V reference in the OFC zeroed for both beams. Observed for the first time on 21/3 at 00:59 Fill #2478 - squeeze
Issues #1: The Solution • The problem has been identified and is related to different concurrent threads running on the OFSU, and during the process of sending the reference orbits from the sequencer to the OFSU, wrong data are introduced. • This racing problem was there last year but we were a lot less sensitive (use of base orbit + bumps / no feedback on IR BPMs) • A new version of the server is ready for test
Issues #1: Mitigation Measures • Several mitigation measures were introduced: • Tasks in the sequencer were introduced, asking the EiCs to confirm that they checked the correct reference before starting the ramp and before starting the squeeze • As already done earlier, the SIS will capture if a active reference is zero and dump the beam (Joerg) • A task was introduced in the sequencer checking that the references are correct (Roman) • The EiCs and operators will observe the orbit during ramp and squeeze, and dump the beams in case of problems.
Issues #2: OFSU Server Crash • Another failure mode: OFSU crashing during the Squeeze • Problem unrelated to the previous one • OFSU reconnected automatically but not properly.
Issues #2: The Solution • The problem has been identified and is related to: • Crash: A memory leak leading the server to crash after a couple of days has been identified. The new version of the server includes the fix • Wrong Settings: Persistency issues on reference variables. There is no straightforward solution for this as FESA cannot guaranty to recover latest variables in all cases after a restart. The new version will reduce significantly this probability but we should concentrate on avoiding crashes during beam time
Issues #2: Mitigation Measures • Several mitigation measures are introduced: • A task will be introduced in the sequencer to restart the server before each fill and thus avoid crashes linked to memory leak. In the meantime, OP crew has the consigne to do it manually. • SIS interlock checking that OFSU server is alive (during Ramp or Squeeze Mode) has been deployed and tested successfully. It will dump the beam if it has no news from OFSU for more than 10s.
Consideration on Risks (Rudiger) • ‘The probability for an asynchronous beam dump, e.g. by kicker firing, is low. The probability for an asynchronous beam dump when a dump is triggered by the BIC is also low (such event has not yet been observed). • As it was pointed out by Ralph, even in the case that the orbit is not correct we do not expect catastrophic damage (LHC out for many months) in case such failure’.
Agreed Plan • Since mitigation measures are in place (ie last Wednesday), no OFSU issues have been observed • So the plan is: • Finish implementation of the proposed mitigation measures (missing: proactive restart of server in sequencer – [Laurette, in progress]) • Stress test OFSU development version in the lab [Maxim - in progress]. • Deploy new server during technical stop and continue stress tests on the machine via sequencer. • Commission new server with beam after TS. • Then continue close monitoring, keeping mitigation measures in place for the moment