190 likes | 196 Views
Explore the reliability challenges, maintenance strategies, and system architecture of cryogenic systems at CERN, based on extensive operational experience and insights from LEP2 and LHC. Discover key components, failure risks, and maintenance policies for optimal system performance.
E N D
ISSUES CONCERNING THE RELIABILITY OF CRYOGENIC SYSTEM M. Sanmarti/ AT-ACR
Outline • From LEP2 to LHC • LHC Cryogenics system architecture (redundancy) • Reliability of sub-systems and components • Maintenance policy and shut-down strategy • Conclusions M. Sanmarti / AT-ACR
Previous considerations • Most of the main cryogenic components have been extensively used at CERN • Considerations based on experience more than detailed failure risk analysis • LEP2 and first LHC commissioning experience • Availability, failures & MTBF’s related to beam (LEP2) and beam commissioning (LHC) • Major or first order failures: ”something that breaks or something that does not work as expected“ M. Sanmarti / AT-ACR
More than 120.000 h cumulated running hours Sub-system or Hardware commissioning Beam commissioning… Recovery after utility failures downtime <2% Cryo impact < 1% De-icing: reduced cooling capacity, time used for MD LEP2 experience M. Sanmarti / AT-ACR
Learning from LEP2 • Components (impact on machine): • Main components failures during commissioning or restart after SD • Cryogenics downtime (not including utilities): < 1,0% • MTBF Cryo system 0.1 years, MTBF Cryoplant 0.4 years, MTTR 1-2 hours • Cold boxes (MTBF years): instrumentation and turbines very reliable • Compressor stations: • Mainly aging problems on instrumentation/piping (MTBF 0.5 years) • Controls: dedicated and robust control system was almost transparent • Distribution & RF cavities: • mainly beam related issues (heat load) affecting cooling capacity • Access needed although no urgent intervention required (key components in RA) • Impurities (De-icing): • Gaseous impurities at warm turbines level (120 K & 90 K) • Predictable: time used for MD or interventions • Maintenance: • Extensive preventive maintenance campaign during SD periods • Corrective: MTTR < 1-2 hours but amplified impact on machine (x7) M. Sanmarti / AT-ACR
5 5 6 6 4 4 4 x 12/18 kW @ 4.5 K 8 x 18 kW @ 4.5 K 8 x 2,4 kW @ 1.9 K 3 7 3 7 1’800 SC magnets 288 SC RF cavities 24 km @ 1.9 K 2 km @ 4.5 K 36‘000 tons @ 1.9 K 75 tons @ 4.5 K 2 8 2 8 1 1 LEP2 LHC From LEP2 to LHC cryogenic system LEP 1500 I/O channels, 8 compressors, 7 turbines per point (4 points) LHC 9000 I/O channels, 16 compressors, 20 turbines per point (5 points) M. Sanmarti / AT-ACR
The LHC cryogenic architecture per point Built-in redundancy Weak point: 2-3 M. Sanmarti / AT-ACR
DFB DFB DFB DFB Major (sub-systems) failures Unlikely to occur during life-cycle, but possible! One 4.5K Ref. or one 1.8K unit out of order: =>Low intensity OK (beam commissioning OK) BUT transition: ≈ 12 to 24 hours Common parts (QUI-QRL-DFB), loss of isolation vacuum : => Total stop of the machine M. Sanmarti / AT-ACR
4.5 K & 1.8 K Refrigerators • Instrumentation: high reliability and spares, MTTR~1-2h • Warm compressors: no redundancy but spare capacity or connection to adjacent refrigerators would allow degraded mode (low intensity): • Oil piping: if spares, MTTR 1-2 days • Motor/Compressor replacement?? • Turbines: no spares at the moment, diagnosis + 5 h. intervention delay if spare available, otherwise degraded mode allows continuation of tests • Cold compressors: spares available, diagnosis + 5 hours delay, no degraded mode allowed • Impurities: • Dryers (H2O), switchable adsorbers (Air, 80 K), single adsorber (H2, 20 K) • Vacuum (leaks): temporary solution until SD major intervention • ACCESS constrains: underground and UX4, UX6, UX8 for QURC (1.8 K) • From the cooling capacity point of view such failures should not affect beam commissioning (spares, redundancy, adjacent refrigerator) but the operational constraints and the recovery time will increase • Degraded modes could be a problem for scrubbing run M. Sanmarti / AT-ACR
Filters in CFB (Magnets Test Bench) Shut Down 2004-2005 QUI (Interconnecting Box) • Instrumentation: • Redundancy on control loop and QRL interfaces sensors • Cryogenic valves (no redundancy): high MTBF • Heater (warm up): redundancy but degraded mode, longer warm up • Vacuum (leaks): temporary solution until SD major intervention • Impurities (Solid): possibility of clogging the QUI filter (line D) provoking a stop of the cooling flow • It would mainly happen during the cool down and the first few quenches • It requires 1-2 days to replace the filter and reach again nominal conditions • From the functionality point of view: • The QUI assures the redundancy of the refrigerators • Clogging of line D filter is the most likely failure to occur • No redundancy for cryogenic valves of QRL interfaces • Any intervention needs underground access: UX4, UX6 & UX8 M. Sanmarti / AT-ACR
QRL and Ring equipment I • QRL • Instrumentation: • Redundancy or degraded mode possible • Most of Cryogenic valves are redundant (degraded mode): • In situ exchange: up to 1 week intervention depending on valve position • Quench valves (cool-down/fill): no redundancy for filling (once/year), security redundancy • Beam screen: • Clogging problems (small Ø pipe): beam screen temperature?? • Loss of instrumentation (heater & temp., no redundancy): • No Temp. control, higher helium flow • Problems during “scrubbing” run • DFB’s, Standalone magnets & DSL’s: • Instrumentation: • HTS valves: easily repair • HTS temperature: redundancy or other control options (valve characteristics against current) • Level gauges: redundancy or easily repairable (except for D2, D3) • DFB: presentation by A. Perin this Workshop M. Sanmarti / AT-ACR
QRL and Ring equipment II • Dipoles & Inner Triplets: • Temperature sensors redundancy (needs electronics replacement) • Other control options (opening valve characteristics, copy valve position of adjacent cells) • Level gauge bayonet heat exchanger: liquid in line B and possible magnet temperature perturbation (operational issue not affecting pumping capacity but temperature control) • Isolation Vacuum: presentation by P. Cruikshank this Workshop • RF cavities: • Instrumentation, valves as above • Pressure stability and protection during quench/quench recovery (presentation by S. Claudet) From the functionality point of view: • Reliability of primary components is high • Replace (redundancy) possible or degraded modes: less control (temp.) and higher helium consumption • Any intervention needs access to the tunnel: radiation issues for IT??? (OK for BC) M. Sanmarti / AT-ACR
LHC experience • New LHC 4.5 K cryoplant @ PM18 (2002-2004): • No major gaseous impurities problems (solid impurities filters in CFB, MTB) • Availability about 99% (50% utilities/cryo) for 20000 cumulated running hours with degraded modes or spare capacity • String2 experience (2002-2003): • 98,5% availability over 4170 h (2002) & 98,6% availability over 1950 h (2003) • Prototype/commissioning: mainly tuning, quench recuperation and controls • No major problems with instrumentation or beam screen circuit M. Sanmarti / AT-ACR
Utilities Failure Recovery (L. Serio @ Chamonix 2003) Cryogenics is a recovery time amplifier LEP contractual time recovery < 5.5 hours + 7*stop duration LHC estimated time recovery < 6 hours + 3*stop duration • Controls : • Complete new control system (still design problems) • Ethernet dependent (control loops, PLC communication) • Recovery performances: • Recovery predictions have still to be validated for the global system during hardware commissioning • Degraded modes will increase recovery time M. Sanmarti / AT-ACR
Maintenance Policy Presentation by T. Pettersson this Workshop • Existing maintenance plan to be upgraded (LEP), completed (new LHC installations) and everything to be implemented in CERN CAMMS • Based on preventive maintenance campaign during SD • Baseline: 13 weeks for full maintenance campaign • Issues arising: Safety valves (5000 u.) every 2 years inducing corrective maintenance • Spare parts: first batch after commissioning using industrial method for criticity analysis, ~2,2% cryoplant cost (280kCHF for 4.5 K refrigerator) • Assures MTTR of 1-2 hours • No spare for turbines, warm compressors/motors… • Maintenance management: • No CERN resources for execution • Maintenance management needs to be reinforced and fully driven by CERN • Manpower management depending on SD scenarios M. Sanmarti / AT-ACR
Shut Down Strategy (16 weeks from Chamonix’04) • Scenario 1: full maintenance & floating temperature (T~200 K) • Keeps preventive/corrective ratio (LEP and present experience): same availability rates • Perturbations during cold check-out: corrective maintenance after SD • Requires ELQA if T>80K (+ 5 weeks during MCO) • Thermal cycling of components: helium leaks, welding stress… • Scenario 2: maintenance on 1 cryoplant/point keeping sectors “cold” • Increases preventive/corrective ratio: reduce availability rates?? • Lower risk of perturbations during cold check-out (after SD) • No need of additional 5 weeks for ELQA • No thermal cycling of components • Not possible in sector 2-3 • Utilities: driven by cooling water towers • 4 weeks per LHC point (2 points in parallel): to be reviewed for Scenario 2 • In any case, warm up could be needed for ring components replacement (magnet, etc..) M. Sanmarti / AT-ACR
Conclusions I • In principle, the cryogenic system should have a very low impact (except on sector 2-3) the beam commissioning because of: • Redundancy of systems • Available spare cooling capacity for low intensity beam • Reliability of components and instrumentation • However, failure of sub-systems and components can not be ruled out completely and could result in few days delays to switch to redundant system or component and to adapt to new configuration • Worst failure would be the loss of insulation vacuum on the QUI or QRL as well as the refrigerator in point 2 or DFB’s for magnet powering • Most likely failure would be filters blockage on the QUI during or after the first cool down and magnets quench due to accumulation of impurities (consolidations under study) • Recovery time after major failure (utility or cryo) will be approximately 6 hours plus 3 times the stop length (15 times if bad vacuum/QRV leaks) M. Sanmarti / AT-ACR
Conclusions II • The cryogenic sub-systems will be individually tested, but the overall cryogenics system will certainly require complex and extensive commissioning prior and during powering to validate the global and collective behavior and optimize operating modes • The availability or quench recovery performances of the cryogenic system: • could be reduced by additional heat loads or non conformities from commissioning • Depends on a correct Maintenance Management: it has already started and needs CERN dedicated resources! M. Sanmarti / AT-ACR
Acknowledgements • Many thanks to: • L. Serio • S. Claudet • G. Riddone • R. Van Weelderen • A. Perin • P. Gomes • Ph. Gayet • N. Bangert for their contribution to this presentation M. Sanmarti / AT-ACR