250 likes | 265 Views
Stay updated on the status of CMS ECAL installation and commissioning, highlighting key recommendations for improvement based on recent readiness reviews.
E N D
ECAL Readiness Reviews Roger Rusack
Some Context The last detector to be installed into CMS was the ECAL preshower (ES), earlier this year. We have had delays with the ES DCC’s. Only recently got a full set for commissioning. We have had significant delays with the delivery of the EE-TCC’s (TCC-48). Able to begin commissioning the EE+ this September and EE- in October. Barrel detector has been commissioned and working smoothly most with the trigger all this year. All trigger commissioning has been done at P5 some in local runs, but many tests needed full RCT/GCT/GT.
Four Readiness Reviews • Trigger/Daq - September 16th • P. Klabbers, P. Rumerio, C. Schwick • DCS and Safety Systems – October 14th • N. Bacheta, R. Loveless, F. Hartmann, J. Spalding • Calibration and Monitoring – October 16th • G. Cerminara, L. Malgeri, T. Cox • Databases – October 21st • G. Della-Rica, R. Egeland, A. Pace
Trigger DAQ Readiness Review • Goal: to assess if existing resources (manpower, time, etc) and plans can solve existing issues in ECAL DAQ and Trigger in time for the first LHC run. • Morning session on the Trigger/DAQ hardware and afternoon to software and common operations. • Panel gave us their 30 recommendations within 24 hours of the review.
Recommendations Highlights! • Critical: • EE- trigger commissioning and availability of ECAL experts and central systems to do so. • Moving slowly forward. SLB errors/RCT masking etc. etc. • Trigger masking single crystals and L1T DQM plots to know which. • Masking in place: needs to be tested • EB TCC synchronization loss: must be fast to detect. • DQM plots in place • ES/ECAL completely configured from DB. • ECAL 99%, ES not yet. • More extensive on-site presence of experts despite the planning unknowns. • Additional competent manpower under Pascal Paganini on the TCC and Jose on the DCC. • TCC yes, DCC not yet.
Recommendations (Highlights) • High priority • Selective readout at high rate. • Done 90 kHz with 3% back pressure. • Change SR thresholds to decrease average event size in the FED. • DCU/L1A contention: • We are working on it. • Configuration DB editing tools • Unification of tools among ES and EB+EE • This is moving slowly due to lack of available experts. • The need for a TIF-like facility at B904. • We completely agree! • Non-event data monitoring and alarming • Documentation • Teaching and improving competence of shift crew.
Selective Readout Probability to have a DCC event size (in kb) greater than a given size in ZS mode. Probability to have a DCC event size (in kb) greater than a given size in SR mode, when reading out 3 × 3 ROI. Raise selective readout thresholds from 375 MeV in a trigger tower (noise) to 2 GeV to redaout 3x3 and 1 GeV to readout 1x1. Long term – update the output buffer of the DCC to handle long tails in event size distribution.
DCS and Safety Systems: • Review to address the following points: • Is the ECAL safety system sufficient to allow safe running from LHC re-start? • Is the status of the ECAL DCS sufficiently evolved (easy to use, documented etc.) to allow central CMS supervision during normal operation? • Are the resources available to make necessary improvements before LHC restart? • Reviewers: • Dick Loveless, Frank Hartmann, Jeff Spalding, Nicola Bacchett
Control Systems and Safety Systems • Are the ECAL safety systems ready for 24/7 operation • Yes, with some critical items (next slides) • Is the DCS for EB/EE and ES ready for 24/7 operation • EB/EE/ES Yes • Is the documentation sufficient and complete? • A good start, but more needed in one central place • What improvements need to be made before we start running? • See next slides • Is the DCS sufficiently stable for Central supervision (with ECAL experts on call)? • EB/EE Yes; ES no (more experience needed & another review)
DCS and Safety Systems • Critical Items • EB/EE humidity limit is at 60% due to hardware limitations; this corresponds to a dewpoint close to operating temperature. • The humidity monitoring system was not designed to be a safety system or a back up leak detection system. We are investigating possible increases to the sensitivity but they will not be implemented until next year. • EB leak detection not operational. • Inside detector not accessible now. • Remove TWIDOS from EB/EE/ES safety action matrices include many operations on the TWIDOS but this will not be implemented for several months (if ever). Safety matrices must focus on what is currently implemented and critical items that will be implemented in the coming weeks. • Agreed
DCS and Safety Systems • Critical Items: • Need for clear instruction for shifters (e.g. in what circumstances should the red button be pressed?). • Yes we are dong this. • Rack safety systems must be implemented CMS-wide. • We agree. • Intervention Procedures: • Interventions on PLCs now follow a strict procedure of notification, intervention, full testing – this is seen as excellent and mandatory • Similar procedures must be implemented for any DCS intervention that may affect safety • Similar procedures must be implemented for Central DCS interventions
Calibration & Monitoring • Charge: • Is the calibration and monitoring system ready for operation (HW & SW)?- Is the documentation both sufficient and complete?- What improvements need to be made before we start running?- Is the manpower in place?
Review Comments: • One of the reviewerss went down with pneumonia afterwards. Got full text of comments on Monday and we are still digesting them. • Readiness: • Precision of 0.5% does not seem to be a long term problem. • Calibration and monitoring system IS ready for LHC, but it is fragile. • Documentation: • Increase importance attached to this. • Things we could do better: • Understand better the interplay between different calibration systems. • Several workflows are not yet fully commissioned. (LEDs etc.)
Recommendations: • Need for senior person to provide oversight of the whole monitor and calibration system. • Response: This is already the responsibility of the DPG conveners. • Give institutional responsibility for tasks where possible: • Response: Yes. • Ensure DB aspects of the workflows. • See DB review.
Database Review • The charge for the review committee is to address the following questions: • Are the online and offline databases that we use for data collection ready for operations? • Is the documentation of both sufficient and complete? • How the database group is organized and what is the distribution of the responsibilities within the group? • What improvements need to be made before we start running • What is the plan to prepare for operations and what is the plan for when operations have started? • Are there sufficient resources, both computing and personnel to carry out these tasks? • Review Committee: • G. Della-Rica (PH), A. Pace (IT) and R. Egeland (PH)
DB Review – Comments and Recommendations • Critical loss of Francesca Cavallari to the database operations. • Response: Organtini will continue as online DB coordinator and we are in discussions with another institution for the offline DB. • Need to scale test every application that touches the database. • Response: These have been done after previous review with hep from IT. • Develop a plan for non-destructive upgrades of each component during LHC running. • Response: We will use the test DB at P5.
DB Review – Comments and Recommendations • Escalate problems with PVSS database access to PVSS support at CERN and EN/ICE group. USE CERN openlab collaboration with Siemens which has resources to resolve issues we are facing. • Response: In discussion with F. Glege and M. Janulis who are working on improvements to PVSS-DB access. • ECAL has very large data volumes in single tables in the DB, (40M entries) we are partitioning them. • If that does not resolve problems we will work with IT as suggested. • Develop contingency plans for doing less if manpower shortages are not resolved.
Trigger Thresholds • We have been routinely running at 90 kHz with 3% dead-time. • We have had 375 MeV threshold on a trigger tower in the RP. This induces a readout of that tower plus the eight around it. • We are now moving to ‘collision settings’ which will allow us to lower this number: • If Et > 1 GeV read the trigger tower • If Et > 2 GeV read square of 3x3 towers around seed tower • GCT moved to EG2 as our lowest trigger with EG1 pre-scaled by 100.
Commissioning • EE Trigger is not commissioned yet. • We were late with the EE-TCC boards and they are need to be integrated into the readout. Still many problems. Commissioning time is a premium. • Still commissioning the EE LED system: • HW completed, debugging going on. • LV problem in ES and EE+ • This is new two problems at 7 o’clock from IP • ES status • All ES-DCC’s now operate at 100 Hz. • Timing-in looks close to final.
Bad Channel List • EB: • 64 (0.1%) isolated dead channels or VFE LV problems • One trigger tower dead. • EE: • One without low voltage. • One super crystal with LV problem NEW • ES • One control ring with LV - Maybe recoverable. NEW • 5 Sensors not operational – three from installation. • No data – only trigger information. • 11 trigger towers in EB • 2 super crystals in EE 24
Where Are We Now. ECAL is ready for the beam splash events. The problem that we must resolve before collisions are the endcap trigger commissioning. SLB link RX errors and trigger timing. Results from the review is that we are ready for the long-term, but we are ‘fragile’ in several places. We remain paranoid.