1 / 26

Control Room and S hift Operations: CMS

Control Room and S hift Operations: CMS. Greg Rakness ( CMS Deputy Run Coordinator ) University of California, Los Angeles. ATLAS Post-LS1 Operations Workshop CERN 24 June 2013 https://indico.cern.ch/conferenceDisplay.py?confId=256916.

arlo
Download Presentation

Control Room and S hift Operations: CMS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Control Room and Shift Operations: CMS Greg Rakness(CMS Deputy Run Coordinator) University of California, Los Angeles ATLAS Post-LS1 Operations Workshop CERN 24 June 2013 https://indico.cern.ch/conferenceDisplay.py?confId=256916

  2. The Compact Muon Solenoid (CMS) is as “big” as ATLAS Not as big in the linear dimension, but certainly as big in data volume (and author list)… So, some aspects of our situation may feel familiar… G. Rakness (UCLA)

  3. Personnel G. Rakness (UCLA)

  4. CMS shift hours • 3 shifts per day (in sync with the LHC shifts) • 07:00-15:00 “day” • 15:00-23:00 “evening” • 23:00-07:00 “night” G. Rakness (UCLA)

  5. CMS shift crew in the p5 control room • Beginning of 2010: 13 shifters during normal operations • 8 subsystem shifters • 5 central shifters (see below) • 13 shifters x 3 shifts/day = 39 collaborators per day... • By the end of 2010: reduce from 13 to 5 “central” shifters • Shift Leader • Data Quality Monitor • Trigger • Data Acquisition • Detector Control System (DCS) • When not running (e.g., overnight during Technical Stops, or now), minimum shift crew required whenequipment is powered on in the experiment cavern… • DCS (to monitor detector conditions) • Shift Leader (at least 2 people needed for personnel safety) G. Rakness (UCLA)

  6. CMS crew “beyond” the p5 control room • Run Field Managers • Set the daily and weekly run plan and facilitate its execution • Provide continuity from one shift to the next, advise shift leaders • Lead the daily run meeting • Communicate with Run Coordinators when issues or questions arise • Another way to think of this role is “super shift-leader” • Two Run Field Managers on-duty at all times • Term = 3 weeks • Detector On-Call (DOC) • If shift crew has a problem with a subsystem, they call the DOC… • 15 DOCs on-duty at all times, one per subsystem • Term ~ one week G. Rakness (UCLA)

  7. Candidates and training • We specify a “preferred profile” for each central shifter (note: this is not strictly enforced) • Run Field Manager: invited personally by run coordinator • Shift Leader: certain level of seniority and experience • DAQ: motivated to gain insight into a modern DAQ system • Trigger: interest in trigger logic and web-based services • DCS: experience with detector development, integration, and/or slow control • DQM: experience with data analysis and/or detector performance assessment • Training done separately for each shift role • Classroom training, some include practical test • One block of training shifts G. Rakness (UCLA)

  8. Filling Shifts • In order to fill central shifts, we had to separate the service work performed by CMS institutes into two categories • Shifts • Other service work • In order to achieve a satisfactory level of shifter experience, we require N shifts before a person’s credits apply as service work • Normally N 20 • Conversion from shifts to credits depends on the type of shifts taken • E.g., weekend = 1.25 credits, night = 1.5 credits, weekend night = 2.0 credits… • Shift sign-up blocks constrained by CERN rules • No more than 5 night shifts in a row • At least one day rest in any 7 days period • Any two shifts must be separated by at least 16 hours • These rules are a (minor) source of complaints, mainly by those who travel to CERN specifically to perform shifts G. Rakness (UCLA)

  9. Shift statistics in 2012-2013 Minimum quota 32 credits~21 shifts Minimum quota 21 shifts Minimum quota 21 shifts Minimum quota 21 shifts

  10. Subsystem personnel These roles are filled within each subsystem • Operations Manager • If Run Coordinators have a request or question about a subsystem, they call the Operations Manager… • Typical term >~ 1 year • Detector On-Call (DOC) • If shift crew has a problem with a subsystem, they call the DOC… • 15 DOCs on-duty at all times representing all critical systems • Rotate ~once per week • On-call experts • If the DOC or Operations Manager has a problem, she calls the subsystem expert… • Experts are “free” to act on the system remotely in case of problem (no strict access control to the CMS network) G. Rakness (UCLA)

  11. Transportation of crew • CMS is on the other side of LHC • Rely on the CERN shuttle to transport shifters between Meyrin and p5 (45 minute ride) • http://cern.ch/ShuttleService ATLAS LHCb ALICE CMS G. Rakness (UCLA)

  12. CMS control room PIX TRK • Subsystem area • Since p5 is so far away, experts tend to stay in the control room longer when they are there • Central area • Focus of activity during standard operations CSC HCAL Alignment DT ECAL RPC DCS SL DQM Magnet TRG BRM DAQ G. Rakness (UCLA)

  13. http://acr.web.cern.ch/acr/ACR_Layout.htm ATLAS Let’s compare the ATLAS and CMS control rooms… Apparently the “Compact” in CMS describes both the detector and the control room… CMS G. Rakness (UCLA)

  14. CMS Centre Located at Bldg. 354 Meyrin • Computers, meeting rooms, tables, coffee nearby… • Location of offline Data Quality Monitoring shifts • Also used for some “analysis marathons” before major conferences… G. Rakness (UCLA)

  15. Meetings • Daily 9:30 meeting at point 5 • Focus on previous 24 hours and following 24 hours • LHC report from Run Coordinators, CMS overall report by Run Field Manager, round table report from each subsystem DOC • Meet 7 days per week during LHC running, even during Technical Stops (canceled on weekends/holidays if not needed) • If the 8:30 LHC meeting runs long, we have to rush to point 5 in order to make it to the 9:30 CMS meeting… • Weekly Run Meeting at Meyrin • Summary of the week, topical discussions, longer term planning • Normally attended by Operations Managers, but expect that any CMS collaborator might attend this meeting… • We use the same Vidyo booking for both meetings G. Rakness (UCLA)

  16. Operation G. Rakness (UCLA)

  17. CMS lifecycle defined by the LHC fill From https://edms.cern.ch/document/1070479... • The users of the LHC modes will include… • Experiments… • The modes are also used by the Detector Control System (DCS)… “The mode will be made available by a number of channels. These will include… DIP” G. Rakness (UCLA)

  18. Things not automated • Shift leader checklist (twiki) • A number of items must be done by the shift crew depending on the state of the machine • White board (20thcentury technology) • We have found this is still the best way to… • communicate short-term instructions from shift-to-shift • remember the CCC phone number G. Rakness (UCLA)

  19. Monitoring and alarms • Over the years, system monitoring and alarms were often implemented in an ad-hoc way to expeditiously satisfy specific needs… • E.g., DCS alarms are different from DAQ alarms • Found that audio alarms are an effective way to alert the shift crew of a crucial problem (set threshold correctly) • Presently working to overhaul system… • … rationalize information into the database • … factorize source from display • ... more easily establish cause-effect • Timescale: 2015 G. Rakness (UCLA)

  20. Evolution to automation It’s true: it is inefficient when humans touch the system… New in 2012: detector HV-state fully based on Machine/Accelerator mode 2015 plan: Run Settings (clock, trigger, thresholds, …) to be fully based on Machine/Accelerator mode HV turn on automated Time between “Stable Beams” and silicon tracker ON (min) G. Rakness (UCLA)

  21. Automated soft error recovery • Radiation from proton collisions causes single event effects in detector electronics • Well-known phenomena accounted for in design of CMS • Impact of effects range from not noticeable to stopping the run • Started to become an issue with increasing luminosity in 2011… • 2012: full commissioning of automatic soft error recovery • Depending on the error and the system, this is done via hardware or software means • This will remain an issue for the rest of the lifetime of CMS • Systems continue to automate recovery from known problems G. Rakness (UCLA)

  22. What does a typical fill look like? Look at the last two fills before the LHCC in Dec. • Fill 3363  163.8/pb recorded • 97.0% data recording efficiency • 2 stops of data taking (manual) • 3 software recoveries (automatic) • 578 hardware recoveries (automatic) • Fill 3370  74.8/pb recorded • 97.6% data recording efficiency • 0 stops of data taking (manual) • 1 software recovery (automatic) • 281 hardware recoveries (automatic) In 2010, each error would have required manual intervention… G. Rakness (UCLA)

  23. Running efficiency per year CMS recorded 92.2% of 44/pb in 2010… 24 June 2013 G. Rakness (UCLA) 23

  24. Running efficiency per year CMS recorded 92.2% of 44/pb in 2010… … then 90.5% of 6/fb in 2011… 24 June 2013 G. Rakness (UCLA) 24

  25. Running efficiency per year CMS recorded 92.2% of 44/pb in 2010… … then 90.5% of 6/fb in 2011… … then 93.5% of 23/fbin 2012… 24 June 2013 G. Rakness (UCLA) 25

  26. Running efficiency per year CMS recorded 92.2% of 44/pb in 2010… … then 90.5% of 6/fb in 2011… … then 93.5% of 23/fbin 2012… This high number was the result of a lot of hard work by a lot of smart people! 24 June 2013 G. Rakness (UCLA) 26

More Related