300 likes | 322 Views
Learn about online and offline DQM shifts at CMS, identifying detector problems, certifying data, and ensuring optimal operation efficiency. Follow detailed instructions for shifts and access requirements to contribute effectively.
E N D
Scope of Online DQM Shifts: Identify problems with detector performance or data integrity during the run SPOT PROBLEMS QUICKLY FOR OPTIMAL OPERATION EFFICIENCY
Offline Data Processing and Offline DQM Prompt Reconstruction at T0 and CAF is performed from within one hour up to 48 hours after data is transferred from P5 to T0 and CAF (CERN) Online environment T0, CAF Subsequent iterations of re-reconstruction at the T1’s follow periodically the Prompt Reco with improved Alignment and Calibration constants, bug fixes. T1 Offline DQM is part of the Offline data processing that, in addition to detector data analyses, includes higher level reconstruction objects, aka Physics Objects (POG’s) Scope of Offline DQM Shifts: Produce the data certification for various reconstruction iterations USED FOR CMS OFFICIAL GOOD RUN LISTS!!!
Central CMS Shifts https://twiki.cern.ch/twiki/bin/view/CMS/CentralShiftAllocation • Shifts and on-call for P5: • 2 Run Field Managers • Shiftleader • DAQ shifter and one on-call • Trigger Shifter and one on-call • HLT (on-call) • DCS (or technical shifter) and one on-call • Online DQM shift (and one on-call expert) • BRM on-call • Offline Run Manager (ORM) • Offline DQM (at CERN, FNAL, or DESY): • Offline DQM Shift • Computing: • Offline shifter and one on-call DQM Shifts: Work in close contact with shift leader (P5) and offline run manager (Offline)
In preparation of DQM shifts • Safety requirements for P5: • Online CMS level 4C safety class must be passed • Each shifter should have appropriate access rights, through EDH: • Request access to « CMS CR » for P5 • Request access to « CMS CEN » for CMS Center at Meyrin • Mandatory for newcomers: • Schedule your first shift in the daytime so assistance will be readily available if needed • Attend the shift tutorial on Monday (possibly the latest one before your shift or the one before that) • Attend a trainee shift between the tutorial and your first shift
In preparation of DQM shifts • Read the DQM shift instructions before you go on shift, even if you have been on shift before: • https://twiki.cern.ch/twiki/bin/view/CMS/DQMShiftInstructions • Online: • https://twiki.cern.ch/twiki/bin/view/CMS/OnlineDQMShifts • https://twiki.cern.ch/twiki/bin/view/CMS/DQMOnlineShortTermInstr • Offline: • https://twiki.cern.ch/twiki/bin/view/CMS/OfflineDQMShifts • https://twiki.cern.ch/twiki/bin/view/CMS/DQMOfflineShortTermInstr • Check the DQM shift instructions while you are on shift • Complain about ambiguities and inconsistencies (ELOG: Problem Report)
DQM Shift - Schedules • Online Shifts at P5 (3/day for 24 hours coverage) • 23:00-7:00 | 7:00-15:00 | 15:00-23:00 • Offline Shifts run at Remote Control Rooms (4/day) • 1:00 – 7:00 at Fermilab • 7:00 – 13:00 at CERN-CMS Centre (Meyrin site) • 13:00 – 19:00 at DESY • 19:00 – 1:00 at Fermilab • Regular shuttle service runs 7 days per week. • https://twiki.cern.ch/twiki/bin/view/CMS/P5Shuttle • For latest run plan updates (e.g. shift cancellations) subscribe to hn-cms-commissioning@cern.ch
DQM Shift Tools • DQM GUI: • Graphical User Interface, for histogram viewing • Receives histograms from • Live DQM applications (online), • Files uploaded from T0 (offline express / prompt), T1 (ReReco) • DQM Run Registry: • Web interface to the Database that holds run information • Register significant runs (Online shifts) and • Bookkeeping of quality information • Elog: for end of shifts and problem reports • TWiki pages: for shift instructions
DQM Shift Tasks • Task 0: • Make sure all DQM applications and webservers function properly • Task 1: • Inspect shift histograms using DQM GUI, following shift instructions • Raise alerts / contact experts in case of problems • Task 2: • Run-by-run bookkeeping using Run Registry • Task 3: • Shift summary and problem reports in ELOG
DQM expert-on-call: 165579 Task 0: Applications • Make sure DQM applications and GUI are running during data taking: • Online: check update of histograms during runs • Offline: check arrival of histograms from Tier-0/CAF processing • Check correct updating of Run Registry • In case of problems (persisting longer than 15 mins), call:
Task 1: Histogram Inspection • Follow online/offline shift instructions • Run-by-run procedure: • (1) Enter significant runs in the Run Registry • Online Shifter: decide if a run is significant (confirm with shift leader!), register in RR • Offline Shifter: analyze runs previously registered by the Online shifter • (2) Shift histogram Inspection • Look at the Summary, the Reports, and the Shift Workspace. • Make an effort to look at all the plots one by one. • If you spot a problem or have a question regarding a specific plot, please contact: • Shift leader / offline run coordinator • Make Elog entry (Type: "Problem Report") • (3) Stay in close contact with the shift leader / offline run coordinator
GUI – Summary Workspace https://cmsweb.cern.ch/dqm/online 12
GUI – Summary Workspace https://cmsweb.cern.ch/dqm/online 13
GUI – Reports Workspace https://cmsweb.cern.ch/dqm/online
Task 2: Run Registry Hover over the quality flag to see the comment
There are three table views in the RunRegistry browsable via the buttons at the top of the page: 1) RunInfo 2) Runs 3) LumiSec Run Registry 18
Run Registry (online): RunInfo View • At the beginning of each run: • Check if the run is significant, if yes create it (from the RunInfo Table View) • Click run number itself->Create Global • In the RunInfo Table View all runs are listed • Register only runs that are significant, i.e. that are collisions runs or have more than ~10,000 events and/or have been running for more than 20 minutes. If in doubt, ask the shift leader. • Proceed to Runs Table View and follow the shift instructions to inspect shift histograms
Run Registry (online): RunInfo View • In the RunInfo Table View all runs should be listed, however, sometimes new runs may not appear in the run info table. • In such a case, • Click on “table” “filter” enter the run number in the run number column. • The run might be visible. Then, if Events > 0, the run can be made “Global” and certified as usual. But if events = rate = 0, then that means that the info is wrong. • In such a case, record all the relevant info about the run in an ELOG entry. • Run information might appear ~20 minutes to 1 hr after the run has started, then the info can be included in the run registry as usual and there won’t be a need to fill a specific ELOG entry for that run.
Run Registry (online): Edit Runs • During the run: • Based on the shift histogram instructions, set the online subsystem flags (GOOD/BAD) and enter comments • Click run number itself->View details->Edit • If a subsystem is BAD, inform the shift leader and the subsystem expert, and enter comment • After the run: • Enter the ‘stop reason’ (in the stop reason field, NOT under comments) • Confirm your results with the shift leader moving the run STATUS to « SIGNOFF » • Click dataset name ->Move to SIGNOFF ->Edit • Note: once the run is in SIGNOFF state, it cannot be modified by the Online shifter
Assigning correct Group name is of vital importance • as it will affect Offline determination of Runs to be used • for different analyses Run Registry (online): Run "Group" Note that Runs are classified though the Group name in the RunRegistry: 1. Select "Collisions10" if the run is taken for physics analysis purposes and contains at least one lumi section with two stable beams (colliding or non-colliding). 2. Select "PostCollisions10" for runs right after Collisions10 runs, when the tracker is off. 3. Select "Cosmics10" if the run is taken for analysis purposes and there is no beam activity throughout the run, i.e. stable "no beam" conditions. 4. Select "Commissioning10" for all other runs, i.e. those taken for tests or specific detector studies only, i.e. not meant for general offline physics analysis. Ask the shift leader if in doubt ! Check latest set of rules at start of your shift in the Online DQM Shift Instructions !
Run Registry (online): Run "Group" • Shifter needs to make sure that the group name is correctly assigned. • Sometimes the shifter is asked to mark a run as Cosmics10 but the choice may not be available. • Selection of Cosmics10 group name is only possible with a cosmics HLT key. So, if the choice is not available this • might mean that • Trigger shifter chose an incorrect key trigger shifter should be made aware A.S.A.P. • It is a CASTOR run and not marked as Cosmics10 run Double-check with shift leader. • This was on purpose: cosmics run with CASTOR key. Inform DQM experts through ELOG so that they can • change the group name to Cosmics. Text
Run Registry (offline): Runs View • Offline DQM Shifter: • Select which runs to analyze: oldest run where the “Global/Online/ALL” dataset is in SIGNOFF status • Add the Offline dataset to analyze (as per instructions) and proceed with subsystem evaluation based on shift histograms in the GUI • Move the dataset entry to SIGNOFF when all subsystems are analyzed
Run Registry (offline): Datasets P5 Meyrin 25
Run Registry: HV conditions DQM GUI • Check that the HV information on the DQM GUI summary ( "Info" histogram) and the Run Registry ("LumiSec" table) are consistent • Sometimes "LS" in the Runs view table shows 0 • If this happens: • Make an ELOG entry "Problem report" indicating the run number • Ensure that the DQM expert on-call is aware of the problem (165579) • Put the information manually into the general comments section of the Run Registry in the following format: • Example: • LS 0 = CASTOR, Strips, Pixel, and DT with HV OFF. All others with HV ON. RR LumiSec
ELOG • - http://cmsonline.cern.ch/portal/page/portal/CMS%20online%20system/Elog • Log in with your AFS account • Click on "Elog" and choose Subsystems "Event Display and DQM" • Problem Report ( 1 entry per problem ) • For each problem arising during your shift make a "Problem Report" entry • Please use “Elog” to report problems (including problems understanding shift instructions)! • Shift Summary ( 1 entry per shift ) • At the end of each shift, write a short "Shift Summary" • N.B. (!!!): • Use Types "Problem Report" or "Shift Summary" only, do NOT create new types • Do not enter run-by-run information in the Elog that should be in the Run Registry • Make sure all run-by-run information is in the Run Registry, not in the Elog
Shift Hand-over • Make sure to arrive 5-10 minutes early for shift hand-over • Upon your arrival in the control room, the previous shifter will be there • Get from her/him the information about the current status of the data taking and what happened during the previous shift • The shift person will show you where the tools are running, which you will be using (DQM GUI, CMS Online page, Run Registry) • If anything with your tasks is not clear to you, ask at that time • At the end of your shift, wait for the next shift person to arrive and provide the same support
Links • Shift instructions: https://twiki.cern.ch/twiki/bin/view/CMS/DQMShiftInstructions • https://twiki.cern.ch/twiki/bin/view/CMS/OnlineDQMShifts • https://twiki.cern.ch/twiki/bin/view/CMS/DQMOnlineShortTermInstr • https://twiki.cern.ch/twiki/bin/view/CMS/OfflineDQMShifts • https://twiki.cern.ch/twiki/bin/view/CMS/DQMOfflineShortTermInstr • DQM GUI: • https://cmsweb.cern.ch/dqm/online/ • https://cmsweb.cern.ch/dqm/offline • follow certificate instructions at https://twiki.cern.ch/twiki/bin/view/CMS/DQMGUIGridCertificate • Run Registry page (need tunnel from outside CERN): • http://pccmsdqm04.cern.ch/runregistry • Elog: • http://cmsonline.cern.ch/portal/page/portal/CMS%20online%20system
Summary • Before the first regular shift: • Pass/renew safety test • Request/check access rights • Organize transport to P5 (car or shuttle) • Carefully read shift Instructions • Do one training Shift Thanks for doing shifts and - Enjoy!