150 likes | 297 Views
Introduction to CMS Portal. Information for Shifters. What the portal can do. Shows what scripts are running (e.g. the monitoring script) and allows them to be stopped/restarted Gives log messages from scripts and databases Allows restarting of the FSM
E N D
Introduction to CMS Portal Information for Shifters UW-Madison
What the portal can do • Shows what scripts are running (e.g. the monitoring script) and allows them to be stopped/restarted • Gives log messages from scripts and databases • Allows restarting of the FSM • Shows what processes are running and CPU and memory used • Shows the current configuration of the component • Shows rack status and temperature • Note: Portal is still in commissioning stage and will sometimes not be available UW-Madison
Getting to CMS DCS Oracle Portal • Portal is an additional tool for DCS status info • For RMC info: http://cmsonline.cern.ch/ • Link is also on the twiki • https://twiki.cern.ch/twiki/bin/view/CMS/RCTSlowControl#WHAT_TO_LOOK_FOR • For rack info: http://cmsonline.cern.ch/portal/page/portal/Services/Racks/s2:status • Link is also on twiki • https://twiki.cern.ch/twiki/bin/view/CMS/RCTSlowControl#ACCESS_TO_RACK_DCS_INFORMATION UW-Madison
CMS Online System http://cmsonline.cern.ch Click on Pmon UW-Madison
Pmon – normal operation of managers CMSfsRCT_monitoring: always running These are usually stopped Only run during installation FIRST: Select our computer Click on CMS-TRG-DCS-01 Then on CMS-TRG-DCS-02 -DIM_DNS_NODE: always runnning Deals with data-basing log messages RDB Archive Manager: always running If these scripts are in manual, contact Kira/Monika UW-Madison
Pmon – possible actions Manager Type and number corresponds to Log messages Don’t change the always/manual setting Don’t use any of the *_ALL buttons If manager needs stopping/starting: Select desired manager and use these buttons UW-Madison
Pmon in error – Call Kira/Monika Possible error state The Blocked status will soon be used for an Offline state, when v24 can’t write Monitoring script stopped -DIM_DNS_NODE stopped and blocked Database Manager stopped and blocked UW-Madison
FSM – Stop/Restart If FSM is in error: Select TRIG_RCT then click FSM_stop and then FSM_restart UW-Madison
Processes This page can indicate memory leaks UW-Madison
PVSS Log Action recorded (FSM_stop) ManagerName (#) corresponds to the Pmon type and number • RCT shifter checks this log at least once a day • Should have entries each day • if not, call Kira/Monika • Log is stored in a database so various errors are possible • Writing to DB, script not writing messages, DB probs, etc. • No entries likely means the DIM_DNS_NODE • is not working which will show up in Pmon • Report unusual entries to Kira/Monika • Esp. with Priority Severe (except “Lost connection to system 1”) UW-Madison
Configuration Management must be central • Check that the CMSfwRCT settings are those in this screenshot • Version will be updated in the future UW-Madison
Rack Status RCT racks should be ON http://cmsonline.cern.ch/portal/page/portal/Services/Racks/s2:status UW-Madison
Rack Temperatures Temperature changes are only recorded and time stamped if they have changed by a certain amount (a couple degrees) Rack temperatures are usually ~ 240 If crate temperatures rise, check what is happening in RCT and other racks as an indication of a general cooling failure http://cmsonline.cern.ch/portal/page/portal/Services/Racks/s2:temperatures UW-Madison
Summary of Online Portal • Check that the expected processes in Pmon are running • Use FSM tab to restart FSM in case of problems • Use Processes tab in case of memory leaks • Use log to check for possible problems • Should update at least once a day • Configuration should remain as indicated • Version, however, may be updated • Check that racks are ON and at the usual temperature of ~240 UW-Madison
What to put in ELOG • Status of crates • Investigate unusual information • Do not just report, try to find out the cause • Hypernews can give information about interventions at point 5 • Information of IT interventions with the database: • http://it-support-servicestatus.web.cern.ch/it-support-servicestatus/ScheduledInterventionsArchive/CMSdatabasesrollinginterventionsannouncemen.htm • Order for communicating questions/problems • Shifter calls Kira, Kira calls Monika, Kira/Monika calls Central DCS • Bobby/Frank might have additional information • May be commissioning • Small things not always announced UW-Madison