1 / 19

DCS Report on Warnings and Alarms During Cooling Failure on Feb 4 11 Feb, 2009

DCS Report on Warnings and Alarms During Cooling Failure on Feb 4 11 Feb, 2009. K. Grogg. Temperature Alerts. Types of temperature alerts Software Warning Appears in alarm panel as RMCX.crateY.tmpz HIGH No action taken (could add email alert) Hardware (RMC firmware) Warning

chloe
Download Presentation

DCS Report on Warnings and Alarms During Cooling Failure on Feb 4 11 Feb, 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DCS Report on Warnings and AlarmsDuring Cooling Failure on Feb 411 Feb, 2009 K. Grogg

  2. Temperature Alerts • Types of temperature alerts • Software Warning • Appears in alarm panel as RMCX.crateY.tmpz HIGH • No action taken (could add email alert) • Hardware (RMC firmware) Warning • Appears in alarm panel as AlarmStatusCrates Warning • No action taken for Warning_trip_time • Don’t want to be too hasty shutting down • Trip Conditions • Warning Trip Time • Time before the AlarmStatusCrates Warning becomes a Fault • Do not want higher than normal temps for too long • 48 V shut off if in hardware warning for this length of time • Hardware (RMC firmware) Temperature Fault • Appears in alarm panel as alarm ALARM and AlarmStatusCrates Fault • Too hot, 48 V shut off immediately • Rack trip temp • Will shut off rack power if above threshold set by central DCS • We do not want this method of shutting down the crates

  3. Temperature Threshold Settings • Current values (can be adjusted as needed)

  4. Optimizing Thresholds • Want to keep electronics in a healthy state • It is best for electronics to keep temperature in as narrow a window as possible • We do not want the racks to turn off the crate power • RMCs should control when crates turn off – allows continued monitoring • Also allows for fault records to be kept (not enough time to write the information if the rack shuts off the 48V) • Need to set the software, firmware, and rack temperature and time thresholds with the following goals: • Ensure a minimum amount of time at excessive temperatures • Avoid shutting down unnecessarily for smaller fluctuations • Ensure RMCs turn off crate power, not the racks • Tom set current thresholds based on measurements at UW lab • Default in firmware version 2.1 • New suggestions: • Fault threshold ~8-10 deg above nominal • Warning threshold halfway between • 4-5 deg above “normal” likely too low because of normal fluctuations

  5. Screenshot of Alarm panel HIGH – Software high temperature threshold reach by at least one temp (a, b, c, d) AlarmStatusCrates WARNING – At least one temp reached 1st hardware threshold alarm ALARM – At least one temp reached 2nd hardware threshold CrateA.vmeX FAULT – Crates off (48V off) AlarmStatusCrates FAULT – Occurs with ALARM when the hardware Fault threshold has been exceeded -and- occurs when at least one temperature has been above threshold for more than warning trip time (600s) - New entries with identical information are overwritten, so only the second occurrence of AlarmStatusCrates FAULT remains in the alarm panel

  6. Current System Configuration Hardware thresholds Same for all RMCs Refresh button must be clicked to get up-to-date information

  7. Screenshot of Fault panel Get temperatures at time of Fault Get time between warning and fault Get time Faults occurred Open Fault panel (not complete information if rack turns off first)

  8. Effects of recent cooling failure • Need to make sure the thresholds are optimal • Cooling lost on 4 Feb, 2009, at about 14:40 • Time between first HIGH (software warning) and AlarmStatusCrates WARNING: • 7:45-11:15 minutes • Time between AlarmStatusCrates WARNING and alarm ALARM • 295– 400 s • After 600s (time over t_warning setting), AlarmStausCrates indicated FAULT (for second time, the first time with alarm) • Temperatures were still above the hardware warning setting • Typically one or two temps above hardware warning • Typically one temp (Crate A, temp B) above hardware failure • Temp B has the lowest threshold (and lowest baseline) • Specific times and temperatures for each RMC are given on next 9 slides

  9. RMC 1 Temps & Times • 4 Feb, 2009 All temperatures given for all RMCs are for Crate A which is consistently higher Values in red are above thresholds

  10. RMC 2 Temps & Times • 4 Feb, 2009 When rack cuts power before RMC shuts down crates there is only a partial fault record

  11. Conclusions • RMC hardware limits are set to trip 48V when needed and not rely on the rack to turn off the power • Except RMC 2, crate power turned off before racks (rack power stayed on) • Trip temperature limits should be lowered at least a little bit • But don’t want to trip if there is no problem • Slight temperature fluctuations possible – seasonal changes, cooling water fluctuations, precision of reading, additional non-RCT electronics in rack • Excursions up to +3 deg from baseline appear to happen during normal operation • Suggestions from Monika • Lower hardware Warning and Fault thresholds by 2-3 deg. • Send email after software warning (allows some time for action to be taken)

  12. Other Items • This presentation will be put on the RCTSlowControl twiki for reference • A remote UI has been set up on hpwiscms02 (laptop) • Allows monitoring without logging into terminal server • Documentation on setting up a remote UI has been added to twiki

  13. RMC 3 Temps & Times • 4 Feb, 2009

  14. RMC 4 Temps & Times • 4 Feb, 2009

  15. RMC 5 Temps & Times • 4 Feb, 2009

  16. RMC 6 Temps & Times • 4 Feb, 2009

  17. RMC 7 Temps & Times • 4 Feb, 2009

  18. RMC 8 Temps & Times • 4 Feb, 2009

  19. RMC 9 Temps & Times • 4 Feb, 2009

More Related