310 likes | 339 Views
Experience, Issues and Prospects for the T CS, T SS Systems. Andromachi Tsirou Piero Giorgio Verdini Lorenzo Masetti Manuel Fahrer Yousaf Shah Frank Hartmann. For a full functional description look at: Tracker Days Nov 2006.
E N D
Experience, Issues and Prospects for the TCS, TSS Systems Andromachi Tsirou Piero Giorgio Verdini Lorenzo Masetti Manuel Fahrer Yousaf Shah Frank Hartmann For a full functional description look at: Tracker Days Nov 2006 Tracker Control System & Tracker Safety System June 2007 Tracker Days
Short history of the last months • System operational @ TAC • TCS communicates with all hardware systems: • CAEN, PLC, DCU • Thermal Screen, Monitoring system (Dry Air monitor) • TSS • LARGE “FINAL” systems established • TSS system acts/interlocks on high and low temperatures with majority voting. • System formally tested and signed off • The PLC Master system is much more powerful than initially designed, THX Machi • Master PLC handles all requirements of the TIF: powercut, CP, DryAir, TS, etc • We also actually faced basically all problems and corresponding interlock in TIF/TAC • TCS runs distributed on several PCs, on the granularity as @P5 • Majority Voting works & still confuses shifters and OpCos ;-) • ALARM panel established: sound alarm annoys senior OpCos ;-) • Cond. DB writing stable (now also under Linux) • Retrieval works via PVSS but otherwise retrieval is still an issue • We have Access Control • A CRON tool checks the system continuously (sends info via Mail) • There is actually even documentation for the shifter ;-)
Network for the Tracker Control 6 PLC Systems 4 SY 1527 Power Supply Controllers XDAQ PC PLC pc HW PC 4 HW PC 3 HW PC 2 HW PC 1 DCU PC Supervisor PC Cron Jobs This part of the system is currently implemented in the Slice Test Experiment Private Network Display Terminal (Control Room) General Purpose CERN network
Distributed System (in PVSS) • Supervisor • CAEN • PLC • DCU • Spy • (development) Green: all connections are fine System Health 18/220 is fine, cause we are not running internal archiving Managers running on THIS machine
The mainTrackerpanel Access Control Last Action Expert settings Majority Opens PLC panels Overviewofsubnodes FSM part Commands! Fast Software OFF Safety PSU ON counting PLC Panic PLOT DCU & CAEN DryAir Opens the ALARM Screen Thermal Screen „aliveandgoodness“ Message Screen (tobeimproved) Similar panel for all nodes below
Somedetails Click on one!
Someotherdetails e.g. vMonof 2.5V line e.g. Meanand Max of PLC per CL orSector
Is one state enough? Majority Voting! • Every node in the hierarchical control tree has a state • But the information of the state is not enough • We need to be able to see at first glance: • How many CTRL are on • How many LV are on • How many HV are on • How many channels in ERROR • We want to be able to run and take data when • More than 95% of the tracker is ON • Less than 5% of the tracker is in ERROR
Solution: majority voting 95 % 95 % 95 % 5 % ON • Mixed states: needed for security • We need to know if any channel is in that state • Main issue: deal with inclusion and exclusion of sub trees • Computation of the percentages allows computing of the state overriding the FSM logic HV_mixed ON_LV LV_mixed ON_CTRL CTRL mixed OFF ERROR
Access Control • Access control levels are defined • @ P5, the user level will be defined via a NICE mechanism • Actually, there are 4 levels (passwords will be distributed according to users knowledge): • Guest (or NOT Logged in): Navigation through FSM panels; NO commands • TK_Operator: Navigation plus Commands; NO settings • TK_Expert: Navigation plus Commands plus settings (expert panels) • TK_PVSS_expert: Navigation, Commands, Settings plus saving of settings to DB & Force TAKE of FSM (not yet supported by JCOP) • Groups and Levels applied to the TAC setup
ALARM SCREEN, a standardtool Operational in the TAC, still under tuning! How it should look: EMPTY! See today at cold and running How it looks, with different limits: See last week Double click on panelopenscorresponding FSM panel!
Level logic Level logic Safety Upper threshold: PLC hardware interlock PVSS automatic software shutdown DT DT PVSS warning DT DT Warning for temperatures out of the nominal range DT to nominal up Warning for temperatures out of the nominal range DT to nominal low DT Upper nominal temperature / sub detector /sub structures Lower nominal temperature / sub detector /sub structures Temperature range Cooling temperature DT DT PVSS warning DT DT PVSS automatic software shutdown DT Safety Lower threshold: PLC hardware interlock
Granularityof SWITCH OFF Shutdown due to DCU analysis Shutdown due to PLC analysis Shutdown due to PLC interlock
Shutdown and mapping DCU DCU Maps on one PSU PVSS analysis DCU & PLC (and CAEN) MasterDCU DP Max_T or Max_I too high Gentle software Shutdown of one PSU, e.g. one rod, 1/3 petal DCU DCU DCU DCU DCU DCU Hopefullyprevents Bunch of DCUs Maps on one Cooling Loop/Sector T_PLC T_PLC Gentle software Shutdown of one cooling loop/sector MasterPLC DP Max_T too high T_PLC T_PLC Bunch of hardwired T e.g. 18 for TEC; 5 for TOB; 2 or 4 for TIB Hopefullyprevents Majorityofsensor T_PLC Interlocks several crates T_PLC T_PLC
Online Data, DB & Trending • We are archiving: • PLC values • CAEN values • DCU values (just restarting) • Heartbeats (actually one) • 6 for PLCsystems • 4 for SY1527 systems • ONLINE value vs. Archive value • Online value has a much finer granularity • Archiving works on deadbands • E.g. save for DT>0,3°C • E.g. save HV for DT>2V • Trending exists for • PLC temperature • RH • Dewpoint • CAEN currents • CAEN voltages • DCU (here some refinements needed) • Currents • Voltages • Temperatures • MEAN & MAX • We do MEAN & MAX for several sensors • We do NOT MEAN&MAX over time • We do not histogram! OFFLINE ONLINE
FROM ONLINE DB – PVSS Archive Thx to the US guys: Zongru Wan, William Badget, Lenny Spiegel, Jien Chen, Alice Bean
Prospect:3D view, will this ever come? • Some hard months of work for Kansas (Jie): • The DB content is now human readable • Content has to be checked by F. Glege and inserted into the DB • Hopefully this week • OBJECT2DETID possible • IF all goes well, we can test the interface just in time in TIF
Spy tool – CRONjob • One PC dedicated for failure analysis, a dedicated PVSS program scans in defined time interval through the PVSS system to detect anomalies, • For example: • LV Channel ON, but current ZERO or current not equal nominal(yet for TIB&TOB) • Vmon not equal to v0set • Changing T (boards&PLC) or I trends (LV&HV) • Exchanged power supplies • Missing archive heartbeat • Connection of different subsystems (PCs) • Message goes to • PVSS ALARMS&WARNING • SMS or MAIL on demand • This tool is responsible for failure analysis and detection of strange features or detection of starting bad trends • Important parts, not fitting in the core program • WISHES and COMMENTS are welcome!!!!!!!!! • Via Mail please!
DCU in TCS – PSX • DCU reading, archiving and analysis in PVSS ready • Since MTCC & TIB+ outer slice test • Distributed on a different PC • DCU to PSU mapper functional • Used for TIB, TOB end TEC • DETIDs are then also send by XDAQ per DCU, defined by DCUID • FE DCU runs as part of a Power Group (PSU) • There is also an automatic SWITCH OFF routine • CCU DCU runs as part of a Control Group • After some “convincing”, we now have the PSX interface running on PVSS 3.6 on Linux SLC4 • BOTTLENECK: • Switching „DCU sending“ from XDAQ side AS STANDARD • The actual system is “large”
Experiences, Issues • Experiences (trivialities) • Only a correct map is a good map!!!! • The decision, which parameter belongs to experts or shifter is not always obvious • The distinction between core package and CRON job was the right decision • Issues • Data retrieval from DB • New update for PVSS pending, some DB issues • Humidities and dewpoints to be improved • Some help from sub detector people would be welcome • Manpower: team decreased during the last months!! • We now have to shift our focus to P5 • E.g. : we have to establish all configurations and maps in the DB
Summary • We are running the system with a comparable size on computers, which are less performance than the ones @P5 • TCS PCs are installed @P5 • The ingredients are there and working all together • We will make use of the TIF/TAC system (some OpCos have fun to close/open panels to find new features) • To tune, e.g. ALARMS • Improve Mean and Max info in all panels of values in nodes below • Improve panels • Distinguish better between shifter and expert info • To hopefully establish 3D • NOW, we have to focus on P5!!!!!!!!! • Not all wishes in TIF/TAC will be fulfilled! (put the blame on me!!)
PLAN: Tracker TCS Panel Parent Tracker Last Action / Time TIB #CG %LV_CTRL %LV %HV %DCU good <T FEDCU> <I> DCU <T> PLC TOB #CG %LV_CTRL %LV %HV %DCU good <T FEDCU> <I> DCU <T> PLC TEC+ #CG %LV_CTRL %LV %HV %DCU good <T FEDCU> <I> DCU <T> PLC TEC- #CG %LV_CTRL %LV %HV %DCU good <T FEDCU> <I> DCU <T> PLC MAO %BR %Candle %48V Cooling T CP1 T CP2 TS Dry Gas BCM <T LIQ> <RH> %PLC good MAX RH MIN T LIQ <imon1> <T_Si FEDCU> <T_H FEDCU> <I> DCU %DCU good <imon2> Trending <T LIQ> Trending RH <imon3> Trending <T_Si FEDCU> <T_H FEDCU> <imon4> Trending <I> FEDCU
Data retrieval from ORACLE Alice Bean abean@ku.edu, Jie Chen <jie.chen@cern.ch> Includes: • CAEN Power Supply: T, V, I, Status • Will include • DCU: Silicon Detector Temperature • PLC: Humidity Value, T • … • Requests & wishes to Alice & Jie root analysis tool: https://twiki.cern.ch/twiki/bin/view/Main/TkDCSRootAnalysis DB web browser: http://cmsdaq.cern.ch/cmsmon/cmsdb/servlet/DatabaseBrowser
DCU communication: General • DCUs have two main functions: • Switch off the PG / CG in case of error • FE DCU switch OFF PG (PSU) • DCU on CCU switch OFF CG • Wait for good values before next step in switching on sequence • These functions can be enabled and disabled
Mapping @ P5 • TSS cable mapping done • Initial preparation will interlock on „crate block“ level! • CAENBus (DCS cables) map done • Crate re-distribution agreed on to have a balance system, respecting TCS needs) • 18 crates were re-located • One BC per rack • SY1527 are „pure“ on the cooling loop or sector level • DCS to Detector map understood • Final mapping/crosscheck via DCU2PSU tool
What's TCS doing?Hardware view BCM ? Coolingsystem Conf. DB TCS RCMS SCADA Thermalscreen Soap PLC (S7) States States Dry air system Soap PVSS all DCU values ? XDAQ Soap Data (Spy) S7 Commands (only in standalone) WIRES TSS LV & HV data & commands FEC OPC DCU values CCUM (T) (~10KB/m) Hybrids (T,V,I) (~0.5 MB/m) PLC Interlocks (T, H) Conditioning high/low voltages environmental Cond. DB
TCS & TSS TCS: Tracker Control System • Control, Monitoring, Analysis, Trending and Archiving of • Detector • Power Supplies • Environmental sensors • Temperature; humidity • Auxiliary systems: • Cooling Plant, BCM, Magnet, TS • DCU information from XDAQ • Based on • LHCC JCOP framework, PVSS, proper LAN connection • Obeys CMS DCS or TK_RCMS (local running) TSS: Tracker Safety System • Autarkic PLC system on UPS, interlocks power supplies on basis • Temperatures from hardwired sensors • Small number of RH probes on exhaust pipes • Auxiliary systems: • Cooling Plant, BCM, Magnet, TS • CMS DSS System ~ 10 PCs ~ 2000 PSU ~ 300 Ctrl PS ~ 1000 hardwired probes ~ 16000 DCUs 6 large PLC racks 1 PLC Master System ~ 1000 hardwired probes