70 likes | 136 Views
The Virtual DAQ Expert, controlled by Sergey Belikov, aims to improve DAQ efficiency by diagnosing and fixing issues in real-time. Learn how it works and its current status and future goals. Help improve the "WHAT TO DO" suggestions for resolving issues.
E N D
New Run Control’s beast: Virtual DAQ Expert Whom to blame for: Sergey Belikov DAQ Fest
Goals: Current: to annoy Achim Franz. Run 7 : to reduce the number of night calls ALARA, to increase the DAQ efficiency. Future : auto-fix most problems of the data collection DAQ Fest
How does it work: • VDE periodically checks status of a running Run Control by analyzing states of the most of its components: triggers, EBC, ATPs, SEBs, DCMs, GTMs. • If rate of all live triggers has become zero, or data flow has been broken, VDE tries to find out the source(s) of the problem and tells DO what to do to fix it. (In the nearest(?) future VDE will be able to fix it automatically by itself(?)) DAQ Fest
Block diagram (very simplified) of the VDE operation G-links OK? No Tell DO what to do Yes No No No Triggers OK? No GTMs OK? Yes SEBs OK? Yes ATPs OK? Yes Yes No Data Flow OK? No Tell DO what to do EBCs OK? Yes Yes Sleep $VDE_PERIOD seconds No DCMs OK? Yes No GL1 RC OK? 631-689 -1991 Yes DAQ Fest
What all those “OK”s mean • G-link OK are all DCMs’ G-links locked? • Trigger OK is at least one Live Trigger rolling ? • Data Flow OK • No EvB Mode: are DCMs’ event counters incremented? • EvB Mode: EBC: run started and EBC is running? && are “received”, “assigned” and “completed” counters incremented && at least 10% of all ATPs are running and DD Run Started? • GTM OK GTM is running and VME Busy is low • SEB OK Running && Buffer not Full && Nevents < Max • ATP OK Running && Run Started && Processing events. • EBC OK Running && Run Started • DCM OK G-link OK && no FPGA errors && no compressor errors && no other errors && Busy is low. DAQ Fest
Current status • VDE is able to recognize most of the problem, and even more…Not obvious (to me) problems: • Even in the GL1 Run Control the EBC sometimes receives events, but does not assigns them in more then 10 seconds period. Why? • Starting from Run 5(4?) the DCM event counters are not incremented during the Stand Alone data taking, until you hit the Stop button. Changed threads race condition? • It would be great if all DAQ experts could improve “WHAT TO DO” part of the VDE. Collective experience? DAQ Fest
Pink future… • By using Run 7 experience, make the “WHAT TO DO” advices correct in > 90% cases. • Add “Auto Fix” button to a VDE message for the cases where we are on 100% sure what has to be done to fix this condition. • Sound accompaniment? DAQ Fest