240 likes | 475 Views
The ALICE data quality monitoring. Barthélémy von Haller CERN PH/AID For the ALICE Collaboration. Data Quality Monitoring. Online feedback on the quality of data Avoid taking and recording low-quality data Identify and solve problem(s) early Data Quality Monitoring (DQM) involves
E N D
The ALICE data quality monitoring Barthélémy von Haller CERN PH/AID For the ALICE Collaboration
Data Quality Monitoring • Online feedback on the quality of data • Avoid taking and recording low-quality data • Identify and solve problem(s) early • Data Quality Monitoring (DQM) involves • Online gathering of data • Analysis by user-defined algorithm • Storage of monitoring data • Visualization Barthélémy von Haller - CERN PH/AID
PDS Data-Acquisition architecture CTP Rare/All L0, L1a, L2 BUSY BUSY LTU LTU DDL H-RORC L0, L1a, L2 HLT Farm TTC TTC FEP FEP FERO FERO FERO FERO Event Fragment Sub-event Event File 10 DDLs 10 D-RORC 10 HLT LDC 120 DDLs 360 DDLs 430 D-RORC 125 Detector LDC D-RORC D-RORC D-RORC D-RORC D-RORC D-RORC LDC LDC LDC Load Bal. LDC LDC Sub-event Event Building Network EDM Sub-event 30 GDC 10 TDSM Fullevent GDC TDSM DADQM DSS 20 DA/DQM 18 DSS Storage Network 25 TDS Barthélémy von Haller - CERN PH/AID
The AMORE framework • AMORE : AutomaticMOnitoring Environment • A DQM framework for the ALICE experiment Barthélémy von Haller - CERN PH/AID
Design & Architecture • Publisher – Subscriber paradigm • Database used for the data pool • Notification with DIM (Distributed Information Management System) Publish data Access data Data Pool Subscriber Publisher Publisher Publisher Publisher Publisher Notifications (DIM) Barthélémy von Haller - CERN PH/AID
Design & Architecture • Published objects are encapsulated into « MonitorObject » structure • Plugin architecture using ROOT reflection • Modules are dynamic libraries loaded at runtime MonitorObjects 10100010111 10100010111 Publish data Access data Data Pool Subscriber Publisher Publisher Publisher Publisher Publisher Notifications (DIM) Barthélémy von Haller - CERN PH/AID
Design & Architecture Barthélémy von Haller - CERN PH/AID
The Pool • Current implementation based on a database • MySQL : reliable, performant, open-source Barthélémy von Haller - CERN PH/AID
Subscriber & User Interface • Generic GUI • Display any object of any running agent • Possibility of handling automatically the layout • Layout can be pretty complex and saved for future reuse • Fit the basic needs of the users to check what is published by the agents • For more complex needs, users can develop their own GUI Barthélémy von Haller - CERN PH/AID
The generic GUI Agent Sub-directories Monitor Objects Agents Barthélémy von Haller - CERN PH/AID
The generic GUI Save <?xml version="1.0"?> <layout> <tab name="Tab 1"> <pad title="_1" name="_1" xlow="0.01" ylow="0.76" xup="0.24" yup="0.99"> <tAmoreObject name="amoreAgentTST01/moFloat1"/> </pad> Load Barthélémy von Haller - CERN PH/AID
Custom gui Barthélémy von Haller - CERN PH/AID
Packaging & test procedure • Subversion repositories • GNU Autotools • Distributed as RPM • Strict release procedure • Build and validate the module on a test machine in a clean and controlled environment • Nightly build • Identify broken code (wrong results, unable to compile) Barthélémy von Haller - CERN PH/AID
Performance & benchmark • Online environment and heavy calculation ensure performance and scalability • To identify and handle performance issues we need : • Metrics • Statistics • Reproducible tests Barthélémy von Haller - CERN PH/AID
Performance & benchmark • Same procedure and environment as for the validation of modules • Benchmark : all existing agents ran on data files provided by the detectors • Estimation of needs for each detector • Identification of variations over time • Comparisons of machines, compilers and architectures Barthélémy von Haller - CERN PH/AID
Performances & benchmark Barthélémy von Haller - CERN PH/AID
Performances & benchmark Current DQM nodes : Intel(R) Xeon(R) CPU 5130 @2.00GHz, nb of cores ? Latest generation of intel processor : Intel(R) Core(TM) i7 CPU 965 @3.20GHz Barthélémy von Haller - CERN PH/AID
Status • In production since last summer, used during commissioning and first beam Barthélémy von Haller - CERN PH/AID
Status • In production since last summer, used during commissioning and first beam • New features are regularly added, usually at users request • 18 modules under development Barthélémy von Haller - CERN PH/AID
Status as of March 09 Barthélémy von Haller - CERN PH/AID
Plans • Access from outside via the eLogBook • Regular snapshots, archive data after end of run (EOR) Access Publish Agent GUI Latest value FIFO X recent values eLogBook Temporary and permanent archive Archive trigger : EOR, regular time interval, on shifter demand Barthélémy von Haller - CERN PH/AID
Plans • Archive data after end of run (EOR), regular snapshots • Access from outside via the eLogBook • Fully automatize the process : comparisons to reference data, identification of problems, notification, actions taken • Parallelization of AMORE Barthélémy von Haller - CERN PH/AID
Conclusion • AMORE has been in production for almost a year • It proved to be very useful during commissioning and first beam period • Large and increasing adoption amongst detectors Barthélémy von Haller - CERN PH/AID