1 / 23

Control in ATLAS TDAQ

This overview provides an in-depth look into the control subsystem of the ATLAS TDAQ system, including dataflow, synchronization, error handling, and high-level triggers. It explores the architecture, technology choices, and the use of expert systems for control and supervision. The implementation details, including the use of the CLIPS expert system framework and the integration with other components, are also discussed.

markajones
Download Presentation

Control in ATLAS TDAQ

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Control in ATLAS TDAQ Dietrich Liko on behalf of the ATLAS TDAQ Group

  2. Overview • The ATLAS TDAQ System • Dataflow & HLT • Control Subsystem of the Online Software • Architecture • TDAQ Wide Run Control Group • Technology Choice • CLIPS • Design & Implementation • Expert System Framework • Run Control, Supervision & Verification • Testing & Verification • Test beam • Scalability Tests Control of the ATLAS TDAQ system

  3. Dataflow ROD ROS LVL1 HLT LVL2 Event Filter Online System Operation DCS Detector control Test beam: see [331] Event Building Performance: see [217] The ATLAS TDAQ System Control of the ATLAS TDAQ system

  4. Control Aspects • Dataflow • Fixed configuration • Synchronization, classical Run Control • Error handling • High level Triggers • Flexible configuration • Synchronization • Error Handling Control of the ATLAS TDAQ system

  5. ATLAS Online Software • Component Architecture • Object Oriented, C++ and Java • Distributed system (CORBA) • XML for Configuration • Specialized services for a TDAQ system • Information sharing, Message Reporting, Configuration • Iterative Development Model • Prototype already in use • Laboratories, Test beam, Scalability tests • Evolvement into the systems for initial ATLAS system Control of the ATLAS TDAQ system

  6. Online Software Architecture • In the context of the iterative development cycle and the Technical Design Review • Reevaluation of requirements and architecture • Several high level packages & corresponding subsystems • Control • Supervision, Verification • Databases: see [130] • Configuration, Conditions • Information Sharing: see [166] • Information Service, Message Service, Monitoring Control of the ATLAS TDAQ system

  7. Control Subsystem In the following only the Supervision subsystem is discussed Control of the ATLAS TDAQ system

  8. Supervision • The Initialization and Shutdown is responsible for: • initialization of TDAQ hardware and software components; • re-initialization of a part of the TDAQ partition when necessary; • shutting the TDAQ partition down gracefully; • TDAQ process supervision. • The Run Control is responsible for • controlling the Run by accepting commands from the user and sending commands to TDAQ sub-systems; • analyzing the status of controlled sub-systems and presenting the status of the whole TDAQ to the Operator • The Error Handling is concerned with • analyzing run-time error messages coming from TDAQ sub-systems; • diagnosing problems, proposing recovery actions to the operator, or performing automatic recovery if requested. Control of the ATLAS TDAQ system

  9. TDAQ Wide Run Control group • Examines the requirements from the subsystem side • Dataflow, HLT • Hierarchical concept • Follows the overall organization of the TDAQ system • Controller central element • All control functionality in combined controller • State machine concept for synchronization • Flexibility in error handling • User customization Control of the ATLAS TDAQ system

  10. Initial Design & Technology Choice • A Run Control implementation is based on a State Machine model and uses the State Machine compiler, CHSM, as underlying technology. • P.J. Lucas, An Object-Oriented language system for implementing concurrent hierarchical, finite state machines, MS Thesis, University of Illinois, (1993) • A Supervisor is mainly concerned with process management. It has been built using the Open Source expert system CLIPS • CLIPS, A tool for building expert systems,http://www.ghg.net/clips/CLIPS.html • A Verification system (DVS) performs tests and provides diagnosis. It is also based on CLIPS. Control of the ATLAS TDAQ system

  11. Experiences • PLUS • Scalability test in 2002 demonstrated that a system of the size of ATLAS TDAQ system can be controlled • MINUS • Lack of flexibility (CHSM) Control of the ATLAS TDAQ system

  12. Technologies • CLIPS • Production system, standard open source expert system • So-called Rete algorithm drives the evaluation rules on a set of facts • In house experience • General purpose scripting language, OO features • C language bindings • Alternatives • Jess: Java based, very similar to CLIPS • Eclipse: Commercial evolution of CLIPS • SMI++ • State Machine • No general purpose scripting language • Difficult to integrate in our environment • Python • Excellent scripting language • No expert system Control of the ATLAS TDAQ system

  13. Design & Implementation • General Framework embedding CLIPS in a CORBA server • Periodic evaluation of knowledge base • Extension mechanism • Online Software Components embedded as plug ins • Control functionality fully described by CLIPS rules Control of the ATLAS TDAQ system

  14. Proxy Objects • Represent external entities • Other controllers, processes etc • Member attributes exposed to expert system as facts • Member functions implement functionality in terms of Online software components • Example • Proxy objects represents child controllers • State of the object corresponds to state of the child (idle, configured, running) • Commands are forwarded to child controllers Control of the ATLAS TDAQ system

  15. Controller Rules drive interactions between objects Proxy Objects Other Controllers External processes Control of the ATLAS TDAQ system

  16. Status • Supervisor • Uses Framework • Run Control • Uses Framework • Verification system • CLIPS based • Choice of a common technology drives the path to an unified control system based on Controllers Control of the ATLAS TDAQ system

  17. Scalability Test 2004 • Test bed • Up to 330 PCs of the CERN IT LXSHARE • 600 to 800 MHz to 2.4 GHZ Dual Pentium III • 256 to 512 MB • Linux RedHat 7.3 • Only control aspect verified • No Dataflow network • Various configurations • Servers on standard machines • Servers on dedicated high end machines Control of the ATLAS TDAQ system

  18. Supervisor – Process Management • One Supervisor • PMG Agents • Startup limited by initialization of processes • Enhanced recoveryprocedures Supervisor P P P Control of the ATLAS TDAQ system

  19. Startup with 1000 Controllers & 3000 processes in 40 to 100 seconds Several configurations: mon_standard has two additional processes for a controller Control of the ATLAS TDAQ system

  20. Run Control • Usual RC tree • Actually 10 controllers on the lowest level • Variation of the number of intermediate nodes • Some central infrastructure • Name Service (IPC) • Information Sharing Control of the ATLAS TDAQ system

  21. Transitions • 7 internal phases • With 1000 Controllers 2 to 6 seconds • No “real life” actions Again: More flexible error handling Control of the ATLAS TDAQ system

  22. Combined Testbeam 2004 Stable operation from the start – Advantage of the component model Control of the ATLAS TDAQ system

  23. Conclusions • New assessment of requirements • Overall Architecture • Controller studied in detail • CLIPS confirmed as technology choice • Design and implementation of a new framework • First test of new systems • Test beam • Scalability test • We can control a system of the size of the ATLAS TDAQ system • Much more flexible system • Common technology in various control components • Unified controllers in the future Control of the ATLAS TDAQ system

More Related