450 likes | 568 Views
FairRoot Status and plans. Mohammad Al-Turany. What is FairRoot Framework? And why it is needed?. Simulation-, Reconstruction-, and Analysis-Framework 2003 started as 2 person project for the CBM experiment at FAIR
E N D
FairRootStatus and plans Mohammad Al-Turany M. Al-Turany, ALICE Offline Meeting
What is FairRootFramework? And why it is needed? • Simulation-, Reconstruction-, and Analysis-Framework • 2003 started as 2 person project for the CBM experiment at FAIR • Long list of base and/or ready to use modules and base classes of needed by the particle experiments http://fairroot.gsi.de M. Al-Turany, ALICE Offline Meeting
Current hot topics in FairRoot • Database interface • Re-design the database interface based on TSQLServer • ZeroMQ integration • Use of ZeroMQ as a communication layer • Building, testing and quality assurance systems • Coverage tests, quality tests and unit tests • Online monitoring • For test beams and detector proto-types • GPU support and integration • Time based simulation M. Al-Turany, ALICE Offline Meeting
FairRoot Developers: Core Team: Mohammad Al-Turany IT Denis BertiniIT Florian Uhlig CBM / IT RadekKarabowicz PANDA / IT DmytroKresan R3B/ IT Tobias Stockmanns PANDA (FZJ) Student: Dennis Klein (finished 02.2013) Alexey Rybalchenko (EE) long list of people who have contributed pieces of code to FairRootsince the project started end of 2003 People participated to major features: IlseKönig HADES Volker FrieseCBM Olaf Hartman PANDA M. Al-Turany, ALICE Offline Meeting
FairRoot Group at the GSI • Mohammad Al-Turany (IT) • Denis Bertini (IT) • RadoslawKarabowicz(IT/PANDA) • DymtroKresan (IT/R3B) • AnarManafov(IT) • Alexey Rybalchenko (Master Student) • Yago Gonzalez Rozas (Guest scientist) • Florian Uhlig (IT/CBM) • N.N. (Sep.2013) (IT) M. Al-Turany, ALICE Offline Meeting
CbmRoot R3BRoot SofiaRoot MPDRoot Design PandaRoot AsyEosRoot FopiRoot EICRoot FairRoot Run Manager IO Manager MC Application Event Display Runtime DB DB Interface Task Magnetic Field Module Detector Event Generator … … Root Libraries Cint ROOT IO TTree TGeo TVirtualMC TEve Proof Geant4 Genat4_VMC Geant3 VGM Florian Uhlig ROOT Users Workshop, Saas Fee 13.03.13 … … M. Al-Turany, ALICE Offline Meeting
FairRoot : Timeline Start testing the VMC concept for CBM Panda decided to join-> FairRoot: same Base package for different experiments SOFIA (Studies On Fission with Aladin) EIC (Electron Ion Collider BNL) EICRoot R3B joined 2013 2010 2004 2012 2006 2011 ENSAR-ROOT Collection of modules used by structural nuclear phsyics exp. First Release of CbmRoot MPD (NICA) start also using FairRoot GEM-TPC seperated from PANDA branch (FOPIRoot) ASYEOS joined (ASYEOSRoot) M. Al-Turany, ALICE Offline Meeting
Database Re-Design M. Al-Turany, ALICE Offline Meeting
Database in FairRoot:The real database in FairRoot is completely hidden from the user and/or software developer • The runtime database is not a database in the classical sense, but a parameter manager. • It knows the “I/O”s defined by the user and all parameter containers needed for the actual analysis and/or Simulation. • It manages the automatic initialization and saving of the parameter containers • After all initialization the complete list of runs and related parameter versions are saved either to Database (Oracle, MySql, …) or to ROOT files. M. Al-Turany, ALICE Offline Meeting
FairRoot DB Design (Old) ASCII File Configuration parameters. FairRoot Run Manager RunTime Database Root File Configuration parameters. IO Manager Root File MC-points Digits, etc Oracle M. Al-Turany, ALICE Offline Meeting
FairRoot DB extended ASCII File Configuration parameters. FairRoot Run Manager RunTime Database Root File Configuration parameters. DB Interface IO Manager TSQLServer Root File MC-points Digits, etc Oracle MySQL Postgresql M. Al-Turany, ALICE Offline Meeting
Re-design Database interface based on ROOT Database Connectivity (RDBC) API which provides uniform interface to Oracle, MySQL, PgSQL • Database Interface in FairRoot using TSQLServer • (MySQL, Oracle, PostGre,... ) • Allows multiple connections to Dbs at runtime • Adds Version Management • Data type: Real and/or MC • Detector type • Date and Time Range • Reduces SQL coding • Simple Predefined Table • Only Simple SQL used • Ultimately Generic Container • Handles Write/Read access M. Al-Turany, ALICE Offline Meeting
Version Mangment Version It must be possible to get a consistent set of information for any date (e.g. The start time of a certain run). It must be possible to get an answer to the question: 'Which parameters were used when analyzing this run X years ago?' (The calibration might have been optimized several times since this date. Maybe some bugs have been detected and corrected in the mean time.) STS CAL Time MVD CAL MVD TEMP Detector Validity time range (UTC) Time RunID t M. Al-Turany, ALICE Offline Meeting
D. Bertini Version Management • The Query process • Context ( Timestamp,Detector,Version) is the primary key • Context converted to unique SeqNo • SeqNo used as keys to access all rows in main table • System gives user access of all such rows Validity Frame Context matched 900001020 900001020 Auxiliary validity table 900001020 Bigtable a Distributed Storage System for Structured Data, Google inc. OSDI 2006 900001020 M. Al-Turany, ALICE Offline Meeting
New Data transfer layer for FairRoot M. Al-Turany, ALICE Offline Meeting
The Online Reconstruction and analysis 300 GB/s 20M Evt/s > 60 000 CPU-core or Equivalent GPU, FPGA, … < 1 GB/s 25K Evt/s We have the fastest algorithms but: How to distribute the processes? How to manage the data flow? How to recover processes when they crash? How to monitor the whole system? …… 1 GB/s 1 TB/s > 60 000 CPU-core or Equivalent GPU, FPGA, … M. Al-Turany, ALICE Offline Meeting
Design constrains • Highly flexible: • different data paths should be modeled. • Adaptive: • Sub-systems are continuously under development and improvement • Should works for simulated and real data: • developing and debugging the algorithms • It should support all possible hardware where the algorithms could run (CPU, GPU, FPGA) • It has to scaleto any size! With minimum or ideally no effort. M. Al-Turany, ALICE Offline Meeting
Data transport • How to handle dynamic components, i.e. pieces that go away temporarily? • How to handle messages that we can't deliver immediately? Particularly, if we're waiting for a component to come back on-line • What if we need to use a different network transport. Say, multicast instead of TCP unicast? Or IPv6? Do we need to rewrite the applications, or is the transport abstracted in some layer? M. Al-Turany, ALICE Offline Meeting
Before Re-inventing the Wheel • What is available on the market and in the community? • A very promising package: ZeroMQ is available since 2 years • Do we intend to separate online and offline?NO • Multi-Threaded concept or Multi-Processes based on message queues? • Message based systems allow us to decouple producers from consumers. • We can spread the work to be done over several processes and machines. • We can manage/upgrade/move around programs (processes) independently of each other. M. Al-Turany, ALICE Offline Meeting
ØMQ(zeromq) • A socket library that acts as a concurrency framework. • Carries messages across inproc, IPC, TCP, and multicast. • Connect N-to-N via fanout, pubsub, pipeline, request-reply. • AsynchI/O for scalable multicore message-passing apps. • 30+ languages including C, C++, Java, .NET, Python. • Most OS’s including Linux, Windows, OS X, PPC405/PPC440. • Large and active open source community. • LGPL free software with full commercial support from iMatix. M. Al-Turany, ALICE Offline Meeting
What does it deliver? • It handles I/O asynchronously, in background threads. • These communicate with application threads using lock-free data structures, • Concurrent ØMQ applications need no locks, semaphores, or other wait states. • Components can come and go dynamically and ØMQ will automatically reconnect. • You can start components in any order. • You can create "service-oriented architectures" (SOAs) where services can join and leave the network at any time. • When a queue is full, ØMQ • Automatically blocks senders, or • Throws away messages, depending on the kind of messaging you are doing (the so-called "pattern"). M. Al-Turany, ALICE Offline Meeting
What does it deliver? • It does not impose any format on messages. • They are blobs of zero to gigabytes large. • You can use any other product (Protocol) on top to represent your data (Google's protocol buffers, etc). • Applications talk to each other over arbitrary transports: TCP, multicast, in-process, inter-process. • You don't need to change your code to use a different transport. M. Al-Turany, ALICE Offline Meeting
The built-in core ØMQ patterns are: • Request-reply, which connects a set of clients to a set of services. (remote procedure call and task distribution pattern) • Publish-subscribe, which connects a set of publishers to a set of subscribers. (data distribution pattern) • Pipeline, which connects nodes in a fan-out / fan-in pattern that can have multiple steps, and loops. (Parallel task distribution and collection pattern) • Exclusive pair, which connect two sockets exclusively M. Al-Turany, ALICE Offline Meeting
Current Status • The Framework deliver some components which can be connected to each other in order to to optimize data flow topology. • All component share a common base called Device (ZeroMQClass). • Devices are grouped by three categories: • Source: Sampler • Message-based Processor: • Sink, BalancedStandaloneSplitter, StandaloneMerger, Buffer • Content-based Processor: Processor M. Al-Turany, ALICE Offline Meeting
FairMQ package Panda Example Framework classes that can be used directly Experiment/detector specific code M. Al-Turany, ALICE Offline Meeting
Example for Panda online reconstruction hierarchy (scenario) Detector Simulation MVD Pixel data Mvd Strip data REP REP Computing Unit SUB Parameter database SUB REQ SUB SUB REQ SUB PUB Clusterer Clusterer Log XPUB Log XPUB REP REP XSUB Log Writer XSUB SUB REQ Log Aggregate Log XPUB Tracker PUB REP PUB XPUB M. Al-Turany, ALICE Offline Meeting
Correct semantics for logging • Pub/Sub sockets • Never block • Lossy! (if needed) • Buffer sizes / locations configurable • Arbitrary message size M. Al-Turany, ALICE Offline Meeting
Results • Throughput of 940 Mbit/swas measured which is very close to the theoretical limit of the TCP/IPv4/GigabitEthernet • The throughput for the named pipe transport between two devices on one node has been measured around 1.7 GB/s Each message consists of digits in one panda event for one detector, with size of few kBytes M. Al-Turany, ALICE Offline Meeting
Payload in Mbyte/s as function of message size ZeroMQ works on InfiniBand but using IP over IB M. Al-Turany, ALICE Offline Meeting
Integrating the existing software: ROOT Files, Lmd Files, Remote event server, … Root (Event loop) ZeroMQ FairRootManager FairRunAna FairTasks Init() Re-Init() Exec() Finish() • FairMQProcessorTask • Init() • Re-Init() • Exec() • Finish() M. Al-Turany, ALICE Offline Meeting
Fairbase/example/Tutorial3 FairBase/examples/Tutorial3 M. Al-Turany, ALICE Offline Meeting
Next to implement • Local and central Log processors • Command channels and objects (messages) • Automatic monitoring and configuration (hopefully till the end of this year!) M. Al-Turany, ALICE Offline Meeting
Summary • ZeroMQ communication layer is integrated into our offline framework (FairRoot) • On the short term we will keep both options ROOT based event loop and concurrent processes communicating with each other via ZeroMQ. • On long Term we are moving away from single event loop to distributed processes. Thanks you ! M. Al-Turany, ALICE Offline Meeting
Native InfiniBand/RDMA is faster than IP over IB Implementing ZeroMQ over IB verbs will improve the performance. M. Al-Turany, ALICE Offline Meeting
Device • Each processing stage of a pipelineis occupied by a process which executes an instance of the Device class M. Al-Turany, ALICE Offline Meeting
Sampler • Devices with no inputs are categorized as sources • Asampler loops (optionally: infinitely) over the loaded events and send them through the output socket. • A variable event rate limiter has been implemented to control the sending speed M. Al-Turany, ALICE Offline Meeting
Message format (Protocol) • Potentially any content-based processor or any source can change the application protocol. Therefore, the framework provides a generic Message class that works with any arbitrary and continuous junk of memory (FairMQMessage). • One has to pass a pointer to the memory buffer, the size in bytes, and can optionally pass a function pointer to a destructor, which will be called once the message object is discarded. M. Al-Turany, ALICE Offline Meeting
New simple classes without ROOT are used in the Sampler (This enable us to use non-ROOT clients) and reduce the messages size. M. Al-Turany, ALICE Offline Meeting
Processor design M. Al-Turany, ALICE Offline Meeting
Content-based Processor • The Processor device has at least one input and one output socket. • A task is meant for accessing and potentially changing the message content. M. Al-Turany, ALICE Offline Meeting
Message-based Processor • All message-based processors inherit from Device and operate on messages without interpreting their content. • Four message-based processors have been implemented so far M. Al-Turany, ALICE Offline Meeting
Example for Fan-out/Fan-in the data path for load balancing MVD data MVD data FairMQBalancedStandaloneSplitter Clusterer Clustrer Clustrer Clustrer Tracker Tracker Tracker MVD Tracker FairMQStandaloneMerger M. Al-Turany, ALICE Offline Meeting
Example for Fan-out/Fan-in the data path for load balancing MVD data MVD data FairMQBalancedStandaloneSplitter Clustrer Clustrer Clustrer Clusterer FairMQStandaloneMerger MVD Tracker MVD Tracker M. Al-Turany, ALICE Offline Meeting