90 likes | 187 Views
The data model. A.Gheata (for CWG4) CWG13 meeting 11 Apr 2014. CWG4 – The data model. Time frame - based data model to: Make data types produced by both detector FEE and processing stages generic By prepending a “ M ultiple D ata H eader” Implementation to follow some principles
E N D
The data model A.Gheata (for CWG4) CWG13 meeting 11 Apr 2014
CWG4 – The data model • Time frame - based data model to: • Makedata types produced by both detector FEE and processing stages generic • By prepending a “Multiple Data Header” • Implementation to follow some principles • Self-containment – data can be processed on any type of nodes • Strict memory management minimizing the need for copying data for processing purposes (data service instead of “copy around”) • Use efficient data layout allowing for fast navigation among data types and sources and usage of data from vectorized algorithms • “Objectify” on top of C-like raw structures • ROOT-based persistency for fast-reco data on EPN’s • Ongoing investigation and prototyping of efficient AOD formats • Flat vs. hierarchical object structures and the impact on processing speed and data compression • Investigation on event size, compression and the output of synchronous reconstruction to be discussed with CWG7 (reconstruction) • Future work: integration simulation and benchmark • Realistic raw time frame simulation (CWG8) + time frame aggregation (CWG4) + FLP to EPN flow (CWG3) + concurrency model and platforms (CWG5) down to EPN reconstruction -> To be done in CWG13 • I/O benchmark depending on event format (e.g. flat vs. hierarchic types)
Simple view Time frame Continuousreadout block Continuous readout block Continuous readout block Heartbeat Heartbeat Continuousreadout block Continuous readout block Continuous readout block Continuous readout block Heartbeat Triggered event Triggered event Triggered event Triggered event Heartbeat Triggered event Triggered event Triggered event Triggered event Triggered event Heartbeat Heartbeat Triggered event Heartbeat Heartbeat Time
Multiple Data Header • FLP would add a common header type for all data blocks (MDH) • Common part • Unique HW ID (FLP/EPN)+ version ID • Summary info for what follows (partly extracted from SDH) • Data type, number of blocks, block length, status, … • Used for navigation in the time frame • Specific part • Relevant SDH info for fast navigation (error bits, fired trigger, ..) • Transient block address table (for DDL data coming in sync) • Make data blocks look the same CWG4 - Data model
The new generic data block • All data blocks produced by both FEE cards or arbitrary processing tasks on FLP (e.g. cluster finding) to be described as generic MDB blocks. A MDH is foreseen to point to several correlated “events” coming asynchronously on different links on the same FLP. Events will have a sub-frame structure (like today) • Processing of MDB blocks is transparent to the node type (FLP, EPN) • EPN’s will process MDB blocks but not required to produce MDB at their turn but rather the persistent storage format (PSF)
Data block types Type=Trigger HW ID = CTP Orbit/BX Size Nb. of blocks Status bits SDH +PAYLOAD Type=Heartbeat HW ID = CTP HB global counter HB local counters Orbit/BX Nb. Of blocks Requested actions: start run, pause, resume, end Type=FEE block HW ID=equipment Orbit/BX Size Nb of blocks Status bits SDH(CDH) +PAYLOAD Type=Clusters SW ID = clusterizer version Size Nb. of blocks Status bits SDH +PAYLOAD CWG4 - Data model
Data management - FLP Offset in buffer BLi(t,t+dt) HBn HBn+1 Linki &buffer(link1) Linki+1 &buffer(link2) BLi+1(t,t+dt) HBn HBn+1 MDH Type HB ID #12345 Nblocks 10 Link #1 addr1 Link #2 addr2 … MDH Type RAW ID #12346 Nblocks 10 Link #1 addr3 Link #2 addr4 … Local processing Serialize to EPN Minimize searches on EPN for synchronized blocks For continuous readout it can be just the same for data reads correlated in time CWG4 - Data model
The time frame data • The time frames will start and end with O2 “heartbeat” MDH (events) and embed all data blocks collected by a given FLP. The corresponding frames will have to be aggregated on a EPN node in a folder-like structure (Time frame summary) easy to browse by reconstruction algorithms. The fast (synchronous) persistent reconstruction format will have to achieve the required overall compression. • Note that the HBE summary may be attached to the “end HBE” to allow for asynchronous dispatching of blocks before the frame is fully aggregated by the FLP
Documentation and links • CWG4 report, trigger heartbeat internal note • https://git.cern.ch/reps/alice-o2-cwg4 • Contribution in the TDR: • Chapter4: Data model • CWG4 wiki: • https://twiki.cern.ch/twiki/bin/viewauth/ALICE/Cwg4