120 likes | 285 Views
High Level Processing & Offline. event selecton. event processing. offine. Data volume and event rates Processing concepts Storage concepts. Dieter Roehrich UiB. Data volume. Event size into the High Level Processing System (HLPS)
E N D
High Level Processing & Offline event selecton event processing offine • Data volume and event rates • Processing concepts • Storage concepts Dieter Roehrich UiB
Data volume • Event size into the High Level Processing System (HLPS) • Central Au+Au collision @ 25 AGeV: 335 kByte • Minimum bias collisions: 84 kByte • Triggered collision: 168 kByte • Relative sizes of data objects • RAW data (processed by the online event selection system) = 100% • Event Summary Data – ESDglobal re-fitting and re-analysis of PID possible • Reconstructed event + compressed raw data (e.g. local track model + hit residuals) = 20% • Reconstructed event + compressed processed data (e.g. local track model + error matrix) = 10% • Physics Analysis Object Data – AOD • Vertices, momenta, PID = 2% • Event tags for offline event selection - TAG = << 1%
Event rates • J/ • Signal rate @ 10 MHz interaction rate = 0.3 Hz • Irreducible background rate = 50 Hz • Open charm • Signal rate @ 10 MHz interaction rate = 0.3 Hz • Background rate into HLPS = 10 kHz • Low-mass di-lepton pairs • Signal rate @ 10 MHz interaction rate = 0.5 Hz • No event selection scheme applicable - minimum bias event rate = 25 kHz
Data rates • Data rates into HLPS • Open charm • 10 kHz * 168 kbyte = 1.7 Gbyte/sec • Low-mass di-lepton pairs • 25 kHz * 84 kbyte = 2.1 Gbyte/sec • Data volume per year – no HLPS action • 10 Pbyte/year • ALICE = 10 Pbyte/year: 25% raw, 25% reconstructed, 50% simulated
Processing concept • HLPS’ tasks • Event reconstruction with offline quality • Sharpen Open Charm selection criteria – reduce event rate further • Create compressed ESDs • Create AODs • No offline re-processing • Same amount of CPU-time needed for unpacking and dissemination of data as for reconstruction • RAW->ESD: never • ESD->ESD’: only exceptionally
Data Compression Scenarios • Loss-less data compression • Run-Length Encoding (standard technique) • Entropy coder (Huffman) • Lempel Ziff • Lossy data compression • Compress 10-bit ADC into 8-bit ADC using logarithmic transfer function (standard technique) • Vector quantization • Data modeling Perform all of the above wherever possible
Data compression: entropy coder Variable Length Coding (e.g. Huffman coding) short codes for long codes for frequent values infrequent values Result: compressed event size = 72% Probability distribution of 8-bit NA49 TPC data
Data compression: vector quantization • Vector • Sequence of ADC-valueson a pad • Calorimeter tower • ... • Vector quantization = transformation of vectors into codebook entries Quantization error compare code book Result (NA49 TPC data): compressed event size = 29 %
Data Compression – data modeling (1) Standard loss(less) algorithms; entropy encoders, vector quantization ... - achieve compression factor ~ 2 (J. Berger et. al., Nucl. Instr. Meth. A489 (2002) 406) Data model adapted to TPC tracking Store (small) deviations from a model: (A. Vestbø et. al., to be publ. In Nucl. Instr. Meth. ) Cluster model depends on track parameters Tracking efficiency before and after comp. Relative pt-resolution before and after comp. Tracking efficiency Relative pt resolution [%] dNch/d=1000
Scale: 100 MeV/c Compressed tracks/clusters Data Compression – data modeling (2) • Towards larger multiplicities • cluster fitting anddeconvolution:fitting of n two-dimensional response functions (e.g. Gauss-distributions) • analyzing the remnant and keeping ”good” clusters • arithmetic coding of pad and time information Leftovers
Data Compression – data modeling (3) Achieved compression ratios and corresponding efficiencies Compression factor: 10
Storage concept Main challenge of processing heavy-ion data: logistics • No archival of raw data • Storage of ESDs • Advanced compressing techniques: 10-20% • Only one pass • Multiple versions of AODs