550 likes | 635 Views
CBM DAQ and Event Selection. Walter F.J. Müller , GSI, Darmstadt for the CBM Collaboration Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway, 4-6 April 2005. Outline. CBM (very briefly) observables setup FEE/DAQ/Trigger requirements challenges
E N D
CBM DAQ and Event Selection Walter F.J. Müller, GSI, Darmstadt for the CBM Collaboration Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway, 4-6 April 2005
Outline • CBM (very briefly) • observables • setup • FEE/DAQ/Trigger • requirements • challenges • strategies Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
CBM at FAIR SIS 100 Tm SIS 300 Tm U: 35 AGeV p: 90 GeV Compressed Baryonic MatterExperiment Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
CBM Physics Topics and Observables • In-medium modifications of hadrons onset of chiral symmetry restoration at high ρBmeasure: , , e+e- (μ+ μ-) open charm: D0, D± • Strangeness in matter enhanced strangeness productionmeasure: K, , , , • Indications for deconfinement at high ρB anomalous charmonium suppression ?measure: D0, D± J/ e+e- (μ+ μ-) • Critical point event-by-event fluctuations measure: π, K Good e/π separation Vertex detector Low cross sections→ High interaction rates→ Selective Triggers Hadron identification Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
CBM Setup Radiation hard Silicon pixel/strip detectorsin a magnetic dipole field Electron detectors: RICH & TRD & ECAL: pion suppression up to 105 Hadron identification: RPC, RICH Measurement of photons, π0, η, and muons: ECAL Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
CBM and HADES All you want to know about CBM:Technical Status Report (400 p)now available under http://www.gsi.de/documents/DOC-2005-Feb-447-1.pdf Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Meson Production in central Au+Au W. Cassing, E. Bratkovskaya, A. Sibirtsev, Nucl. Phys. A 691 (2001) 745 10 MHz interaction rateneeded for 10-15 A GeV SIS300 Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
A Typical Au+Au Collision Central Au+Au collision at 25 AGeV: URQMD + GEANT 160 p 170 n 360 -330 +360 0 41 K+ 13 K-42 K0 107 Au+Au interactions/sec 109 tracks/sec to reconstruct for first level event selection Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
CBM Trigger Requirements assume archive rate: few GB/sec 20 kevents/sec • In-medium modifications of hadrons onset of chiral symmetry restoration at high ρBmeasure: , , e+e- open charm (D0, D±) • Strangeness in matter enhanced strangeness productionmeasure: K, , , , • Indications for deconfinement at high ρB anomalous charmonium suppression ?measure: D0, D±- J/ e+e • Critical point event-by-event fluctuations measure: π, K offline trigger trigger ondisplaced vertex offline drives FEE/DAQarchitecture trigger trigger trigger on high pte+ - e- pair offline Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Open Charm Detection • Example: D0 K-+ (3.9%; c = 124.4 m) • reconstruct tracks • find primary vertex • find displaced tracks • find secondary vertex target few 100 μm 5 cm • high selectivity because combinatorics is reduced first two planesof vertex detector Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
CBM DAQ Requirements Profile • D and J/Ψ signal drives the rate capability requirements • D signal drives FEE and DAQ/Trigger requirements • Problem similar to B detection, like in LHCb or BTeV (rip) • Adopted approach: displaced vertex 'trigger' in first level, like in BTeV (rip) • Additional Problem: DC beam → interactions at random times → time stamps with ns precision needed → explicit event association needed • Current design for FEE and DAQ/Trigger: • Self-triggered FEE • Data-push architecture Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Buffer Conventional FEE-DAQ-Trigger Layout Especially instrumented detectors Detector L0 Trigger fbunch Trigger Primitives Dedicated connections FEE Cave Limited capacity Shack L1 Accept DAQ Modest bandwidth L2 Trigger L1 Trigger Limited L1 trigger latency Specialized trigger hardware Standard hardware Archive Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Limits of Conventional Architecture Decision time for first level trigger limited. typ. max. latency 4 μs for LHC Not suitable for complex global triggers like secondary vertex search Only especially instrumented detectors can contribute to first level trigger Limits future trigger development Large variety of very specific trigger hardware High development cost Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Buffer L2 Trigger L1 Trigger L2 Trigger The way out .. use Data Push Architecture Especially instrumented detectors Detector L0 Trigger fbunch fclock Trigger Primitives Dedicated connections FEE Timedistribution Cave Limited capacity Shack L1 Accept DAQ High bandwidth Modest bandwidth L1 Trigger Limited L1 trigger latency Specialized trigger hardware Standard hardware Special hardware Archive Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
L1 Trigger L2 Trigger The way out ... use Data Push Architecture Detector fclock FEE Cave Shack DAQ High bandwidth Special hardware Archive Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
L1 Select L2 Select The way out ... use Data Push Architecture Detector Self-triggered front-end Autonomous hit detection fclock FEE No dedicated trigger connectivity All detectors can contribute to L1 Cave Shack DAQ Large buffer depth available System is throughput-limited and not latency-limited High bandwidth Modular design: Few multi-purpose rather many special-purpose modules Special hardware Use term: Event Selection Archive Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Front-End for Data Push Architecture • Each channel detects autonomously all hits • An absolute time stamp, precise to a fraction of the sampling period, is associated with each hit • All hits are shipped to the next layer (usually concentrators) • Association of hits with events done later using time correlation • Typical Parameters: • with few 1% occupancy and 107 interaction rate: • some 100 kHz channel hit rate • few MByte/sec per channel • whole CBM detector: 1 Tbyte/sec Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Typical Self-Triggered Front-End Use sampling ADC on each detector channel running with appropriate clock • Average 10 MHz interaction rate • Not periodic like in collider • On average 100 ns event spacing a: 126 t: 5.6 a: 114 t: 22.2 amplitude Time is determined to a fraction of the sampling period 100 threshold 50 time 0 5 10 15 20 25 30 Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Toward Multi-Purpose FEE Chain preFilter digital Filter Hit Finder Backend & Driver PreAmp ADC • Pad • GEM's • PMT • APD's Anti-AliasingFilter Sample rate: 10-100 MHz Dyn. range: 8...12 bit 'Shaping' 1/t Tailcancellation Baselinerestorer Hit parameter estimators: Amplitude Time Clustering Buffering Link protocol see talk V. Lindenstruthsee talk L. Musa All potentially in one mixed-signal chip Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
CBM DAQ and Online Event Selection • More than 50% of total data volume relevant for first level event selection • Aim for simplicity • Ansatz: do (almost) all processing done after the build stage • Simple two layer approach: 1. event building 2. event processing • Other scenarios are possible, putting more emphasis on: • do all processing as early as possible • transfer data only then necessary neededfor D neededfor J/μ usefullfor J/μ STS, TRD, and ECAL data usedin first level event selection Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Logical Data Flow Concentrators:multiplex channelsto high-speed links Time distribution Buffers Build Network Processing resources forfirst level event selectionstructured in small farms Connection to'high level' selection processing Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Bandwidth Requirements Data flow: ~ 1 TB/sec Gilder helps Moore helps 1st level selection: ~ 1014-15 operation/sec ~ 100 Sub-Farms Data flow: few 10 GB/sec to archive: few 1 GB/sec Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Focus on CNet Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Self-Triggered FEE – Output Format I FEE Output of a FEE chipis a list of hits Each hit has a timestampplus other information Output of asingleFEE chip 17 15 ... 68 34 ... 134 18 ... 135 19 ... 1234 33 ... TimeStamp Channeladdress other values:amplitudespulse shape !! Time Stamp values can increase forever !! ? How to express absolute time efficiently ? Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Handle the infinite Time Axis 1. Subdivide Time in Epochs 2. Express a timerelative to an epoch practical epochlength about 10 μs 3. Introduce Epoch Markers Epoch 1 Epoch 4 Epoch 2 Epoch 3 (2, 137 ns) (3, 314 ns) Time A Hit An EpochMarker Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Self-Triggered FEE – Output Format II Output of a FEE chipis a list of hits andepoch markers Each hit has a timestampplus other information FEE M 1 H 17 15 ... H 68 34 ... H 134 18 ... H 135 19 ... H 1234 33 ... M 2 M 3 H 258 19 ... Hit EpochMarker Hit with effective timestamp (3, 258) Recordtype Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Self-Triggered FEE – Concentrators M 1 H 18 2007 ... M 2 H 589 2134 ... M 3 H 258 2714 ... time address FEE FEE M 1 H 17 15 ... H 68 34 ... H 134 18 ... H 135 19 ... H 1234 33 ... M 2 M 3 H 258 19 ... M 1 H 17 15 ... H 18 2007 ... H 68 34 ... H 134 18 ... H 135 19 ... H 1234 33 ... M 2 H 589 2134 ... M 3 H 258 19 ... H 258 2714 ... Seems prudentto keep dataalways sortedin time A concentrator mergesthe data streams andeliminates redundantepoch markers Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
FEE Data – Clusters I • In many subsystems a particle causes correlated hits in physically neighboring detector cells (STS, TRD, ECAL) • Depending on detector subsystem • the cluster pattern is 1d or 2d • contained in one FEE chip or not • examples in CBM: • STS-MAPS: 2d contained • STS-Strip: 1d mostly contained • TRD 1d mostly contained to 2d often uncontained depending on pad geometry (varies inside→outside) • RPC t.b.d. • ECAL 2d many uncontained Note for 2d: a 16(64) channel chip has ¾(½) of channels on perimeter ! Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
FEE Data – Clusters II • Usually one wants to read very low amplitude hits in the tail of a cluster • low channel hit threshold might give to much noise • → read only low amplitude hit if in neighborhood of a big one • → how to handle clusters crossing a chip border ? • use two thresholds • high threshold determines particle hit and region of interest • RoI communicated to all relevant neighbors • low amplitude hits in RoI are validated and send • → this implies cross communication on CNet between FEE chips... Better named FNet If RoI are communicated, CNet becomes a real network !! see talk V. Lindenstruthsee talk L. Musa Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Focus on BNet Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Event Building – Alternatives • Straight event-by-event approach: • data arrives on ~1000 links • ~100 byte per event and link • 1010 packets/sec to handle... • Handle time intervals or event intervals • 10 μs or 100 events seems reasonable • Very regular and fully controlled traffic pattern: • data traffic can be scheduled to avoid network congestion • a large fraction of the switch bandwidth can be used Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Networking I • High-speed networking • high density connectors • 2.5 Gbps SerDes now ~100 mW • 480 Gbps InfiniBand switch on one chip • DDR and QDR link speeds will come • just wait and see • Mellanox MTA4739624 port InfiniBand switch • 4x ports, 1 Gbyte/sec per port • → 96 x 2.5 Gbps SERDES • 480 Gbps aggregate B/W • Single chip implementation • 961 ball BGA • 18 W power dissipation • Double data rate version (5 Gbps per link) in pipe.... Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Networking II • TODAY: • Voltaire ISR 9288 switch • 288 4x ports; non-blocking • cost today ~120 kEUR (or ~400 EUR/port) • 288 GByte/sec switching bandwidth • likely in a few years: • 288 4x port QDR • likely same or lower cost • 1152 GByte/sec switching speeds • adequate for CBM... • Conclusion: • BNet switch is not a major issue Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Focus on PNet Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Network Characteristics Data PushDatagram'serrors markedbut not recovered Request/Responseand Data PushTransactionserrors recovered Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
L1 Event Selection Farm Layout • Current working hypothesis: CPU + FPGA hybrid system (proviso follows) • Use programmable logic for cores of algorithms • Use CPU for the non-parallelizable parts • Use serial connection fabric (links and switches) • Modular design (only few board types) FPGA Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Network Summary • 5 different networks with very different characteristics • CNet • medium distance, short messages, special requirements • connects custom components (FEE ASICs) • TNet • broadcast time (and tags), special requirements • BNet • naturally large messages, Rack-2-Rack • PNet • short distance, most efficient if already 'build-in' • connects standard components (FPGA, SoCs) • HNet • general purpose, to rest of world FEE Interfaces and CNet will be co-developed. Depends on clock/time distribution is done Custom Potentially build with CNet components Custom Probably uncritical Ethernet, Infiniband,... Look at emerging technologiesStay open for changes and surprisesCost efficiency is key here !! PCIe,ASI,.... Whatever the implementation is, it will be called Ethernet... Ethernet Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Algorithms • Performance of L1 feature extraction algorithms is essential • critical in CBM: STS tracking + vertex reconstruction TRD tracking and Pid • Look for algorithms which allow massive parallel implementation • Hough Transform Trackerneeds lots of bit level operations, well suited for FPGA • Cellular Automaton tracker • Other approaches to be evaluated • Co-develop tracking detectors and analysis algorithms • L1 tracking is necessarily speed optimized→ more detector granularity and redundancy needed • Aim for CBM:Validate final hardware design with at least 2 trackers suitable for L1 Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Algorithms – an Example • Hough Transform • assume track comes from (close to) primary vertex • map each measurement into 'Hough space' • a peak in Hough space indicates a real track • is a 'global' method • needs substantial amount of calculation to fill and analyze the histograms • Many, but very simple operations • allows massively parallel implementation Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Hough-Transform – Implementation Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Hough-Transform – Implementation Very suitable forimplementation inprogrammable logic (FPGA's) Other track finderapproaches, likecellular automatatracker, also underinvestigation Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Interim Summary • Event definition has changed: • now based on time stamps and time correlation • Role of DAQ has changed: • DAQ is simply responsible to transport data from producers to consumers • Role of 'Trigger' has changed: • filter events delivered by DAQ • 'Online Event Selection' is better term • System aspects: • 'online' – 'offline' boundary blurs • more COTS (commercial off the shelf) components • much more modular system • much more adaptable system • This is emerging technology in HEP, though baseline for ILCHowever: being used since many years in nuclear structure Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Moore – quo vadis ? • Will price/performance of computing continue to improve ? • What are limits of CMOS technology ? • Where are the markets ? What are market forces ? • Technology • most of the gain comes from architecture anyway • conventional designs, especially x86, reach their limits • Markets • end of the metal-box PC age→ Laptops + PDA + all kind of dedicated boxes (Video, Games) • end of the binary compatibility age → intermediate code + 'Just in Time' Compilers (JIT) There is life after Intel x86A lot of architectural innovation ahead Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
CPU CPU CPU Cache Cache Cache Mem IO Mem IO IO IO IO Mem IO SPE Mem Mem SPE SPE Mem Mem SPE SPE Mem Mem SPE SPE Mem Mem SPE BlueGene vs Cell Processor BlueGene:121 mm2; 130 nm2.8/5.6 DP GFlop STI Cell:221 mm2; 90 nm256 SP GFlop 30 DP GFlop 25 GB/sec mem 78 GB/sec IO Finally presentedon ISSCC 2005 International Solid-State Circuit Conf. SPE = Synergistic Processing Element Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
BlueGene vs Cell Processor Developed by IBMMarket: national security science Budget: ~100 M$ Developed bySony, Toshiba and IBMMarket: VIDEOGAMESBudget: 500 M$ BG/L High performance computing is driven now by embedded systems(games, video, ....) → Science is a spin-off, at best ... Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
STI Cell Processor • 'normal' PowerPC CPU • 8 Synergistic Processing Element (SPE) each with • 258 kB memory • 128 x 128 bit registers • 4 SP floating point units • own instruction stream • 32 multiply/add per clock cycle • runs at > 4 GHz 221 mm2die sizein 90 nm Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Game Processors as Supercomputers ? Slide from CHEP'04 Dave McQueeneyIBM CTO US Federal Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
CPU and FPGA paradigms merge Conventional CPU SIMD (single instruction – multiple data) CPU Register Wide Register Control Control ALU ALU ALU ALU ALU Configurable Instruction Set CPU Wide Register arithmeticresources ALU ALU ALU ALU ALU ALU Control PSM PSM PSM PSM PSM ALU ALU ALU ALU ALU ALU configurableconnectionfabric PSM PSM PSM PSM PSM ALU ALU ALU ALU ALU ALU Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
Configurable Instruction Set Processor • Example Stretch S5xxx • Hybrid design: • conventional fixed instruction set part • plus configurable instruction set part • C/C++ compiler analyses the kernel of algorithms • generates custom instruction set • generates code to use it • The promise • easy of use of C/C++ • performance of an FPGA Stretch S5 engine Fabric is the keyword from Stretch Inc. product brief interconnected resources Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI
CPU and FPGA paradigms merge CPU Processorindustryworld view A lot of innovation in the years to come Essential will be availability of efficient development tools configurablelogic configurablelogic FPGAindustryworld view Moore will go on ! There are the technologies There are the markets Architectural changes ahead CPU CPU Topical Workshop: Advanced Instrumentation for Future Accelerator Experiments, Bergen, Norway --- Walter F.J. Müller, GSI