420 likes | 439 Views
Summary of TDAQ upgrade open meetings. TDAQ upgrade open meetings. Two meetings up to now December 2008 http://indico.cern.ch/conferenceDisplay.py?confId=46823 January 2009 http://indico.cern.ch/conferenceDisplay.py?confId=47257 Norman Gee is the convener
E N D
TDAQ upgrade open meetings • Two meetings up to now • December 2008 • http://indico.cern.ch/conferenceDisplay.py?confId=46823 • January 2009 • http://indico.cern.ch/conferenceDisplay.py?confId=47257 • Norman Gee is the convener • TDAQ sessions during the Upgrade week • End of February 2009
Items of past meetings • Introduction and time table (N. Gee) • Current dataflow limitations (D. della Volpe) • Insertable B-layer readout (K. Einsweiler) • Physics rates and trigger selection (S. Tapprogge) • Montecarlo status (Z. Marshall) • SLAC ROD development (A. Haas) • Presented in the last TDAQ week • Tracker Level 1 trigger (Nikhef workshop) • Clustering on chip, Level 0 accept • Level 1 calorimeter trigger (Calo trigger workshop) • FTK
ATLAS Studies/Plans For 2012/13: New tracker Insertable B-layer (IBL) Additional readout, probably with new-style RODs For 2018 sLHC: Replace Inner detector (ASICs will be frozen in 12-18 months) Calorimetry – digital front ends, new forward calo (ASICs) Muons – rates! Approval Timetable 2009/early 2010: IBL TDR 2009: LoI - ATLAS changes for SLHC 2010: Technical Proposal (+CMS, + SLHC) 2011: Technical Design Report(s)
TDAQ Need to develop upgrade plans With physics & trigger community: understand requirements – trigger selections and rates, performance of algorithms, handling pileup, consequences of new detector components With detectors and Electronics Coordinator: Dialogue on key parameters (e.g. latency, trigger and data rates, …) Encourage uniformity at detector/TDAQ interface – protocols, h/w, s/w Calo/L1Calo interface; L1Muon; role of FTK; L1 or L1.5 track trigger… Internal TDAQ: Develop trigger architecture (L1; L1.5; merge L2 & EF?) Identify & study options to handle higher data rates, complex events Cost, effort, timetable; evolve into project TDAQ must get up to speed, reach consensus, be ready to write Tech Proposal Must do R&D while still supporting running experiment as first priority
Aims for the next few months Submit Expression of Interest for TDAQ Upgrade R&D in next 1-2 months Short document; full TDAQ consultation before submission Distributed to ATLAS CB; informs all institutes that the programme is starting Submit R&D Proposal for TDAQ Upgrade R&D in following 3 months Details of programme and justification; institute participation by area; schedule; resources needed; impact on existing work Approved by EB on recommendation of USG Used by institutes in support of requests to funding agencies Produce TDAQ Upgrade Baseline Designs (one for Phase I, one for Phase II) First versions for Feb 2009 Initially based on first ideas, rough estimates of behaviour, and evolving as studies are done; but also providing a focus , as time is short (We have agreed to do a similar design for L1Calo/Calo interface) Initiate needed studies Will need wide participation
How? Small TDAQ upgrade oversight group (Chris, David, Livio, Stefan, Norman,) With others as required. TDAQ open meetings Next is 21 Jan ‘09; several meetings in upgrade week 23-27 Feb ‘09; Upgrade workshop in TDAQ week May ‘09 Activities being started (list will grow) Requirements capture Current System Limits (TDAQ and detector front ends/readout) Collection of detector information (future) Data flow modelling HLT processing times, CPU & memory resources Level-1 …
Insertable B-layer (IBL) • Adds 14 M channels to 80 M • Front-end based on FE-14 chips • Intermediate storage to manage higher occupancies. • Clusters are formed within a column pairs • modest bandwidth reductions at high luminosity • Each FE chip in the IBL • Has dedicated TTC (down) and Data (up) links • Uses individual fibers to maximize compatibility with existing infrastructure • TTC link would run at either 40 MHz or possibly 80 MHz • The Data link would run at 160 Mbit/s • Encoding is under discussion • Estimated throughput per link @ 3x1034 • Between 80 and 120 Mbit/s (~30% safety factor for IBL) • Present B-layer saturates at 2.5x1034
Operation above 1034 for pixel • These results assume a L1 trigger rate 100 KHz • Possible with the existing Silicon RODs. • Estimates are very naïve • There is not a large amount of buffering in the readout system • Poor performance expected when the bandwidth limit is approached • Operation above design luminosity will require a reduction in trigger rate • No detailed simulation effort at the module level • A high intensity testbeam program, suggests that the module operation will be (barely) OK. • The readout of the present FE chip degrades at high occupancy • Critical occupancy is expected to be ~2-3 times design luminosity • Significant hit losses will occur in the FE chips, independent of L1 rate.
Readout configuration • Assume Silicon ROD (or similar) is used • Modularity driven by s-link throughput • 160 Mb/s implies 8 FE per ROD • Only small changes to the ROD • BOC should deal with • 160 Mbit/s single links (presently 80 Mbit/s) • Improved link protocol • 64 RODs (4 VME crates) needed • Currently 192 (9 VME crates) • 25-50% increase in the total ROD count
SLAC ROD upgrade From K. Einsweiler • There is some thinking in the upgrade community about a longer-term ROD improvement program (in fact it would replace the complete ROD/ROL/ROS system). This is very interesting, and worth following for the SLHC. • This approach is very much optimized towards high speed data movement, with embedded general purpose processing which could be very flexibly deployed to perform initial analysis on the data streams. • However, the present Silicon ROD has evolved to meet both standard DAQ requirements, and the more complex requirements of Calibration. The ROD supports both the DAQ functions of formatting and monitoring data transmitted from the detector, and the TTC functions of trigger and command transmission. • During normal data-taking, the TTC function is largely a transparent broadcast. However, during calibration, the TTC and DAQ functions are tightly coupled as specialized commands are sent to individual modules and specific analysis is performed on the returning data in a tight loop. On the ROD, this requires close coordination of the Master (fixed point) DSP that oversees control functions and command/trigger sequences and the four Slave (floating point) DSPs that analyze returning data using a total of 1 GB of local memory for histogramming purposes. • The parallelism provided by the ROD farm (528 FP DSPs and 132 GB of memory) is essential for rapid characterization of the 80M channel Pixel Detector. The present ROD is complex and imperfect, but after many years of work, it does what we need. The IBL does not seem to require major improvements in the ROD/ROS.
Detailed rates and menus for LHC Available at 1031 and 1032 How do they scale? Understand requirements Including pile-up and event size effects Physics rates and trigger
Level 1 rates at 1031 • Total estimate without overlap is ~12 kHz
Level 1 rates at 1032 • Total estimate without overlap is ~46 kHz
Initial prototype for 1031 Description takes more than 40 pages Includes algorithm sequences at HLT Trigger menus
Some trivial notes • SLHC physics requirements a moving target • Need initial LHC results • Lack of quantitative results • Some level 1 estimates available • Trigger menu for 1031 already too complex • Need to simplify, will grow with luminosity • More complex triggers probably needed at level 1 • Need for tools and MC simulations • To have quantitative assessments
Physics requirements to TDAQ • To exploit physics potential of SLHC • Triggers for discovery physics • (Very) high pT objects (thresholds increased wrt LHC) • Triggers for precision measurements • High pT objects (with similar thresholds as for LHC) • Use more exclusive / multi-object selection to control rate • Monitor and calibration triggers • Low to high pT thresholds (will be pre-scaled) • Conditions at 1035 will impact trigger rates • Higher rate for fixed threshold and efficiency • Trivial increase by corresponding increase in luminosity • Further increase due to less effective isolation criteria, fake rate • Due to the 80 – 500 interactions per crossing
Calo trigger (extrapolation) • If we could maintain the same performance as at lower L: • L1 Trigger Thresholds for O(10 kHz) rate: • ~60 GeV inclusive isolated EM trigger • ~25 GeV isolated EM pair • L1 Trigger Thresholds for O(1 kHz) rate: • ~300 GeV inclusive jets • ~100 GeV ETmiss • Saturation rate (250 GeV/tower) > 100 Hz • Maybe OK for high ET physics. But they are optimistic: • These are trigger thresholds, not ETat which trigger is efficient • SLHC pileup will degrade performance • Can't simply raise all thresholds • Need EW scale triggers still
Single muon trigger Already critical at 1034 No rate control above ~20 GeV Rely on combined trigger? Are there possible improvements? Track trigger? Big impact on ID upgrade Simulation studies needed Muon trigger
Montecarlo status • Is current detector a good approximation? • No big problems with Muon and LArg changes • Most problems coming from ID • Upgrades implies a lot of channels more • There are not enough free channel IDs • Implies ID remapping • Very difficult to change geomtry • Geometry is not yet defined • Comparison of options needed • Private versions of simulation • Non-trivial problems to run it
Full simulation • G4 produces one event at a time • With current geometry • Digitization able to go up to 1034 • Takes few minutes per event • Limited by memory usage • Timing does not scale with luminosity
ATLFAST-II • Two modular components • FastCaloSim • Shower parametrization • Only calorimetry (no level 1) • No time information (offline event overlay not possible) • FATRAS (ATLFAST-IIF) • Readout geometry and custom physics modules • Muon system digitized • Time information available, trigger info • Inner detector still not digitized • No trigger, only in-time pileup • Rough timing • ATLFAST-II (400 MinBias events) ~3.5 h • ATLFAST-IIF (400 MinBias events) ~15 m
Time scales (unofficial) • Upgrade detector geometry • Needs decisions and manpower • 6+ months from integrated geometry descriptions • Digitization / full simulation • Can run 1034 some of the time, 1035perhaps this year • Some tricks to get pileup exist, but would need validation • ATLFAST-II • Simulation of in-time pileup exists • Limited trigger modules • L1 calo ready for validation - month-or-so time scale? • Out of time pileup could be many months away
ROS PC and ROBin • The ROS PC is a critical point of the system • Hosts the ROBin • ROB memory limits the possible level-2 and EB latency • Data received through 3 s-link (160 MB/s) • A GB ethernet interface is available • Event data read out from ROB on request • Limitation on the rate of level-2 requests • Data and delete requests via PCI • Alternative path is the ROBin GB ethernet interface
Limitations from L2 requests EB requests ROS performance
General limitations of DF • ROB input limited by s-link • S-link throughput is 160 MB/s • Each exceeding link generates backpressure • ROS is limited by output • L2 and EB • Data requests • Data throughput • Data size • TDAQ can handle larger data volumes • Actual limitation is Tier-0 (300 MB/s)
Possible improvements • New ROS MB and CPU • Improve the limits due to high data requests • Measured gain ~30% • Cost 1.5 kCHF / MB • PCI Express ROBin • Current bandwidth 256 MB/s • Requires PC upgrade • Gain >30% • Cost 2 kCHF (PC) + 4*3.0 kCHF/ROBin = 14 kCHF • Switch based scenario • Use the ROBin GB ethernet to handle requests • Tests on-going • Requires more ROBins?
The SLAC ROD (1) • An alternative approach • Motivations: • Future running will require more bandwidth • Larger events (more detectors, more luminosity) • Larger data fraction to L2 for efficient triggering • ROD maintenance is an issue • Hard to find ROD components
The SLAC ROD (2) • Proposed improvements • Unique ROD • Operating system (not just DSP!) • Calibration data at high readout rate • Higher bandwidth per “read-out link” • Increased ability to handle hot “read-out links” • Full EB at 100 KHz • To get better HLT selection • Merge L2 and EF • Address ROS and ROS upgrades together
Components (1) • Reconfigurable Cluster Element (RCE)
Components (2) • Cluster Interconnect (CI) board
Components (3) • ATCA Read Out Crate
Proposed upgrade • Read-out module (ROM) • Contains 8 (16?) RCE's • Can read-out and pre-process data from front-ends • Could be used as upgraded ROD • Plus L1.5 triggering? • Can buffer the data and present it to L2/EF over Ethernet • Could be used as upgraded ROS • Proposed for B-layer…
Brainstorming on DAQ/HLT • Focused on architecture and infrastructure • Mainly for HLT • Called by Andre Anjos • Nov 19th 2008 • Appendix to the DAQ week • Jan 22nd 2009
Current TDAQ architecture 3 data networks Simplify the design Keep flexibility Merge L2 end EF on the same PU 2 types of processing units 7 types of applications Queues and timeouts
Key points • Simplify HLT • Automatic load balance • Testing and deployment in a single application • Simplify networking • Merging L2 and EF networks • Makes possible incremental event building • Latency and ROB size? • Simplify configuration • Fewer application type • Fewer queues • Fewer timeouts
Some other aspects • Timing • Phase-I – 2012-2013? • Adiabatic approach • First prototype by Mario Bondioli (Irvine)
Personal notes • Requirements: • Dynamic system • Online latency measurements needed • Dynamic thresholds and timeouts • Process control • HLT must be a state machine • Allocated resources must be de-allocated • Machine resources • Multithreading to • Share resources • Improve scalability with new machines • Current “offline approach” is not correct • Lack of control • Bad use of resources • Dependencies are complex and in the wrong direction • Start from an online framework !
Conclusioni • Quale puo’ essere il coinvolgimento dei gruppi italiani? • Interessi attuali • Tempi • Sinergie da sfruttare al massimo • Aspetti HW e SW da considerare congiuntamente • Interazioni tra le parti da valutare accuratamente • Cerchiamo fin da ora di compattarci il piu’ possibile