SuperB FCTS/DAQ Protocol Proposal Tradeoffs

SuperB FCTS/DAQ Protocol ProposalTradeoffs Gregory Dubois-Felsmann & Steffen LuitzSuperB workshop, Elba2 June 2008

Assumptions, options, and tradeoffs There has been discussion provoked by our proposal: good! • We considered a choice between two main options • Fixed-latency triggering and variable-latency triggering w/ queueing • There are other significant choices to make • Overlapping-event handling is an important one • Let’s look again at the tradeoffs... SuperB FCTS/DAQ protocol proposal

Where things stand • Last detector workshop: • We agreed that a reasonable way to move forward was for us to propose a sketch of the protocol for event data movement between the front-end electronics and the DAQ system, for an assumption of a fully-triggered readout. • This is primarily intended to provoke concrete reactions and discussion from those who have to design and build the electronics. They will very likely know better than we will what the constraints are, and we expect that will lead to changes in the protocol. • Original goal: • Circulate a written draft in time for feedback and rethinking by Elba. • Reality: • Proposal discussed at a detector telecon.; writeup is in progress SuperB FCTS/DAQ protocol proposal

Reminder about basic assumptions • Previously established • Physics goals require an open trigger, e.g., similar to BaBar’s. • Level 1 Accept rate of ~100kTps (triggers per second) • Unless highly effective Level 1 Bhabha veto is developed • Expecting 50kTps Bhabhas... • Stretch goal: be able to handle 150kTps • Level 3 Accept rate of ~20kTps • Expect “Level 4” offline filter beyond that SuperB FCTS/DAQ protocol proposal

Unstated assumptions & issues • We didn’t discuss these... • Level 1 trigger provided with maximum latency similar to BaBar’s: ~12 microseconds • There has been no study at all of whether the latency could be reduced significantly (i.e., by at least a factor of two). It’s plausible that this could be done by using more capable components and parallelizing more, but this has not been investigated by experts. • Other experiments have been able to do better, but SuperB does not have quite the same prompt and spectacular signals that some do. • DCH and EMC must provide “side channels” of data to trigger, perhaps SVT will also contribute • Their ability to deliver data quickly is essential to limiting trigger latency • Trigger may need higher-resolution data than in BaBar (e.g., to allow for more precise 3D tracking) • This might push latency longer (more complex computations needed) but lower the Level 1 rate in compensation. SuperB FCTS/DAQ protocol proposal

Start from BaBar • BaBar-Note-281 (v1.1) described the protocol for communication between the ROMs and the FEEs • The Conceptual Design document for the FCTS describes the extension of this protocol through the FCTS system. • More detail is available in the “FCTS Architecture” note • For now we are considering only the event data protocol, corresponding to the “run-time commands” in the BaBar protocol (viz. Section 3.1 of BaBar-Note-281) SuperB FCTS/DAQ protocol proposal

BaBar event data protocol • In normal triggered data acquisition running: • Signal from Level 1 trigger (GLT lines) goes to FCGM • If system is not “busy” (within 2.x us of previous command) or “full” (no FE buffers available) or “inhibited” (external signal), send L1Accept command through FCTS to DAQ crates & ROMs • ROM forwards L1Accept command to FEEs • FEEs capture a previously specified window of data into a buffer (in triggered, i.e., non-EMC, systems) • Some variable time later, when resources available, ROM sends ReadEvent command to FEEs, which send back the earliest available buffer (not addressable, state modeled by ROM) • This relies on a trigger delivery latency that is confined to a fixed jitter interval, with readout windows large enough to cover the uncertainty. (This is about 1us in BaBar, though most trigger lines have much better resolution, ~200-300ns.) SuperB FCTS/DAQ protocol proposal

Two choices • Basic requirement is no intrinsic per-L1Accept deadtime, and deadtime due to “full” designed to be at most ~1% at the nominal 100 kTps trigger rate. This can be achieved in two ways… SuperB FCTS/DAQ protocol proposal

Alternatives • BaBar-like fixed-latency model • Works (roughly) if it is possible to deliver L1Accepts at a minimum spacing equal to the shortest time interval by which the (assumed fully pipelined) trigger can distinguish consecutive events. • Essentially this means that there can’t be a meaningful limit on the minimum command spacing (100 ns ~ 1% deadtime). In practice, in almost any scenario this requires being able to handle overlapping readout windows. • You don’t have to be able to do this for an unlimited number of events, of course, just for bursts long enough that statistically you don’t get significant deadtime due to “full”. • Places very stringent requirements on FCTS-DAQ link • Variable latency with addressing by time, and queueing of triggers • In general requires additional pre-L1Accept “ring buffer”-type space, as closely-spaced triggers will effectively be delayed in transit. SuperB FCTS/DAQ protocol proposal

Buffering • We assume two levels of buffering in the FEEs, in both models: • A continuously-running ring buffer upstream of the L1Accepts, long enough for the maximum trigger latency, in either model. • In Model 1, its length can be essentially equal to the trigger latency (plus some constant offset) • In Model 2, it needs to be longer than this by enough to handle 99% of anticipated trigger bursts. We need to do modeling to be quantitative about this, but we anticipate that the answer will be O(10us) of additional capacity (i.e., roughly a doubling (a guess so far, but will do modeling)). • A post-L1Accept buffer. This would likely be constructed as a number of fixed-size slots as in BaBar’s design. The number of L1Accept slots required needs to be determined by modeling, but is very likely to be substantially more than the four in the BaBar design. • The driving parameters are that events must be able to be acquired about ten times faster than in BaBar, but the actual readout probably cannot be comparably faster (link speeds are only 2x faster, and we probably cannot afford to have significantly more ROMs). • The amount of such buffering needed would be somewhat larger in Model 1. SuperB FCTS/DAQ protocol proposal

Buffering - tradeoffs • Larger buffers add to channel costs, provide more targets for radiation upsets and transient data loss • Variable latency: pre-L1Accept buffers are larger • Fixed latency: post-L1Accept buffers are larger (probably less so) • Buffer addressability adds some complexity to front ends • Minimal digital delay line approach is not applicable • Variable-length readout adds (marginally?) more complexity (see next slides) SuperB FCTS/DAQ protocol proposal

physics physics readout readout readout readout physics physics Bhabha (nottriggered) physics readout?? extended readout Overlaps - additional explication • Overlapping readout windows need to be supported More details: • This is very difficult to avoid • At 100kTps, 1us readout windows will overlap O(10%) of the time • Several cases: SuperB FCTS/DAQ protocol proposal

Overlaps - some consequences • For a given distribution of window overlap probability... • A function of the trigger rate and readout window width • With the window width a function of intrinsic signal width and trigger jitter the conditional probability that the signals overlap is a function of the ratio of the signal width to the trigger jitter. • I.e., narrow signals could still overlap because of unlucky trigger jitter outcomes, but less often • If the signals do not overlap, only the windows, the ~only issue is of occupancy on the uplinks • Bandwidth that you spend on re-reading data isn’t available to handle new data • A fixed-latency design still works, but redundant (or useless) readout can’t be avoided without additional complexity in the front ends • If the signals overlap, there are additional issues (L3, reco)! SuperB FCTS/DAQ protocol proposal

Overlaps - Bhabhas • If a physics event occurs shortly after a vetoed Bhabha, and if removing the Bhabha’s effect from the event requires acquiring most or all of the signals from the Bhabha itself...then you have to go “back in time” to get the Bhabha. • The trigger has to have remembered that it just saw and vetoed a Bhabha • The data still have to be available in the front ends, perhaps ~1us longer than the nominal trigger latency. • Both models can handle “going back in time” • Fixed latency: make the fixed latency longer • Variable latency: almost for free - it’s just like a queued trigger • Readout issues are otherwise exactly as for other trigger overlaps SuperB FCTS/DAQ protocol proposal

Buffering II • Overlapping readout windows need to be supported • This has implications for the copy of event data from the ring buffer to the post-L1Accept buffer. • We propose that the protocol support copy by reference when windows overlap. This reduces the internal bandwidth required in the FEEs. • This requires the system to model an event as composed potentially of one or more by-reference segments followed by a by-value segment. At a minimum, the by-value segment of an event must be retained somewhere in the system until enough time has passed that a future event cannot need any part of it. This could be done either in the FEE or in the ROM, trading off complexity in the FEE against complexity in the FCTS protocol and the ROM. • We expect to propose one possible implementation as a reference point but describe alternatives as well. • We are proposing that the L1Accept command include a SuperB FCTS/DAQ protocol proposal

Buffering II (updated in Elba) • Overlapping readout windows need to be supported • This has implications for the copy of event data from the ring buffer to the post-L1Accept buffer. • We propose that the protocol support copy by reference when windows overlap. This reduces the internal bandwidth required in the FEEs. • This requires the system to model an event as composed potentially of one or more by-reference segments followed by a by-value segment. At a minimum, the by-value segment of an event must be retained somewhere in the system until enough time has passed that a future event cannot need any part of it. This could be done either in the FEE or in the ROM, trading off complexity in the FEE against complexity in the FCTS protocol and the ROM. • We prefer putting the greater complexity in the FCTS and ROM, not in the FEE • We propose that the L1Accept command include a “length” field in either model, in addition to the time address in Model 2 SuperB FCTS/DAQ protocol proposal

Command protocol • BaBar • ROM-to-FEE commands are 12 bits: a 0, a 1, a 5-bit command code, and a 5-bit trigger tag (sequence number). At 60 MHz, this takes 192 ns to transmit. • FCTS-to-ROM commands are 104 bits: the full 56-bit 60MHz clock counter (the unique event key), the post-FCTS state of the 32 trigger bits, the 12-bit ROM-to-FEE command, and four flag bits. These take ~1.75 us to transmit, a significant fraction of the minimum command spacing. SuperB FCTS/DAQ protocol proposal

Command protocols for SuperB Model 1… • The BaBar ROM-to-FEE command content may be OK. • Overall performance would be improved by including a length field. • The BaBar ROM-to-FEE command timing is barely compatible with Model 1 at the same clock speeds. • Ideally, for 1% deadtime a 100ns command interval would be needed. Somewhat longer intervals could be acceptable in several scenarios: • If the trigger cannot generate separate trigger decisions that close together, then they don’t need to be processed, but the longer this interval becomes, the more necessary it becomes for the trigger itself to be able to handle overlapping events and make appropriate decisions (e.g., not vetoing an time interval that contains both a Bhabha and a physics event). • If triggers must be delayed in transit because of a somewhat longer command interval, this can be OK as long as the accumulated delay in the maximum burst that needs to be supported (from modeling) is compatible with the trigger jitter specification. SuperB FCTS/DAQ protocol proposal

Command protocols for SuperB Model 1… • The BaBar FCTS-to-ROM command timing is completely unacceptable. • It would lead to intolerable trigger delivery delays. • The command word length could be somewhat shortened. Two possibilities: • The post-FCTS-decision trigger bits could be treated as event data, with the FCTS read out as an additional detector system. (This would preclude the ROMs’ making FEX or other decisions based on trigger content.) • The timestamp could probably be shortened from 56 bits to 40-45 bits by treating it as relative within each data run. This would require a run identifier to become part of the unique event key. • These measures are probably insufficient by themselves. The command delivery link would still need to be made several times faster. • It looks like we need speeds greater than 1Gbps • This in turn might preclude combining commands with clock distribution in the FCTS. • If these paths are separated, the ROMs would likely have to be able to resync commands with the clock in order for a BaBar-like event build and out-of-sync detection scheme to work. • We are still thinking this through; it seems to add significant complexity to the FCTS protocol and ROM implementation, but apparently not to the FEEs. SuperB FCTS/DAQ protocol proposal

Command protocols for SuperB Model 2… • The ROM-to-FEE command word needs to be extended to include a ring buffer address field. • A length field may also be needed if the resolution of overlapping events is made a responsibility of the FCTS and the ROMs. • The time resolution of the addressing will need to be somewhere between the system clock period (e.g., 16 ns) and the minimum useful time resolution of the trigger (perhaps 125 ns). • Finer resolution allows some reduction in bandwidth, but... • After discussions at Elba, it seems that 125-250 ns is the right range. • The ring buffer in Model 2 needs to be somewhat longer than the intrinsic trigger latency, to accommodate queued triggers. Pending detailed modeling, we are guessing that a buffer depth of several tens of us should be adequate. • It is not necessary to be able to address the range before the shortest possible trigger latency. • This means that the address field should be in the range of 6-9 bits. SuperB FCTS/DAQ protocol proposal

Command protocols for SuperB Model 2 ROM-to-FEE command word, continued… • The length field needs to be able to select fractions of the normal readout window. “50%” of the possible optimization is obtained merely by allowing reading a half-size window. • For Bhabha-followed-by-physics readout, a larger-than-normal window is needed. • This can alternatively be done in Model 2 by issuing two triggers, if the system can guarantee that there will be no gap between their readout windows. • Either way the length field should not need to be more than 2-4 bits. (The number of 4-8 in Saturday’s talk was too large.) • In Model 2, the resulting increase in command transmission time from adding these fields is almost irrelevant: it just adds somewhat to the required ring buffer depth. (The additional delay should be no more than ~200ns.) SuperB FCTS/DAQ protocol proposal

Command protocols for SuperB Model 2… • The FCTS-to-ROM command word needs to be extended by at least the address and length fields. • If the resolution of overlapping events is made a responsibility of the FCTS and the ROMs, the command word would need to be further extended to include a set of descriptors for the multiple segments of an event with overlap. • Each descriptor would probably be both an address and a length. • Here again slower link speeds just translate into additional ring buffer space required. • Even 2-3 us command transmission time (e.g., if the command word is much more complex than for BaBar) could be accommodated. SuperB FCTS/DAQ protocol proposal

Conclusion • Model 2 appears to us to be significantly more attractive, and provides additional flexibility in DAQ that we think is advisable. • It appears to significantly loosen the requirements on the performance of the FCTS and ROM-to-FEE command links, and it provides a natural and uniform way to solve the overlapping-event problem. • The ability to deliver triggers and read out events as soon as possible should increase the overall performance of the system. • The corresponding cost is the increased complexity and size of the FEE needed to support an addressable - and significantly larger - ring buffer. • Additional complexity and size also raises radiation damage concerns. • Ultimately overall system cost needs to be optimized. • This turns out to be a quantitative question. SuperB FCTS/DAQ protocol proposal

Proposal and next steps • We are proposing this model - that is, a model with time-addressable pre-L1Accept buffering in the FE electronics. Next steps for us (core DAQ people - please consider joining us!): • Define the time addressing protocol and buffer depth. • Specify a detailed model for overlapping-event readout. • Write this up. • Do modeling to estimate the ring buffer depth needed in Model 2. Next steps for you: • Consider the consequences to FEE design and estimate the marginal cost of variable latency (Model 2)’s additional requirements. • If those costs look substantial, someone then needs to evaluate the additional complexity and cost of the multiple-path command-and-clock distribution and resynchronization scheme needed in the FCTS in Model 1. SuperB FCTS/DAQ protocol proposal

Requests to subsystems • Responses from two subsystem groups before Elba. • Thank you! • Additional question (or clarification): • Please include in your presentation of channel requirements (counts, bit depths, digitization rates, etc.) the readout window that is required by the system’s intrinsic signal time distribution, ignoring trigger jitter. • Please include the effects of physics variance in particle and detector signal propagation time (e.g., include DCH maximum drift time) • Equivalently, just say what window you would need if the trigger jitter were exactly zero. SuperB FCTS/DAQ protocol proposal

SuperB FCTS/DAQ Protocol Proposal Tradeoffs