Data Transport in Particle Physics Experiments

Data Transport in Particle Physics Experiments Tony Gillman Particle Physics Department Rutherford Appleton Laboratory

Data transport – scope • A very generic title… • “Transport” is meaning here “movement of signals and data” – • How are data transferred all the way from detectors to computers? • What happens to the signals during this journey – transmission media, formats, … • I will aim to cover a broad range of topics – • Analogue signal handling – and some of the pitfalls… • Analogue to Digital conversion techniques – the good and the bad… • Data serialisation and deserialisation – why bother… • Digital data transport media – copper vs silica • Purpose: give some idea of the problems of getting data from experiments • The first-level trigger of the ATLAS detector at the CERN LHC neatly illustrates many of these techniques, so will be used as a general case study

The data challenge • Current generation of experiments will generate prodigious data volumes • ATLAS will produce ~1 Petabyte (1015 bytes) per second • In addition, the instantaneous data rates can be extremely high • The LHC collision rate is 40 MHz → bursts of new data arrive every 25 nsec • How do we transfer these data from the detectors into the data acquisition electronics → massive communication problem • Triggering removes the need to transport all of these data – • Store data for ~2 msec in pipeline memories • First-level trigger decides from which events to accept and transport the detector data • 40 MHz → 75 kHz max– (higher-level triggers reduce this much further)

ATLAS trigger system • Still a challenge even to get triggerdata into trigger electronics • Transport of data must be almost error-free, or trigger rate can become unacceptably high • Latency(time delay between collisions and trigger decision) must be short, to minimise data storage requirements (remember 106 Gigabytes per sec!) • Every part of signal chain must therefore be as fast as possible

ATLAS level-1 calorimeter trigger Jet / ET (JEP) Pre- Processor (PPr) Analogue Receivers 400Mbit/s To CTP DAQ/RoI To CTP e/, /hadron Clusters (CP) Analogue tower sums (~7200) 400Mbit/s DAQ DAQ/RoI 1Gbit/s 1Gbit/s 1Gbit/s Readout Driver (ROD) Real-time signal (data) path Readout data path To ROS

ATLAS level-1 calorimeter trigger • Data are transported from detectors (calorimeters) to trigger processing electronics to generate ACCEPT signals to feed Central Trigger Processor • Signals undergo transformations at several stages in their journey…

Analogue signal transmission • Calorimeter signals are of two types: • Liquid Argon calorimeter – • Bipolar, 75 nsec FWHM • Differential, ±2Vmax • Tile calorimeter – • Unipolar, 50 nsec FWHM • Differential, ±2Vmax

TileCal analogue trigger cable Transport medium: 16 shielded twisted-pair channels + global shield Characteristic impedance:88Ω 10 % Cable delay:≤4.76 nsec/m Inter-pair delay skew:<2.5 nsec(70m cable) Attenuation:-0.06 dB/m Crosstalk:<0.2%(70m cable) Bandwidth:13 MHz at -6dB

Imperfections in transmission lines • Pre-installation measurements on TileCal analogue cables showed badtimingskew • Inter-pair skew (tpdn≠ tpdm)– excessive, up to 20 nsec, but could be calibrated out • Intra-pair skew (tpdn+≠ tpdn-)– excessive, up to 28 nsec • This effect is totally unacceptable – result is to change the shape and amplitude of resultant differential signal, because of varying levels of dispersion Good Pair (tower 4, PMT 19) Bad Pair (tower 4, PMT 19) Resultant signal Positive signal Negative (inverted) signal

S-parameter measurements • S-parameters characterise transmission-line performance in the frequency domain • For the suspect cable, S-parameters were measured for the 4 propagation modes – 1. differential-mode → differential-mode (signal attenuation) 2. common-mode → common-mode (signal attenuation) 3. differential-mode → common-mode (mode conversion) • Common-mode signal will radiate and couple to adjacent signal pair 4. common-mode → differential-mode (mode conversion) • Cable susceptible to radiation and resultant differential-mode signal will degrade S/N ratio • Mode conversion is caused by asymmetries in differential transmission-lines

S-parameter measurements • First step was to measure characteristic impedance Z0of cables in two modes –common-mode and differential-mode and terminate cables under test in both ways using a single network • Measure transfer function of cables over frequency range up to 50 MHz in each of four modes using sine waves Test setup for common → common mode and common → differential mode measurements Test setup for differential → differential mode and differential → common mode measurements

S-parameter measurements “Bad” Pair “Good” Pair • “Bad” pair exhibits severe attenuation at high frequencies → signal dispersion • Common → differential conversion is extremely large >15 MHz(compare with differential → differential mode!) • Conclusion: The entire batch of cables from this manufacturer was rejected

ATLAS analogue trigger cabling

Analogue → Digital conversion • Digital signals have many advantages over analogue signals (noise immunity, crosstalk, processing capability, …), so preferable to digitise detector signals as early as possible in signal chain • Analogue-to-digital converters (ADCs) are mixed-signal devices • Digital Output = Input signal / VREF = AIN / VREF x 2N • AIN = Analogue Input Voltage • VREF = Vmax - Vmin (Reference Voltage) • N = No of output bits (resolution) • Analogue signal resolution = VREF / 2N • This is the fastest type of converter, also known as a Flash ADC(FADC) • The delay between the clock and the digital output data appearing is latency • Low latency essential in many applications (e.g. ATLAS level-1 trigger) Clock

ADC performance – some notes For an n-bit converter… • Dynamic range in dB – 20 log (2n -1) • Signal-to-Noise Ratio (SNR) = rms Signal / rms Noise (integrated over 1/2 clock period) • Several sources of noise – • Quantisation noise • Clock jitter • Electronic circuit noise • Fundamental limit on ADC performance is quantisation noise – LSB / sqrt 12 • SNR for ideal ADC = (6.02n + 1.76) dB • Nyquist limit – highest frequency component permitted≤ ½ sampling frequency • If f(Ain) > ½ fs aliasing will occur → increased noise • Avoid aliasing by passing signal through low-pass filter before ADC comparators

ADC performance – timing jitter • Clock jitter leads to aperture uncertainty • For a sine wave signal (V = A sin wt)→dVmax = 2p A f dt • Aperture uncertainty therefore translates to a noise source,degrading the ADC resolution for high-frequency signals • Magnitude scales with the input signal frequency • The effect only becomes significant if dt > (2n p f)-1 • The demands on clock jitter are very severe…

ADC performance – ENOB • Overall effect of aperture uncertainty is to reduce the Effective Number Of Bits (ENOB) of the ADC at high frequencies • N.B. An n-bit ADC will not resolve to n bits at its full analogue bandwidth unless clock jitter is kept below these limits

Digital signal transmission • To transfer parallel data between sub-systems, convert toserial bitstreams to reduce the number of data paths and connector pins – increases reliability (but also latency!) • Serialising-deserialising(SerDes)chipsets can drive serial bitstreams at~Gbit/srate • Very common technology for serial links isLow-VoltageDifferentialSignaling (LVDS) Cable chosen for trigger – shielded Twin-ax (2 parallel cores – Z0 = 100W) • Many advantages: • Low-voltage power supplies • Good noise immunity • Low power dissipation • Small signal swing → high data rates • “Gigabits at Milliwatts”

Eye patterns – digital data Source Destination

Pre-compensation techniques • Adding a passive pre-compensation network(high-pass filter – CR or LR) to the LVDS driver outputs boosts the high-frequency components of the signal to compensate for the cable dispersion No pre-compensation LR pre-compensation N.B. overshoot

ATLAS PreProcessor Module ANALOGUE DIGITAL MCMs Digital data outputs LVDS Serialisers Processor ASIC Signal flow Flash ADCs Analogue signal inputs Signal flow

Beware!!! • Installing Cu signal cabling can produce unexpected effects – Cable Discharge Event (CDE) • Static electricity on the jacket material of the cable induces a charge in the cable wires • Mechanisms – • Tribocharging(friction), produced as cables are pulled across surfaces • Electromagnetic fields can induce charge build up on cables, e.g. from electronic light ballasts • This may have been an issue for our 8000 LVDS cables installed under-floor between racks • As a precaution, we “discharged” cables after installation but before connecting any modules • N.B. This is another reason why using fibre-optic cabling has advantages

Optical fibres • Cylindrical dielectric waveguide transmitting light along its axis by total internal reflection, consisting of a core covered by a sheath of cladding (ncore > ncladding) • As an alternative to Cu cabling for digital data transmission, it has many benefits – • Huge bandwidth • Immunity from EMI, ground-loops and crosstalk • Small volume for cable plant • Two types available – Multi-mode and Single-mode (usual material is silica) – • Multi-mode fibres – large core diameter (few tens of mm) allows multiple path lengths → intermodal dispersion limits Bandwidth x Distance product • Reduce intermodal dispersion by using graded-index silica – transit time variations → zero • Single-mode fibres – small core diameter (few mm) forces lowest-order (axial) mode, low dispersion → high Bandwidth x Distance product • Propagation delay ~ncore / c (~5 nsec/m – similar to Cu cable)

Optical fibres – some available types • Step-index Multi-Mode fibres – • Cheap • Large core diameter → easy to couple light in/out • High intermodal dispersion → low bandwidth • Suitable for short links and low data rates • Graded-index Multi-Mode fibres – • Large core diameter → easy to couple light in/out • Reduced intermodal dispersion → increased bandwidth • Suitable for medium-range links/low data rates or short links/medium data rates • Step-index Single-Mode fibres – • Small core diameter → harder to couple light in/out • Wide bandwidth • Suitable for long-range links and high data rates

Optical fibres – ATLAS level-1 trigger • Data transmitted to level-2 trigger and DAQ via Readout Driver modules (RODs) – distance ~10m, total bandwidth >250 Gbyte/s • Chosen to use Multi-Mode fibres driven by laser diode transmitters (Infineon) operating at 850 nm,mounted on trigger modules • Total no of fibres feeding Readout Driver modules (RODs)~320 • Transmitters are driven from Agilent G-link transmitters at 960 Mbaud (800 Mbit/s) • Receivers are dual Stratos devices mounted on 20 RODs 56 mm

ILC Vertex Detector • International Linear Collider will be an accelerator ~35 km long colliding bunches of e- and e+at energies of 500 GeV – physics to complement that from the LHC • VXD will be based on Si detectors e.g.CCDs – forming ladders

ILC Vertex Detector • 5 concentric barrels of ladders, on radii ranging from 15mm - 60mm • Thickness <0.1% X0per barrel (target) • ~109 pixels – each 20m  20m • ILC will generate many spurious hits from beamstrahlungduring bunch crossings • To minimise these background hits, CCDs must be read out quickly – • Readout time of 50s for inner barrel (highest background hit density) • Readout time of 250s for each outer barrel (lower background hit density)

Background hit rates • Accelerator beam parameters – • ~1 msec bunch-train • 337 nsec inter-bunch gap • 5 Hz repetition rate(200msec dead-time)

Readout data volumes • So how much data will the VXD generate? • Total no of pixels clocked out during each bunch train~4.109 • To read out every pixel(assuming ≤1 byte/pixel) rawdata volume~20 Gbyte/s • This is unnecessary, most pixels are empty – only ~0.5% occupancy • Sparsify data in real-time in Readout chips • Digitise signals in on-chip ADCs (5 bits OK) • Look for 2x2 pixel clusters with signal >cluster threshold → 6 bytes per cluster 26 bits(h-f addressing) 20 bits for 4 x 5-bit ADC values 2 spare bits – parity, etc • 20 Gbyte/s → 40 Mbyte/s

Data acquisition task • Total sparsified data volume per bunch train ~8 Mbytes (~40Mbyte/s) • To read this out in real-time requires peak data transfer rate >30 Gbyte/sec • Readout chips require de-randomising FIFOs → average data transfer rate ~600 Mbyte/sec • Provide each Readout chip with primary memory to store sparsified data (+ address tags) • ~1 Mbyte/CPR (Barrel 1) → ~10 Kbyte/CPR (Barrel 5) • Read data out to DAQ during 200msec dead-time after each bunch train • Total sparsified data rate from VXD ~40 Mbyte/s (split between ±h)

Data collection • Many ways to collect the data from all CPRs – this is only one possibility • Empty CPR primary memories sequentially at 50 MHz on to byte-wide ring-buses at ends of each barrel • Serialise the data from each ring-bus at 400 Mbit/s and drive differential LVDS signals (or optical links) into 2 DAQ cards (±h) • DAQ cards de-serialise the LVDS data, combine the 5 data streams, re-format, assemble and store the data for the entire bunch crossing (taking ~80 msec) • 2 optical fibres/DAQ card export data to main DAQ + import readout control signals

“Galvanic” links need space… Small part of ATLAS L1Calo data link system installed underground ~10%of digital data links of ATLAS L1Calo trigger in a Birmingham test-rig • Data from ILC Vertex Detector could be transported on a single fibre! • Upgraded L1Calo for Super-LHC will probably use fibres for all data transport

Data Transport in Particle Physics Experiments