500 likes | 614 Views
Start-up & synchronization sequence for Front-End. LHCb Electronics Upgrade Meeting 13 February 2014. F. Alessio, CERN. Scope & Outline. Aim at describing the way in which the system would globally: Synchronize the readout of events at the beginning of a run
E N D
Start-up & synchronization sequence for Front-End LHCb Electronics Upgrade Meeting 13 February 2014 F. Alessio, CERN
Scope & Outline • Aim at describing the way in which the system would globally: • Synchronize the readout of events at the beginning of a run • («Start-of-run fast sequence») • to ensure TELL40 code is able to align to incoming data stream • to ensure FE is aligned to correct BXID • Resynchronization mechanisms • After a desynchronization • Preventive resynch to avoid loss of data • Usage of FE Reset and Synch command 2
Synch and alignment of TFC links 1. Synch and align the TFC links: send Synch pattern in 8b/10b Firmware will check synchronization of buffer and alignment of content of frame When done, set synch bit via ECS. 3
Synch and alignment of TFC links 2. Synch and align TFC-TELL40 links: send Synch pattern in 8b/10b Firmware will check synchronization of buffer and alignment of content of frame When done, set synch bit via ECS. 4
Synch and alignment of TFC links 3. Synch and align TFC-FE links: send TFC commands in GBT according to each FE’s mapping SOL40 must be configured before FE and clock transmission must be up (minimal SODIN configuration) GBT does the job. When synch’d, can start configuring FE. 5
Synch and alignment of TFC links 3bis. Special mode: relay TFC commands back to SOL40 measure latency • SOL40 will measure transmission delay by comparing BXID • Interested only in BXID latency, not fine phase • However if at the limit might see some desynch and adjust... 6
Synch and alignment of TFC links 4. Synch and align FE-TELL40 links: send Synch Pattern in GBT TELL40 LLI (GBT decoding+ GX buffer) will synchronize the links Special Synch Pattern is used to align the data stream to TELL40 processing logic 7
Synch and alignment of TFC links 4bis. Special mode: relay TFC commands back to TELL40 measure latency • TELL40 will measure transmission delay by comparing BXID received by SOL40 and FE • Interested only in BXID latency, not fine phase • However if at the limit might see some desynch and adjust... 8
ECS start-of-run sequence Just like now, current LHCb experiment 9
TFC start-of-run sequence • FE Reset is the first thing issued via control system: asynchronous! • Issued when TFC receives command «GO» • Ensure that FE got the right BXID value before synch-ing to TELL40 • This sequence is the same every time a FE Reset is issued. • If FE Reset is ussed, change Run #. For «in-Run» resynch, use Synch command. 11
What to do on SYNCH? Send something like this! • Synch command is meant to be sure that (whole) system is synchronized… in a synchronous way! • FEs send Synch Patter for the same BXIDs everywhere • FE frees its memory : delete its content, read and write pointers back to empty • FE sends Synch Pattern for as many as Synch commands are set by TFC • Next event, starts at bit 0 (LSB) – might be header only • TELL40 will align to corresponding frame and BXID • TELL40 closes all events before and sends them out truncated • TELL40 does not know what BXID to expect. Synch pattern says it. • Note: Synch command can be used on lab tests without TFC. For few links, system can align independently of TFC (just program your FE to send it and TELL40 to receive it). TELL40 would be in pass-through (i.e. accept all). • As soon as you need synchronicity across all links, then you need TFC. TFC will ensure that synch command is sent out for same BXID everywhere automatically aligning TELL40 readout. 12
What to do on SYNCH? Send something like this! • Double usage (in AND or in OR): • Periodically: i.e., SYNCH command sent every n Hz • a la FE Reset, but without resetting • a bit inefficient to clear the FE buffer (could be like this only at the beginning and when very few bunches – lots of empty-empty crossing) • Asynchronously: i.e. when a desynch is detected, like TELL40 detects wrong frames, wrong packing • needs fast diagnostics in TELL40 codes • makes sense to clear the FE buffer in this case • could be sent only for a local sub-detector from SOL40 • (slow) triggered by ECS (TELL40 or FE set a desynch bit, ECS sends command to SOL40) • or by TELL40 via SOL40 transmitting an info field regarding this w/ throttle protocol (we have plenty of BW) • Resynchronization sequence 13
Resynchronzation sequence • Synch command was requested • From ECS or fast via TELL40 • Note: Synch command is entirely programmable in frequency, length and location around the orbit. Header Only is entirely programmable in length. 14
Conclusion TFC documents updated, please review and comment. LHCb-PUB-2012-001 (TFC specs) LHCb-PUB-2012-017 (FE and BE specs) 15
Backup 16
System and functional requirements • Bidirectionalcommunication network • Clock jitter, and phase and latency control • At the FE, butalsoat TELL40 and between S-TFC boards • Partitioning to allowrunning with any ensemble and parallelpartitions • LHC interfaces • Events rate control • Low-Level-Trigger input • Support for old TTC-baseddistributionsystem • Destination control for the eventpackets • Sub-detectors calibrationtriggers • S-ODIN data bank • Infomationabouttransmittedevents • Test-benchsupport 17
The S-TFC system at a glance • S-ODINresponsible for controllingupgradedreadoutsystem • Distributing timing and synchronouscommands • Manages the dispatching of events to the EFF • Rate regulates the system • Support old TTC system: hybridsystem! STORAGE • SOL40responsible for interfacingFE+TELL40 sliceto S-ODIN • Fan-out TFC information to TELL40 • Fan-in THROTTLE information from TELL40 • Distributes TFC information to FE • Distributes ECS configuration data to FE • Receives ECS monitoring data from FE DATA DATA 18
The upgraded physical readout slice - ATCA • Common electronics board for upgraded readout system: AMC40 cards fitted in an ATCA motherboard • S-ODIN & SOL40 AMC cards • LLT & TRIG40 AMC cards • TELL40s AMC card • LHC Interfaces specific AMC card 20
The upgraded physical readout slice – PCIe • Common electronics board for upgraded readout system: PCIe-Gen3 card fitted on a host PC • SODIN & SOL40 PCIe card • LLT PCIe ard • TELL40 PCIe card • LHC Interfaces specificPCIe card 21
TFC protocol to TELL40 • «Extended» TFC word to TELL40 via SOL40: • 64 bits sent every 40 MHz = 2.56 Gb/s • packed with 8b/10b protocol (i.e. total of 80 bits) • no dedicated GBT buffer, use ALTERA GX simple 8b/10b encoder/decoder • MEP acceptcommandwhen MEP ready: • Take MEP address and pack to FARM • No need for special address, dynamic Constant latency after BXID • THROTTLE information from each TELL40 to SOL40: • 1 bit for each AMC board + BXID for which the throttle was set • merged and aligned in SOL40 • same GX buffer as before (same bidirectional transceiver) 22
TFC protocol to FE • TFC word on downlink to FE via SOL40 embedded in GBT word: • 24 bits in each GBT frame every 40 MHz = 0.98 Gb/s • allcommandsassociated to BXID in TFC word • Put localconfigurabledelays for each TFC command • GBT doesnotsupportindividualdelays for each line • Need for «local» pipelining: detector delays+cables+operationallogic (i.e. laser pulse?) • DATA SHOULD BE TAGGED WITH THE CROSSING TO WHICH IT BELONGS! • TFC word willarrivebefore the actualeventtakesplace • To allow use of commands/resets for particularBXID • Accounting of delays in SODIN: for now, 16 clock cycles earlier + time to receive • Aligned to the furthest FE (simulation, then in situ calibration!) • TFC protocol to FE hasimplications on GBT configuration and ECS to/from FE • seespecsdocument! 23
Timing distribution • From TFC point of view, ensured constant: • LATENCY: Alignment with BXID • FINE PHASE: Alignment with best samplingpoint • Some resynchronizationmechanismsenvisaged: • Within TFC boards • With GBT • No impact on FE itself • Loopbackmechanism: • re-transmit TFC word back • allows for latencymeasurement + monitoring of TFC commands and synchronization 25
How to decode TFC in FE chips? FE electronics block • Usage of TFC+ECS GBTs in FE is 100% common to everybody!! • dashedlines indicate the detector specificinterfaceparts • pleasepayparticular care in the clock transmission: the TFC clock must be used by FE to transmit data, i.e. lowjitter! • Kaptoncable, crate, copperbetween FE ASICs and GBTX 26
The TFC+ECS Master GBT Clock[7:0] External clock reference FEModule • These clocks should be the main clocks for the FE • 8 programmablephases • 4 programmablefrequencies (40,80,160,320 MHz) E – Port GBTX e-Link Phase - Shifter CLK Reference/xPLL E – Port FEModule E – Port ePLLRx GBTIA DEC/DSCR CDR E – Port data-down data-up Phase – Aligners + Ser/Des for E – Ports CLK Manager clock 80, 160 and 320 Mb/s ports GBLD SCR/ENC SER E – Port ePLLTx FEModule E – Port E – Port • Used to: • sample TFC bits • drive Data GBTs • drive FE processes Control Logic Configuration (e-Fuses + reg-Bank) one 80 Mb/s port GBT – SCA JTAG I2C Slave I2C Master E – Port data I2C (light) control clocks JTAG port I2C port 27
The TFC+ECS GBT protocol to FE • TFC protocolhasdirectimplications in the way in which GBT should be usedeverywhere • 24 e-links @ 80 Mb/s dedicated to TFC word as a baseline (effectively 48 bits) • use 80 MHz phaseshifter clock to sample TFC parallel word • TFC bits are packed in GBT frame so thattheyall come out on the same clock edge • Modification are possible in order to satisfy sub-detector’s FE requirements • Leftover e-links dedicated to GBT-SCAs for ECS configuring and monitoring(see later) 28
Words come out from GBT at 80 Mb/s • In simplewords: • Odd bits of GBT protocol on risingedgeof 40 MHz clock (first, msb), • Even bits of GBT protocol on fallingedgeof 40 MHz clock (second,lsb) 29
TFC decoding at FE after GBT • Thisiscrucial!! • wecan alreadyspecifywhereeach TFC bit will come out on the GBT chip • thisis the only way in which FE designers stillhaveminimalfreedom with GBT chip • if TFC info waspacked to come out on only 12 e-links (first oddtheneven), thendecoding in FE ASIC would be mandatory! • whichwouldmeanthatthe GBT bus wouldhave to go to each FE ASIC for decoding of TFC command • thereisalso the idea to repeat the TFC bits on even and odd bits in TFC protocol • wouldthat help? • FE could tie logical blocks directly on GBT pins… • Or to select a minimal set of TFC commands and repeat them to profit from fan-out possibilities 30
Now, what about the ECS part? • Eachpair of bit from ECS field inside GBT can go to a GBT-SCA • OneGBT-SCA isneeded to configure the Data GBTs(EC one for example?) • The rest can go to either FE ASICs or DCS objects(temperature, pressure) via other GBT-SCAs • GBT-SCA chip hasalreadyeverything for us: interfaces, e-linksports .. • No reason to go for somethingdifferent! • However, «silicon for SCA will come laterthansilicon for GBTX»… • We need something while we wait for it! FPGA emulator (working on it) 31
SOL40 firmware • Protocol drivers build GBT-SCA packets with addressing scheme and bus type for associated GBT-SCA user busses to selected FE chip • Basically each block will build one of the GBT-SCA supported protocols Memory Mapwith internal addressing scheme for GBT-SCA chips + FE chips addressing, e-link addressing and bus type: content of memory loaded from ECS 32
Usual considerations … • TFC+ECSInterface has the ECS load of an entireFE cluster for configurating and monitoring • 34bits @ 40 MHz = 1.36Gb/son single GBT link • ~180 Gb/s for full SOL40 (132 links) • Single CCPC mightbecomebottleneck… • Clara & us, December 2011 • How long to configure FE cluster? • howmany bits / FE? • howmanyFEs/ GBT link? • howmanyFEs / TFC+ECSInterface? • Numbers to be pinned down soon+ GBT-SCAinterfaces and protocols. 33
Old TTC systemsupport and runningtwosystems in parallel • We already suggested the idea of a hybrid system: • reminder: L0 electronics relying on TTC protocol • part of the system runs with old TTC system • part of the system runs with the new architecture • How? • Need connection between S-ODIN and ODIN (bidirectional) • use dedicated RTM board on S-ODIN ATCA card • In an early commissioning phase ODIN is the master, S-ODIN is the slave • S-ODIN task would be to distribute new commands to new FE, to new TELL40s, and run processes in parallel to ODIN • ODIN tasks are the ones today + S-ODIN controls the upgraded part • In this configuration, upgraded slice will run at 40 MHz, but positive triggers will come only at maximum 1.1MHz… • Great testbench for development + tests + apprenticeship… • Bi-product: improve LHCb physics programme in 2015-2018… • 3. In the final system, S-ODIN is the master, ODIN is the slave • ODIN task is only to interface the L0 electronics path to S-ODIN and to • provide clock resets on old TTC protocol 34
Reminder: your (generic) FE NO TRIGGER to FE! Only commands, clock and slow control For details, see LHCb-INT-2011-011 • Compress (zero-suppress) data already at the FE • reduce # of links from ~80000 to ~12500 (~20 MCHF to ~3.1 MCHF) • data driven readout (asynchronous) + variable latencies! • Efficiently use data link bandwidth • pack data on data link continuously with elastic buffer • extensive use of GBT (robust FEC vsWideBus mode) • evaluate choices based on complexity vsrobustness 37
Fast & Slow Control to FE TFC TFC 4.8 Gb/s On detector Off detector ECS 4.8 Gb/s ECS Data Off detector Data 4.8 Gb/s • Separate links between controls and data • A lot of data to collect • Controls can be fanned-out (especially fast control) • Compact links merging Timing, Fast and Clock (TFC) and Slow Control (ECS). • Extensive use of GBT as Master GBT to drive Data GBT(especially for clock) • Extensive use of GBT-SCA for FE configuration and monitoring 38
Reminder: generic FE data flow scheme Modify data according to TFC commands + BufferFullthen pack continuouslyonto GBT Tag data with TFC commands and pipe themacrosscompresson/suppressionlogicblock FE buffer for data Data availableneededonlyifcompression / suppressionisdynamic Applieschanges to data Compression/suppressionlogic can havedynamic or staticlatency 39
The code: GBT dynamicpacking Very important to analyze simulation output bit-by-bit and clock-by-clock! 42
Reminder: dynamicpackingalgorithm Average event size = link bandwidth Link bandwidth Average event size 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 = + Buffer depth • Header is the unique identifier for each event in frame • Compulsory(tag for each crossing), partly programmable (must contain length of frame+BXID) • Difficult buffer management, but almost no truncation. • Flexible against occupancy problem (what if your estimate is wrong?). • Maximum exploitation of bandwidth. • Readout Board uses Header to decode and separate frames lots of resources. BX0 BX0 BX1 BX1 BX2 BX2 BX BX3 3 BX4 BX4 43
Reminder: fixedpackingalgorithm Average event size /= link bandwidth Truncation! Link bandwidth Average event size 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 = + Buffer depth • This is different: one clock cycle one event one GBT frame • Header more flexible: you can add addresses, hitmaps… • Very simple buffer management, but truncation has to happen eventually. • Not flexible against occupancy problem (again, what if your estimate is wrong?). • Loses a bit of bandwidth as empty spaces must be padded to be sent out. • Readout Board uses Header to decode and separate frames much fewer resources BX0 BX0 BX1 BX1 BX2 BX2 BX3 BX3 BX4 BX4 44
Reminder: fixed vs variablelengthheader in dynamicpacking Dynamicpackingwithfixedlength header. Dynamicpackingwithdynamiclength header (fully flexible!) 45
FE flow control scheme • Fewcomments to start with: • BX Veto and HeaderOnlycommands are identical from FE point of view ORed • TFC commands are synchronouswrt to BXID Reset • once wealign BXID Reset with beam, TFC commands come ALWAYS at the samelatency • (wrt to BXID Reset, hence BXID)! • Compression/suppressionlogicshouldactaccordingly to TFC command • (whywouldyouwant to compress/suppressifthatcrossingisrejected a priori? • Especiallyifyourpre-processing isdynamic…) • Data isfilteredaccording to TFC commands and the FE buffer status • Data ispackedonto the GBT link in a continuous fashion 46
Data Valid • GBT can accept DATA or IDLE frame: • Send IDLE frame whenever a GBT frame isnot ready to be sent! • IDLE frame can containwhateveryour sub-detector wants to send. See TELL40 fwspecs, comingsoon… Data Validsignal to distinguishbetween DATA and IDLE frame: 47
Data Valid Be careful to rise synchronize the Data Validsignal to the right risingedgewhenusing the 80 MHz clock (or 160 or 320…) GBTX would split the frame in this case!! Synchronizeyour DV signal to the beginning of the GBT frame! 48
Data Valid • Some sub-detectors willconnect more FEs to the same GBT transmitter: • Each FE with itsownmemory • Can happenthatone can send DATA, the othercannot! (IDLE vs DATA in the samepacket) Keep DV always high! You HAVE to indicate whether the packetwas DATA or IDLE, by sacrificingone bit your DATA/IDLE frame 49
Data Valid • Some sub-detectors willconnect more FEs to the same GBT transmitter: • Each FE with itsownmemory • Can happenthatone can send DATA, the othercannot! (IDLE vs DATA in the samepacket) Keep DV always high! You HAVE to indicate whether the packetwas DATA or IDLE, by sacrificingone bit your DATA/IDLE frame 50