120 likes | 262 Views
CPM architecture and challenges. CP system requirements Architecture Modularity Data Formats Data Flow Challenges High-speed data paths Latency. CP system requirements. Cluster Algorithms 4 x 4 x 2 cell environment Sliding window. Process –2.5 < η < 2.5 region
E N D
CPM architecture and challenges • CP system requirements • Architecture • Modularity • Data Formats • Data Flow • Challenges • High-speed data paths • Latency CPM FDR, Architecture and Challenges
CP system requirements • Cluster Algorithms • 4 x 4 x 2 cell environment • Sliding window • Process –2.5 < η < 2.5 region • 50x64 trigger towers per layer • Two layers • 8 bit data (0-255 GeV) • Relatively complex algorithm • Output data to CTP • 16 x 3 bit hit counts • Each hit condition is a combination of four thresholds • Output data to RODs • Intermediate results • RoI data for RoIB CPM FDR, Architecture and Challenges
System design considerations • Several major challenges to overcome • Large processing capacity • Data i/o, largely at input • Latency requirements • Processing must be split over several modules working in parallel • But overlapping nature of algorithms implies fan-out needed • Modularity is compromise between competing requirements • High connectivity back-plane required for data sharing • Data must be ‘compressed’ as much as possible • Use data reduction whenever possible • Data serialisation at various speeds used to reduce i/o pin counts CPM FDR, Architecture and Challenges
System modularity • Full system • 50 x 64 x 2 trigger towers • Four crates, each processing one quadrant in phi • 50 x 16 x 2 core towers • Eta range split over 14 CPMs • 4 x 16 x 2 core towers • Module contains 8 CP FPGAs • 4 x 2 x 2 core towers CPM FDR, Architecture and Challenges
Board Level Fan-out,input signals and back-plane • CPM has 64 core algorithm cells • 16 x 4 reference towers • Obtained from direct PPM connections (2 PPMs per CPM) • Algorithm requires extra surrounding cells for ‘environment’ • One extra below, two above • 19 x 4 x 2 towers in all • Fanout in phi achieved via multiple copies of PPM output data • Fanout in eta achieved via back-plane CPM FDR, Architecture and Challenges
Internal Fan-out and the Cluster Processing FPGA Environment • CP FPGA processes 2x4 reference cells • Algorithm requires 4x4x2 cells around reference • Convoluting these gives 5x7x2 FPGA environment • Data received from 18 different serialiser FPGAs • 6 on-board • 12 through back-plane from left on-board from right from above ‘core’ cells from below CPM FDR, Architecture and Challenges
CPM Data formats – tower data • 8 bit tower data • PPM peak finding algorithm guarantees any non-zero data is surrounded by zeroes • Allows data encoding/compression • Two 8 bit towers converted to one 9 bit ‘BC-muxed’ data word • Add odd-parity bit for error detection • 160 input towers encoded in 80 x 10 bit data streams • Same format utilized for: • input to CPM • between serializer FPGA and CP FPGA ‘bcmuxed’ 10 bit data two towers x 8 bits bcmux bit parity bit 8 bit data CPM FDR, Architecture and Challenges
CPM data formats – hits and readout • CPM hits results: • 16 x 3 bit saturating sums • 8 sent to left CMM, 8 sent to right • 8 x 3 = 24 results bits pluse 1 odd-parity bit added • DAQ readout • Per L1A, 84 x 20 bits data • Bulk of data is BC-demuxed input data • 10 bit per tower, eight bit data, 1 bit parity error, 1 bit link error • 160 direct inputs x 10 bit data = 80 x 20 bits • 48 bits hit data, 12 bits Bcnum, 20 bits odd-parity check bit • RoI readout • Per L1A, 22 x 16 bits data • Bulk of data is individual CP FPGA hit and region location • 16 bits + 2 bits location + 1 bit saturation + 1 bit parity error • 8 FPGAs each have 2 RoI locations = 8 x 2 x 20 bits • Rest is 12 bits Bcnum, and odd-parity check bit CPM FDR, Architecture and Challenges
CPM data flow: signal speeds • Multiple protocols and data speeds used throughout board • Care needed to synchronize data at each stage • This has proved to be the biggest challenge on the CPM 400 Mbit/s serial data (480 Mbit/s with protocol) LVDS deserialiser 40 MHz parallel data Serialiser FPGA 160 MHz serial data CP FPGA Readout Controllers 40 MHz parallel data Hit Merger 640 Mbit/s serial data (800 Mbit/s with protocol) CPM FDR, Architecture and Challenges
CPM challenges: high-speed data paths • 400 (480) Mbit/s input data • Needed to reduce input connectivity • 80 differential inputs plus grounds = 200 pins/CPM • Previous studies of the LVDS chipset established viability • Works very reliably with test modules (DSS/LSM) • Still some questions over pre-compensation and PPM inputs • 160 MHz CP FPGA input data • Needed to reduce back-plane connectivity • 160 fan-in and 160 fan-out pins per CPM • Needed to reduce CP FPGA input pin count • 108 input streams needed per chip • This has been the subject of the most study in prototype testing • 640 (800) Mbit/s Glink output data • Glink chipset successfully used in demonstrators • Needed some work to understand interaction with RODs CPM FDR, Architecture and Challenges
CPM challenges: latency • CP system latency budget: ~14 ticks • This is a very difficult target • Note, CPM is only first stage of CP system • CMM needs about 5 ticks • CPM latency - irreducible • Input cables: > 2 ticks • LVDS deserialisers: ~ 2 ticks • Mux/Demux to 160 MHz: ~ 1 tick • BC-demuxing algorithm: 1 tick • Remaining budget 14-5-2-2-1-1 = 3 ! CPM FDR, Architecture and Challenges
Conclusions • The CPM is a very complex module • Difficulties include: • High connectivity • Multiple time-domains • Tight constraints on latency • Large overall system size • Extensive testing has shown that the current prototype CPM meets these demands CPM FDR, Architecture and Challenges