200 likes | 311 Views
FPGA Co-processor for the ALICE High Level Trigger. Gaute Grastveit University of Bergen Norway. H.Helstrup 1 , J.Lien 1 , V.Lindenstruth 2 , C.Loizides 5 , D.Roehrich 3 , B.Skaali 4 , T.Steinbeck 2 , K.Ullaland 3 , A.Vestbo 3 , T. Vik 4 , A. Wiebalck 2 for the ALICE Collaboration
E N D
FPGA Co-processor for the ALICE High Level Trigger Gaute Grastveit University of Bergen Norway H.Helstrup1, J.Lien1, V.Lindenstruth2, C.Loizides5, D.Roehrich3, B.Skaali4, T.Steinbeck2, K.Ullaland3, A.Vestbo3, T. Vik4, A. Wiebalck2 for the ALICE Collaboration 1Bergen College, Norway 2Kirchhoff Institute for Physics, University of Heidelberg, Germany 3Departement of Physics, University of Bergen, Norway 4Departement of Physics, University of Oslo, Norway 5Institute of Nuclear Physics, University of Frankfurt, Germany
ALICE– A Large Ion Collider Experiment TPC - Time Projection Chamber
Very High Data Rate Pb-Pb central collisions Event rate: 200Hz Event size: ~75Mb => 15 Gbyte/s Max data-rate to tape is 1.25 Gbyte/s Compression/selection is needed Conventional, lossless methods: factor 2
HLT functionality • Compress • Reduce the amount of data required to encode the event as far as possible without loosing physics information • Trigger • Accept/reject events on the basis of physics application • Select • Select regions of interest within an event • remove pile-up in p-p • ... Task: reconstruct the tracks of 20.000 charged particles (each producing 150 clusters) in the TPC Timebudget: 5 ms
The HLT setup Data are received in parallel RCU – Readout Controller Unit DDL – Data Detector Link RORC – ReadOut Reciver Card HLT farm • PCI kernel in the FPGA • FPGA will also be utilised for pattern recognition • Reduces number of CPU’s needed
The HLT FPGA co-processor • FPGA: APEX 20K400 • Next prototype: Altera Stratix FPGA • Large internal memory • DSP cores
Two Schemes for Finding Tracks • Low occupancy (p-p, Pb-Pb outer padrows) • Conventional approach with (2d) cluster finder and track follower • High occupancy (overlapping clusters): • Hough transform on raw data • Cluster analysis for deconvolution • (Kalman filter) High multiplicity picture
time The numbers represent Charge (ADC values) A vertical uninterrupted stack of numbers is called a sequence. The square shows the geometric centre of the sequence. Neighbouring sequences belong to the same Cluster. Final mean value: (Weighted mean) Pad
FPGA implementation of a cluster finder - the algorithm • Calculate the mean for every sequence • Adjacent pads with similar means are merged • Two lists of sequences are used: one for clusters on the previous pad one for clusters on the current pad • Clusters are removed from the searchrange when a match is found or we know it is finished • Clusters are inserted in the inputrange after merging or when we start a new cluster Memory of clusters begin Searchrange / Previous pad end Inputrange / Current pad insert
Block Diagram, Verification T Testbench Top structure RAM (lpm) Decoder FIFO (lpm) Merger cluster seq seq File: charges File: VHDL clusters File: C++ clusters C++ model C++ program compares the results
Relative Scales As before the mean is calculated by: smaller + Smaller numbers, only multiplies by <11 - Multiplication can’t be done until merging takes place Alternative, (absolute): Pre_Calc (2 mult, 1 add) Decoder FIFO (lpm) Merger
Deconvolution Simplified implementation, almost for free – splits at minima in both directions (time and pad) off on
Merger Goals • spend few clock cycles per sequence • use few logic elements • high clockspeed
Cluster Finder Performance • Syntesized on Altera APEX • Uses 1800 Logic Elements (11%) • Memory usage 16*80 + 64*112= 8448 bits (4%) • Circuit runs at 33Mhz
Outlook Implementation of Hough transformation
Conclusion We have demonstrated the feasibility of a real time cluster finder implemented in an FPGA Firmware implementation of a Hough transform looks promising
TPC- Time Projection Chamber 18 sectors on each side, each sector is readout in 6 subsectors Total is ca. 570.000 pads