350 likes | 433 Views
Tiny Triplet Finder. Jinyuan Wu, Z. Shi Dec. 2003. Hit data come out of the detector planes in random order. Hit data from 3 planes generated by same particle tracks are organized together to form triplets. Hits, Hit Data & Triplets.
E N D
Tiny Triplet Finder Jinyuan Wu, Z. Shi Dec. 2003
Hit data come out of the detector planes in random order. • Hit data from 3 planes generated by same particle tracks are organized together to form triplets. Hits, Hit Data & Triplets
Three data items must satisfy the condition: xA+ xC = 2 xB. • A total of N3 combinations must be checked (e.g. 5x5x5=125). • Three layers of loops if the process is implemented in software. • Large silicon resource may be needed without careful planning. Triplet Finding Plane A Plane B Plane C
Tiny Triplet Finder (Animation) x3 (1) Fill in hits to x1 and x3 planes. Implement x1 and x3, the plane x2 is for eye-guide only. (2) Cycle through x2 hits. x2 Line matched. x1
Tiny Triplet Finder Bit-wise Logic Block Bit Array/Shifter A Bit Array/Shifter C Hash Sorter A Hash Sorter C
Bit Array/Shifters TTF OperationsPhase I: Filling Bit Arrays Note: Flipped Bit Order • xA+ xC = 2 xB • xA= - xC + constant Physical Planes Fill a corresponding logic cell. For any hit…
Bit Array/Shifters TTF Operations Phase II: Making Match Triplet is found. Logically shift the bit array. Perform bit-wise AND in this range. Physical Planes For any center plane hit…
Bit Array/Shifters TTF Operations Phase II: Making Match (Next Hit) Triplet is found through bit-wise AND. Logically shift the bit array. Fake triplet may exist. It will be cut out in the later stages. Physical Planes Loop to next center plane hit…
Bit Array/Shifters TTF OperationsPhase II: Making Match (More…) Logically shift the bit array. Triplet is found through bit-wise AND. Physical Planes Loop to next center plane hit…
Bit Array/Shifters TTF Operations Phase II: Making Match (and More…) Logically shift the bit array. Triplet is found through bit-wise AND. Physical Planes Loop to next center plane hit…
Bit Array/Shifters TTF Operations Phase II: Making Match (Last Hit) Logically shift the bit array. Physical Planes Triplet is found through bit-wise AND. Loop to next center plane hit…
Multiple Hits & TripletsKeep Them All • Each bin may be filled with more than one hits. • Hits data are kept in hash sorters – allowing multiple hits per bin. • There are may be more than one match in each bit-wise AND operation. • They are all sent out, one-by-one, to the later stages for fine cut and arbitration processes.
Boundary IssuesBeyond Just Bit-wise AND • When the track hits near the boundary of a bin, simple bit-wise AND may miss the triplet. • The bit-wise OR-AND logic will cover the boundary. • The logic cells in most of today’s FPGA’s have 4 inputs. So the OR-AND bit-wise logic doesn’t increase any resource usage.
FIFO Interface Hit Feeder C FIFO Interface Hit Feeder A Hash2blk Hash2blk D D QC QA DI DI QQ QQ K K KA KC Push Push RDY RDY EN EN Push Push Pop Pop WT WT RePop RePop MD MD SR SR Bit64A Bit64C KS KS WR WR Q Q EN EN SR SR FIFO Interface Hit Feeder B D SA Cut and Arbitration DA QB DB QT EN XB DC RDY SC EN Chk Shift Reg. 6 steps D Q4 EN Tiny Triplet Finder: Block Diagram 77 LUT 66 FF DA 172 LUT FF 192 LUT 111 FF 92 LUT FF A BitLogic KA XB C KC EN Pop 77 LUT 66 FF Halt
Simulation (1) Filling hits on Plane A and C. (2) Looping over hits on Plane B. The extra combination halts the pipeline for proper operation. This combination is a fake triplet. Pipelined internal operations… Triplets are grouped together.
Logic Cell Usage • Both 64- and 128-bit TTF designs fit $100 FPGA comfortably. • A simple 64-bit Hough transform design is shown for scale. • A $1200 FPGA is shown for scale.
Use 4 bit arrays. • There are 3 constraints total. • More constraints help to eliminating fake tracks. • It is possible to use bit-wise majority logic (such as 3-out-of-4) to accommodate detector inefficiency issues. Pentlet FindingBeyond Just Bit-wise AND Plane A Plane B Plane C Plane D Plane E
Comparison of Baseline and Tiny Triplet Finder:Data Flow Station N-1 bend Station N-1 bend Station N bend FIFO FIFO AM: Long Doublet Station N bend Station N+1 bend Station N+1 bend FIFO FIFO FIFO FIFO FIFO AM: Triplet FTF: Triplet Hash Sorter Hash Sorter Station N-1 Non-bend Station N-1 Non-bend FIFO FIFO FIFO FIFO AM: Short Doublet Hash Sorter: Short Doublet Station N Non-bend Station N Non-bend FIFO FIFO FIFO FIFO AM: Short Doublet Hash Sorter: Short Doublet Station N+1 Non-bend Station N+1 Non-bend FIFO FIFO FIFO FIFO AM: Short Doublet Hash Sorter: Short Doublet
The End Thanks
Triplet Feeder FIFO Interface Hit Feeder A’ Hash2blk D D SA QA DI QQ K QT KA Push RDY EN EN Push Push Pop WT RePop MD SR Cut, Arbitration and Combine DA DT QT DC RDY EN Shift Reg. 2 steps D QQ EN Short Doublet Stage: 1 of 3 Identical Ones y3y (8+3bits) x3y (7bits) y2y (8+3bits) x2y (7bits) 77 LUT 66 FF y1y (8+3bits) x1y (7bits) HitMap E P StnID H Triplets In Station N-1 Non-bend FIFO FIFO Hash Sorter: Short Doublet Triplets Out y3y (8+3bits) x3y (7bits) y2y (8+3bits) x2y (7bits) y1y (8+3bits) Dy1(400u) x1x (8+3bits) Dx1(400u) HitMap E P StnID H
CB10E RAM 512x18 RAM 256x36 Q D D O O CE D D O O WE WE EN EN A A A3 A2 COMP A A1 A1 EQ A0 B S0,S1 A0 S0 CB8RE Q CE FD FD FD FD FD FD D D Q Q D Q D Q D D Q Q SR EN EN EN Hash Sorter: Vertex-II Implementation POP POPQ RDY REPOP WT EVeq MD EV(9:0) IdK(7:0) IdKQ(7:0) EV(8:0) K(8:0) DOB(27:0) DI(27:0) DIQ(27:0) IdN(7:0) PUSH PUSHQ DUMPD IdP(7:0) IdQ(7:0) SR Vertex II Implementation CLK
CB9E RAM 256x16 RAM 128x36 Q D D O O CE D D O O WE WE EN EN A A A3 A2 COMP A A1 A1 EQ A0 B S0,S1 A0 S0 CB7RE Q CE FD FD FD FD FD FD D D Q Q D Q D Q D D Q Q SR EN EN EN Hash Sorter: Cyclone Implementation POP POPQ RDY REPOP WT EVeq MD EV(8:0) IdK(6:0) IdKQ(6:0) EV(7:0) K(7:0) DOB(28:0) DI(28:0) DIQ(28:0) IdN(6:0) PUSH PUSHQ DUMPD IdP(6:0) IdQ(6:0) SR Cyclone Implementation CLK
D D D D D D D D Q Q Q Q Q Q Q Q WE WE WE WE WE WE WE WE A(3:0) A(3:0) A(3:0) A(3:0) A(3:0) A(3:0) A(3:0) A(3:0) Bit Array/shifterXILINX Implementation • Each Bit Array/Shifter has 64 bins. • Each bin is a 16-bit RAM (look-up table). • The RAM operates like a D Flip-flop. • Each xc2v1000 FPGA has 10,240 such RAM’s.
Bit Array/shifterFilling the Hits • Each hit is written into 8 to 16 different RAM’s with different addresses. • One clock cycle (not 8 to 16) is used to fill a hit.
Bit Array/shifterShift Reading • With given addresses of the RAM’s, hit patterns appear at the outputs of the arrays. • Changing addresses shifts the hit patterns. • One array shifts in unit of 1 bin, the other in unit of 8 bins. • The relative shift of the two patterns covers 128 bins. • One clock cycle is used for a reading, regardless the distance of the relative shifting.
D D D D D D D D Q Q Q Q Q Q Q Q WE WE WE WE WE WE WE WE A(3:0) A(3:0) A(3:0) A(3:0) A(3:0) A(3:0) A(3:0) A(3:0) Bit Array/shifterComments: Xilinx or Not • Writing/reading operations need only single clock cycle. • Storage and shifter are combined — big resource saving. • Maximum 16 clock cycles are needed for resetting. • Non-xilinx implementation: Additional 64x8=512 LE’s may be needed. But resetting needs only one clock cycle.
Challenge on Triplet Finding (1)The Facts in “Road” Language • A triplet has two parameters. • With two hits in two planes, two parameters can be determined. The 3rd hit in the 3rd plane will provide the first constraint. • Assume each parameter is sliced into 64 bins. (That’s not too many.) • There are 64 x 64 = 4096 “roads”, i.e., possible track configurations. • Any hit can belong to 64 possible roads. Two hits in two planes have one common road. Three hits must have one common road to be in a triplet.
CAM CAM CAM D( ) D( ) D( ) Match Match Match WE WE WE A A A Challenge on Triplet Finding (2)A Possible Implementation Using CAM • It is possible to use (static) contents addressable memory (CAM) to implement triplet finding. • Hit patterns are pre-stored in CAM. • When a hit pattern match the pre-stored one, the address represent the road. • Each of 3 planes has 64 possible hit locations. (64x64=4096 roads) • Assume one road can generate one hit pattern, a CAM array with 64x3=192 input bits, 4096 words is needed. • It can be done with 128 small CAM’s (32-bit 32-words). EPC20K1000E has 160. (80%) • Oops: boundary issues have not be considered. (More patterns ) Plane A Plane B Plane C Road
Challenge on Triplet Finding (3)A Possible Implementation Using Hough Transformation • Book a 2-D (64x64) histogram. Each bin of the histogram represent a road. • Each hit from plane A, C or B increments the counts of the 64 bins in a row, column or diagonal. • The bins with count 3 are valid triplet roads. • Each histogram bin is a counter plus selection logic. • Assume each bin can be implemented with 4 Altera logic elements (LE’s). • 4096 bins need 16,384 LE’s, EPC20K1000E has 38,400. (16,384/38,400 = 43%) • However: How to find the bins with count 3 and encode them is not trivial.
Silicon Usage (Estimate) • The Tiny Triplet Finder (TTF) silicon usage is acceptable. • The hash sorter silicon usage is acceptable. • The Baseline, if not implemented automatically, could reduce the size a lot.
RAM D CC8RLE Q D O L WE EN CE SR A CB8SE S1 SS RAM CE D O Q FD FD D D Q Q D O CE CE WE EN A A0 A1 A2 A3 S NBin DOB DI IdNext
3 3 3 7 7 15 2 2 2 14 6 6 13 1 1 1 12 5 5 11 0 0 0 10 4 4 9 0 0 0 8 3 3 7 1 1 1 6 2 2 5 2 2 2 4 1 1 3 3 3 3 2 0 0 1 0
4 4 4 4 4 0 4 1 4 2 4 5 5 5 5 5 5 5 5 0 1 6 6 6 6 6 6 6 6 0 7 7 7 7 7 7 7 7 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 1 1 1 1 1 1 1 1 0 1 2 3 4 5 2 2 2 2 2 2 2 2 0 1 2 3 4 3 3 3 3 3 3 3 3 0 1 2 3 4’ 4’ 4’ 4’ 4’ 4’ 4’ 4’
3 4 4 4 5 4 6 4 7 4 4 4 4 5 5 5 5 5 5 5 5 2 3 4 5 6 7 6 6 6 6 6 6 6 6 1 2 3 4 5 6 7 7 7 7 7 7 7 7 7 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 7 1 1 1 1 1 1 1 1 6 7 2 2 2 2 2 2 2 2 5 6 7 3 3 3 3 3 3 3 3 4 5 6 7 4’ 4’ 4’ 4’ 4’ 4’ 4’ 4’