610 likes | 830 Views
Uneven Bin Width Digitization and a Timing Calibration Method Using Cascaded PLL. Wu, Jinyuan Fermilab May. 2014. Introduction. Digitization with uneven bins is needed in FPGA based TDC . The differential nonlinearity is acceptable in many cases.
E N D
Uneven Bin Width Digitization and a Timing Calibration Method Using Cascaded PLL Wu, Jinyuan Fermilab May. 2014
Introduction • Digitization with uneven bins is needed in FPGA based TDC. • The differential nonlinearity is acceptable in many cases. • A value called equivalent bin width is defined. • A scheme of generating calibration pulses with cascaded PLL circuits is presented. • The same scheme can be used for clock phase measurement. Uneven Bin Width Digitization & Cascade PLL
TDC Using FPGA Logic Chain Delay • Convenient. • Low cost. • But the bin widths are not uniform. IN CLK Uneven Bin Width Digitization & Cascade PLL
VCCINT =1.20V VCCINT =1.18V Delay Line Speed vs. Core Voltage 16 Patterns @ 400 MHz Uneven Bin Width Digitization & Cascade PLL
Adjusting Bin Widths? • Compensation: Adjusting bin width to certain value. • Slowing down the delay Chain? • Linearization: Fine tuning width of each bin. • Cost? DF PPS TDC
Nonlinearity = Something Bad? • Nonlinear scales are commonly used. • Sometimes, the markers can be in arbitrary (but known) positions, such as in solar spectrum. Association of Universities for Research in Astronomy Inc. (AURA) solar-spectrum-from-www-mao-kiev-ua--sol_ukr--terskol--bmv_m PPS TDC
The Equivalent Width 0 1 2 3 4 5 6 n-1 n n+1 w1 w0 w2 w3 w4 W weq • Digitizers with non-uniform bin widths are able to make precise measurements as long as it is calibrated appropriately. • A equivalent bin width can be defined as above. • The calibration can be done offline or/and online. PPS TDC
Auto Calibration: Histogram Booking • In the auto calibration process, a bin width histogram (DNL histogram) is first booked. • More counts are accumulated in wider bins. 16-32K Events DNL Histogram S LUT In (bin) Out (ps) Uneven Bin Width Digitization & Cascade PLL
Auto Calibration: Summing Lookup Table • Bin widths are summed up into the calibration lookup table. • Note that the values represent times of the centers of the bins. DNL Histogram S LUT In (bin) Out (ps) Uneven Bin Width Digitization & Cascade PLL
Calibration Pulse Generation: Random != Uniform 16384 Events • When number of events is finite, random hits has large fluctuations. • Pulses with evenly spread timing relative to the TDC clock are desirable. Uneven Bin Width Digitization & Cascade PLL
Cascaded PLL Circuits CK250a CLOCK_50 CK251c • Two stages of PLL circuits are cascaded together. • f(CK250a) = 250 MHz • f(CK251c) = 250.06 MHz • f(CK251c) = (4096/4095)*f(CK250a) • T(CK250a) - T(CK251c) = 0.97 ps. PPS TDC
Phase Differences • The relative timing differences between the CK250a and CK251c cover entire range of 4000 ps with 4096 cycles. • The 2N number 4096 is chosen for easy implementation of the calibration sequencing functional block. Uneven Bin Width Digitization & Cascade PLL
Test Result in an Oscilloscope Screen Capture Calibration Lookup Table Trigger Edges By CK250a Calibration Edges By CK251c • A total of 16384 Calibration edges are collected. • Entire 4000 ps range are scanned 4 times (4*4096 = 16384). • The histogram (with 50 ps/bin) serves as a demonstration of calibration lookup table. Uneven Bin Width Digitization & Cascade PLL
Clock Phase Measurement, Another Application Correctly Captured? D Q D Q CK251c CK250a Correctly Captured? D Q D Q CK251c CK250b CK251c Cascaded PLL Circuits CK50a • Two clocks from same source but with different phases are multiplied in PLL. • The CK251c clock scans entire 4000 ps range and the correctness of the captures of the DFF driven by two clocks are checked. CK250a CK50b Cascaded PLL Circuits CK250b Uneven Bin Width Digitization & Cascade PLL
Oscilloscope Screen Capture Captured 1-0 Trans. By CK250b Captured 0-1 Trans. By CK250b Captured Correctly By CK250a Captured Correctly By CK250b 4 ns/step => 0.97 ps • The phase difference of CK250a and CK250b can be measured after CK251c scans through. • The 0-1 and 1-0 transitions have different setup time. Uneven Bin Width Digitization & Cascade PLL
The End Thanks
Good, However • Auto calibration solved some problems • However, it won’t eliminate the ultra-wide bins Uneven Bin Width Digitization & Cascade PLL
Concern: Dead Time? Uneven Bin Width Digitization & Cascade PLL
Wave Union TDC records multiple transitions. Wave Union Launcher A Regular TDC records only one transition Wave Union Launcher A 0: Hold 1: Unleash In CLK Uneven Bin Width Digitization & Cascade PLL
Wave Union Launcher A: 2 Measurements/hit 1: Unleash Uneven Bin Width Digitization & Cascade PLL
1 2 Sub-dividing Ultra-wide Bins 1: Unleash Device: EP2C8T144C6 • Plain TDC: • Max. bin width: 160 ps. • Equivalent bin width: 60 ps. • Wave Union TDC A: • Max. bin width: 65 ps. • Equivalent bin width: 30 ps. 1 2 Uneven Bin Width Digitization & Cascade PLL
Time Measurement Errors Due to Power Supply Noise Switching Power Supply Linear Power Supply • Typical RMS resolution is 25-30 ps. • Measurements with cleaner power (diamonds) is better than noisy power (squares). Uneven Bin Width Digitization & Cascade PLL
Coarse Time Counter Coarse Time In Fine Time Encoder Fine Time ENA CLK Hit Detect Logic Data Ready Pipeline Structure of TDC Time Sensing Block • The front-end of the TDC is designed with pipeline structure. • There is nearly no dead time in this section. • A hit can be digitized every clock cycle (@250 MHz). • However, we introduce some dead time by using slower clock to save power. Uneven Bin Width Digitization & Cascade PLL
Concern: Low-power? Uneven Bin Width Digitization & Cascade PLL
Low-Power Design Practice: Clock Speed 250 MHz 62.5 MHz IN0 Delay Line & Sampling Register Array Data Load/ Transfer Register Encoder Buffer w/ Zero Suppression CK250 CK62 Load Clock Disable Sequencer • The Sampling Register Arrays are clocked at 250 MHz. • All other stages are clocked at 62.5 MHz. • When a valid hit is sampled, the Sampling Register Array is disabled so that the registered pattern is stable for 64 ns. • The Data Load/Transfer Registers are enabled to load input 64 ns, so that a valid hit is guaranteed to be load once and only once. Uneven Bin Width Digitization & Cascade PLL
Low Power Design Practice: Resource Sharing IN3 • The Data Load/Transfer Registers are enabled to load input 64 ns, (i.e., 4 clock cycles at 62.5 MHz). • The Data Load/Transfer Registers transfer data from other channels when they are not enabled to load. • Four channels share an Encoder and a Buffer with Zero Suppression. IN2 IN1 IN0 Delay Line & Sampling Register Array Data Load/ Transfer Register Encoder Buffer w/ Zero Suppression Data Merging Register CK250 CK62 Load Clock Disable Sequencer 62.5 MHz 250 MHz Uneven Bin Width Digitization & Cascade PLL
Low-Power Design Practice: Wave Union • Intrinsically the Wave Union TDC is a low-power scheme. • Multiple measurements are made with one set of delay line, register encoder etc. yielding finer resolution that otherwise needs several regular TDC blocks to achieve. Uneven Bin Width Digitization & Cascade PLL
Concern: Data Packing? Uneven Bin Width Digitization & Cascade PLL
Data Packaging: Block Diagram 1 Straw: 2 TDC + 1 ADC 1 Straw: 2 TDC + 1 ADC 1 Straw: 2 TDC + 1 ADC Carry Chain Reg. Array Encoder Buffer & Data Packing Carry Chain Reg. Array Encoder ADC Data Output Buffer Parallel-to -Serial Converter Data • For each straw, 2 TDC and 1 ADC are implemented. • Time and charge data are grouped and sent out together. Uneven Bin Width Digitization & Cascade PLL
Data Packing: A Real Design for a Similar Project • TDC and ADC data packaging for OpenPET of LBL. Uneven Bin Width Digitization & Cascade PLL
Data Bit Layout 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Header & Count Hit Count Header 0 0 0 1 CH: 0-15 TDC Coarse Time: LSB= 4 ns TDC Fine Time LSB=15.625ps 0 0 0 1 TDC Coarse Time: LSB= 4 ns TDC Fine Time LSB=15.625ps ADC 1 ADC 0 ADC 3 ADC 2 Hit ADC 5 ADC 4 ADC 7 ADC 6 ADC 9 ADC 8 ADC 11 ADC 10 0 0 0 1 CH: 0-15 TDC Coarse Time: LSB= 4 ns TDC Fine Time LSB=15.625ps 0 0 0 1 TDC Coarse Time: LSB= 4 ns TDC Fine Time LSB=15.625ps ADC 1 ADC 0 ADC 3 ADC 2 Hit ADC 5 ADC 4 ADC 7 ADC 6 ADC 9 ADC 8 ADC 11 ADC 10 • Data layout for full ADC resolution. • This scheme uses 256 bits/hit. • There could be other layout with 128 bits/hit. Uneven Bin Width Digitization & Cascade PLL
Connection Between Digitizer and ROC ROC RX Clock Frame Data Digitizer Clock Generator TX Frame Generator • Clock and frame signals are provided along with data links. • Data links runs at 200 M bits/s Uneven Bin Width Digitization & Cascade PLL
Data Rate: Is 200 Mb/s Enough? • Assumption: • 1695 ns micro-bunch length. • 900 ns data taking window. • 1 LVDS data output pair for every 4 straws. • The 300 kHz hit rate in TDR is likely an over estimate. • As long as the actual hit rate is < 200 kHz, data link of 200 Mb/s per LVDS pair should be sufficient. Uneven Bin Width Digitization & Cascade PLL
Test Results Uneven Bin Width Digitization & Cascade PLL
The Test Hardware www.altera.com Uneven Bin Width Digitization & Cascade PLL
Test Setup TDC Module NIM to LVDS Converter Uneven Bin Width Digitization & Cascade PLL
Output Raw Data and Typical Delta T Histogram Between Two Channels 00003C C064A6 F064B8 C07CA4 F07CB4 C094A0 F094B0 C0AC9C F0ACAC C0C497 F0C4A8 C0DC91 F0DCA2 • RMS of this histogram is 25 ps. Uneven Bin Width Digitization & Cascade PLL
Delta T Between NIM Inputs FPGA Pulse Gen. TDC LeCroy 429A NIM FAN- OUT NIM To LVDS A TDC B TDC TDC C LeCroy 429A NIM FAN- OUT NIM To LVDS TDC TDC TDC TDC • TDC channels internally ganged together has smallest standard deviation of time differences. • Typical channel pairs sharing same fan-out unit has 30 ps RMS. • Timing jitters of the fan-out units add to the measurement errors. Uneven Bin Width Digitization & Cascade PLL
Measurement Precisions Analyzed by Woon-Seng Choong, LBNL Uneven Bin Width Digitization & Cascade PLL
Performance Degrading in CPU/GPU, ASIC & FPGA Performance Theoretical limit of current technology Degrading Due to Design Carefully designed FPGA may have better performance than typical ASIC. Theoretical limit of Older technology Degrading Due to Structure Degrading Due to Design Degrading Due to Design CPU/GPU ASIC. FPGA • Imperfect designs degrade performance of ICs, including CPU/GPU considerably. • ASIC devices are built using older technology and suffering similar design degrading. • FPGA internal structure causes extra performance degrading in addition to design degrading. • Design modification in FPGA is easier so that design degrading can be minimized. Uneven Bin Width Digitization & Cascade PLL
Specifications Uneven Bin Width Digitization & Cascade PLL
Test ResultNIM Inputs RMS 10ps 140ps 0 1 2 Wave Union TDC B BNC adapters to add delays @ 140ps step. Wave Union TDC B + NIM/ LVDS Wave Union TDC B Wave Union TDC B - LeCroy 429A NIM Fan-out Wave Union TDC B NIM/ LVDS Wave Union TDC B + Wave Union TDC B Uneven Bin Width Digitization & Cascade PLL Wave Union TDC B
Other Applications: Single Slope ADC 4xR2 FPGA VIN1+ TDC VIN1- VIN2+ TDC VIN2- 4xR2 R R C VREF+ R1 VREF- Uneven Bin Width Digitization & Cascade PLL
If You Want to Try www.altera.com THDB-H2G (HSMC to GPIO Daughter Board) $50 www.altera.com DK-START-3C25N Cyclone III FPGA Starter Kit $211 • The FPGA on the Starter Kit is fairly powerful. • More than 16 pairs LVDS I/O can be accessed via the daughter card. • FPGA can fit 32 channels but implementing 16 channels is more practical given the I/O pairs. • TDC data are stored in the RAM on the board and can be readout via USB. • A good solution for small experiment systems as well as student labs. Uneven Bin Width Digitization & Cascade PLL
Timing Uncertainty Confinement Uneven Bin Width Digitization & Cascade PLL
Historical Implementation in ASIC TDC DLL Clock Chain Coarse Time Counter c0 c1 HIT is used as CK input which creates unnecessary challenges. Coarse Time Register HIT Encoder Coarse Time Selection Logic Unnecessary Challenges = Extra Efforts + Reduced Performance • Deadtime is unavoidable. • Coarse time recording needs special care. • Two array + encoder sets are needed for raising edge and falling edge. • The register array must be reset for next event. • The encoder must be re-synchronized with system clock in order to interface with readout stage. Uneven Bin Width Digitization & Cascade PLL
Unnecessary Challenges Unnecessary for FPGA TDC 000 001 011 010 110 111 101 100 Gray Code Counter Coarse Time Counter Coarse Time Counter Coarse Time Counter • In history, Gray code counters, double counters and dual registers + MUX are found in ASIC TDC coarse time counter schemes. • Theses are unnecessary if the TDC is designed appropriately. • In FPGA, a plain binary counter is sufficient. Uneven Bin Width Digitization & Cascade PLL
A Better Implementation DLL Clock Chain HIT is used as D input. HIT Multi- Sampling Register Array Clock Domain Transfer 16-bit Encoder with Registered Outputs 16-bit Encoder with Registered Outputs Coarse Time Counter OR + Register DV EG T4..T0 TC • Deadtimeless operation is possible. • No special care is needed for coarse time. • Both raising and falling edges are digitized with a single array + encoder set. • No resetting is needed for the register array. • The output is synchronized with the system clock and is ready to interface with readout stage. Uneven Bin Width Digitization & Cascade PLL
Coarse Time Counter Coarse Time Counter Coarse Time HIT Fine Time Encoder Fine Time ENA • The timing uncertainty between HIT and CLK is confined in the sampling register array. • All the remaining logics are driven by the CLK signal. • No special cares such as Gray code counter is needed for coarse time counter. CLK Hit Detect Logic Data Valid Uneven Bin Width Digitization & Cascade PLL
Comparison Uneven Bin Width Digitization & Cascade PLL