250 likes | 410 Views
Scalable Packetized Correlators. Jason Manley, Aaron Parsons, Don Backer, Henry Chen, Terry Filiba , David MacMahon , Peter McMahon, Arash Parsa , Andrew Siemion , Dan Werthimer , Mel Wright. Outline. What is a correlator? Scalable packetized correlators: The architecture The hardware
E N D
Scalable Packetized Correlators Jason Manley, Aaron Parsons, Don Backer, Henry Chen, Terry Filiba, David MacMahon, Peter McMahon, ArashParsa, Andrew Siemion, Dan Werthimer, Mel Wright
Outline • What is a correlator? • Scalable packetized correlators: • The architecture • The hardware • The software • The cost • Closing thoughts • Walk through actual design • Questions and comments
MWA XNTD PAPER FAST PAST LAR LWA
Basic idea Vk ∑ ∑ ∑ Vjk Vjk Vkk ∑ Z-n Vik ∑ 90° Vii Amplitude Amplitude Time Time ∑ Vik Vi Vi ∑ ∑ Z-n Vij Vij 90° 90° ∑ Vjj Vj Vj ∑ ∑ Z-n Vij Vij
“Actual” FX Correlator Z-n Z-n Z-n FFT FFT FFT ∑ ∑ ∑
Design Philosophy • Standardized processing hardware • Commercial interconnect • Asynchronous compute engines • Synchronization using common 1PPS • UDP output delivery over ethernet network • Correlator scales with your array
F Engine Operations • Two F engines per iBOB • Dual polarization design • Currently uses ASTRO library • Currently processes data at native clock rate (<200MHz IBOB or < 400MHz ROACH) DDC Channelize Quantize Reformat ADC
Setup and Control • Clocks: • X engines each run off independent clock • Sampling synchronized at F engines, but clock not distributed to X engines • Synchronized using global 1pps signal at ADCs • Propagated to X engines using out-of-band signaling on XAUI links • Headers labeling 10GbE Ethernet packet data • System control: separate 100Mbps Ethernet network on BEE2 • F engines configured from BEEs through XAUI links • Control packets: CASPER UDP framework on BEE2 control FPGA • Execute Python scripts for configuration, control and debugging
F engine development • 2008: • Coarse delays (cable length compensation) • Fringe-stopping & fine delays • Walsh code generation and phase switching • Real sampling (low bandwidth) • Parallel streams (high bandwidth) • Future: • Ability to output subset of band • Spectral zoom modes
X Engine Operations • Using CASPER library • Scales with 2^N antennas • Fit as many X engines on an FPGA as possible (2x 16 ant on BEE2 usr) 10GbE Buffer X Eng Accum
Backend Software • UDP packets received • Currently received, parsed and saved in MIRIAD file format by single computer. • Computing requirements dependant on experiment; • Usually single computer ok: 128 antennas, 1 sec integrations, 2k chan = 512MB/s
Pending systems • Bench sys: 8ant, DP, 200MHz, 2k ch • PAPER: 128ant, DP, 100MHz, 2k ch • KAT-7: 8ant, DP, 256MHz, 2k ch • meerKAT: 80ant, DP, 1GHz, 16k ch • Bologna: 32ant, SP, 32MHz, 1k ch • GMRT: 32ant, DP, 400MHz, 4k-8k ch
FPGA Roadmap • Processing power doubling every two years • V4 = ½ power requirements of V2Pro* * Manufacturers claim - Xilinx Inc.
Coming soon… • 10Gbps output optionally gives integrations ~10ms • More efficient use of hardware DSP slices • High speed, scalable, distributed data capture software • Walsh codes and phase switching • Phase rotation • 64 antenna design • Upgrade to 4096 channels • ROACH hardware: • <400MHz bandwidth • 16 384 channels • 128 antennas • no architectural changes
Questions and Comments Visit the CASPER correlator page: http://casper.berkeley.edu/wiki/index.php?title=Correlator Add your own requirements: http://casper.berkeley.edu/wiki/index.php?title=International_Correlator_Collaboration Email me: jason_manley@hotmail.com
Current usesPocket Spectrometer • Using ATMEL ADC’s at 2 Gsamples/sec • Performing 4 real FFT’s in 1 (complex) biplex pipelined FFT module. • 2048 channels • Uses just 1 ADC, 1 IBOB, and your laptop.