1 / 14

ATCA GPU Correlator

Explore GPU replacement in ATCA correlator for improved flexibility, reliability, and support. Utilize ASKAP components for filterbank and cross-correlation. Upgrade samplers and switch to enhance data processing efficiency.

coffey
Download Presentation

ATCA GPU Correlator

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ATCA GPU Correlator Strawman Design Chris Phillips | LBA Lead Scientist 17 November 2015 Astronomy and space science

  2. ATCA • Current CABB Backend • 6 antenna • 8 GHz analog IF, dual pol • 2x2 GHz IF, dual pol • 11 bit samplers, FPGA backend • 2048 channel/1 MHz continuum mode • Various spectral line modes with “Zoom Bands” • E.g. 4 MHz/2048 channels, 64 MHz/2048 channels • Many designed modes never implemented • Hardware unreliable and difficult to change modes • No full bandwidth tied array mode DIFX 2015 – Chris Phillips

  3. GPU replacement? Could we improve flexibility, reliability and long term support with GPU backend New samplers, which talk directly to optical fibre Use ASKAP “redback” boards for coarse filterbank(128 MHz) and Ethernet packetisation (10 GbE) Fine filterbank and cross correlation in GPU - Ethernet cross connect DIFX 2015 – Chris Phillips

  4. DIFX 2015 – Chris Phillips

  5. Samplers • 12 bit Texas Instrument ADC12J4000 • Transport JESD204B • No FPGA required at sampler, optical interface • 8 lanes @ 8 Gb/s DIFX 2015 – Chris Phillips

  6. Redback • ASKAP Beamformer/correlator board • 6 Xilinx Kintex-420 Series-7 FPGAs • 1U chassis • 36x10 Gbps Ethernet output • 2x2 GHz IF per redback • 2 redback/telescope • Coarse filterbank – 128 MHz • Need to divide data 16 ways • Re-quantize to 8 bits to reduce i/o load • 16 bits would be better, but doubles backend cost • 12 bit may have minimal cost overhead DIFX 2015 – Chris Phillips

  7. Switch • Commodity 40 Gbps Ethernet switch • 64 port 40 GbE ~$30K • Can run 4x10 GbE per 40 GbE port • 8 bit system requires 56x40 GbE ports DIFX 2015 – Chris Phillips

  8. GPU backend • Need to frequency slice data at least 12 ways to avoid bottleneck on ingest (6 antenna, 2 pols) • Use factor of 16 • 16 GPUs per IF, dedicated 40 GbE Ethernet per GPU • 2 GPU/host plus 2 40 GbE NICs • Don’t implement zoom bands – filter data to highest required spectral resolution then frequency average as appropriate DIFX 2015 – Chris Phillips

  9. Costs: Assumptions ($A) DIFX 2015 – Chris Phillips

  10. Costs: Total DIFX 2015 – Chris Phillips

  11. Supported Modes • Assuming GPUs have enough computational power basic interferometry modes relatively easy to implement • High spectral resolution, short integration times • Tied array – multiple simultaneous beams • i/o issues (but new GPU 2 copy engines and NIC Tx relatively empty) • Pulsar binning • Need to also extract noisecal DIFX 2015 – Chris Phillips

  12. Exotic Modes • Pulsar Coherence de-dispersion or binning • Just requires extra compute • Fast radio bursts – serendipitous • Requires ~500 kHz spectral resolution, 64usec time resolution • 246 Gbps visibility output (8 Gbps/node) • Not really viable, need to detect on 128 MHz data • Longer integration, lower spectral resolution? • Transient buffer mode, external triggers • 3 Gbytes/sec incoming rate • Only buffer a few seconds – not viable without major cost (RAM) DIFX 2015 – Chris Phillips

  13. Exotic Modes (cont) • Nanosecond pulse detection (Lunaska) • 2 GHz bandwidth gives 0.25 nsecsampling time • But need full bandwidth in one location • Could received 128 MHz channels on servers then recirculate in round-robin fashion. • Not enough bandwidth to receive second copy of data without second NIC and more switching capability (dedicated infiniband?) • Changed redback mode to buffer data, then round-robin full 2 GHz data to GPUs – need special redback FPGA firmware • De-disperse antenna data, form tied array beams and look for pulses • Can dump voltage data if candidate found DIFX 2015 – Chris Phillips

  14. Thank you Astronomy and Space Science Dr Chris PhillipsLBA Lead Scientist t +61 2 93724608 eChris.Phillips@csiro.au wwww.atnf.csiro.au Astronomy and space science

More Related