240 likes | 259 Views
Learn about the DiFX system, a high-performance correlator developed for eVLBI projects, its history, capabilities, supported formats, architecture, and benchmarking techniques. Explore benchmarks, contact information, and collaboration status.
E N D
DiFX Performance Testing Chris Phillips eVLBI Project Scientist 25 June 2009
DiFX history • Developed by Adam Deller at Swinburne University of Technology (now NRAO) to replace LBA S2 correlator to allow disk based correlation • Production correlator of the LBA (Australia) since 2007 • Verified against LBA, VLBA and Bonn hardware correlators
DiFX overview • FX-style correlator implemented in C++ • 95% optimised C vector function call (Heavy reliance of Intel IPP libraries) • Non-clocked system, unlike HWCs • Maximum performance without compromising generality or ease of maintenance • Modular design to support generality and enable “3rd party” contributors and local system optimisation
Capabilities • Near-arbitrary time and frequency resolution • Advanced pulsar gating • eVLBI (LBA has done 1 Gbps eVLBI) • Correlate anything it can unpack (1/2/4/X Gbps) • Most new formats easy to implement
Supported formats • Input • LBA • Mk5A (Mk4/VLBA) • K5 (via translation) • Mk5B • VDIF(end 2009) • Output • RPFITS, FITS-IDI
Current users • Long Baseline Array (Australia) • VLBA (USA) • MPIfR (Bonn, Germany) • AuScope geodetic array (Australia/NZ, 2009) • E-LOFAR (EU)
Future/Imminent Capabilities • Single pass, multiple phase center's • Improved (faster) fringe rotation • Band matching • eg 2x64MHz with 1x128MHz • Baseband pulsar "folder" • Native geodetic output format • Phase cal extraction • Frequency division multiplexing of VDIF • Polyphase filterbank
Baseband data DataStream 1 Core 1 DataStream 2 Core 2 … … processing buffer processing buffer DataStream N Core M processing buffer Visibilities Timerange, destination Source data Master Node Visbility buffer Visbility buffer Visbility buffer DiFX architecture Large, segmented ring buffer Up to 100s MB/ a few or more seconds MPI is used for inter-process communications Each data transfer is double buffered
Computational Distribution • Currently: only time division multiplexing • VDIF will allow frequency division multiplexing: implementation style? • As currently implemented all baselines must still be correlated on one Core
Benchmarking • Need to eliminate disk i/o go get clear indication of potential speed of specific setup • eVLBI! • Live eVLBI not suitable as fixed data rate • VLBIFAKE program generates eVLBI data stream • LBADR, Mark5B and VDIF • TCP and UDP • Only TCP usable for benchmarking • Shell script to run correlator and save logs • Rate determined by median transfer from VLBIFAKE CSIRO. eVLBI-Aus
Cuppa • 20 nodes, dual CPU Quad core • 6 stations • Up to 12 processing nodes • Testing number of threads and processing cores CSIRO. eVLBI-Aus
APSR • 18 compute nodes, dual CPU Quad core • 5 i/o nodes dual CPU dual core • 4 stations • Up to 18 processing nodes CSIRO. eVLBI-Aus
APSR • 18 compute nodes, dual CPU Quad core • 5 i/o nodes dual CPU dual core • 4 stations • Up to 18 processing nodes CSIRO. eVLBI-Aus
Code collaboration status • Entire codebase has been organised on SVN (hosted by ATNF) • DiFX wiki (hosted by Curtin): http://cira.ivec.org/dokuwiki/doku.php/difx/index • Mailing list: difx-users@googlegroups.com • To get on the difx-users list, search out difx-users on google groups and request access, or email me
Contact Us Phone: 1300 363 400 or +61 3 9545 2176 Email: enquiries@csiro.au Web: www.csiro.au Thank you ATNF Chris Phillips eVLBI Project Scientist Phone: +61 2 93724608 Email: Chris.Phillips@csiro.au Web: www.atnf.csiro.au/vlbi
Benchmarks • Non-clocked system, unlike HWCs • Indicative number of CPU cores required to correlate at real time: • LBA @ 1 Gbps (256 MHz agg. b/w, 2 bit): 100 • VLBA @ 4 Gbps (1 GHz agg. b/w, 2 bit): 800 • Weak dependencies on e.g. num. channels • 160 CPU core system (exceeding VLBA HWC capacity) costs <$100k inc. networking, annual electricity ~$10k