230 likes | 349 Views
New Features in DiFX2.0. Adam Deller NRAO 3rd DiFX workshop, Curtin University, Perth. Outline. What is DiFX2.0? New features: Spectral channel selection (“zoom bands”) In-correlator averaging Multiple phase centres (MPCs) Local oscillator (LO) offsets
E N D
New Features in DiFX2.0 Adam Deller NRAO 3rd DiFX workshop, Curtin University, Perth
Outline • What is DiFX2.0? • New features: • Spectral channel selection (“zoom bands”) • In-correlator averaging • Multiple phase centres (MPCs) • Local oscillator (LO) offsets • Skipping over useless data rather than reading • Under the hood: • Station-based improvements in Mode • Baseline-based improvements in Core 3rd DiFX workshop, Curtin University, Perth
What is DiFX2.0? • DiFX2.0 is an evolution of the DiFX code base • Adds new features and changes the way some of the internal information is maintained (eg time: now managed by scan) • Required a big break with the existing code due to changes in file formats - necessary to provide control info for the new features 3rd DiFX workshop, Curtin University, Perth
New features: channel select • Define a new “Frequency” band which encompasses a subset of an existing band • This “zoom” frequency can be selected as the one to correlate at a baseline, in place of the full bandwidth • Applications: • wide recorded band, narrow maser emission (throw away the useless channels, save network) • Correlate eg 1x16 MHz with 2x8MHz bands 3rd DiFX workshop, Curtin University, Perth
New features: channel select Datastream 1 band Datastream 2 band x Baseline 3rd DiFX workshop, Curtin University, Perth
New features: averaging • Narrow-field VLBI only requires coarse spectral resolution eg 0.5 MHz • But taking eg a 16 point FFT is not efficient! • Minimum desirable FFT size is about 128 • For coarser spectral resolution, visibilities had been averaged in difx2fits • Wasteful of intermediate diskspace • Now averaged in correlator: saves network capacity (enabling MPCs) and disk space 3rd DiFX workshop, Curtin University, Perth
New features: averaging Datastream 1 band Datastream 2 band x avg Thread visibility Core visibility 3rd DiFX workshop, Curtin University, Perth
New features: Multiple PCs • At any given instant, the phase centre of correlation can be changed by rotating the visibilities by a phase value equal to the LO frequency x delay (between desired phase centre and current phase centre) • This is a station-based effect, but if done after some accumulation must be done separately to each baseline primary beam uv-shifted “pencil” fields 3rd DiFX workshop, Curtin University, Perth
New features: Multiple PCs • Multiple phase centres were the main driver behind the compatibility-breaking upgrades for DiFX2.0 • Need to provide separate geometric model for each phase centre (calcif2, vex2difx) • The initial correlation is directed at the pointing centre, with high spectral resolution, and typically once per subint (can be more frequent) shift is applied and chans averaged 3rd DiFX workshop, Curtin University, Perth
New features: Multiple PCs Repeat for each phase centre Subint visibility Thread amp Thread amp Core amp Rotate phase Average Theadphase Theadphase Corephase 3rd DiFX workshop, Curtin University, Perth
New features: LO offsets • An improperly set LO at a station yields wrapping phase • This can now be corrected for in DiFX • It is implemented post-FFT, so is limited to maximum offset rates of a few Hz to a few kHz, depending on FFT size • Could be done pre-FFT if people thought was really needed (discussion?) • Also required a new entry in the input file 3rd DiFX workshop, Curtin University, Perth
New features: LO offsets 180° time One FFT -180° 3rd DiFX workshop, Curtin University, Perth
New feature: data skipping • In file-based mode in DiFX1.5, a Datastream will read all of every file you give it • This is annoying if you want to correlate a subset of an experiment - the file list must be cropped, and sometimes files are big so just reading from the start of one takes ages • In DiFX2.0, the read thread checks the time of the last send request, and attempts to reposition file pointer appropriately 3rd DiFX workshop, Curtin University, Perth
Read thread opens file Attempts toskip past EOF Read thread opens next file Skips to precedinginteger second New feature: data skipping File 1 File 2 Time Latest FxManagerrequest 3rd DiFX workshop, Curtin University, Perth
Station-based efficiency gains • The majority of the station-based cost was not in the FFT, but the sin/cos to calculate the phase of the fringe rotation (pre-F) • For the situation where the phase change is linear from channel to channel (always true) can calculate sin/cos for the first N channels and then for every Mth channel, use complex multiplies to get the full NxM channel result • Saves about 20% of the overall execution time for 10 stations = ~25% of station-based 3rd DiFX workshop, Curtin University, Perth
Station-based efficiency gains 180° One FFT of data Previously, sin/cosfor every sample -180° 3rd DiFX workshop, Curtin University, Perth
Station-based efficiency gains 180° One FFT of data Now, sin/cos the first M samples, and every M’th afterthat -180° 3rd DiFX workshop, Curtin University, Perth
Baseline-based efficiency gain • For many baselines/large numbers of channels, entire output accumulator no longer fits in CPU cache - massive slowdown • Looping over baseline/freq/polarisation once per FFT is inefficient in this situation • Solution: calculate more than one FFT for each datastream, then XMAC the same baseline/freq for more than one FFT • Reduces the overhead of going from 128 to 2048 chans/band from ~5x to ~2x 3rd DiFX workshop, Curtin University, Perth
Mode 1 Mode 2 Mode 3 Mode N Baseline-based efficiency gain Before: … Visibility buffer(too big for cache) 3rd DiFX workshop, Curtin University, Perth
Mode 1 Mode 2 Mode 3 Mode N Baseline-based efficiency gain After: … But one slot fits in cache! Visibility buffer(too big for cache) 3rd DiFX workshop, Curtin University, Perth
Summary of input file changes • .input file moved: • Num channels, oversample, decimation (from config entries to frequency entries) • .input file changed: • Post-F fringe rotation, quadratic interpolation -> fringe rotation order • Blocks per send/guard blocks->subintNS/guardNS • Delay/uvw files -> im file • .input file new: • LO offset, zoom band entries (datastream) 3rd DiFX workshop, Curtin University, Perth
Summary of other file changes • Calc file changes: • Added a source table, which is referenced in the scan table • Scans now have pointing centre and one or more phase centres • IM file changes: • Extra entries for the phase centres, as well as the pointing centre 3rd DiFX workshop, Curtin University, Perth
Some quick benchmark results • For station-based dominated (<~10 stations) DiFX2.0 should be ~20% faster • To add phase centres out to the edge of the primary beam one must go to 2k or 4k channels = 2-3x slower than continuum 128 (was more like 4-5x slower before changes) • But then adding phase centres is basically free. Doing 100 phase centres is only about 1.2x slower than doing 1 phase centre 3rd DiFX workshop, Curtin University, Perth