1 / 32

Low-Frequency Pulsar Surveys and Supercomputing

Low-Frequency Pulsar Surveys and Supercomputing. Matthew Bailes. Outline:. Baseband Instrumentation MultiBOB MWA survey vs PKSMB survey Data rates CPU times Low-Frequency Pulsar Monitoring The Future Supercomputers. Pulsar “Dedispersion”. Incoherent. Coherent Dedispersion.

yuval
Download Presentation

Low-Frequency Pulsar Surveys and Supercomputing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Low-Frequency Pulsar Surveys and Supercomputing Matthew Bailes

  2. Outline: • Baseband Instrumentation • MultiBOB • MWA survey vs PKSMB survey • Data rates • CPU times • Low-Frequency Pulsar Monitoring • The Future Supercomputers

  3. Pulsar “Dedispersion” • Incoherent

  4. Coherent Dedispersion • Unresolved on us timescales • From young or millisecond pulsars • Power-law distribution of energies PSR J0218+4232

  5. 1022+1001 Pulsar Timing (Kramer et al.)

  6. CPSR2 Timing (Hotan, Bailes & Ord)

  7. Swinburne Baseband Recorders etc • 1998: Canadian S2 to computer (16 MHz x 2) • 100K system + video tapes • 2000: CPSR • 20 MHz x 2 + DLT7000 drives x 4 • 2002: CPSR2 • 128 MHz x 2 + real-time supercomputer (60 cores) • 2006: DiFX (Deller, Tingay, Bailes & West) • Software Correlator (ATNF adopted) • 2007: APSR • 1024 MHz x 2 + real-time supercomputer (160 cores) • 2008: MultiBOB • 13 x 1024 ch x 64us + fibre + 1600-core supercomputer

  8. dspsr software • Mature • Delivers < 100 ns timing on selected pulsars • Total power estimation every 8us with RFI excision • Write a “loader” • Can do: • Giant pulse work • Pulsar searching (coherent filterbanks) • Pulsar timing/polarimetry • Interferometry with pulsar gating

  9. PSRDADA (van Straten) • psrdada.sourceforge.net • Generic UDP data capture system (APSR/MultiBOB) • Ring Buffer(s) • Can attach threads to fold/dedisperse etc • Hierachical buffers • Shares available CPU resources/disk • Web-based control/monitoring • Free! + hooks to dspsr & psrchive.

  10. APSR • Takes 8 Gb/s voltages • Forms: • 16 x 128 channels (with coherent dedispersion) • 4 Stokes, umpteen pulsars • Real-time fold to DM=250 pc/cc. • O(100) Ops/sample • Sustaining >>100 Gflops • ~100K computers. • June 2008 • 192 MHz working @ 4bits • 768 MHz working @ 2bits

  11. Coherent Dedispersion BW/time 1024 x (100K) BW 128 (300K) x 16 20 x x 1998 2000 2002 2004 2006 2008 year

  12. Coherent Dedispersion • Now “trivial” • FFT ease ~ B-2/3

  13. MultiBOB • High Resolution Universe Survey (PALFA of the South) • Werthimer’s iBOB boards • 1024 channels, down to 10us sampling • Two pols • FPGA coding hard… • Use software gain equalizer/summer • ~5 MB/s beam • 1 Gb/s Fibre to Swinburne (>1000 km fibre) • Real time searching!

  14. New PKS MB Survey: • Kramer • 13 beams • 70 minutes/pointing • 1024 channels • 300 MHz BW • 64 us sampling • +/- 3.5 deg • Bailes • 13 beams • 9 minutes/pointing • 1024 channels • 300 MHz BW • 64 us sampling • +/- 15 deg • Johnston • 13 beams • 4.5 minutes/pointing • 1024 channels • 300 MHz BW • 32 us sampling • The rest

  15. MWA • Samples • Takes (24x1.3MHz=32 MHz) x 2 x 512 • “Just” 32 GB/s (64 Gsamples/s) • FFTs it • (5 N log2 ops/pt = 2.2 Tflops) • XMultiplies & adds • (512)*256*B*4 = 16 TMACs

  16. 32 vs 288 MHz ~3-5x PKS 700 vs 0.6 deg2 350 vs 25 K Sensitivity: (folded factor)

  17. ~ Parity PKS vs MWA • G ~ 3-5 x better • Tsys ~ 14 x worse ? • B1/2 ~ 3 x worse • Flux ~ 25 x better (1400 vs 200 MHz) • t1/2 ~ 32 x better Single Pulse work ~ Comparable Coherent search ~ 32x improvement! But: There is a limit to the time you can observe a pulsar! 4m vs 144m -> 5x deeper.

  18. Scattering b=0 • 1,10,100,1000ms

  19. Scattering b=5d • 1,10,50,100ms

  20. b=30 • 0.5,1ms

  21. ? ? . . . . . . . . . . . . . . . . . . . . . Correlator Us Search instrumentation? Volts Spectra Visibilities FBanks uv 32 MHz Dedisp F X Grid 2D FFT-1 36 GB/s x 512 x 512 x 256 x 1922 x 512 36 GB/s 1024 GB/s 32 bits 600 GB/s 30 GB/s 5 bits 200 GB/s 32 bits Fold FFT Spectra Pulsars <1 bit/s

  22. Search Timings • 36,000 “coherent beams” (768m/4m=192)2 • 36 gigapixels/s • Dedisperse/CPU core • Gigapixel/120s • 36 x 120 = 4320 cores = 500 machines = 250 kW • NFFT = 36,000 * 1024 (DMs)/8192 = 4608 FFTs/sec • Seek (3s / 8192 x 1024 pt FFT) • 14,000 cores ~ 1800 machines = MW. (M$/yr)

  23. Supercomputing @ Swinburne The Green Machine • installed May/June 2007 • 185 Dell PowerEdge1950 nodes • 2 quad-core processors (Clovertown: Intel Xeon 64-bit 2.33 GHz) • 16GB RAM • 1TB disk -> 300 TB total • 1640 cores/14 Tflops • dual channel gigabit ethernet • CentOS Linux OS • job queue submission • 20 Gb infiniband (Q1 2008) • 83 kW .vs. 130 kW cooling Machines: ~1.2M Fuel: ~100K/yr

  24. Search Times: • Depend only upon: • Npixels x Nchans x Tsamp-1 • Requires: • No acceleration trials • PSR J0437-4715 • In 8192s, small width from acceleration

  25. Search Timings (32x32 tiles) • 36000->1024 “coherent beams” • 36->1 gigapixels/s • Dedisperse/core • Gigapixel/120s • 120 = 120 cores = 15 machines = 7 kW • NFFT = 1024 * 1024 (DMs)/8192(s/FFT) = 128 FFTs/sec • Seek (3s / (8192 x 1024) pt FFT) • 378 cores ~ 50 machines = 25 kW.

  26. RRATs • Log N - Log S (helps with long pointings…) • 1000 x integration time. • Maybe good RRAT finder.

  27. Monitoring: Monitoring?

  28. Monitoring:

  29. Build Your Own Telescope? • May be cheaper to build dedicated PSR telescope than attempt to process everything from existing telescopes! • 32x32 tile: (2D FFT - 1D FFT - dedisperse - FFT) • ~2M telescopes • ~2M “beamformer/receivers” • ~1M correlator • ~1M Supercomputer • ~1M construction • ~7-8M

  30. Next-Gen Supercomputers (IO or Tflops?) • Infiniband 20 Gb (40Gb) • 288 port switch • ~10 Tb/s IO Capacity (1-2K/node) • Teraflop CPU capacities/node (140 Gflops now) • Teraflop Server or Tflop GPU? • 10 GB/s vs 76 GB/s • Power (0.1W/$) • 2M = 200 kW

  31. Architecture (2011??): 288 Ports 40 Gb/s 144 Tflops 288 Ports 40 Gb/s 144 Tflops FX 300K ~1M 300K ~1M

  32. Summary: • Strong motivation for multiple (~100) tied array beams • PSRs/deg^2 • Surveys only possible with compact configurations • At present • Future Supercomputers may allow search even with MWA-like telescopes

More Related