320 likes | 343 Views
Massive Radio Astronomy Surveys at Arecibo Using ALFA: Data Mining and Management. The ALFA system: Large-scale surveys at Arecibo Consortia for pulsar, Galactic, and extragalactic science. Data management Analysis and archival of massive data sets Mining of intermediate data products
E N D
Massive Radio Astronomy Surveys at Arecibo Using ALFA: Data Mining and Management • The ALFA system: • Large-scale surveys at Arecibo • Consortia for pulsar, Galactic, and extragalactic science. • Data management • Analysis and archival of massive data sets • Mining of intermediate data products • Implementation at the Cornell Theory Center • Eventual linkage to National Virtual Observatory • Student involvement in end-to-end aspects of the project Jim Cordes URSI 2006
Constructed at ATNF, installed 2004 Apr 7 beams x 2 pol (linear) 1225-1525 MHz SEFD = 2.4 to 3 Jy Current spectrometers: 100 MHz fast-dump 3-lev correlators (WAPPs) GALFA polyphase FB (UCB) Next generation (2006): 300 MHz polyphase filter banks (J. Mock, UCB) 3.4 arcmin beams on an 11 arcmin x 13 arcmin ellipse Jim Cordes URSI 2006
ALFA Science Goals: Massive Surveys • Extragalactic atomic hydrogen (H I) surveys: (E-ALFA) • Zone of avoidance galaxies • All (Arecibo) sky search for low-mass H I clouds and galaxies • High-velocity clouds • Galactic science: (G-ALFA) • Low-and-high-latitude H I surveys • Interstellar turbulence • Disk-halo connection (chimneys & fountains) • Galactic structure and dynamics • Continuum surveys • Galactic synchrotron radiation (magnetoionic medium) • Pulsar science: (P-ALFA) • Deep surveys + follow up pulse timing studies • SETI: to be done simultaneously with pulsar surveys http://alfa.naic.edu/ (specifications, memos, consortia) Jim Cordes URSI 2006
ALFA Scientific Consortia • P-ALFA, G-ALFA, E-ALFA (ALFALFA) • 35 to 50 members/consortium • Open participation, self-organized • Astrophysics goals are consortium driven • Fraction of telescope time for ALFA surveys: up to 50% in any LST range • Baseline processing using Consortium software • Data mining advances by individual research groups; enabled by new algorithms used with data storage and processing capability • Surveys will take > 5 years to complete + years needed for followup • Surveys will be long term NAIC legacies for astrophysics appropriate infrastructure e.g. radio-GLAST (-ray) synergy • Challenges: RFI excision, data rates and volumes, distinguishing astrophysical signals from noise and RFI. Jim Cordes URSI 2006
http://egg.astro.cornell.edu/alfalfa Survey of 7000 square degrees of high gal. latitude sky Neutral atomic hydrogen (HI) in the Milky Way and the nearby universe; also OH megamasers at 0.16 < z < 0.25. Covers the HI line with Vhelio from -2000 to +18000 km/s (100 MHz centered at 1385 MHz) with 5 km/s resolution Will take ~6-7 years to complete; started Feb 4, 2005. 7-pixel Arecibo L-band (1.4 GHz) Feed Array (ALFA) camera Minimum intrusion, 2-pass, drift scan technique Predicted to detect HI in 20,000 galaxies P.I. - Riccardo Giovanelli (Cornell) 42 team members from 28 institutions in 11 countries Jim Cordes URSI 2006
ALFALFA Data Products Precursor data available SQL database PHP interface Download catalog in XML/VOTable format Spectra Cross reference with DSS, 2MASS and SDSS images NVO work led by Brian Kent Jim Cordes URSI 2006
Pulsar Science • Extreme matter physics • 10x nuclear density: teaspoon ~ 108 tons • High-temperature superfluid & superconductor • B ~ Bq = 4.4 x 1013 Gauss • Voltage drops ~ 1012 volts • FEM = 109Fg = 109 x 1011FgEarth • Relativistic plasma physics • Magnetospheres • Radiation mechanisms • Tests of theories of gravity • Gravitational wave detectors • Probes of turbulent and magnetized ISM (& IGM) • End states of stellar evolution • Massive stars neutron stars or black holes Jim Cordes URSI 2006
PALFA Survey Goals 103 new pulsars • Galactic plane survey: |b|<5, 32 < l < 77, 400s dwell times • Intermediate latitude survey: 5 < |b| < 15 for MSPs, NS-NS • Reach edge of Galactic population for much of luminosity function • High sensitivity to millisecond pulsars (dedispersion) • Dmax = 2 to 3 times greater than for Parkes MB Sensitivity to transient sources (algorithms) Data management: • Keep all raw data (~ 1 Petabyte after 5 years) at the Cornell Theory Center (CISE grant: $1.8M) • Database of raw data, data products, end products • Web based tools for Linux-Windows interface (mySQL ServerSQL) • VO linkage (in future) Jim Cordes URSI 2006
Surveys with Parkes, Arecibo & GBT. Simulated & actual Yield ~ 1000 pulsars. Jim Cordes URSI 2006
First Results 11 new pulsars: 1 binary, 1 w/ gamma-ray counterpart, 1 transient source ApJ 20 Jan 2006 ApJ Mar 2006 Exploited database of Parkes Multibeam survey + multiobservatory followup Jim Cordes URSI 2006
A pulsar found through its single-pulse emission, not its periodicity (c.f. Crab giant pulses). Algorithm: matched filtering in the DM-t plane. ALFA’s 7 beams provide powerful discrimination between celestial and RFI transients Jim Cordes URSI 2006
Nature, in press • 11 sources found in reanalysis of Parkes MB survey • missed in periodicity search • Pulse rates ~ 0.3 to 20 pulses hr-1 • Extreme cases of pulse nulling? • Implied Galactic population > “normal” pulsar population (i.e. ~ 2105 objects) Jim Cordes URSI 2006
Discovery: Fast Astrophysical Signals • Require source to be very compact • size < light travel time emission regions in stellar atmospheres or very compact stars (neutron star or black hole) or coherent emitters from ETI? • Useful as probes of intervening media • A full inventory of the radio sky is yet to be done • Parameter space coverage relies on advances in computational hardware and software tools • Surprises are the order of the day Jim Cordes URSI 2006
Data Flow and Processing Reobserve list Web access tools
Processing Basic Data Units • Traditional: • a narrow search for periodic, dispersed signals • blunt instrument approach to RFI: discard data • Our approach: • seek a wide range of signal types • classify all non-noise events and signals, whether celestial or terrestrial, whether natural or artificial • Enabling infrastructure: • access to large volumes of raw data • high throughput processing • global analyses of data products through custom database Jim Cordes URSI 2006
P Frequency RFI Time Interstellar Scintillation Astrophysical effects are typically buried in noise and RFI Jim Cordes URSI 2006
Student Involvementin addition to traditional data analysis, interpretation, and paper writing End to end participation: • Survey observations • On site and remotely • Quicklook pulsar analysis • Linux cluster + search algorithms (degraded res) • Data integrity + immediate gratification (21 discoveries) • Database development • MySQL db at Arecibo for survey logistics • Server SQL db at CTC for data products/mining • Data processing and algorithm development at off-island processing sites • RFI mitigation (excision in frequency-time plane) • Acceleration searches for binary pulsars • Follow up studies (timing, imaging, multi- obs) • Web services, PHP interfaces to NVO • Participation of EALFA, PALFA students in NVO summer schools, posters at AAS next week Jim Cordes URSI 2006
Summary Points • ALFA = 7 beam system enables large-scale surveys at Arecibo • Legacy data products for Galactic-Xgal science • Survey planning, remote observing, data analysis, algorithm development, database development, discovery engage students in radio astronomy in new ways • Computational and storage facilities allow • More comprehensive and open-ended analyses • More ambitious data archival (1 Petabyte of raw data) • Better tools for data mining (short term, long term) • Data management/VO: an area of interest for Commission J! Jim Cordes URSI 2006
Extra Slides Jim Cordes URSI 2006
Data Mining the Astrophysical Sky • Observation Space: • Data model: stochastic signals time scales: ns to years noise with known statistics Jim Cordes URSI 2006
Why do more pulsar surveys? • Astrophysics of neutron stars • A recent important discovery (a NS-NS binary) • What can Arecibo/ALFA do better? • International research groups • Student involvement • Undergraduate students at Cornell • Graduate students at Cornell • A620: Large-scale Surveys in Radio Astronomy (Spring 2004) • http://astro.cornell.edu/academic/courses/astro620/ • Multi-wavelength astrophysical community Jim Cordes URSI 2006
Binary Pulsars to Earth pulsar companion star Jim Cordes URSI 2006
DM Frequency time time |FFT(f)| 1/P 2/P 3/P Pulsar Periodicity Search FFT each DM’s time series Jim Cordes URSI 2006
The First ALFA Pulsar Jim Cordes URSI 2006
Data Management • Sampling the radio sky • Real-time and average data rates • Survey algorithms • Approximations to matched filtering • Dedispersion + FFT + harmonic summing + threshold tests • Single-pulse searching • Binary orbital effects • Processing times • Intermediate data products Jim Cordes URSI 2006
ALFA Survey Projects • Extragalactic • ALFALFA drift scans of entire Arecibo sky • Deep Survey • AGES • ZOA zone of avoidance galaxies • Galactic • HI Survey • GALFACTS (G-ALFA Continuum Transit Survey) • Pulsar • Galactic plane survey Jim Cordes URSI 2006
ALFALFA Science Goals * A Legacy Survey: HI in the Nearby Universe * The HI Mass Function and the "Missing Satellite Problem" * Galaxy Evolution and Dynamics within Local Large Scale Structure * The Extent and Origin of HI Disks * The Nature of High Velocity Clouds * A Blind Survey for 21 cm Absorbers at z < 0.06 * A Blind Survey for OH Megamasers at 0.16 < z < 0.25 * Comparison with other surveys Jim Cordes URSI 2006