Gamma-ray Large Area Space Telescope

Gamma-ray Large Area Space Telescope Application of a multidimensional wavelet denoising algorithm for the detection and characterization of astrophysical sources of g-rays S. W. Digel Stanford Linear Accelerator Center B. Zhang Quantitative Image Analysis Unit, Institut Pasteur J. Chiang Joint Center for Astrophysics/Physics Department, University of Maryland, BaltimoreCounty J. M. Fadili GREYC CNRS UMR 6072, Image Processing Group J.-L. Starck Service d'Astrophysique CEA/Saclay

Outline • What is the application? • GLAST Large Area Telescope observations of the high-energy gamma-ray sky • Characteristics of the sources and of the data • Denoising via wavelet domain hypothesis testing

e– e+ GLAST Large Area Telescope (LAT) • Within its first few weeks, the LAT will double the number of celestial gamma rays ever detected • 5-year design life, goal of 10 years Spectrum Astro 1.8 m Tracker ACD 3000 kg Calorimeter

Some relevant details • Response functions • Angular resolution and collecting area are great for a g-ray telescope, but poor for an astronomical instrument • Observing strategy • FOV is huge (>2 sr), and the earth is a bright background • For various reasons GLAST will be in low earth orbit – so scan and rock while pointing away from the earth • Astronomical analyses are underpinned by event reconstruction and classification, which we will take for granted here as being done optimally well

Gamma-ray sky • High-energy gamma rays are produced by interactions of high-energy particles (surprise), accelerated in a remarkable variety of ways • Blazars, Pulsars, Plerions, SNR, OB star associations, microquasars, Gamma-Ray Bursts are known or plausible classes of sources • The universe is basically transparent to gamma rays in the range up to ~10s of GeV • EGRET detected gamma rays from sources as close as the moon and as distant as redshifts of a few • Unlike cosmic rays, gamma rays point back to their sources M87 jet (STScI) Crab pulsar & nebula (CXC)

Gamma-ray sky (2) • Cosmic-ray interactions with the interstellar medium and interstellar radiation field make the Milky Way itself a bright, structured celestial foreground • The gamma-ray sky is far from quiescent; blazars are intrinsically remarkably variable • Many EGRET blazars were detected only once, while flaring EGRET (>100 MeV)

More on gamma-ray sky • Celestial fluxes are low & constraints (financial and physical) severely limit the collecting area and angular resolution • Gamma-ray astronomy is quite expensive – GLAST should be a relative bargain in this field at ~50 cents per celestial gamma ray • Source detection and characterization will require careful analysis >1 GeV in Virgo cluster EGRET data LAT simulation

The Problem • We’d like to detect all of the celestial sources in the data, and to recognize when a blazar [or member of another other class of source] are flaring (and quite likely being detected for the first time) • Some of this is real time searching (keeping up with ~6 downlinks and 8 sky coverages per day) – although the very brightest flares worth alerting other astronomers about are easiest to detect • Some is wanting to be careful not to miss a blazar that turns on for a week or two • We do not know in advance where the sources are that we will detect • For blazars projections are that we will detect several thousand, i.e., a few times more than are known at other wavelengths

More on the problem • In high-energy gamma-ray astronomy, the standard analysis technique has been model fitting, likelihood analysis to detect and characterize sources • Searching for new sources with likelihood analysis is slow, brute force • Also does not extrapolate well to finding extended sources • The utility of a good, ideally nonparametric method for initially detecting and characterizing gamma-ray sources in LAT data is clear • Within the LAT collaboration, several algorithms have been investigated at some level • Optimal filtering, aperture photometry, and wavelet-based (followed by source detection step)

E X Y Wavelet domain hypothesis testing • From Zhang, Fadili, & Starck et al. (2005) • For this analysis, the data are treated as 2D+1D, maps of photon counts in planes with different energy ranges • Photon counts v(x, y, E), and typically 0-few counts per pixel/bin • Indexing the pixels/bins (I the index set) • vi’s are iid Poisson is the underlying image (intensity × exposure).

Requirements of the method • Get an estimate of the intensity using statistical estimation/decision theory in the wavelet domain • Must be efficient enough in very low-count setting • Regularity preservation (e.g., isotropic structures) • Flux preservation

Wavelet Poisson intensity estimation: Overview • Transformation-based methods • Variance stabilizing transforms (e.g. Anscombe) [Donoho 93] • Gaussianizing transforms (Fisz) [Frylewizc 04][Fadili 04] • Classical direct methods • Wavelet histogram auto-convolution [Bijaoui et al. 01] • l1 penalized estimator [Sardy et al. 03] • Adaptive Wiener filter [Nowak et al. 99] • Modulation estimators [Antoniadis et al. 01] • Bayesian direct methods • Bayesian mutliscale models [Kolaczyk 99,04] • Mixture-based prior [Timmerman 99],[Lu 04] • Hypothesis testing methods [Kolaczyk 99][Bo et al. 04]

Formulation of the method • The background intensity, if not known a priori, is assumed to be locally constant (in the wavelet support). • Can be estimated from the approximation coefficients at a coarser scale. • Define the binary hypothesis testing for each wavelet coefficient w: • H0 : w is consistent with the background; • H1 : w is inconsistent with the background. • H0 coefficients (consistency with the background with statistical fluctuations) are zeroed • H1 coefficients (significant change from the background) are retained • This is a controlled (via the p-value) hard thresholding scheme.

Hypothesis testing of individual coefficients • The coefficients are tested separately without correction for multiple comparisons. The probability of false detection is upper bounded by p: • If the pdf under the null is symmetric around the (unique) mode at 0, the double-sided critical threshold at significance level p is • For computational tractability, we use the Haar wavelet, thanks to the availability of the close form for its pdf • Zhang et al. also show that the same thresholds can be applied to the coefficients of biorthogonal Haar wavelets, giving an excellent regularity preservation.

Deriving the thresholds • The pdf of Haar coefficients follows a Non-Central Chi-Square distribution • Using the Fisher normal approximation, threshold can be derived for every Gaussian significance level • It has a higher level of detection power • This decision threshold is found to be sensitive to low-count situations • We have extended it to non-constant, or unknown a priori, background intensity

Extension to the energy dimension • A 2D bi-orthogonal haar (BH) transform is applied spatially to each image at energy band i • To each BH spatial coefficient, a 1D BH transform is applied along the energy axis • Apply the hypothesis testing estimator to the 2D+1D BH coeffs • Apply inverse 1D BH transform along energy axis • Apply inverse 2D BH spatially to get the final estimated multi-spectral data • Simulations show that the method offers good flux preservation invarious count settings

Application to simulated LAT data • Power-law spectra with range of fluxes (factors of sqrt(2) from 1 x 10-8 cm-2 s-1 ) and spectral indexes (1.6, 1.8, 2, 2.2, 2.4), one month scanning sky survey, counts summed over energy Point sources + background Point sources only After denoising

More examples • Fluxes as before, hard spectral breaks at 500, 1000, 2000, 4000, 8000 MeV Point sources + background After denoising

More examples (2) • One-month observation at high Galactic latitudes, source flux 5 x 10-8 cm-2 s-1

Profiles in latitude More examples (3) • Detecting 5 x 10-8 cm-2 s-1 sources against the Milky Way Point sources + Galactic + extragalactic diffuse After denoising, with model for diffuse emission supplied After denoising

More examples (4) • Recovery of spectra looks possible • A source from the previous simulation Expected E-1 slope in counts per bin

Summary • Motivations for a reliable nonparametric source detection algorithm to apply to GLAST LAT data are clear • Especially for the relatively short time ranges over which we will want to study sources, the data will be squarely in the low counts regime with widely varying response functions and significant celestial foregrounds • The hypothesis testing wavelet denoising algorithm of Zhang et al. has firm statistical underpinnings, is demonstrably sensitive relative to other wavelet approaches, and allows spectra to be recovered

Gamma-ray Large Area Space Telescope