280 likes | 437 Views
Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST. PHYSTAT2003 S. Digel (Stanford Univ./HEPL). 10 September 2003. Outline. Introduction Gamma-ray astro- & astroparticle physics Important points about gamma-ray astronomy GLAST mission
E N D
Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST PHYSTAT2003 S. Digel (Stanford Univ./HEPL) 10 September 2003
Outline • Introduction • Gamma-ray astro- & astroparticle physics • Important points about gamma-ray astronomy • GLAST mission • LAT instrument design and nature of the data • LAT in perspective • Analysis needs from low to high level • Statistical issues • Some current approaches
Motivation: Wealth of astro- and astroparticle physics • Extragalactic • Blazars – most of their luminosity is in gamma rays • Other active galaxies – Centaurus A • Galaxy clusters? • Isotropic emission? • Gamma-ray bursts • In the Milky Way • Pulsars, binary pulsars, millisecond pulsars, plerions • Microquasars, microblazars • Supernova remnants, OB/WR associations, black holes? • Diffuse – cosmic rays interacting with interstellar gas and photons • WIMP annihilation? • Solar flares M87 jet (STScI) Crab pulsar & nebula (CXC) Common theme (except for WIMPS): Nonthermal emission, particle acceleration in jets and shocks
Some important points about gamma-ray astronomy • In the range up to ~50 GeV, the detector must be in space • In terms of the particle background, mass & power limitations, cost, review committees, etc., space is the last place you want to put it • Among other compromises, the collecting area and data rate are limited You need one of these
Important points (2) PSF • The angular response is really bad (for physics reasons) • On the other hand the field of view is truly enormous (the detector is not really a telescope) • Celestial fluxes are low (except for impulsive GRBs) • Photon number fluxes typically ~E-2 • The Milky Way is a bright, pervasive foreground • ~10% of flux at low latitudes is from point sources FOV γ-ray rates in LAT
Large Area Telescope on GLAST • 20 MeV to >300 GeV • Launch in late 2006 • 5-year design life (10-year goal) Spectrum Astro
e– e+ Design of the LAT for gamma-ray detection • Tracker18 XY tracking planes with interleaved W conversion foils. Single-sided silicon strip detectors (228 μm pitch). Measure the photon direction; gamma ID. • Calorimeter1536 CsI(Tl) crystals in 8 layers; PIN photodiode readouts. Image the shower to measure the photon energy. • Anticoincidence Detector (ACD)89 plastic scintillator tiles. Reject background of charged cosmic rays; segmentation limits self-veto at high energy. Tracker ACD Calorimeter • Electronics SystemIncludes flexible, robust hardware trigger and software filters.
LAT in perspective • Within its first few weeks, the LAT will double the number of celestial gamma rays ever detected
Simulated LAT (>1 GeV, 1 yr) Simulated LAT (>100 MeV, 1 yr) The Gamma-Ray Sky EGRET (>100 MeV)
Nature of the LAT Data • Events are readouts of TKR hits, TOT, ACD tiles, and CAL crystal energy depositions, along with time, position, and orientation of the LAT • Intense charged particle background & limited bandwidth for telemetry → data are extremely filtered • ~3 kHz trigger rate 30 Hz filtered event rate, ~3 Gbyte/day raw data, ~2 × 105 gamma rays/day T. Usher
Analysis needs • Reconstruction and classification of events • Charged particles vs. gamma-rays • Quality of reconstruction of energy, direction • Detection and characterization of celestial sources of gamma rays • Locations, spectra, variability & transient alerts, angular extents • Identification of sources & population studies • Counterparts and correlations Increasing level
Reconstruction of events • Pattern recognition • Starting with clusters of hits in TKR, find straightest, longest e± tracks using a combinatorial (brute force) approach • Track fitting via Kalman filtering • Multiple scattering is not Gaussian • Iterative with energy reconstruction from CAL • Vertexing • Find the conversion point (for gamma rays) and energy/direction • Issues: I’d guess they are in hand. Much experience with track finding algorithms in the collaboration. Would like better energy estimates from scattering. Jones & Tompkins (1998)
Classification of events • Classification trees for PSF & energy ‘pruning’ and charged particle rejection (W. Atwood) • Trained with Monte Carlo data • Must provide useful inputs; can’t make the tree do all the work A ‘tree’ represented in Insightful Miner • In LAT case, this has meant ‘flattening’ inputs to factor out general trends with energy and inclination angle. • Outputs are probabilities, e.g., of good energy measurements • Issues (general): Exploring relevant inputs, optimizing classification without tuning to the training data sets W. Atwood
Higher-level analysis: source detection and characterization • Low fluxes, pervasive celestial diffuse emission, and limited angular resolution drive the analysis to model fitting • The detector is characterized by its response functions • PSF, energy resolution, and effective collecting area • They depend on incident direction, energy, plane of conversion, etc. • Derived from beam tests and the detailed instrument simulation
Detection & characterization (2) • The Milky Way is the strongest ‘source’. • Many point sources are transient and detected over a few weeks only EGRET (>100 MeV) 3EG catalog (Hartman et al. 1999)
Detection and characterization (3) • Models are straightforward to define – radiative transfer is simple • Data-space version not as simple, but manageable • Likelihood analysis is widely used in γ-ray astronomy & we plan to use it for the standard high-level analysis tool for LAT data • Introduced by Pollock et al. (1981) for analysis of COS-B data, also used extensively for EGRET analysis.
…Specific issues for the LAT analysis • Computation of the likelihood function • Sensible level of detail in the high-level response functions – not too much and not too little • Binned vs. unbinned analysis, multidimensional normalization (aka exposure) • Practical optimization of multiparameter models & likelihood analysis tool for general use • Scanning observations – smearing of sources • Albedo cuts, residual charged-particle background • Systematic errors? Carnahan Earth is not small and we can’t see through it
…Specific issues for the LAT analysis (2) • Interpretation of the unbinned likelihood function in likelihood ratio tests • Protassov et al. (2002) reminder about LRTs not being valid for determining the number of components in a finite mixture model, i.e., evaluating whether a source is present • Can we cover ourselves using simulations?
Example of why it matters: Galactic Center (3EG J1746—2851) • Recent re-analysis of EGRET data • Unbinned to use detailed response functions • If the new analysis is actually better, the source is now not coincident with the Galactic center itself • Many plausible candidates exist, even if dark matter annihilation may no longer be one of them >5 GeV γ-rays 3EG source confidence region Arches cluster (~150 O stars) Hooper & Dingus (2002) 20 cm radio continuum Sgr A* CS (2-1) line Sgr A East Yusef-Zadeh (2002)
Higher-level characterization: Variability • A common characteristic, especially in extragalactic sources, but has been hard to study • LAT will at least have much better sampling in time and inflight monitoring of calibration should be much easier Pulsar Blazar Unident. (variable) Unident. (steady) McLaughlin et al. (1996), EGRET >100 MeV fluxes
Variability (2) Variability measures for unid. sources compared • At least 3 statistics have been published; interpretations are not always consistent; • Differences in how to incorporate upper limits • see Nolan et al. (2003) • Issues: Variability index useful for classification, a useful ‘trigger’ for issuing alerts Reimer (2001)
Variability (3) • Issues for Gamma-Ray Bursts • Analysis issues – extensively explored – are for time series, e.g., pulse decomposition Distribution of times between gamma rays for the 20th brightest GRB per year • LAT analysis will not be BATSE-like involving count rates and background subtraction • Deadtime will be an issue for the most interesting bursts LAT limit Norris & Bonnell
Variability: Periodic sources • Rotation-powered pulsars • Established methods exist to find upper limits on pulsation (with and without ephemerides) • Some implicitly assume a profile shape • Blind searches (some pulsars are radio quiet) • Problem: no template for pulse profile • Various statistical methods have been developed: • Epoch folding, FFT, Gregory & Loredo (1992-96) Bayesian • Also need to search position, period, period derivative and hope for no glitches • Issues: Probably in good hands
Beyond model fitting: Non-parametric analysis • Likelihood will answer only the questions that you ask • An ideal nonparametric analysis method would • Characterize extended sources & obviate need for a detailed model of the Milky Way • Do it quickly • Many methods are in use in astronomy • Wavelet (platelet, wedgelet) approaches for image analysis or time series (‘denoising’, source detection – including extended sources) • Multiscale analyses (wavelet transform or platelet image decomposition), with a prescription for deciding what terms are worth keeping (e.g., Willett & Nowak 2002 define the ‘penalized likelihood function’); ICA? • Issues: Interpretion of results (statistical significances); incorporating the detailed response functions Prototype CWT Analysis EGRET (>100 MeV) Terrier (2002)
Construction of the LAT source catalog • Issues: Criteria for inclusion, spurious sources • For EGRET catalog criteria were conservative to cover estimated systematic uncertainties (>5σ for |b| < 10°) • Spurious source rate • Mattox et al. (1996) – simulation of distribution of likelihood test statistic – effective beam size. • ‘Trials factor’
Source identification • Positional coincidence is not nearly good enough • Source localization is poor (~1° for EGRET, ~several arcmins for LAT) • Counterpart densities are high Hartman et al. (1999) • Ideally, for an established population of sources, other information can be used (e.g., spectral hardness or correlated variability of the potential counterparts) • Issue: Quantitative assignments of confidence levels of association of sources, how to establish a new source class Sowards-Emmerd et al. (2003)
Population correlations • Weaker than finding counterparts • Correlations of gamma-ray sources with SN/OB associations was noted already in COS-B era (e.g., Montmerle 1979) • Recent work on correlations of unidentified EGRET sources • Supernova remnants, OB associations, WR stars, pulsars (e.g., Romero et al. 1999) • Galaxy clusters (e.g., Colafrancesco 2002 +Kawasaki & Totani 2002; Scharf & Mukherjee 2002 correlate clusters with EGRET data directly; Reimer et al. 2003 ‘stack’ the obs.) • Issues: Characterization of populations to enable useful correlations, validation via simulation Reimer et al. (2003)
Conclusions • Great advances in gamma-ray astronomy can be expected with GLAST • Maximizing the scientific return will require addressing the statistical issues at every level in the data analysis EGRET Phases 1-5 >100 MeV LAT Simulation