230 likes | 401 Views
Jeffrey D. Scargle Space Science and Astrobiology Division NASA Ames Research Center Fermi Gamma Ray Space Telescope. Special thanks: Jim Chiang, Jay Norris, and Greg Madejski, … Applied Information Systems Research Program (NASA)
E N D
Jeffrey D. Scargle Space Science and Astrobiology Division NASA Ames Research Center Fermi Gamma Ray Space Telescope Special thanks: Jim Chiang, Jay Norris, and Greg Madejski, … Applied Information Systems Research Program (NASA) Center for Applied Mathematics, Computation and Statistics (SJSU) Institute for Pure and Applied Mathematics (UCLA)
Bin-free Algorithms for Estimation of … • Light Curve Analysis (Bayesian Blocks) • Auto- and Cross- • Correlation Functions • Fourier Power Spectra (amplitude and phase) • Wavelet Power • Structure Functions • Energy-Dependent Time Lags (An Algorithm for Detecting Quantum Gravity Photon Dispersion in Gamma-Ray Bursts : DisCan. 2008 ApJ 673 972-980) … from Energy- and Time-Tagged Photon Data … with Variable Exposure and Gaps
All of this will be in the Handbook of Statistical Analysis of Event Data … funded by the NASA AISR Program MatLab Code Documentation Examples Tutorial Contributions welcome!
Variable Source Propagation To Observer Photon Detection • Luminosity: random • or deterministic • Photon Emission • Independent Random • Process (Poisson) • Random • Scintillation, • Dispersion, etc.? • Random Detection • of Photons (Poisson) Correlations in source luminosity do not imply correlations in time series data!
The Wold - von Neumann Decomposition Theorem X = C * R + D Moving Average Process Any stationary process X can be represented as the convolution of a constant pulse shape C and a (white) random process R plus a linearly deterministic process D.
Time Series Data Time-Tagged Events Binned Event Times Time-To-Spill Point Measurements Mixed Modes Binning Any Standard Variability Analysis Tool: Bayesian blocks, correlation, power spectra, structure Fixed Equi-Variance
Area = 1 / dt n / dt E / dt dt
Area = 1 / dt’ n / dt’ E / dt’ dt’ = dt × exposure
Bayesian Blocks Piecewise-constant Model of Time Series Data Optimum Partition of Interval, Maximizing Fitness Of Step Function Model Segmentation of Interval into Blocks, Representing Data as Constant In the Blocks -- within Statistical Fluctuations Histogram in Unequal Bins -- not Fixed A Priori but determined by Data Studies in Astronomical Time Series Analysis. V. Bayesian Blocks, a New Method to Analyze Structure in Photon Counting Data, Ap. J. 504 (1998) 405. An Algorithm for the Optimal Partitioning of Data on an Interval," IEEE Signal Processing Letters, 12 (2005) 105-108.
The optimizer is based on a dynamic programming concept of Richard Bellman best = [ ]; last = [ ]; for R = 1: num_cells [ best(R), last(R) ] = max( [0 best] + ... reverse( log_post( cumsum( data_cells(R:-1:1, :) ), prior, type ) ) ); if first > 0 & last(R) > first % Option: trigger on first significant block changepoints = last(R); return end end % Now locate all the changepoints index = last( num_cells ); changepoints = []; while index > 1 changepoints = [ index changepoints ]; index = last( index - 1 ); end Global optimum of exponentially large search space in O(N2)!
Cross- and Auto- Correlation Functions for unevenly spaced data Edelson and Krolik: The Discrete Correlation Function: a New Method for Analyzing Unevenly Sampled Variability Data Ap. J. 333 (1988) 646
for id_2 = 1: num_xx_2 xx_2_this = xx_2( id_2 ); tt_2_this = tt_2( id_2 ); tt_lag = tt_2_this - tt_1 - tau_min; % time lags relative to this point index_tau = ceil( ( tt_lag / tau_bin_size ) + eps ); % The index of this array refers to the inputs tt and xx; % the values of the array are indices for the output variables % sf cd nv that are a function of tau. % Eliminate values of index_tau outside the chosen tau range: ii_tau_good = find( index_tau > 0 & index_tau <= tau_num ); index_tau_use = index_tau( ii_tau_good ); if ~isempty( index_tau_use ) % There are almost always duplicate values of index_tau; % mark and count the sets of unique index values ("clusters") ii_jump = find( diff( index_tau_use ) < 0 ); % cluster edges num_clust = length( ii_jump ) + 1; % number of clusters for id_clust = 1: num_clust % get index range for each cluster if id_clust == 1 ii_1 = 1; else ii_1 = ii_jump( id_clust - 1 ) + 1; end if id_clust == num_clust ii_2 = length( index_tau_use ); else ii_2 = ii_jump( id_clust ); end ii_lag = index_tau_use( ii_1 ); % first of duplicates values is ok xx_arg = xx_1( ii_tau_good( ii_1 ): ii_tau_good( ii_2 ) ); sum_xx_arg = xx_2_this .* sum( xx_arg ); vec = ones( size( xx_arg ) ); cf( ii_lag ) = cf( ii_lag ) + sum_xx_arg; % correlation and structure fcn sf( ii_lag ) = sf( ii_lag ) + sum( ( xx_2_this * vec - xx_arg ) .^ 2 ); nv( ii_lag ) = nv( ii_lag ) + ii_2 - ii_1 + 1; err_1( ii_lag ) = err_1( ii_lag ) + sum_xx_arg .^ 2; err_2( ii_lag ) = err_2( ii_lag ) + std( xx_2_this * xx_arg ); end % for id_clust end end % for id_2
Summary: A variety of new and standard time series analysis tools can be implemented for time- and/or energy tagged data. Future: Many applications to TeV and other photon data. Handbook of Statistical Analysis of Event Data Contributions welcome! Automatic variability analysis tools for High Energy Pipelines: