1 / 38

Time Series Analysis and Variability Detection in Astrophysics

Learn about time series analysis in astrophysics, including the use of light curves and time-tagged event data. Understand the differences between frequentist and Bayesian approaches, and explore techniques such as the discrete Fourier transform and periodograms. Discover how to detect variability in different types of signals, from periodic to burst-like phenomena.

llear
Download Presentation

Time Series Analysis and Variability Detection in Astrophysics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 15 • Time series (aka Light Curves) • Time Tagged Event (TTE) data • Frequentist vs Bayesian variability analysis. • The discrete Fourier transform (DFT) • Periodograms • 2D Fourier transforms • The Hankel transform

  2. Time series (aka light curves) • Binned or unbinned. (Nearly everything is binned at some level, so quasi unbinned is perhaps more appropriate.) • Some varieties encountered: • Periodic (more usually, quasi-periodic) signals • Mira or Cepheid variables • Pulsars • Bursts or flares • Fundamental question: is the source variable or not. • This is just another signal-detection problem. • We can use the Null Hypothesis approach.

  3. (Quasi) unbinned data: Time Tagged Events. • Often encountered in high-energy astronomy, in which individual photons are counted. • Basic information is the arrival time of the photon. • Also recorded may be its energy & direction. • In HE astronomy one often speaks, somewhat cautiously, of events rather than photons • because not all detections may be caused by a photon. • Basic quantity: the count rate (events s-1). • The arrival times are a Poisson process (ie independent, uncorrelated).

  4. Variability detection example • Let us use a Cash likelihood ratio statistic in an attempt to test the null hypothesis against a simple model, for TTEs in an interval T. • The null hypothesis says that the expectation value for the count rate is constant – ie that there is no variability. This model can be written • For the non-null model, let us choose a 2-block model:

  5. In pictures:

  6. 2-block comparison continued • There are 3 model parameters: • the average rate A; • the ‘change point’ Tc; • the fraction f of the total counts in block 1. • Note that the NH model, m1, can be expressed by a particular combination of parameters of m2. (Ie when f=Tc/T.) As mentioned when the Cash statistic was first described, this is necessary for the theory to be valid.

  7. Generic likelihood for TTE data? • A cunning trick: we introduce a ‘virtual bin’ of arbitrary size Δt. • The hope is that this Δt will cancel from the final expression. • An integer number of events will fall within the bin. • Hence the probability P(n,t) for an given number n in a bin centred on t will be Poisson: where

  8. In the limit Δt ->0: • Λ→m(t) Δt • P(n,t) → 0 for n>1. • In other words, if we make the bin small enough, the only possibilities are that it contains either 0 or 1 event. • P(0,t) = exp[-m(t) Δt] • P(1,t) = m(t) Δt exp[-m(t) Δt]

  9. The total likelihood: • If there are M virtual bins (ieM=T/ Δt) then the total likelihood is the product of the probabilities in each bin: • It’s fairly easy to show that this gives where N is the total number of events (ie, photons) detected in the time T. • This expression still depends on the arbitrary parameter Δt. But with a likelihood ratio, we expect this to cancel out.

  10. The Cash likelihood ratio: • Note the Δts cancel as desired; so do the exponential terms. • With the 2-block model this becomes: where N1 is the number of events in block 1 and τ=Tc/T.

  11. Pros and cons of this ‘frequentist’ approach. • With this we can do the following: • Find the best-fit values of A, τ and f. • Use the Cash theory to calculate a value for the probability of the Null Hypothesis. • What we can’t do with it is • Obtain full, joint probability distributions for A, τ and f; • Make use of prior knowledge of same; • Obtain the relative probabilities for the 1- vs 2-block models as a whole. • For this we need a Bayesian approach.

  12. A Bayesian approach • Bayesian statistics is not just some different way to approach probabilistic problems, it is the formally correct way to approach them. • The difficulties seem to be • conceptual (the human brain is very poor at formally correct statistical reasoning!) • practical evaluation of the integrals. • Gregory and Loredo ApJ 398, 146 (1992) • Read sections 1 and 2 carefully, about 5 times, and you will begin to understand Bayesian parameter estimation. • My approach follows theirs closely.

  13. Approach to Bayes’ theorem. • Remember joint and conditional probability densities in slide 18 of lecture 3. • This works also for probabilities for discrete propositions A and B: • Suitable propositions for our situation: • AD: “Such-and-such a pattern of events was detected.” • BMi: “The data are explained by model i.”

  14. Approach to Bayes’ theorem. • If we include an explicit dependence on environmental information I, this gives: • Rearrangement gives Bayes’ theorem: • In words:

  15. Global likelihood for all models • This, the denominator of Bayes’ theorem, can be thought of as a normalizing constant. Ie: = sum over (prior x likelihood) for all models in the class. • Ensures that a sum over all posteriors (the LH side of Bayes’ theorem) equals 1.

  16. Odds ratio • It can be difficult to calculate the denominator, the global likelihood for the entire class of models, from first principles. • Better to form a ratio of the posterior probabilities for 2 models. P(D|I) will then cancel out. • This is called a Bayesian Odds Ratio:

  17. How to calculate the likelihood for model Mi. • If we have no prior reason to favour one model over another, we can set the ratio of priors to 1. • That leaves the problem of calculating the global likelihood P(D|M,I) for each model. • This itself plays the role of denominator, or normalizing constant, in a ‘deeper’ version of Bayes’ theorem, one which deals with the parameter values:

  18. Bayes’ theorem applied to the probability density of the parameters. • Note: • I have suppressed the Is for compactness. • I’ve used small p for continuous functions of the parameters (probability densities), but large P for single numbers (probabilities). • We want the LH side, when integrated over Θ, to equal 1. Hence…

  19. Global likelihood for model M • In words:

  20. A Bayesian procedure: Deduce the likelihood function for each parameter. GL for the competing model Decide on priors for each parameter. Decide on global priors. Integrate the product of (prior x likelihood) to get the global likelihood (GL) for the model. Apply Bayes’ Theorem: Form an Odds Ratio to compare the two models. Posterior joint probability density for the parameters.

  21. Applied to the present problem: • Likelihood function: use the ones we already worked out for the 1- and 2-block models: • Prior distributions: • Choose p(A|M)=1/Amax between 0 and Amax. • Amax is arbitrary, but it will cancel. • Choose p(f|M)=1 and p(τ|M)=1 between 0 and 1.

  22. Expressions for global likelihood: • Priors for the models: choose them equal. • Odds ratio therefore is

  23. Points to note: • The odds ratio doesn’t depend on Δt, T or Amax • Even a very simple model + simple priors can generate a complicated integral. • This particular one can be evaluated – it is just a long, fiddly business. • From Bayes’ theorem, the posterior probability density evaluates to • This is proportional to the Cash formula, but now we know it is correct.

  24. Example

  25. The DFT • Related to Fourier expansion. • Cyclic: Fk+mxN=Fk. • Use with caution! It will not approximate a FT without special treatment: • ‘Zero padding’ • Vignetting. • High N. • A fast algorithm exists – the Fast Fourier Transform (FFT). • For this, N should be a product of low primes.

  26. Zero-padding example Fourier transform

  27. Zero-padding example Discrete Fourier transform (DFT)

  28. Zero-padding example DFT with zero padding

  29. Problems with uneven sampling Back to the full FT

  30. Problems with uneven sampling FT of unevenly-sampled data

  31. Problems with uneven sampling • Lomb-Scargle periodogram. • Much literature on: • finding peaks • calculating uncertainties. • See eg Press et al, chapter 13.8

  32. 2D Fourier pairs fringes  point (delta function).

  33. 2D Fourier pairs higher spatial frequency  further from the origin.

  34. 2D Fourier pairs multiplication  convolution.

  35. 2D Fourier pairs gaussian  gaussian.

  36. ‘Natural’ vs ‘DFT’ origins

  37. The Hankel transform • If a function has radial symmetry, eg then so does its 2D Fourier transform: • In this case, a short cut to the 2D FT is the Hankel transform: • J0 is the Bessel function of order zero.

  38. Hankel transform example

More Related