Lecture 8

Lecture 8 • Source detection

Different sorts of model All models Background + signal Background + many similar signals

b + svsb + Σsi • s may often be assumed to be: • slowly varying with r; • with compact support

Source detection • The basic idea is related to Null Hypothesis testing… • But if the sources can be assumed to be localized, we can cut the data up and test each source-sized bit at a time. • sliding window. • Some missed jargon: • the probability at the intercept • is called the P-value (you can • google it) Survival function

Testing the NH: • Not all tests are equally good at finding signals! • Eg Cash statistic is better than χ2 (in circumstances where the Cash test is appropriate – eg bkg is a subset of the signal model). • Cash stat makes use of knowledge about the signal shape – in general any stat which does similar (eg a matched filter) will also perform well. • There is an infinite variety of ‘statistics’ to choose from.

Source detection. • If the SF probability in each patch (the P-value) is smaller than a previously chosen cutoff, we can call this a positive detection. • BUT! Note that there is no certainty. • Sometimes the null model will by chance give a large χ2 => ‘false positives.’ For given data, background and cutoff, there will be a fixed number of false positives expected in the source list. • => ‘reliability’. More on this later. • Sometimes a real source will give a small null-hypothesis χ2 => ‘false negatives’, real sources which are missed. • => ‘completeness’. More on this later.

Problems with the NH approach: • We don’t have exact knowledge of the background. • Have to estimate it either from • separate data – in which case we need the separate data! (Don’t always have the luxury.) • or from the same data… but this may be dominated by the source... • Or our background model may be wrong. • Same issues as other model fitting. In particular: • χ2 has to be used with care when the noise is Poisson.

But where are the sources? • Applying some sort of NH test in a sliding window will return a new random signal – now correlated.. • Finding the sources consists rather of looking for peaks in this random signal. • The simplest example is when the noise is uncorrelated and the source peaks have width=0.

Looking for sources 1 channel at a time: • In each channel, we test the NH with N=1. • Since there are no fitted parameters, υ=1 also. • If the source occupies a single channel, this procedure is optimal. • If, however, the source is spread over several channels (as is usual), this procedure is not efficient. • We want a statistic which uses the maximum amount of information about the source shape.

A generic source-detection algorithm • We shall assume that: • The data is ‘binned’ (eg CCD data). • We have a good independent estimate of the background. • The sources are sparsely distributed – such that we can deal with them one at a time. • The shape of the source profile is known. • The source position is unknown. • The source amplitude is unknown (but >0).

Generic source-detection algorithm: The algorithm has 3 steps: 1: Calculate a sliding-window map. 2: Find the peaks in this map. Choose a Pcutoff 3: For each peak, calculate the probability that it could arise by chance from the background (the null hypothesis P-value). P < Pcutoff? No Yes Sources Rejects

1: The sliding window. y U y U y U

1: The sliding window. • For each position of the sliding window, a single number U is calculated from the values falling within the window. • The output is a map of the U values. • The intent is to: • Raise the signal-to-noise • Improve sensitivity • Amplify the sources at the expense of the noise. • Sliding-window processing only has value when the source has a width > 1 pixel. • Edges need special treatment. Same thing.

1: Window functions • A weighted sum (= a convolution). • Simplest with all weights = 1: “sliding box”. • Optimum weights – a “matched filter”: • For uniform Gaussian noise, wopt = s. • Trickier to optimize for Poisson noise. • Per-window null-hypothesis χ2. • With either an independent value of bkg (in which case degrees of freedom = number of pixels Nw in the window), or… • …one fitted from the data (deg free = Nw-1). • Likelihood (same bkg provisions as χ2).

1: Window functions Parent function Data

1: Window functions Parent function Chi squared, size=100 Matched filter, size=10 Log-likelihood, size=100

2: Peak finding Gaussian noise, convolved with a gaussian filter. …don’t get the gaussians mixed up!

2: Peak finding • How best to do it? • There’s no single neat prescription. • Naive prescription: • Pixel i is a peak pixel if yi > any other y within a patch of pixels from i-j to i+j. • This probably looks familiar to you. • But what value to choose for j? • Things to avoid are: • j too small – results in more than 1 peak per source; • j too large – misses a close adjacent source.

2: Peak finding Box too small: Box too large:

3: Decision time – is it a source or not? • To calculate a P-value we need the probability distribution of peaks in the post-window map of U values (given the null hypothesis). • This is not the same as the probability distribution of the original data values… • …nor is it even the same as the probability distribution of U values. • In fact, little work seems to have been done on ppeaks. (Though there is quite a lot on the distribution of extrema – not quite the same thing.)

3: The decision ‘Map’ vs ‘peak’ distributions for Gaussian noise. Black: all pixels Red: peaks

3: Cash to the rescue • A practical recipe for applying Cash to source detection goes as follows: • Choose a window area surrounding each peak. • Within this window, calculate Lnull with model mi = bi (the background map values). • Calculate Lbest by fitting a model • Degrees of freedom ν = 1 (the amplitude) + d (the dimensions of the spatial fit). • The Cash statistic 2(Lbest-Lnull) behaves like χ2 with 1+d deg. free. mi = bi + θ1s(ri – θr)

3: Cash to the rescue • The only difficult point (which is a problem for every method) is to calculate the fraction of pixels which are peaks. • Monte Carlo • Possibly a Fourier technique? • Also, don’t want to use the fit for final parameter values. A Mighell fit is better. From my 2009 Cash paper.

What is the best detection method? From my 2009 Cash paper.

Lecture 8

Lecture 8

Presentation Transcript

Lecture 8

Lecture 8

Lecture 8

Lecture 8

Lecture #8

Lecture 8

Lecture 8

Lecture 8

Lecture 8

Lecture 8

Lecture 8

LECTURE № 8

LECTURE 8

Lecture 8

Lecture 8

Lecture 8

Lecture 8

Lecture 8

Lecture 8