140 likes | 270 Views
Methods of Experimental Particle Physics. Alexei Safonov Lecture #22. Maximum Likelihood. Likelihood for N measurements x i : It is essentially a joint p.d.f . seen as a function of the parameters q
E N D
Methods of Experimental Particle Physics Alexei Safonov Lecture #22
Maximum Likelihood • Likelihood for N measurements xi: • It is essentially a joint p.d.f. seen as a function of the parameters q • While xi could be any measurements, it’s easy to visualize it as a histogram of x with N bins; in each bin you calculate how probable it is to see xi for a particular q (or set of q’s) • In ML, the estimators are those values of q that maximize the likelihood function • One could use MINUIT to find the position of minimum for -log(L)
Example: Maximum Likelihood • If you look at a distribution of the number of enetries in each bin, the data (in each bin) follows Poisson distribution • n=0,1,.. l>0 • Define Likelihood: • Best parameters where L is at its maximum • f is the gaussian signal + polynomial background • Well defined for any size of signals and small Ni • How do you know if your results make sense? • For any data, this method will find SOME minimum for any function. It gives you “best” results, not “sensible” ones. • Does the magnitude of L tell you how likely it is that things are ok?
Likelihood Function Magnitude • Let’s guesstimate (actually overestimate) Lmax for our example: • 80 bins, average probability for outcomes in each bin ~50% • Lmax=(1/2)80while data and fit would be clearly compatible • Well, we didn’t ask the right question • Lmax gives the probability of this EXACT outcome, which is of course very unlikely • We wanted to know “can something like this happen”?
Hypothesis Testing • We are trying to figure out if the minimum we got makes sense or not • Is it artificial or do we have a reasonable degree of belief that these numbers at the minimum make sense? • Ask a different (kind of opposite) question: • If the function with the parameters you picked as best is true, is it probable to see the data you actually saw? • Studying how the data should look seems easy: • Take the function, and generate fake “pseudo-data” using a Monte Carlo generator following the function • Hypothetically, you would want to say that if the real data and most of the “fake data” show a similar degree of similarity between the data and the function, you are presumably good
Hypothesis Testing • But how do you quantify whether this particular set of fake data looks more function-like than the true data or not? • You can use the likelihood value you calculated: • where all parameters are taken at the position of the maximum of L • For any function and data it’s a number, so easy to compare • You could then build a scheme that looks like this: • Calculate L using real data at the minimum, call it L0 • Generate fake data according to the function, calculate L and check if L>L0 • Repeat above step a million times • If 90% of the time fake data gives L>L0 that’s good • If 99% of the time fake data gives you L>L0that’s not good
The p-value • What we just calculated is called the p-value • Probability for the data to look even less likely than what we observed in data assuming that the function is correct • Calculate p-value: • Pseudoexperiment: take your function ( with “best” parameters) and “simulate” “data” according to the function (allowing statistical fluctuations), every time calculate and record likelihood L • Do 1,000,000 pseudoexperiments, check how often you get L lower than the Lmax you actually observed. • Small p-value (say less than 1%) tells you that your data does not follow the function you are trying to describe it with • When searching for smaller signals, can use p-value calculated using pseudoexperiments following the “background only” model: small p-values tell you that data does not like the “background only” model • Caveat: as a single number, such p-value does not tell you if adding signal helps
Hypothesis Testing • What we have described, even better works if you have a fixed “hypothesis” (no parameters) and want to check if the data is consistent with that hypothesis • In the initial fit we have found “best” parameters and took it as L0; in pseudo-experiments we never had this additional flexibility to pick something that is “best” • But for a fixed hypothesis there is never a problem • Example: • You stare at data and you have a well defined background prediction (say from MC simulation) • You suspect there may be a bump in the data but you are not sure • A good question to ask is “does the data look like the prediction?” – in other words what is the p-value?
Signal or No Signal? • When you start looking for Higgs, you don’t know its mass or cross-section • Comparing data and the background prediction is great, but it only tells you whether the data and background expectation look alike or not • If something fishy is going on, you will see a low p-value, which tells you that data does not look like background prediction • But it does not tell you if it looks like you may have found a Higgs (maybe it’s something else in there, which makes the data and predictions disagree) • Need to answer a question “does it look more like X or more like Y”? • If both X and Y are fixed (X is background only, Y is background plus signal with known mass and cross-section), one could probably calculate two p-values and then make a judgment • The caveat is that in all real searches you almost never know the mass – what do you do then?
Hypothesis Testing – Unknown Mass • Say I don’t know the mass, but I think I know the cross-section for each mass • As if I believe SM predictions and look for SM Higgs • How do you account for unknown mass? • Proposal #1: • Calculate p-value for Background and Background+Signal for every possible value of Higgs mass (say you scan over 100 values) • One p-value for Background only that tells you that data does not look like background (say p=0.001%) and 100 p-values for each mass each saying you different things (most are small like p=0.001-1%, but the one at 140 is p=40%) • One issue is having two p-values, but there is another one too
P-value for Comparing Hypotheses • In the past you were using L to say is something is function-like or not • Strictly speaking, you could have picked a different metrics, L is not the only possible choice • When comparing hypotheses, a good choice is the ratio of joint p.d.f.’s • It tells you at each point if something is more H1-like or H2-like • When you do pseudo-experiments in calculating the p-value, you will still generate data according to background model (if you are determining if something is background-like), you will just use this statistic as a metrics in deciding if the pseudo-data is more “alike” than the true data or not • Can successfully define p-value, p=1% will tell you that in only 1% of the cases the data, if it’s truly following H0, would look like yours • If 1% was your threshold, you will reject the hypothesis H0 in favor of H1 • Btw, what if both are wrong?
Unknown Mass • Say we did define this relative p-value and calculated it for each mass • Note: the plot below is something slightly different in that it allows for unknown cross-section of the signal • There is a clear bump somewhere near 125 that has p-value of ~1% • Does it mean that there is only 1% that this is background and 99% that this is Higgs?
A Caveat: Combinatorial Trial Factor • Also called “Look Elsewhere Effect” or LEE: • Local p-value tells us how significant the deviation is in this specific point • This would be a correct estimate of signal significance if we knew Higgs mass ahead of time • But we didn’t • It is like looking at 1,000 data plots: • Even if all of them truly came from their expected distributions, on average one of them must appear only 0.1% probable From the X-Files of HEP experimentalists: A bump that turned out to not be real
Combinatorial Trial Factors • Our p-value ignores the fact that every time something jumps up it looks more like Higgs and less like background • Need to account for that in your pseudoexperiments • Hense the word “local” in the bottom plot • When you do pseudo-experiments, you should also try all sorts of masses just like in data to see how bad the data can deviate from the prescribed expectation even if it is truly following the expectation • More Monte Carlo pseudo-experiments calculating the p-value