What Happens When We Don’t Know p ?

What is the Probability? What Happens When We Don’t Know p? • We are typically comparing a hypothesis against a null hypothesis • In the examples we have given, the probability p of a “hit” in the null hypothesis is known to be correct What do we do if we don’t know what p is? • You might think you could simply look up the probability p for any given ailment, phenomenon, etc. There are many problems with this: • The base probability may be poorly researched or understood • The population from which you are picking your experimental group may be uncharacteristic • Age, race, geographics, etc. • The base probability may even change with time

The Solution is Two (Or More) Groups • Instead of assuming you know the base probability, simply measure it in your experiment • Assign everyone randomly to the experimental group or the control group • Use the control group to determine the base probability p • Then compare with the experimental group • If the difference is significant, then you have verified your hypothesis Sometimes, the relevant question isn’t necessarily how well something works, but whether it works better than something else • I don’t want to know if a new medical treatment has any effect, I want to know if it is better than standard treatment • This time, you have two experimental groups (or two plus a control group) • You can then compare the probabilities for the two groups

Typical Results from Experiments Where We Don’t Know the Probability • Comparing against the null hypothesis: • Comparing one property vs. another: • Comparing two properties to eachother and to the null hypothesis: • How do we analyze these?

Estimating the Probabilities • Suppose a single group has N samples in it, and some property happens x times • What is the probability p that the property occurs? • You might think that the probability p is simply: • However, we know this isn’t necessarily right, due to randomness • But it is not a bad guess of the probability • If we knew the probability p, then we would have • Where we know • Turn the formula for x around to write solve for  • Divide by N: • Then use the formula  = Np to find

Probability and Its Uncertainty • Note that the uncertainty  in x grows only as N • Therefore, as N becomes large, the uncertainty in p gets small • And hence, the first term will be a good estimate of the probability • We will treat this as the mean of the probability • We would also like toestimate the uncertaintyin the probability • We don’t know the probability, but we know a good estimate of it • So we have • So in summary,

Warning: Ignore This Slide The Problem with What We Have Been Doing • The formulas we have been using implicitly assume N is large • To get these formulas right for finite N, you have to actually make additional assumptions • What is the prior probability of different probabilities, i.e., are all probabilities equally likely, or are some more likely than others? • The results depend on the assumptions • For example, if you assumethat all probabilities p are equallylikely, then you would find • The formulas above are equivalent if N is large • We will use the formulas above and ignore this problem

Sample Problem In the table at right is the total number of winter Olympic medals won by the U.S. in the past three winter Olympics, and the total number of winter Olympic medals. Determine the probability, and uncertainty, in each case, of the U.S. winning an Olympic medal that year. • We start with the formula for the most likely probability • Now calculate the uncertainties

Comparing Two Probabilities How Do We Make a Comparison? • We typically have two (or more) estimates of the probability, one for each group • One or more experimental groups, often plus a control group • We can see that the estimates of the probabilities are not identical • Is the difference in probabilities due to chance, or is there something that is causing it? • We need to compare two groups, each with an estimatedprobability and an uncertainty in that estimate • We want to know if p1 > p2 • This is the same as asking if p1 – p2 > 0 • We already have rules forsubtracting distributions

z-Values for Comparing Two Probabilities • We want to know if this is zero (the null hypothesis – nothing going on here) • More precisely, we want to know how many standard deviations this is from zero • The answer is the central value divided by the uncertainty, so • You then interpret this z-value the same way you would havebefore • A z-value below 2 is pretty much statistically meaningless • A z-value larger than 5 is almost certainly statistically meaningful • A z-value between 2 and 5 can be cited as somewhere in between

Converting z-Values into Confidence Limits • Sometimes, we want to say how confident we are that the two probabilities are different • In the limit of large N, the distribution of probabilities will be Gaussian • So we can use the same tables we had before • We might postulate that one probability is bigger, smaller, or simply different than the other • If we do the third case (different = bigger or smaller), then we must double the probabilities, just like before (two-sided distributions)

Sample Problem In the table at right is the total number of winter Olympic medals won by the U.S. in the past three winter Olympics, and the total number of winter Olympic medals. Is the success of the USA changing over time? • Before I look at the data, I have no idea whether we expect the USA to be improving or getting worse in the Olympics • Therefore, we will do a two-sided distribution, allowing for both possibilities • First we calculate the probabilities and uncertainties as shown in the table above • We can now compare any two years, or even combine years • The obvious choice to search for a trend is to compare 2010 and 2018 • We find the z-value using our formula

Sample Problem (2) In the table at right is the total number of winter Olympic medals won by the U.S. in the past three winter Olympics, and the total number of winter Olympic medals. Is the success of the USA changing over time? • We now look up the probability in a table: • Since we only postulated that it was changing(not up or down), we have to double theprobabilities to give 2-sided probabilities • Interpolating, we estimate the probability atz = 2.55 as 1.08% likelihood of occurring by chance • We can rule out the null hypothesis at 98.9% confidence level

Statistics for Rare Incidents Some Simplifications Occur • Suppose you don’t know what the probability is, but you know it’s rare • Probability of getting hit by a meteorite • Probability of winning > $10,000 in the lottery • Probability of snow falling in Winston-Salem on any given day in March • Then we can assume p is small, and to a good approximation, 1 – p 1 • Look at formulas for mean and distribution: • Therefore, we can approximate • If we somehow know , we can get  without knowing N or p • Even when p isn’t tiny (say, p < 10%), it will only give small errors (< 5%)

Getting  and  from x For Rare Events • Turn this equation around • First, note that for large , note that the error is ,much smaller than  • Hence to a crude approximation,  = x • And the error on  is about • Summarizing, Warning: Ignore Rest of This Slide • The formulas we have been using implicitly assume N is large • To get these formulas right for finite N, you have to actually make additional assumptions • For example, if you assume that all means  are equally likely, these formulas get changed to • We will use the formulas above and ignore this problem

Comparing Two Values for Significance • Suppose you have two samples that differ in some way, one with x1incidences, and the other x2 incidences of the some metric we are interested in • We want to know if the difference between x1 and x2 is significant • We first calculate each of the means and uncertainties in the means • We want to know if they are different, i.e., is the difference between them significant • How many standard deviationsis this away from zero?

How Significant Is This Difference? • Depending on the nature of our hypothesis, we can calculate the probability that this occurred by chance • We might be using one- or two-sided distributions • We then cite a confidence level for our conclusions Warning: Ignore Rest of This Slide • For small numbers, these formulas really aren’t right • Result depends on assumptions • We will ignore this anduse formulas above

Sample Problem Has the number of murders in NYC changed substantially over the last decade? • We have a lot of data to compare • We could compare any one year to any other year • To get more statistical power, it is a good idea to bin the data, and combine several years • A logical way to do this is to combinethem into two equal sized bins • We now compare these numbers: • This is a very significant change (better than 99.9999%) • The murder rate is dropping in New York City

Cautionary Note on Low Probability z-Values • Unlike the other two cases (known probability, or comparing two probabilities), these z-values do not take into account the fact that the populations of the two distributions may be different • Hence, a large z-value may mean something, but that something could just be that one of the groups is larger Sample Problem In 2017, there were 116 homicides in Washington DC, and 290 in NYC. Is the difference significant? • This is very significant • But just represents that fact that NYC is way bigger than DC

Summary of Statistics The Formulas Case 1: Probability p is known for the null hypothesis • The mean and standard deviation are given by • And the z-value is given by Case 2: Probabilities for two groups are being measured • The probabilities’ means and standard deviationsare given by • And the z-value for the difference between two groups is Case 3: Comparing two groups with low probabilities • The mean and standard deviation for each group is about • And the z-value to compare the two groups is

Size of Studies and z-Values How does the statistical significance change as we increase the size of the study? • Probably easiest to look at the secondcase, when estimating probabilities: • For non-random effects, if you increase N, the probabilities should stay the same • But the errors on the probabilities will fall proportional to N • Now look at the formula for z-value • The numerator stays the same, but the denominator falls as N • This means that the z-value rises as N • Hence the larger the study, the more significant your results will be • Rule of thumb: if you want to see changes in probabilityof size p, to make them significant you will probablyneed a sample size of at least

Low z-Values • So, we’ve done an experiment, we’ve measured the probabilities, and we get a z-score • What should we conclude if, say, z = 1.6? • We certainly would be wrong to conclude that our hypothesis is correct • But we might be wrong to conclude our hypothesis is excluded • It could be that z = 1.6 is due entirely to chance • In which case, doing a larger study won’t make it any bigger • It could get smaller, or disappear entirely • It could be that z = 1.6 is due entirely to a real effect that is simply smaller than we had hoped • If so, we could make our study four times as big, and we’d expect z ~ 3.2 • Or we could make our study ten times as big, and we’d expect z ~ 5.1

Low z-Values and False Negatives • When z-values are too low, we should say that the hypothesis is not supported • Not the same thing as wrong • The most we can conclude is that the effect of the independent variable is small • Example: If our horoscopes test comes out negative we can conclude that these horoscopes are not reliable indicators of personality • It is still possible that the horoscopes are weak indicators of personality • If the hypothesis is well supported otherwise (and well funded, and important), then it might be worth investigating in a larger study • If the effects are sufficiently small, we may no longer care

High z-Values and False Positives • So, we’ve done an experiment, we’ve measured the probabilities, and we get a z-score • What should we conclude if, say, z = 6.0? • It is very unlikely that this is due to random fluctuations • It is definitely fine to say that your hypothesis is supported • It is wrong to say that your hypothesis is proven • There are lots of ways that errors can creep in to make a fake effect look real • These generically go under the name of “systematic errors” • They will be discussed in the next section on the scientific method Dr. Carlson’s Description of the Scientific Method 7. Check for errors and repeat steps 4 – 6 as needed

What Happens When We Don’t Know p ?

What Happens When We Don’t Know p ?

Presentation Transcript