Measuring Evidence with p-values

Section 4.2 Measuring Evidence with p-values

Question of the Day Does drinking tea boost your immune system?

Tea and Immune Response • Participants were randomized to drink five or six cups of either tea (black) or coffee every day for two weeks (both drinks have caffeine but only tea has L-theanine) • After two weeks, blood samples were exposed to an antigen, and production of interferon gamma (immune system response) was measured • Explanatory variable: tea or coffee • Response variable: immune system response • Does drinking tea actually boost your immunity? Antigens in tea-Beverage Prime Human Vγ2Vδ2 T Cells in vitro and in vivo for Memory and Non-memory Antibacterial Cytokine Responses, Kamathet.al., Proceedings of the National Academy of Sciences, May 13, 2003.

Tea and the Immune System If the tea drinkers have enough higher levels of immune system response, can we conclude that drinking tea rather than coffee caused an increase in this aspect of the immune response? • Yes • No Randomized experiment allows conclusions about causality

Review µT = mean immune system response after drinking tea µC = mean immune system response after drinking coffee Does drinking tea boost immunity? The relevant hypotheses are: • H0: µT > µC, Ha: µT = µC • H0: µT< µC, Ha: µT= µC • H0: µT = µC, Ha: µT > µC • H0: µT = µC, Ha: µT< µC • H0: µT = µC, Ha: µT ≠ µC

Tea and Immune System The explanatory variable is tea or coffee, and the response variable is immune system response measured in amount of interferon gamma produced. How could we visualize this data? • Bar chart • Histogram • Side-by-side boxplots • Scatterplot One categorical and one quantitative

Tea and Immune System

Two Plausible Explanations • Why might the tea drinkers have higher levels of immune system response? • Two plausible explanations: • Alternative true: Tea drinkers have higher immune system responses than coffee drinkers • Null true, random chance: the people who got randomly assigned to the tea group have better immune systems than those who got randomly assigned to the coffee group

The Plausibility of the Null • The goal is determine whether the null hypothesis and random chance are a plausible explanation, given the observed data What kinds of statistics might we get, just by random chance, if the null hypothesis were true?

Actual Experiment R R R R R R R R R R R R R R R R R R R R R 1. Randomize units to treatment groups Tea Coffee R R R R R R R R R R R R R R R R R

Actual Experiment Randomize units to treatment groups Conduct experiment Measure response variable Tea Coffee 5 R 11 R R 13 18 R 20 R 0 R 0 R 3 R 11 R R 15 R 47 48 R R 52 R 55 56 R 58 R 16 R 21 R R 21 R 38 R 52

Actual Experiment Randomize units to treatment groups Conduct experiment Measure response variable Calculate statistic Tea Coffee 5 R 11 R R 13 18 R 20 R 0 R 0 R 3 R 11 R R 15 R 47 48 R R 52 R 55 56 R 58 R 16 R 21 R R 21 R 38 R 52

Actual Experiment • Two plausible explanations: • Tea boosts immunity • Random chance What might happen just by random chance??? Tea Coffee 5 R 11 R R 13 18 R R 20 R 0 0 R R 3 R 11 15 R R 47 R 48 52 R 55 R R 56 58 R 16 R R 21 21 R 38 R 52 R

Simulation 0 R 0 R R 3 R 11 15 R 16 R R 21 21 R R 38 52 R 5 R R 11 R 13 R 18 R 20 R 47 R 48 R 52 55 R 56 R 58 R Tea Coffee R 5 R 11 13 R 18 R R 20 0 R R 0 R 3 R 11 15 R R 47 48 R R 52 55 R 56 R 58 R 16 R R 21 R 21 R 38 R 52

Simulation 0 0 3 11 R 15 R 16 R 21 R 21 R 38 52 R R 5 11 R 13 R 18 R R 20 R 47 48 R 52 R 55 R R 56 R 58 1. Re-randomize units to treatment groups Tea Coffee R 38 R 52 5 R R 15 R 16 R 21 R 21 R 13 18 R 20 R R 47 R 55 11 R R 48 R 52 R 56 R 58

Simulation Repeat Many Times! 1. Re-randomize units to treatment groups Tea Coffee 2. Calculate statistic: 0 R 11 R R 38 R 52 R 5 0 3 R 15 R 16 R 21 21 R R 13 18 R 20 R 47 R 55 R 11 R R 48 R 52 56 R 58 R

Distribution of Statistic Under H0 How extreme is the observed statistic??? Is the null hypothesis a plausible explanation? (Note: you shouldn’t be able to answer this question quite yet, but should be thinking about why this would or wouldn’t convince you to reject the null as a plausible explanation)

Randomization Distribution A randomization distribution is a collection of statistics from samples simulated assuming the null hypothesis is true • The randomization distribution shows what types of statistics would be observed, just by random chance, if the null hypothesis were true

Green Tea and Prostate Cancer • A study was conducted on 60 men with PIN lesions, some of which turn into prostate cancer • Half of these men were randomized to take 600 mg of green tea extract daily, while the other half were given a placebo pill • The study was double-blind, neither the participants nor the doctors knew who was actually receiving green tea • After one year, only 1 person taking green tea had gotten cancer, while 9 taking the placebo had gotten cancer

Green Tea and Prostate Cancer The explanatory variable is green tea extract of placebo, the response variable is whether or not the person developed prostate cancer. What statistic and parameter is most relevant? • Mean • Proportion • Difference in means • Difference in proportions • Correlation Two categorical variables

Green Tea and Prostate Cancer p1 = proportion of green tea consumers to get prostate cancer p2 = proportion of placebo consumers to get prostate cancer State the null hypotheses. • H0: p1 = p2 • H0: p1< p2 • H0: p1> p2 • H0: p1≠ p2 The null hypothesis always includes an equals sign.

Green Tea and Prostate Cancer p1 = proportion of green tea consumers to get prostate cancer p2 = proportion of placebo consumers to get prostate cancer State the alternative hypotheses. • Ha: p1 = p2 • Ha: p1< p2 • Ha: p1> p2 • Ha: p1≠ p2 The alternative hypothesis is what the researchers are aiming to prove.

Randomization Test • State hypotheses • Collect data • Calculate statistic: • Simulate statistics that could be observed, just by random chance, if the null hypothesis were true (create a randomization distribution) • How extreme is the observed statistic? Is the null hypothesis (random chance) a plausible explanation?

Randomization Distribution Based on the randomization distribution, would the observed statistic of -0.267 be extreme if the null hypothesis were true? • Yes • No

Randomization Distribution Do you think the null hypothesis is a plausible explanation for these results? • Yes • No

Randomization Distribution In a hypothesis test for H0:  = 12 vsHa:  < 12, we have a sample with n = 45 and . What do we require about the method to produce randomization samples? •  = 12 •  < 12 We need to generate randomization samples assuming the null hypothesis is true.

Randomization Distribution In a hypothesis test for H0:  = 12 vsHa:  < 12, we have a sample with n = 45 and . Where will the randomization distribution be centered? • 10.2 • 12 • 45 • 1.8 Randomization distributions are always centered around the null hypothesized value.

Randomization Distribution Center A randomization distribution is centered at the value of the parameter given in the null hypothesis. • A randomization distribution simulates samples assuming the null hypothesis is true, so

Randomization Distribution In a hypothesis test for H0:  = 12 vsHa:  < 12, we have a sample with n = 45 and . What will we look for on the randomization distribution? • How extreme 10.2 is • How extreme 12 is • How extreme 45 is • What the standard error is • How many randomization samples we collected We want to see how extreme the observed statistic is.

Randomization Distribution In a hypothesis test for H0: 1= 2vsHa: 1> 2, we have a sample with 26, 21. What do we require about the method to produce randomization samples? • 1 = 2 • 1 > 2 • 26, 21 We need to generate randomization samples assuming the null hypothesis is true.

Randomization Distribution In a hypothesis test for H0: 1= 2vsHa: 1> 2, we have a sample with 26, 21. Where will the randomization distribution be centered? • 0 • 1 • 21 • 26 • 5 The randomization distribution is centered around the null hypothesized value, 1- 2 = 0

Randomization Distribution In a hypothesis test for H0: 1= 2vsHa: 1> 2, we have a sample with 26, 21. What do we look for on the randomization distribution? • The standard error • The center point • How extreme 26 is • How extreme 21 is • How extreme 5 is We want to see how extreme the observed difference in means is.

Back to Tea vs Coffee… • What do we do when the “extremity” of the observed statistic isn’t obvious? • We need a formal way of measuring how extreme a statistic would be, if H0 were true…

p-value The p-value is the proportion of samples, when the null hypothesis is true, that would give a statistic as extreme as (or more extreme than) the observed sample.

Tea vs Coffee Distribution of statistic if H0 true Proportion as extreme as observed statistic p-value observed statistic If there is no difference between tea and coffee regarding immunity, we would only see results this extreme 26 out of 1000 times

Calculating a p-value • What kinds of statistics would we get, just by random chance, if the null hypothesis were true? (randomization distribution) • What proportion of these statistics are as extreme as our original sample statistic? • (p-value)

Green Tea Supplements Distribution of statistic if H0 true p-value observed statistic If green tea supplements do not help prevent cancer, the chance of seeing results this extreme is only 0.0005 (or 1 out of 2000 samples).

p-value Use the randomization distribution below to test H0 :  = 0 vs Ha :  > 0 Match the sample statistics: r = 0.1, r = 0.3, and r = 0.5 With the p-values: 0.005, 0.15, and 0.35 Which sample statistic goes with which p-value?

Alternative Hypothesis • Tea versus coffee: Ha: µT> µC • Green tea: Ha: p1 < p2 UPPER TAIL LOWER TAIL

Alternative Hypothesis • A one-sided alternative contains either > or < • A two-sidedalternative contains ≠ • The p-value is the proportion in the tail in the direction specified by Ha • For a two-sided alternative, the p-value is twice the proportion in the smallest tail

Tea versus Coffee In the tea versus coffee example, suppose instead of asking whether tea boosts immunity, the study was designed to investigate whether tea or coffee is better for the immune system. State the alternative hypothesis: • Ha: µT= µC • Ha: µT< µC • Ha: µT> µC • Ha: µT≠ µC No specific direction is specified in the question of interest.

Tea versus Coffee • When Ha contains ≠, the p-value is twice the proportion in the smallest tail (In StatKey, you can equivalently click Two-Tail and add the two tails) p-value = 2 x 0.026 = 0.052

p-value and Ha H0:  = 0 Ha:  > 0 Upper-tail (Right Tail) H0:  = 0 Ha:  < 0 Lower-tail (Left Tail) H0:  = 0 Ha:  ≠ 0 Two-tailed

Warning: Check Order of Groups! • The p-value can be calculated based on the direction of the alternative hypothesis, as long as the order in Ha matches the order when the statistic is calculated! • As a check, remember that if the data support the alternative hypothesis, the p-value of a one-sided test should not be more than 0.5!

Summary • The randomization distribution shows what types of statistics would be observed, just by random chance, if the null hypothesis were true • A p-value is the chance of getting a statistic as extreme as that observed, if H0 is true • A p-value can be calculated as the proportion of statistics in the randomization distribution as extreme as (or more extreme than) the observed sample statistic

Measuring Evidence with p-values

Measuring Evidence with p-values

Presentation Transcript

p-values and Discovery

Patients as P-values Partners

P-Values

Inference: Fisher’s Exact p-values

Measuring Transversity In P+P

Measuring values in the European Values Study

Measuring with gallons

P-values for different alternatives

P Values

Measuring With Jugs

Group A: Limits and p-values with nuisance parameters

p-values and Discovery

Measuring fragmentation photons in p+p collisions

p-values and Discovery

Understanding p-values

-log 10 (p- values )

Measuring values in the European Values Study

p-Values for Hypothesis Testing About  With  Known

Hypothesis Testing and P-values

P Values

Methods for Measuring Elongation Values in

EDUCATE TOGETHER: MEASURING VALUES