1 / 37

Statistical Hypothesis Testing: Uncovering Patterns in Data

Dive into statistics, the mathematics branch exploring probability and uncertainty. Learn how to test hypotheses, support or reject patterns using experiments and data analysis. Explore real-life scenarios like village height comparison, light-seeking fleas, and genetic ratios to understand the significance of results. Discover the Mann-Whitney Test to determine if observed differences are statistically significant or mere chance. Master the art of statistical hypothesis testing through practical examples and applications.

Download Presentation

Statistical Hypothesis Testing: Uncovering Patterns in Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics is . . . . . The branch of mathematics dealing with . . . • Uncertainty, or • Probability

  2. Checking for Patterns: the Villagers • Two groups of men from separate villages. • One group looks taller. • Is there a real difference? • Could these just be two samples from the same group, with sampling error? • What is the chance that both samples could have come from the same population?

  3. Hypotheses: support or reject In Science, we can’t prove that there is a difference between the two villages. We make predictions (hypotheses), carry out experiments and examine the data. We can then say one of two things: • The hypothesis is supported by the data, • The hypothesis is not supported and can therefore be rejected. When using statistics, we make two hypotheses: • The Null Hypothesis, H0proposes that there is no pattern at all, e.g. the observed difference in heights in the two villages is a result of sampling error. Both villages are part of the same population. • The alternative hypothesis, H1proposes that there is a pattern, e.g. the observed difference in heights in the two villages can not be accounted for by pure chance.

  4. Checking for Patterns: the light-seeking fleas • fleas are tested at various times after hatching • we record how many move towards the light (positive phototaxis) • there seems to be a clear positive correlation • what is the probability that this apparent pattern is a result of pure chance?

  5. Hypotheses for the flea experiment • The Null Hypothesis, H0proposes that there is no pattern at all: the apparent correlation between time of hatching and percentage of positively phototaxic fleas is the result of pure chance. • The alternative hypothesis, H1proposes that there is a pattern: the apparent correlation between time of hatching and percentage of positively phototaxic fleas cannot be accounted for by pure chance.

  6. Checking for patterns: genetic ratios • Gregor Mendel, the founder of genetics, crossed two pea plants with green pods. Both were heterozygous for the recessive characteristic yellow pods. In the next generation 428 plants had green pods and 152 yellow pods. • According to the theory, the expected ratio is 3:1. The actual ratio is 2.82:1. • Could this deviation from the ideal ratio be produced by pure chance or is there a significant deviation, so that there is not adequate support for the hypothesis?

  7. Hypotheses for the genetics cross • The Null Hypothesis, H0proposes that the deviation from the expected 3:1 ration could have been produced by pure chance. • The alternative hypothesis, H1proposes that the deviation from the expected 3:1 ration could not have been produced by pure chance – in other words, it is not a 3:1 ratio. • These hypotheses may seem rather strange, as the expected ratio is in the null hypothesis. This time we want support for the null hypothesis, as this means that the interpretation of a 3:1 ratio is correct. • The golden rule is not broken - null hypotheses always propose that variations are produced by chance.

  8. The Villagers We ask the men to stand in order of height. What do you notice about the ranked men?

  9. The villagers (2) 1 1 3 3 5 5 7 8 9 9 9 12 13 13 13 16 16 16 19 20 7 8 10 10 10 12 14 14 14 17 17 17 19 20 1.5 1.5 3.5 3.5 5.5 5.5 BLUE VILLAGE: 1.5, 1.5, 3.5, 3.5, 5.5, 7, 10, 12, 14, 17 ORANGE VILLAGE: 5.5, 8, 10, 10, 14, 14, 17, 17, 19, 20 1) We give a number for the rank of each. 2) With equal ranks, we give the average of the sequence of equals 3) Collect the ranks for the two villages into two groups Average ranks are quite different (BLUE 7.55; ORANGE 13.45), but could this happen by chance?

  10. The Mann-Whitney Test (1) Average rank for BLUE is 7.55; Average rank for ORANGE 13.45; This looks pretty convincing, but to be sure, we have to allow for the size of samples as a big difference in ranks gives more certainty when the sample is big. First we calculate total of ranks in each group: R1 = (ranks for blue village) = 75.5 R2 = (ranks for orange village) = 134.5 , the Greek letter sigma means “the sum of” Now, we calculate two values for Mann-Whitney’s U, as follows: U1=n1.n2 + 0.5n1 (n1 + 1) - R1 U2 = n1.n2 + 0.5n2 (n2 + 1) - R2 (n1 & n2 are the numbers in the samples, in this case both 10)

  11. The Mann-Whitney Test (2) R1 = (ranks for blue village) = 75.5 R2 = (ranks for orange village) = 134.5 Now, we calculate two values for Mann-Whitney’s U, as follows: U1=n1.n2 + 0.5n1 (n1 + 1) – R1 • U1=(10x10) + 0.5x10(10-1) – 75.5 = 100 + 45 – 75.5 = 69.5 U2 = n1.n2 + 0.5n2 (n2 + 1) - R2 • U2=(10x10) + 0.5x10(10-1) – 134.5 = 100 + 45 – 134.5 = 10.5 The lowest of the two values of U counts for the next stage: significance testing.

  12. The Mann-Whitney Test (3) Firstly let us return to the Hypotheses for this problem: • The Null Hypothesis H0: the observed difference in heights in the two villages is a result of sampling error. Both villages are part of the same population. • The alternative hypothesis H1: the observed difference in heights in the two villages can not be accounted for by pure chance. We look up the smallest value for U (10.5) in a significance table. This gives us the probability for the null hypothesis.

  13. The Mann-Whitney Test (3) We look up the smallest value for U (10.5) in a significance table. This gives us the probability for the null hypothesis: that the observed difference in heights in the two villages is a result of sampling error. Both villages are part of the same population. this says that this table is for a probability of 5% or p=0.05 we look up the number with the correct values for n1 and n2 if our calculated value for U is less than or equal to the critical value, then the probability of the null hypothesis is less than 5% (p<0.05)

  14. The Mann-Whitney Test (4) The probability (p) that the observed difference in heights in the two villages is a result of sampling error and that both villages are part of the same population is given by: p<0.05 We therefore reject the null hypothesis. We can say that there is a significant difference between the heights of men in the two villages. We can say that the difference is significant at the 5% level. (In fact the value for U is so far below the critical value for the 5% significance level, that it is very likely, it is even more significant. To find out, we would need a different table for the greater level of significance).

  15. The advantage of replicates Note how the critical values increase with bigger samples. As a significant difference requires a low number, doing many replicates makes it easier to demonstrate a significant difference

  16. Tailed-ness in the U test One and two tailed tests • In a two tailed test, your alternative hypothesis simply proposes that there is simply a difference between the two groups compared. • It proposes that the groups are different but not which group has the highest values. • In a one tailed test, your alternative hypothesis proposes that there is a a difference between the two groups compared, with a definite direction (either the first or the second group has the highest values). • In a good experiment, the investigator should be able to make a prediction with direction, and one-tailed tests are the rule.

  17. As with the villagers, the statistical test used for the flea example uses ranking, but in a different way. Check out this graph showing “perfect” correlation. Now, give each data point a rank on the x axis . . . testing for correlation (1) 6 5 4 3 2 1 2 3 4 5 6 • And then on the y axis. 1 • Now make a table of pairs of rankings . . . There is a perfect match.

  18. Now, check out this graph showing a less than perfect correlation. Again, give each data point a rank on the x axis . . . testing for correlation(2) 6 5 4 3 2 1 1 2 3 4 5= • And then on the y axis. 5= • Now make a table of pairs of rankings . . . The rankings no longer match perfectly.

  19. Now, check out this graph showing no apparent correlation. Again, give each data point a rank on the x axis . . . testing for correlation (3) 5= 5= 4 3 2 1 1 2 3 4 5 6 • And then on the y axis. • Now make a table of pairs of rankings . . . There are a lot of mismatches in the rankings.

  20. Spearman’s rank correlation test (1) Spearman’s test starts with a table comparing rankings on the x and y axes, It gives a single number, called the correlation coefficient, The symbol for this is r, r is in the range +1.0 to -1.0, when r = +1.0, this means a perfect positive correlation, with the line of best fit going up from bottom left to top right, and the rankings the same on both the x and the y axis, testing for correlation (4) r = -0.6 r = 0.0 r = -1.0 r = +1.0 r = +0.8 • when r = -1.0, this means a perfect negative correlation, with the line of best fit going up from top left to bottom right, and the rankings exactly the opposite on the x and the y axis, • When r = 0, there is no correlation • When r is between 0 and either –1 or +1, there is a weaker correlation.

  21. Spearman’s rank correlation test (2) Calculating the correlation coefficient, r calculate difference between ranks rank for x rank for y tabulate data square this sum of squared deviations 5.5 n = 10 d2 = number of pairs

  22. Spearman’s rank correlation test (3) Calculating the correlation coefficient, r (cont.) r is calculated according to this equation:

  23. Spearman’s rank correlation test (3) Significance of the correlation coefficient, r Consider this graph: Add a best fit line: • The points are “all” on the best fit line, • The ranks are the same for “all” points • So the correlation coefficient is: (?) • r = + 1.0 • But what does this mean? • Actually, nothing because you can always draw a straight line through two points • If the next point comes here (blue dot) . . . • then the correlation looks “safer”, but if it comes here (green dot) . . • then the correlation looks highly unlikely

  24. Significance of the correlation test (2) We look up the value for r (+0.817) in a significance table. This gives us the probability for the null hypothesis: that the apparent correlation on the graph could have been obtained by pure chance. The value of r is greater than the critical value for p = 0.001 (2-tailed test) or p = 0.005 (1-tailed test). So the null hypothesis is very unlikely and we have excellent support for the alternative hypothesis. We see where our calculated value for r fits on the line. In this case it is to the right of the biggest number. we find the correct line in the table.  (the Greek letter nu) is the number of data points: in this case 10

  25. Significance of the correlation test (3) One and two tailed tests • In a two tailed test, your alternative hypothesis simply proposes that there is some sort of a correlation between the variables x and y . . . • It proposes that the variables are linked but not whether it is a positive or a negative correlation. • In a one tailed test, your alternative hypothesis proposes that there is a correlation between the variables x and y with a definite direction (either positive or negative), • In a good experiment, the investigator should be able to make a prediction with direction, and one-tailed tests are the rule.

  26. Significance of the correlation test (4) Final conclusion on the flea experiment • We return to the original null hypothesis and give its probability . . . The probability (p) that the observed positive correlation between the age of the fleas (in hours since hatching) and % positive phototaxis could have been produced by purely random variation is given by p < 0,005 (or p < 0.5%) . . . • We then give the “other side of the coin”: the support for the alternative hypothesis . . . . . .so that the hypothesis that phototaxic behaviour of fleas is related to the age of the fleas is supported at the 0,5% level.

  27. The statistical advantage of thoroughness Look down the column for p = 0.01 (two-tailed). With bigger samples, the critical value becomes smaller, This means it is easier to show that there is a correlation, Unfortunately, this means spending more time collecting data . . but it is worth it, to get a conclusive result.

  28. Problems with correlation: 1 non-linearity The rankings are very similar and clearly, this will be a significant correlation . . . . . r = 0.964, with  = 10, p < 0.001 The data in the table relate number of reptile and amphibian species to the size of an oceanic island. But look at the graph! This is not a linear relationship; the graph looks right when both scales are logarithmic

  29. Problems with correlation: 2 “rogue points” The true line is more like the one shown in red. Spreadsheets use statistical techniques to calculate the equation for the “best fit line”. But a single “rogue point” (ringed) can distort the line considerably. It is often better to draw the line yourself. You need to decide which points to ignore, and whether the relationship is linear.

  30. Problems with correlation: 3 IS CORRELATION THE SAME AS CAUSATION? Consider this graph: Nobody would suggest that an increase in numbers of churches, mosques, synagogues and other places of worship cause an increase in public houses. They are both related to a third variable . . . The size of the community.

  31. Problems with correlation: 4 IS CORRELATION THE SAME AS CAUSATION? SMOKING AND LUNG CANCER Even the tobacco companies cannot deny the correlation between cigarette consumption and risk of lung cancer. But they have brought the idea of causation into question . . . . . . . suggesting a third variable which has nothing to do with smoking, e.g. a certain gene, which has two effects – one to increase risk of cancer and two to make a person more likely to take up cigarette smoking, HIV AND AIDS A very controversial hypothesis suggested that the presence of particles of the human immunodeficiency virus (HIV) were not the cause of AIDS but just another symptom. The true cause was suggested to be the reckless and irresponsible life-style of the patient.

  32. Genetic Ratios: are deviations significant? • Mendel crossed two pea plants with green pods. Both were heterozygous for the recessive characteristic yellow pods. • In the offspring, 428 plants had green pods and 152 had yellow pods. • The expected ratio is 3:1. The actual ratio is 2.82:1. • Could this deviation from the ideal ratio be produced by pure chance or is there a significant deviation? • This is a job for the 2 test (chi squared). • This test compares actual numerical patterns with expected patterns and gives the probability that chance could have caused the deviations.

  33. Checking Genetic Ratios with 2 Enter the observed values into the first column of a table (O = observed), Calculate the values expected for a “perfect ratio”: total offspring = 580; ¾ of this is 435 and ¼ is 145, Enter these values in the second column (E = expected), In the third column, calculate deviations from the expected values (O – E), Square this value in the third column, and in the final column divide by the expected. 2 is the sum of the final column 435 145 0.451 2 =

  34. Checking Genetic Ratios with 2: 2 THE SIGNIFICANCE TEST The value of 0.451 for 2 does not mean anything yet. First, we must look up the value in significance tables, As with the Spearman’s rank table, there are lines in the 2 table for different values of  the number of degrees of freedom, For 2, this is the number of data items minus one, so in this case  = 1, We see where our calculated value fits on this line, It is well below the critical value for p = 0.05, so we give the probability of the null hypothesis as: p > 0.05

  35. Checking Genetic Ratios with 2: 3 THE SIGNIFICANCE TEST: 2 What does this probability mean? Let us return to the null hypothesis: • The Null Hypothesis, H0proposes that the deviation from the expected 3:1 ration could have been produced by pure chance. As the probability is greater than 5%, then we cannot reject the null hypothesis! At first, this looks like a failure, until we realize that this is just what we want: There is no significant deviation from a 3:1 ratio, so we can accept the alternative hypothesis that this is a “good” 3:1 ratio.

  36. NO tailed-ness in the 2 test One and two tailed tests As hypotheses for tests predict “fit” or “no-fit” and have no direction, there are no one-tailed or two-tailed tests.

  37. Statistics: which test? To compare two groups, e.g. heights of trees from different woods, or speed of breakdown of protein by two different enzymes. At least 6 in each group in an experiment 6 replicates! THE MANN-WHITNEY U TEST different sources give minimum between 8 and 15 data points To check for correlation between two variables e.g. effect of temperature on metabolic rate SPEARMAN’S RANK CORRELATION TEST To check for goodness of fit to a numerical pattern, e.g. are woodlice randomly distributed in a choice chamber? THE 2 TEST 2 numbers

More Related