1 / 34

The Role of Probability in Statistics: Statistical Significance

The Role of Probability in Statistics: Statistical Significance. Introduction to Probability and Statistics Ms. Young. Objective. Understand the concept of statistical significance and the essential role that probability plays in defining it. Statistical Significance.

uriel
Download Presentation

The Role of Probability in Statistics: Statistical Significance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Role of Probability in Statistics: Statistical Significance Introduction to Probability and Statistics Ms. Young

  2. Objective Understand the concept of statistical significance and the essential role that probability plays in defining it.

  3. Statistical Significance • A set of measurements or observations are considered to be statistically significant if they probably DID NOT occur by chance • Ex. ~ Tossing a coin 100 times and getting 80 heads and 20 tails would be statistically significant because it probably did not occur by chance • Example 1: • Determine whether each scenario is statistically significant or not • A detective in Detroit finds that 25 of the 62 guns used in crimes during the past week were sold by the same gun shop. • This finding is statistically significant. Because there are many gun shops in the Detroit area, having 25 out of 62 guns come from the same shop seems unlikely to have occurred by chance.

  4. Example 1 Cont’d… • In terms of the global average temperature, five of the years between 1990 and 1999 were the five hottest years in the 20th century. • Having the five hottest years in 1990–1999 is statistically significant • By chance alone, any particular year in a century would have a 5 in 100, or 1 in 20, chance of being one of the five hottest years. Having five of those years come in the same decade is very unlikely to have occurred by chance alone • This statistical significance suggests that the world may be warming up • The team with the worst win-loss record in basketball wins one game against the defending league champions. • This one win is not statistically significant because although we expect a team with a poor win-loss record to lose most of its games, we also expect it to win occasionally, even against the defending league champions

  5. Example 2 • A researcher conducts a double-blind experiment that tests whether a new herbal formula is effective in preventing colds. During a three-month period, the 100 randomly selected people in a treatment group take the herbal formula while the 100 randomly selected people in a control group take a placebo. The results show that 30 people in the treatment group get colds, compared to 32 people in the control group. Can we conclude that the new herbal formula is effective in preventing colds? • Whether a person gets a cold during any three-month period depends on many unpredictable factors. Therefore, we should not expect the number of people with colds in any two groups of 100 people to be exactly the same. • In this case, the difference between 30 people getting colds in the treatment group and 32 people getting colds in the control group seems small enough to be explainable by chance. • So the difference is not statistically significant, and we should not conclude that the treatment is effective.

  6. Quantifying Statistical Significance • Determining if something is statistically significant can be obvious in some cases (i.e, 80 heads vs. 20 tails), but how do you decide if something is statistically significant if the numbers are closer (i.e., 55 heads vs. 45 tails)? • Probability is used to quantify statistical significance by determining the likelihood that a result may have occurred by chance • .05 level of significance: if the probability that something DID occur by chance is less than or equal to .05, or 5%, then it is statistically significant at the .05 level • In other words, if the probability that something did occur by chance is small (5%), then the probability that it did not occur by chance is big (95%), which means it is statistically significant because it probably did not occur by chance • .01 level of significance: if the probability that something DID occur by chance is less than or equal to .01, or 1%, then it is statistically significant at the .01 level • In other words, if the probability that something did occur by chance is small (1%), then the probability that it did not occur by chance is big (99%), which means it is statistically significant because it probably did not occur by chance • Something that is significant at the .01 level is also significant at the .05 level (since 1% is less than 5%), but something significant at the .05 level is not necessarily significant at the .01 level (since something could be significant at the .05 level if it’s under 5%, but doesn’t have to be as low as 1%)

  7. Example 3 • In the test of the Salk polio vaccine, 33 of the 200,000 children in the treatment group got paralytic polio, while 115 of the 200,000 in the control group got paralytic polio. Calculations show that the probability of this difference between the groups occurring by chance is less than 0.01. Describe the implications of this result. • The results are significant at the .01 level. This means there is a 1% chance or less that the results occurred by chance, therefore the results probably did not occur by chance which means that there is good reason to believe that the treatment works.

  8. Fundamentals of Hypothesis Testing Introduction to Probability and Statistics Ms. Young

  9. Objective • After this section you will understand the goal of hypothesis testing and the basic structure of a hypothesis test, including how to set up the null and alternative hypotheses, how to determine the possible outcomes of a hypothesis test, and how to decide between these possible outcomes.

  10. Statistical Claims • “Of our 350 million users, more than 50% log on to Facebook everyday” • “Using Gender Choice could increase a woman’s chance of giving birth to a baby girl up to 80%” • “According to the U.S. Census Bureau, Current Population Surveys, March 1998, 1999, and 2000, the average salary of someone with a high school diploma is $30,400 while the average salary of someone with a Bachelor's Degree is $52,200.” • How could we determine whether these claims are true or not? • Hypothesis Testing

  11. Formulating the Hypothesis • A hypothesis is a claim about a population parameter • Could either be a claim about a population mean, μ, or a population proportion, p • All of the claims on the previous slide would be considered hypotheses • A hypothesis test is a standard procedure for testing a claim about a population parameter • There are always at least two hypotheses in any hypothesis test; the null & alternative hypotheses

  12. Null Hypothesis • The null hypothesis, represented as (read as “H-naught”), is the starting assumption for a hypothesis test • The null hypothesis always claims a specific value for a population parameter and therefore takes the form of an equality • Take the claim, “using Gender Choice could increase a woman’s chance of giving birth to a baby girl up to 80%” for example. If the product did not work, it would be expected that there would be an approximately equally likely chance of having either a boy or a girl. Therefore, the null hypothesis (the claim not working) would be:

  13. Alternative Hypothesis • The alternative hypothesis, represented as , is a claim that the population parameter has a value that differs from the value claimed in the null hypothesis, or in other words, the claim does hold true • The alternative hypothesis can take one of the following forms: • left tailed • Ex. ~ A manufacturing company claims that their new hybrid model gets 62 mpg. A consumer group claims that the mean fuel consumption of this vehicle is less than 62 mpg. • This alternative hypothesis would be considered left-tailed since the claimed value is smaller (or to the left) of the null value • right tailed • Ex. ~ The claim that Gender Choice increases a woman’s chance of having a baby girl up to 80% would be testing values above the null value of .5, and would therefore be right-tailed mpg

  14. Alternative Hypothesis Cont’d… • two tailed • Ex. ~ A wildlife biologist working in the African savanna claims that the actual proportion of female zebras in the region is different from the accepted proportion of 50%. • Since the claim does not specify whether the alternative hypothesis is above 50% or below 50%, it would be considered two-tailed in which case the values above and below would be tested

  15. Possible Outcomes of a Hypothesis Test • There are two possible outcomes to a hypothesis test: • Reject the null hypothesis in which case we have evidence in support of the alternative hypothesis • Do Not reject the null hypothesis in which case we do not have enough evidence to support the alternative hypothesis • NOTE – Accepting the null hypothesis is not a possible outcome since it is the starting assumption. • The test may provide evidence to NOT REJECT the null hypothesis, but that does not mean that the null hypothesis is true • Be sure to formulate the null and alternative hypotheses prior to choosing a sample to avoid bias

  16. Example 1 • For the following case, describe the possible outcomes of a hypothesis test and how we would interpret these outcomes • The manufacturer of a new model of hybrid car advertises that the mean fuel consumption is equal to 62 mpg on the highway (μ = 62 mpg). A consumer group claims that the mean is less than 62 mpg (μ < 62 mpg). • Possible outcomes: • Reject the null hypothesis of μ = 62 mpg in which case we have evidence in support of the consumer group’s claim that the mean mpg of the new hybrid is less than 62 • Do not reject the null hypothesis, in which case we lack evidence to support the consumer group’s claim • Note – this does not necessarily imply that the manufacturer’s claim is true though

  17. Drawing a Conclusion from a Hypothesis Test • Using the claim that Gender Choice could increase a woman’s chance of giving birth to a baby girl up to 80%, suppose that a sample produces a sample proportion of, . • Although this supports the alternative hypothesis of , is it enough evidence to reject the null hypothesis? • This is where statistical significance comes into play (introduced earlier) • Recall that something is considered to be statistically significant if it most likely DID NOT occur by chance • There are two levels of statistical significance • The 0.05 level ~ which means that if the probability of a particular result occurring by chance is less than 0.05, or 5%, then it is considered to be statistically significant at the 0.05 level • The 0.01 level ~ which means that if the probability of a particular result occurring by chance is less than 0.01, or 1%, then it is considered to be statistically significant at the 0.01 level • The 0.01 level would represent a stronger significance than the 0.05 level

  18. Hypothesis Test Decisions Based on Levels of Statistical Significance • We decide the outcome of a hypothesis test by comparing the actual sample result (mean or proportion) to the result expected if the null hypothesis is true (using z-scores). We must choose a significance level for the decision. • If the chance that the sample result occurred by chance is less than 0.01, then the test is statistically significant at the 0.01 level and offers STRONG evidence for rejecting the null hypothesis. • If the chance that the sample result occurred by chance is less than 0.05, then the test offers MODERATE evidence for rejecting the null hypothesis. • If the chance that the sample result occurred by chance is greater than the chosen level of significance (0.01 or 0.05), then we DO NOT reject the null hypothesis.

  19. P-Values • A P-Value, or probability value, is the value that represents the probability of selecting a sample at least as extreme as the observed sample • In other words, it is the value that allows us to determine if something is statistically significant or not • NOTE ~ notice that the P-Value is represented using a capitol P, whereas the population proportion is represented using a lowercase p. • We will learn how to actually calculate the P-Value in the following sections • A small P-value indicates that the observed result is unlikely (therefore statistically significant) and provides evidence to reject the null hypothesis • A large P-value indicates that the sample result is not unusual, therefore not statistically significant - or that it could easily occur by chance, which tells us to NOT reject the null hypothesis

  20. Example 2 • You suspect that a coin may have a bias toward landing tails more often than heads, and decide to test this suspicion by tossing the coin 100 times. The result is that you get 40 heads (and 60 tails). A calculation (not shown here) indicates that the probability of getting 40 or fewer heads in 100 tosses with a fair coin is 0.0228. Find the P-value and level of statistical significance for your result. Should you conclude that the coin is biased against heads? • The P-Value is 0.0228 • This value is smaller than 5% (.05), but not smaller than 1% (.01), so it is statistically significant at the 0.05 level which gives us moderate reason to reject the null hypothesis and conclude that the coin is biased against heads

  21. Putting It All Together Step 1. Formulate the null and alternative hypotheses, each of which must make a claim about a population parameter, such as a population mean (μ) or a population proportion (p); be sure this is done before drawing a sample or collecting data. Based on the form of the alternative hypothesis, decide whether you will need a left-, right-, or two-tailed hypothesis test. Step 2. Draw a sample from the population and measure the sample statistics, including the sample size (n) and the relevant sample statistic, such as the sample mean (x) or sample proportion (p). Step 3. Determine the likelihood of observing a sample statistic (mean or proportion) at least as extreme as the one you found under the assumption that the null hypothesis is true. The precise probability of such an observation is the P-value (probability value) for your sample result. Step 4. Decide whether to reject or not reject the null hypothesis, based on your chosen level of significance (usually 0.05 or 0.01, but other significance levels are sometimes used).

  22. Hypothesis Tests for Population Means Introduction to Probability and Statistics Ms. Young

  23. Objective • After this section you will understand and interpret one- and two-tailed hypothesis tests for claims made about population means,.

  24. Background Info • Recall that there are two possible outcomes of a hypothesis test; to either reject or not reject the null hypothesis • To determine whether to reject or not, a P-value needs to be calculated and then compared to the desired level of significance (usually .05 or .01). • To calculate a P-value, you must first understand the concepts of a normal distribution (introduced in ch.5): • Recall that if a distribution is normal, you can use z-scores along with a z-score table to find probabilities of certain values occurring • Also recall that a distribution begins to take the shape of a normal distribution when the sample size is at least 30 and becomes more and more normal as the sample size increases (Central Limit Theorem) • In essence, a P-value (probability value) is the probability that is found using z-scores and the z-score table • Be sure that you are using the sample standard deviation, , when calculating the z-score since you are comparing a sample (group mean or group proportion) to the entire population

  25. One-Tailed Hypothesis Tests • As mentioned earlier, hypothesis tests can either be one-tailed (left or right) or two-tailed • The process for conducting a left-tailed test is the same as the process for conducting a right-tailed test, but a two-tailed test varies slightly • Example 1 ~ Left-Tailed Hypothesis Test: • Columbia College advertises that the mean starting salary of its graduates is $39,000. The Committee for Truth in Advertising suspects that this claim is exaggerated and that the mean starting salary for graduates is actually lower. They decide to conduct a hypothesis test to seek evidence to support this suspicion. Suppose that the committee gathered a sample of 100 graduates and found that the sample mean is and the standard deviation for that sample is s = $6,150 • Step 1: State the null and alternative hypotheses • Step 2: Draw a sample and come up with a sample statistic and the standard deviation of that sample:

  26. Example 1 Cont’d… • Step 3: Calculate the P-value (using the normal distribution and z-scores) and determine the level of significance • In order to calculate the P-value, we need to find the z-score using the Central Limit Theorem since we are dealing with the mean of a group. Since we do not know the population standard deviation, we will use the standard deviation found for the sample as an estimate. • Using the z-score table we find that a z-score of -3.25 correlates with a probability of .0006, or .06%. This is the P-value. • Since this value is less than .05 it is significant at the .05 level, but even better, this value is less than .01 which means that it is significant at the .01 level • Step 4: Decide if you should reject or not reject the null hypothesis • Since the P-value is significant at both levels (.05 and .01), we should reject the null hypothesis of $39,000 • What this means is that we have strong evidence to believe that Columbia College exaggerated about the mean starting salary of their graduates being $39,000 and that it is most likely lower.

  27. One-Tailed Hypothesis Tests Example 2 ~ Right-Tailed Hypothesis Test • In the United States, the average car is driven about 12,000 miles each year. The owner of a large rental car company suspects that for his fleet, the mean distance is greater than 12,000 miles each year. He selects a random sample of n = 225 cars from his fleet and finds that the mean annual mileage for this sample is miles. Suppose that the standard deviation for that sample is 2,415 miles. Interpret this claim by conducting a hypothesis test. • Step 1: State the null and alternative hypotheses • Step 2: Draw a sample and come up with a sample statistic and the standard deviation of that sample • This information was already given: • The sample is • The standard deviation for that sample is 2,415 miles • Step 3: Calculate the P-value and determine the level of significance: • The z-score is: miles

  28. One-Tailed Hypothesis Tests Example 2 Cont’d… • Step 3 cont’d… • The z-score was found to be 2.33 which corresponds to a probability of .9901 on the z-score table, but that represents the area below 12,375 and we are interested in knowing the probability of a car being driven more than that value so we subtract .9901 from 1 (1 - .9901) and get a probability of .0099 • The P-value is .0099 which is less than .01, meaning that it is significant at the .01 level • Step 4: Decide if you should reject or not reject the null hypothesis • Since the P-value is significant at both levels (.05 and .01), we should reject the null hypothesis of 12,000 miles • What this means is that we have strong evidence to believe that the mean distance traveled for the rental car fleet is greater than 12,000 miles

  29. Critical Values for Statistical Significance • Since we can decide to reject the null hypothesis if the P-value is .05 or lower (or .01 or lower), we can use critical values as a quick guideline to decide if we should reject the null hypothesis or not • Critical values for .05 significance level: • For a left-tailed test, the z-score that corresponds to a probability of .05 is -1.645, so any z-score that is less than or equal to -1.645 will be statistically significant at the .05 level • For a right-tailed test, the z-score that corresponds to a probability of .05 (which we would look for .95 on the chart) is 1.645, so any z-score greater than or equal to 1.645 will be statistically significant at the .05 level • Critical values for the .01 significance level: • For a left-tailed test, the z-score that corresponds to a probability of .01 is -2.33, so any z-score that is less than or equal to -2.33 will be statistically significant at the .01 level • For a right-tailed test, the z-score that corresponds to a probability of .01 (which we would look for .99 on the chart) is 2.33, so any z-score greater than or equal to 2.33 will be statistically significant at the .01 level

  30. Critical Values for Statistical Significance

  31. Two-Tailed Hypothesis Tests • The process for conducting a two-tailed hypothesis test is very similar to the one-tailed tests, except the critical values are slightly different • Since a two tailed test tests both above and below the claimed value, a .05 significance level would have to be split between the two extremes thus looking for a z-score that corresponds to a probability of .025 • The z-scores that correspond to a probability of .025 are -1.96 and 1.96, so for a two-tailed test, it is significant at the .05 level if the z-score is less than or equal to -1.96 or greater than or equal to 1.96

  32. Two-Tailed Hypothesis Tests • For a two-tailed test, a .01 significance level would mean that the z-score needs to correspond to a probability of .005 (.01 split in half) • The z-scores that correspond to a probability of .005 are -2.575 and 2.575, so if the z-score is less than or equal to -2.575 or greater than or equal to 2.575, then it is statistically significant at the .01 level • Summary of critical values for two-tailed tests: • .05 significance level: • .01 significance level:

  33. Two-Tailed Hypothesis Tests Example 3 ~ Two-Tailed Hypothesis Test: • Consider the study in which University of Maryland researchers measured body temperatures in a sample of n = 106 healthy adults, finding a sample mean body temperature of with a sample standard deviation of 0.62°F. We will assume that the population standard deviation is the same as the standard deviation found from the sample. Determine whether this sample provides evidence for rejecting the common belief that the mean human body temperature is • Step 1: State the null and alternative hypotheses • Step 2: Draw a sample and come up with a sample statistic and the standard deviation of that sample • This information was already given: • The sample mean is • The standard deviation for that sample is 0.62°F

  34. Two-Tailed Hypothesis Tests Example 3 Cont’d… • Step 3: Calculate the P-value and determine the level of significance • To calculate the P-value for a two tailed test, you must find the z-score like you would with a one-tailed test, but the probability that corresponds to it must then be multiplied by 2 • The z-score is: • The P-value is less than .0002 (.0001 * 2), and since the z-score of -6.64 is significantly lower than -1.96 and -2.575, this would be statistically significant at both levels • Step 4: Decide if you should reject or not reject the null hypothesis • The null hypothesis should be rejected which provides strong evidence that the mean human body temperature is not 98.6°F. It may be either higher or lower.

More Related