230 likes | 495 Views
Example 10.1 Experimenting with a New Pizza Style at the Pepperoni Pizza Restaurant. Concepts in Hypothesis Testing. Background Information. The manager of Pepperoni Pizza Restaurant has recently begun experimenting with a new method of baking its pepperoni pizzas.
E N D
Example 10.1Experimenting with a New Pizza Style at the Pepperoni Pizza Restaurant Concepts in Hypothesis Testing
Background Information • The manager of Pepperoni Pizza Restaurant has recently begun experimenting with a new method of baking its pepperoni pizzas.
Background Information – cont’d • He believes that the new method produces a better-tasting pizza, but he would like to base a decision on whether to switch from the old method to the new method on customer reactions. • Therefore he performs an experiment.
The Experiment • For 100 randomly selected customers who order a pepperoni pizza for home delivery, he includes both an old style and a free new style pizza in the order.
The Experiment – cont’d • All he asks is that these customers rate the difference between pizzas on a -10 to +10 scale, where -10 means they strongly favor the old style, +10 means they strongly favor the new style, and 0 means they are indifferent between the two styles. • Once he gets the ratings from the customers, how should he proceed?
Hypothesis Testing • This example’s goal is to explain hypothesis testing concepts. We are not implying that the manager would, or should, use a hypothesis testing procedure to decide whether to switch methods.
Hypothesis Testing – cont’d • First, hypothesis testing does not take costs into account. In this example, if the new method is more costly it would be ignored by hypothesis testing. • Second, even if costs of the two pizza-making methods are equivalent, the manager might base his decision on a simple point estimate and possibly a confidence interval.
Null and Alternative Hypotheses • Usually, the null hypothesis is labeled Ho and the alternative hypothesis is labeled Ha. • The null and alternative hypotheses divide all possibilities into two nonoverlapping sets, exactly one of which must be true.
Null and Alternative Hypotheses – cont’d • Traditionally, hypotheses testing has been phrased as a decision-making problem, where an analyst decides either to accept the null hypothesis or reject it, based on the sample evidence.
One-Tailed Versus Two-Tailed Tests • The form of the alternative hypothesis can be either a one-tailed or two-tailed, depending on what the analyst is trying to prove. • A one-tailed hypothesis is one where the only sample results which can lead to rejection of the null hypothesis are those in a particular direction, namely, those where the sample mean rating is positive.
One-Tailed Versus Two-Tailed Tests – cont’d • A two-tailed test is one where results in either of two directions can lead to rejection of the null hypothesis. • Once the hypotheses are set up, it is easy to detect whether the test is one-tailed or two-tailed.
One-Tailed Versus Two-Tailed Tests – cont’d • One tailed alternatives are phrased in terms of “>” or “<“ whereas two tailed alternatives are phrased in terms of “” • The real question is whether to set up hypotheses for a particular problem as one-tailed or two-tailed. • There is no statistical answer to this question. It depends entirely on what we are trying to prove.
Types of Errors • Whether or not one decides to accept or reject the null hypothesis, it might be the wrong decision. • One might reject the null hypothesis when it is true or incorrectly accept the null hypothesis when it is false. • These errors are called type I and type II errors.
Types of Errors – cont’d • In general we incorrectly reject a null hypothesis that is true. We commit a type II error when we incorrectly accept a null hypothesis that is false. • These ideas appear graphically below.
Types of Errors -- continued • While these errors seem to be equally serious, actually type I errors have traditionally been regarded as the more serious of the two. • Therefore, the hypothesis-testing procedure factors caution in terms of rejecting the null hypothesis.
Significance Level and Rejection Region • The real question is how strong the evidence in favor of the alternative hypothesis must be to reject the null hypothesis. • The analyst determines the probability of a type I error that he is willing to tolerate. The value is denoted by and is most commonly equal to 0.05, although sigma=0.01 and sigma=0.10 are also frequently used.
Significance Level and Rejection Region – cont’d • The value of is called the significance level of the test. • Then, given the value of sigma, we use statistical theory to determine the rejection region.
Significance Level and Rejection Region – cont’d • If the sample falls into this region we reject the null hypothesis; otherwise, we accept it. • Sample evidence that falls into the rejection region is called statistically significant at the sigma level.
Significance from p-values • This approach is currently more popular than the significance level and rejected region approach. • This approach is to avoid the use of the level and instead simply report “how significant” the sample evidence is.
Significance from p-values – cont’d • We do this by means of the p-value.The p-value is the probability of seeing a random sample at least as extreme as the sample observes, given that the null hypothesis is true. • Here “extreme” is relative to the null hypothesis.
Significance from p-values – cont’d • In general smaller p-values indicate more evidence in support of the alternative hypothesis. If a p-value is sufficiently small, almost any decision maker will conclude that rejecting the null hypothesis is the more “reasonable” decision.
Significance from p-values – cont’d • How small is a “small” p-value? This is largely a matter of semantics but if the • p-value is less than 0.01, it provides “convincing” evidence that the alternative hypothesis is true; • p-value is between 0.01 and 0.05, there is “strong” evidence in favor of the alternative hypothesis;
Significance from p-values – cont’d • p-value is between 0.05 and 0.10, it is in a “gray area”; • p-values greater than 0.10 are interpreted as weak or no evidence in support of the alternative.