1 / 35

Business Statistics : Communicating with Numbers By Sanjiv Jaggia and Alison Kelly

Business Statistics : Communicating with Numbers By Sanjiv Jaggia and Alison Kelly. Chapter 13 Learning Objectives (LOs). LO 13.1: Provide a conceptual overview of ANOVA. LO 13.2: Conduct and evaluate hypothesis tests based on one-way ANOVA.

daxia
Download Presentation

Business Statistics : Communicating with Numbers By Sanjiv Jaggia and Alison Kelly

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly

  2. Chapter 13 Learning Objectives (LOs) LO 13.1:Provide a conceptual overview of ANOVA. LO 13.2:Conduct and evaluate hypothesis tests based on one-way ANOVA. LO 13.3:Use Tukey’sHSD method in order to determine which means differ. LO 13.4:Conduct and evaluate hypothesis tests based on two-way ANOVA with no interaction. LO 13.5:Conduct and evaluate hypothesis tests based on two-way ANOVA with interaction.

  3. Chapter Case - How Much Does Using Public Transportation Save? • Environmental and economic concerns have led to an upswing in the use of public transportation. • Research analyst Sean Cox looked at study results from a Boston Globe article that claimed commuters there topped the nation in cost savings from public transportation. • He wants to know if the average savings significantly differ among these cities.

  4. How Much Does Using Public Transportation Save?

  5. 13.1 One-Way ANOVA LO 13.1 Provide a conceptual overview of ANOVA. • Analysis of Variance (ANOVA) is used to determine if there are differences among three or more populations means (µ). • One-way ANOVA compares population means based on one categorical variable (City in the chapter case) called treatment. • We utilize a completely randomized design, comparing sample means computed for each treatment to test whether the population means differ.

  6. ANOVA Assumptions LO 13.1 The assumptions are extensions of those we used when comparing just two populations: • The populations are normally distributed. • The population standard deviations are unknown but assumed equal. • Samples are selected independently from each population. Here we compare a total of c (more than 2) populations, rather than just two.

  7. 4 Steps of HT State hypotheses (H0 and HA) Find test statistic (Fin this case) Find critical or p-value (We get them from the Excel output) Conclusion and interpretation

  8. The Hypothesis Test LO 13.2 Conduct and evaluate hypothesis tests based on one-way ANOVA. • Step 1: The competing hypotheses for the one-way ANOVA: H0: µ1 = µ2 = µ3 = µ4 (different treatments have no different effect on population means) HA: Not all population means are equal (different treatments have different effect on population means) We have only this pair of hypotheses.

  9. The ANOVA Concept LO 13.2 • The competing hypotheses are displayed graphically below: • The left graph depicts the null hypothesis, where all sample means are drawn from the same distribution. • On the right, the distributions, and population means, differ. Sample means are close enough to one another  population means are same.

  10. The ANOVA Concept LO 13.2 The rationale is If sample means are close enough to one another, we guess that population means are same. Then how close the sample means should be? They should be close enough so that the variance of the sample means is small enough. How small the variance should be? It is determined by F critical value in the F distribution. So we convert the variance of sample means into F test statistic. F value the ratio of once variance to another variance. So we find another variance and divide the variance of sample means by another variance. This is the F test statistic.

  11. The ANOVA Concept LO 13.2 The rationale is 9. The F test statistic should be small enough (smaller than the critical value) for us to conclude that population means are same. If F test statistic < F critical value, then we do not reject H0. If F test statistic > F critical value, then we reject H0. 10. Small enough F test statistic also means p –value (P(F >F test statistic) is greater than level of significance (α) If p-value > α, then we do not reject H0. If p-value < α, then we reject H0.

  12. Chapter Case • Steps 2 and 3: Generate Excel table This ANOVA table shows F test statistic and critical value, but we will use the p-value, which is 7.96/100000000000000000000. It is practically 0. Please see Public Transportation in Excel Demonstration in D2L to see how to generate this table. The data is posted in Excel Data so you can practice it.

  13. Chapter Case • Step 4: F test statistic (610.5694) > F critical value (3.0984), so we reject H0. Or p-value (0) < α (0.05). So we reject H0. It means that there is significance difference among the average cost savings in 4 cities.

  14. 13.2 Multiple Comparison Methods LO 13.3 Use confidence intervals and Tukey’s HSD method in order to determine which means differ. • When the one-way ANOVA finds significant differences between the population means (as in the chapter case), it is natural to ask which means differ. • In this section we show two techniques for performing this follow-up analysis: • Fisher’s Least Difference Method (Not very efficient) • Tukey’s Honestly Significant Differences Method

  15. The Tukey HSD Procedure LO 13.3 • Tukey’s method generates confidence intervals (CI) of the form: • If CI does not include 0, there is difference between 2 population means (positive or negative) • The q-statistic comes from the studentized range distribution with c and (nT – c) degrees of freedom.

  16. The Tukey HSD Procedure LO 13.3 • Is there difference between Boston and Chicago? • Boston sample mean = 12622 • Chicago sample mean = 10730 • With α = 0.05, c = 4, nT – c = 24 – 4 = 20, q = 3.96 • MSE = 7209 • nBoston = 5 nChicago = 5

  17. The Tukey HSD Procedure LO 13.3 • confidence interval (CI) for the difference between Boston and Chicago = (12622-10730) ± 3.96 7209/2 (1/5+1/5) = 1892 ± 150.3653 = (1741.635, 2042.365) CI does not include 0, so there is difference between Boston and Chicago (Boston > Chicago). You can repeat this procedure for all pairs of cities.

  18. Excel Output of the Chapter Case c-1 MSE nT- c

  19. Finding q

  20. Homework • Problem 8 on page 394. To answer a, Download “Detergent” data from s:\jslee\QM3620\Data and run Excel to generate the ANOVA table. • Problem 14 on page 400. Answers are on pp. 677-678 in the appendix.

  21. 13.3 Two-Way ANOVA with No Interaction - Example 13.4 • She interviews four workers in three fields and determines their salaries. • This is one-way ANOVA • Data are not reliable

  22. Two-Way ANOVA Example • If the education level of the 12 workers is considered, a different story emerges. The data is posted as “Two_Factor_Income” in Excel Data. It is clear that education also impacts wage. Se we collect more reliable data this way

  23. The Randomized Block Design • This type of two-way ANOVA is called a randomized block design. • The term “block” refers to a matched set of observations across the treatments. • In the salary example, the treatments are the three fields of employment. • The blocks are the education levels. Until we account for them, we cannot capture the employment field effects.

  24. ANOVA for the Salary Example LO 13.4 H0: µ1 = µ2 = µ3 (Average salaries in 3 fields are same) H1: Not all the average salaries are same

  25. ANOVA for the Salary Example LO 13.4 See “Two Factor Income” In Excel Demonstration in D2L. You can also find the date in Excel Data.

  26. Interpreting the Results LO 13.4 Steps 2, 3 & 4 • F test statistic of column (56.58) > f critical value (4.76). Also p-value of column is about 0.03 (< 0.05). So, we reject H0. After controlling for educational level, the average salaries differ. • Salary also differs by educational level, as indicated by the very small p-value.

  27. ANOVA for the Salary Example LO 13.4 Hypothesis are always: H0: Same means (There is no effect of the factor) HA: Different means (There is effect of the factor)

  28. Homework Problem 30 on p. 409. Do parts a, b and c. The data, “House” is posted on S:\jslee\QM3620\Data. You should download this file and run Excel to generate ANOVA table. Answers are on p. 679 in the Appendix.

  29. 13.4: Two-Way ANOVA with Interaction LO 13.5 Conduct and evaluate hypothesis tests based on two-way ANOVA with interaction. • Now we will look at data categorized by two factors, but with two or more values observed in each “cell.” • We want to know three effects: Effect of column factor Effect of row factor Interaction effect with three p-values (You can use F test statistics and F critical values, but using p-values is simpler)

  30. What is Interaction? LO 13.5 • Interaction means that the effect of one factor depends on the level of the other factor. • For example, perhaps education impacts salaries in the financial sector, but not in professional sports. The two categories, employment sector and education, interact differently depending on the sector.

  31. Another Salary Example LO 13.5 • Each interior cell entry represents the average salary of 3 employees who fit the categories. • By choosing Data > Data Analysis > ANOVA: Two Factor With Replication, we are able to obtain Excel output.

  32. Two-way ANOVA with Interaction LO 13.5 Data and demonstration are in Excel Data and Excel Demonstration. The are called “Two_Factor_Income_with_Interaction.”

  33. Interpreting the Output LO 13.5 • We could do all the steps of HT in a simple way. If p-value is smaller than level of significance (α), then we conclude the relevant factor has effect.

  34. Interpreting the Output LO 13.5 • P-value for row factor (Education level) = 0 < α =0.5, reject H0, meaning there is effect. • P-value for column factor (Field) = 0 < α =0.5, reject H0, meaning there is effect. • P-value for Interaction (Education level * Field) = 0.0017 < α =0.5, reject H0, meaning there is effect.

  35. Homework Problem 38 on p. 415. Data called “Job Satisfaction” is posted on s:\jslee\qm3620\Data. Answers are on p. 679.

More Related