1 / 22

Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly

Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly. Chapter 17 Learning Objectives (LOs). LO 17.1: Use dummy variables to capture a shift of the intercept. LO 17.2: Test for differences between the categories of a qualitative variable.

anthea
Download Presentation

Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly

  2. Chapter 17 Learning Objectives (LOs) LO 17.1:Use dummy variables to capture a shift of the intercept. LO 17.2:Test for differences between the categories of a qualitative variable. LO 17.3:Use dummy variables to capture a shift of the intercept and/or slope.

  3. Is There Evidence of Wage Discrimination? • Three Seton Hall professors recently learned in a court decision that they could pursue their lawsuit alleging the University paid higher salaries to younger instructors and male professors. • Mary Schweitzer works in human resources at another college and has been asked by the college to test for age and gender discrimination in salaries. • She gathers data on 42 professors, including the salary, experience, gender, and age of each.

  4. Is There Evidence of Wage Discrimination? • Using this data set, Mary hopes to: • Test whether salary differs by a fixed amount between males and females. • Determine whether there is evidence of age discrimination in salaries. • Determine if the salary difference between males and females increases with experience.

  5. 17.1Dummy Variables LO 17.1 Use dummy variables to capture a shift of the intercept. • In previous chapters, all the variables used in regression applications have been quantitative. • In empirical work it is common to have some variables that are qualitative: the values represent categories that may have no implied ordering. • We can include these factors in a regression through the use of dummy variables. • A dummy variable for a qualitative variable with two categories assigns a value of 1 for one of the categories and a value of 0 for the other.

  6. Variables with Two Categories LO 17.1 • For example, suppose we are interested in determining the impact of gender on salary. We might first define a dummy variable d (other meaningful names e.g., Dgender, are better) that has the following structure: Let d = 1 if gender = “female” and d= 0 if gender = “male.” • This allows us to include a measure for gender in a regression model and quantify the impact of gender on salary.

  7. Regression with a Dummy Variable LO 17.1

  8. Regression with a Dummy Variable LO 17.1

  9. Regression with a Dummy Variable LO 17.1 Graphically, we can see how the dummy variable shifts the intercept of the regression line.

  10. Salaries, Gender, and Age LO 17.1 d1 = 1 for male and 0 for female d2 = 0 for young and 1 for old

  11. Estimation Results LO 17.1 The estimated model is ŷ = 40.61 + 1.13x + 13.92d1 + 4.34d2. b. The predicted salary of a 50-year old male professor (d1 = 1 and d2 = 0) with 10 years of experience (x = 10) is ŷ= 40.61 + 1.13(10) + 13.92(1) + 4.34(0) = 65.83, or $65,830. The corresponding salary of a 50-year-old female (d1 = 0 and d2 = 0) is ŷ = 40.61 + 1.13(10) + 13.92(0) + 4.34(0) = 51.91, or $51,910. The predicted difference in salary between a male and a female professor with 10 years of experience is $13,920 (65,830 − 51,910). This difference can also be inferred from the estimated coefficient 13.92 of the gender dummy variable d1. Note that the salary difference does not change with experience. For instance, the predicted salary of a 50-year-old male with 20 years of experience is $77,130. The corresponding salary of a 50-year-old female is $63,210, for the same difference of $13,920.

  12. Estimation Results LO 17.1 c. For a 65-year-old female professor with 10 years of experience, the predicted salary is ŷ = 40.61 + 1.13(10) + 13.92(0) + 4.34(1) = 56.25, or 56,250. Prior to any statistical testing, it appears that an older female professor earns, on average, $4,340 (56,250 − 51,910) more than a younger female professor with the same experience.

  13. Testing the Significance of Dummy Variables LO 17.2 Test for differences between the categories of a qualitative variable. • The statistical tests discussed in Chapter 15 remain valid for dummy variables as well. • We can perform a t-test (using p-value) for individual significance, form a confidence interval using the parameter estimate and its standard error, and conduct a partial F test for joint significance.

  14. Example 17.2 LO 17.2

  15. Multiple Categories LO 17.2

  16. Multiple Categories LO 17.2

  17. Avoiding the Dummy Variable Trap LO 17.2 • Given the intercept term, we exclude one of the dummy variables from the regression. • If we included as many dummy variables as categories, this would create perfect multicollinearity in the data, and such a model cannot be estimated. • So, we include one less dummy variable than the number of categories of the qualitative variable.

  18. Homework Problem 8 on p. 524. the data file (SATdummy) is posted on S: drive. The answers are in the appendix.

  19. Example 17.3 • A recent article suggests that Asian-Americans face serious discrimination in the college admissions process (The Boston Globe, February 8, 2010). Specifically, Asian applicants typically need an extra 140 points on the SAT to compete with white students. Another report suggests that colleges are eager to recruit Hispanic students who are generally underrepresented in applicant pools (USA Today, February 8, 2010). In an attempt to corroborate these claims, a sociologist first wants to determine if SAT scores differ by ethnic background. She collects data on 200 individuals from her city with their recent SAT scores and ethnic background.

  20. Example 17.3 3, not 4 DV as follows:

  21. Example 17.3 b. For an Asian individual, we set d1 = 0, d2 = 0, d3 = 1 and calculate ŷ = 1388.89 + 264.86 = 1653.75. Thus, the predicted SAT score for an Asian individual is approximately 1654. The predicted SAT score for a Hispanic individual (d1 = d2 = d3 = 0) is ŷ = 1388.89, or approximately 1389. c. Since the p-values corresponding to d1 and d3 are approximately zero, we conclude at the 5% level that the SAT scores of White and Asian students are different from those of Hispanic students. However, with a p-value of 0.16, we cannot conclude that the SAT scores of Black and Hispanic students are statistically different.

  22. Homework Problem 11on p. 524. the data file (Retail Sales) is posted on S: drive. Do not do part d.

More Related