1 / 46

Multinomial Logistic Regression

Multinomial Logistic Regression “ Inanimate objects can be classified scientifically into three major categories; those that don't work, those that break down and those that get lost” (Russell Baker). Multinomial Logistic Regression.

yair
Download Presentation

Multinomial Logistic Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multinomial Logistic Regression“Inanimate objects can be classified scientifically into three major categories; those that don't work, those that break down and those that get lost” (Russell Baker)

  2. Multinomial Logistic Regression • Also known as “polytomous” or “nominal logistic” or “logit regression” or the “discrete choice model” • Generalization of binary logistic regression to a polytomous DV • When applied to a dichotomous DV identical to binary logistic regression

  3. Polytomous Variables • Three or more unordered categories • Categories mutually exclusive and exhaustive • Sometimes called “multicategorical” or sometimes “multinomial” variables

  4. Polytomous DVs • Reason for leaving welfare: • marriage, stable employment, move to another state, incarceration, or death • Status of foster home application: • licensed to foster, discontinued application process prior to licensure, or rejected for licensure • Changes in living arrangements of the elderly: • newly co-residing with their children, no longer co-residing, or residing in institutions

  5. Single (Dichotomous) IV Example • DV = interview tracking effort • easy-to-interview and track mothers (Easy); • difficult-to-track mothers who required more telephone calls (MoreCalls); • difficult-to-track mothers who required more unscheduled home visits (MoreVisits) • IV = race, 0 = European-American, 1 = African-American • N = 246 mothers • What is the relationship between race and interview tracking effort?

  6. Crosstabulation • Table 3.1 • Relationship between race and tracking effort is statistically significant [2(2, N = 246) = 8.69, p = .013]

  7. Reference Category • In binary logistic regression category of the DV coded 0 implicitly serves as the reference category • Known as “baseline,” “base,” or “comparison” category • Necessary to explicitly select reference category • “Easy” selected

  8. Probabilities • Table 3.1 • More Calls (vs. Easy) • European-American: .24 = 30 / (30 + 96) • African-American: .31 = 24 / (24 + 53) • More Visits (vs. Easy) • European-American: .15 = 17 / (17 + 96) • African-American: .33 = 26 / (26 +53)

  9. Odds & Odds Ratio • More Calls (vs. Easy) • European-American: .3125(.2098 / .6713) • African-American: .4528(.2330 / .5146) • Odds Ratio = 1.45 (.4528 / .3125) • 45% increase in the odds • More Visits (vs. Easy) • European-American: .1771 (.1189 / .6713) • African-American: .4905 (.2524 / .5146). • Odds Ratio = 2.77 (.4905 / .1771) • 177% increase in the odds

  10. Question & Answer • What is the relationship between race and interview tracking effort? • The odds of requiring more calls, compared to being easy-to-track, are higher for African-Americans by a factor of 1.45 (45%). The odds of requiring more visits, compared to being easy-to-track, are higher for African-Americans by a factor of 2.77 (177%).

  11. Multinomial Logistic Regression • Set of binary logistic regression models estimated simultaneously • Number of non-redundant binary logistic regression equations equals the number of categories of the DV minus one

  12. Statistical Significance • Table 3.2 • (Race, More Calls vs. Easy) = (Race, More Visits vs. Easy) = 0 • Reject • Table 3.3 • (Race, More Calls vs. Easy) = (Race, More Visits vs. Easy) = 0 • Reject • Table 3.4 • (Race, More Calls vs. Easy) = 0 • Don’t Reject • (Race, More Visits vs. Easy) = 0 • Reject

  13. Odds Ratios • OR(More Calls vs. Easy) = 1.45 • The odds of requiring more calls, compared to being easy-to-track, are not significantly different for European- and African-Americans. • OR(More Visits vs. Easy) = 2.77 • The odds of requiring more visits, compared to being easy-to-track, are higher for African-Americans by a factor of 2.77 (177%).

  14. Estimated Logits (L) Table 3.4 L(More Calls vs. Easy) = a + BRaceXRace L(More Calls vs. Easy) = -1.163 + (.371)(XRace) L(More Visits vs. Easy) = a + BRaceXRace L(More Visits vs. Easy) = -1.731 + (1.019)(XRace)

  15. Logits to Odds • African-Americans (X = 1) • L(More Calls vs. Easy) = -.792 = -1.163 + (.371)(1) • Odds = e-.792 = .45 • L(More Visits vs. Easy) = -.712 = -1.731 + (1.019)(1) • Odds = e-.712 = .49

  16. Logits to Probabilities • African-Americans, L(More Calls vs. Easy) = -.792 • African-Americans, L(More Visits vs. Easy) = -.712

  17. Question & Answer • What is the relationship between race and interview tracking effort? • The odds of requiring more calls, compared to being easy-to-track, are not significantly different for European- and African-Americans. • The odds of requiring more visits, compared to being easy-to-track, are higher for African-Americans by a factor of 2.77 (177%).

  18. Single (Quantitative) IV Example • DV = interview tracking effort • easy-to-interview and track mothers (Easy); • difficult-to-track mothers who required more telephone calls (MoreCalls); • difficult-to-track mothers who required more unscheduled home visits (MoreVisits) • IV = years of education • N = 246 mothers • What is the relationship between education and interview tracking effort?

  19. Statistical Significance • Table 3.6 • (Education, More Calls vs. Easy) = (Education, More Visits vs. Easy) = 0 • Reject • Table 3.7 • (Education, More Calls vs. Easy) = 0 • Don’t Reject • (Education, More Visits vs. Easy) = 0 • Reject

  20. Odds Ratios • OR(More Calls vs. Easy) = .88 • The odds of requiring more calls, compared to being easy-to-track, are not significantly associated with education. • OR(More Visits vs. Easy) = .76 • For every additional year of education the odds of needing more visits, compared to being easy-to-track, decrease by a factor of .76 (i.e., -24.1%).

  21. Figures • Education.xls

  22. Estimated Logits (L) Table 3.7 X = 12 (high school education) • L(More Calls vs. Easy) = -.977 = .583 + (-.130)(12) • L(More Visits vs. Easy) = -1.235 = 2.077 + (-.276)(12)

  23. Effect of Education on Tracking Effort (Logits)

  24. Logits to Odds X = 12 (high school education) • Odds(More Calls vs. Easy) = e-.977 = .38 • Odds(More Visits vs. Easy) = e-1.235 = .29

  25. Effect of Education on Tracking Effort (Odds)

  26. Logits to Probabilities X = 12 (high school education)

  27. Effect of Education on Tracking Effort (Probabilities)

  28. Question & Answer • What is the relationship between education and interview tracking effort? • The odds of requiring more calls, compared to being easy-to-track, are not significantly associated with education. For every additional year of education the odds of needing more visits, compared to being easy-to-track, decrease by a factor of .76 (i.e., -24.1%).

  29. Multiple IV Example • DV = interview tracking effort • easy-to-interview and track mothers (Easy); • difficult-to-track mothers who required more telephone calls (MoreCalls); • difficult-to-track mothers who required more unscheduled home visits (MoreVisits) • IV = race, 0 = European-American, 1 = African-American • IV = years of education • N = 246 mothers

  30. Multiple IV Example (cont’d) • What is the relationship between race and interview tracking effort, when controlling for education?

  31. Statistical Significance • Table 3.8 • (Race, More Calls vs. Easy) = (Race, More Visits vs. Easy) = (Ed, More Calls vs. Easy) = (Ed, More Visits vs. Easy) = 0 • Reject • Table 3.9 • (Race, More Calls vs. Easy) = (Race, More Visits vs. Easy) = 0 • Reject • (Ed, More Calls vs. Easy) = (Ed, More Visits vs. Easy) = 0 • Reject

  32. Statistical Significance (cont’d) • Table 3.10 • (Race, More Calls vs. Easy) = 0 • Don’t reject • (Race, More Visits vs. Easy) = 0 • Reject • (Ed, More Calls vs. Easy) = 0 • Don’t reject • (Ed, More Visits vs. Easy) = 0 • Reject

  33. Odds Ratios: Race • OR(More Calls vs. Easy) = 1.36 • The odds of requiring more calls, compared to being easy-to-track, are not significantly different for European- and African-Americans. • OR(More Visits vs. Easy) = 2.48 • The odds of requiring more visits, compared to being easy-to-track, are higher for African-Americans by a factor of 2.48 (148%).

  34. Odds Ratios: Education • OR(More Calls vs. Easy) = .89 • The odds of requiring more calls, compared to being easy-to-track, are not significantly associated with education. • OR(More Visits vs. Easy) = .77 • For every additional year of education the odds of needing more visits, compared to being easy-to-track, decrease by a factor of .77 (i.e., -23%), when controlling for race.

  35. Figures • Race & Education.xls

  36. Effect of Education on Tracking Effort for African-Americans (Odds)

  37. Effect of Education on Tracking Effort for African-Americans (Probabilities)

  38. Question & Answer • What is the relationship between race and interview tracking effort, when controlling for education? • The odds of requiring more calls, compared to being easy-to-track, are not significantly different for European- and African-Americans, when controlling for education. The odds of requiring more visits, compared to being easy-to-track, are higher for African-Americans by a factor of 2.48 (148%), when controlling for education.

  39. Assumptions Necessary for Testing Hypotheses • Assumptions discussed in GZLM lecture • Independence of irrelevant alternatives (IIA) • Odds of one outcome (e.g., More Calls) relative to another (e.g., Easy) are not influenced by other alternatives (e.g., More Visits)

  40. Model Evaluation • Create a set of binary DVs from the polytomous DV recode TrackCat (1=0) (2=1) (3=sysmis) into MoreCalls. recode TrackCat (1=0) (2=sysmis) (3=1) into MoreVisits. Run separate binary logistic regressions • Use binary logistic regression methods to detect outliers and influential observations

  41. Model Evaluation (cont’d) • Index plots • Leverage values • Standardized or unstandardized deviance residuals • Cook’s D • Graph and compare observed and estimated counts

  42. Analogs of R2 • None in standard use and each may give different results • Typically much smaller than R2 values in linear regression • Difficult to interpret

  43. Multicollinearity • SPSS multinomial logistic regression doesn’t compute multicollinearity statistics • Use SPSS linear regression • Problematic levels • Tolerance < .10 or • VIF > 10

  44. Additional Topics • Polytomous IVs • Curvilinear relationships • Interactions

  45. Additional Regression Models for Polytomous DVs • Multinomial probit regression • Substantive results essentially indistinguishable from binary logistic regression • Choice between this and binary logistic regression largely one of convenience and discipline-specific convention • Many researchers prefer binary logistic regression because it provides odds ratios whereas probit regression does not, and binary logistic regression comes with a wider variety of fit statistics

  46. Additional Regression Models for Polytomous DVs (cont’d) Discriminant analysis • Limited to continuous IVs

More Related