640 likes | 1.29k Views
Multinomial Logistic Regression “ Inanimate objects can be classified scientifically into three major categories; those that don't work, those that break down and those that get lost” (Russell Baker). Multinomial Logistic Regression.
E N D
Multinomial Logistic Regression“Inanimate objects can be classified scientifically into three major categories; those that don't work, those that break down and those that get lost” (Russell Baker)
Multinomial Logistic Regression • Also known as “polytomous” or “nominal logistic” or “logit regression” or the “discrete choice model” • Generalization of binary logistic regression to a polytomous DV • When applied to a dichotomous DV identical to binary logistic regression
Polytomous Variables • Three or more unordered categories • Categories mutually exclusive and exhaustive • Sometimes called “multicategorical” or sometimes “multinomial” variables
Polytomous DVs • Reason for leaving welfare: • marriage, stable employment, move to another state, incarceration, or death • Status of foster home application: • licensed to foster, discontinued application process prior to licensure, or rejected for licensure • Changes in living arrangements of the elderly: • newly co-residing with their children, no longer co-residing, or residing in institutions
Single (Dichotomous) IV Example • DV = interview tracking effort • easy-to-interview and track mothers (Easy); • difficult-to-track mothers who required more telephone calls (MoreCalls); • difficult-to-track mothers who required more unscheduled home visits (MoreVisits) • IV = race, 0 = European-American, 1 = African-American • N = 246 mothers • What is the relationship between race and interview tracking effort?
Crosstabulation • Table 3.1 • Relationship between race and tracking effort is statistically significant [2(2, N = 246) = 8.69, p = .013]
Reference Category • In binary logistic regression category of the DV coded 0 implicitly serves as the reference category • Known as “baseline,” “base,” or “comparison” category • Necessary to explicitly select reference category • “Easy” selected
Probabilities • Table 3.1 • More Calls (vs. Easy) • European-American: .24 = 30 / (30 + 96) • African-American: .31 = 24 / (24 + 53) • More Visits (vs. Easy) • European-American: .15 = 17 / (17 + 96) • African-American: .33 = 26 / (26 +53)
Odds & Odds Ratio • More Calls (vs. Easy) • European-American: .3125(.2098 / .6713) • African-American: .4528(.2330 / .5146) • Odds Ratio = 1.45 (.4528 / .3125) • 45% increase in the odds • More Visits (vs. Easy) • European-American: .1771 (.1189 / .6713) • African-American: .4905 (.2524 / .5146). • Odds Ratio = 2.77 (.4905 / .1771) • 177% increase in the odds
Question & Answer • What is the relationship between race and interview tracking effort? • The odds of requiring more calls, compared to being easy-to-track, are higher for African-Americans by a factor of 1.45 (45%). The odds of requiring more visits, compared to being easy-to-track, are higher for African-Americans by a factor of 2.77 (177%).
Multinomial Logistic Regression • Set of binary logistic regression models estimated simultaneously • Number of non-redundant binary logistic regression equations equals the number of categories of the DV minus one
Statistical Significance • Table 3.2 • (Race, More Calls vs. Easy) = (Race, More Visits vs. Easy) = 0 • Reject • Table 3.3 • (Race, More Calls vs. Easy) = (Race, More Visits vs. Easy) = 0 • Reject • Table 3.4 • (Race, More Calls vs. Easy) = 0 • Don’t Reject • (Race, More Visits vs. Easy) = 0 • Reject
Odds Ratios • OR(More Calls vs. Easy) = 1.45 • The odds of requiring more calls, compared to being easy-to-track, are not significantly different for European- and African-Americans. • OR(More Visits vs. Easy) = 2.77 • The odds of requiring more visits, compared to being easy-to-track, are higher for African-Americans by a factor of 2.77 (177%).
Estimated Logits (L) Table 3.4 L(More Calls vs. Easy) = a + BRaceXRace L(More Calls vs. Easy) = -1.163 + (.371)(XRace) L(More Visits vs. Easy) = a + BRaceXRace L(More Visits vs. Easy) = -1.731 + (1.019)(XRace)
Logits to Odds • African-Americans (X = 1) • L(More Calls vs. Easy) = -.792 = -1.163 + (.371)(1) • Odds = e-.792 = .45 • L(More Visits vs. Easy) = -.712 = -1.731 + (1.019)(1) • Odds = e-.712 = .49
Logits to Probabilities • African-Americans, L(More Calls vs. Easy) = -.792 • African-Americans, L(More Visits vs. Easy) = -.712
Question & Answer • What is the relationship between race and interview tracking effort? • The odds of requiring more calls, compared to being easy-to-track, are not significantly different for European- and African-Americans. • The odds of requiring more visits, compared to being easy-to-track, are higher for African-Americans by a factor of 2.77 (177%).
Single (Quantitative) IV Example • DV = interview tracking effort • easy-to-interview and track mothers (Easy); • difficult-to-track mothers who required more telephone calls (MoreCalls); • difficult-to-track mothers who required more unscheduled home visits (MoreVisits) • IV = years of education • N = 246 mothers • What is the relationship between education and interview tracking effort?
Statistical Significance • Table 3.6 • (Education, More Calls vs. Easy) = (Education, More Visits vs. Easy) = 0 • Reject • Table 3.7 • (Education, More Calls vs. Easy) = 0 • Don’t Reject • (Education, More Visits vs. Easy) = 0 • Reject
Odds Ratios • OR(More Calls vs. Easy) = .88 • The odds of requiring more calls, compared to being easy-to-track, are not significantly associated with education. • OR(More Visits vs. Easy) = .76 • For every additional year of education the odds of needing more visits, compared to being easy-to-track, decrease by a factor of .76 (i.e., -24.1%).
Figures • Education.xls
Estimated Logits (L) Table 3.7 X = 12 (high school education) • L(More Calls vs. Easy) = -.977 = .583 + (-.130)(12) • L(More Visits vs. Easy) = -1.235 = 2.077 + (-.276)(12)
Logits to Odds X = 12 (high school education) • Odds(More Calls vs. Easy) = e-.977 = .38 • Odds(More Visits vs. Easy) = e-1.235 = .29
Logits to Probabilities X = 12 (high school education)
Question & Answer • What is the relationship between education and interview tracking effort? • The odds of requiring more calls, compared to being easy-to-track, are not significantly associated with education. For every additional year of education the odds of needing more visits, compared to being easy-to-track, decrease by a factor of .76 (i.e., -24.1%).
Multiple IV Example • DV = interview tracking effort • easy-to-interview and track mothers (Easy); • difficult-to-track mothers who required more telephone calls (MoreCalls); • difficult-to-track mothers who required more unscheduled home visits (MoreVisits) • IV = race, 0 = European-American, 1 = African-American • IV = years of education • N = 246 mothers
Multiple IV Example (cont’d) • What is the relationship between race and interview tracking effort, when controlling for education?
Statistical Significance • Table 3.8 • (Race, More Calls vs. Easy) = (Race, More Visits vs. Easy) = (Ed, More Calls vs. Easy) = (Ed, More Visits vs. Easy) = 0 • Reject • Table 3.9 • (Race, More Calls vs. Easy) = (Race, More Visits vs. Easy) = 0 • Reject • (Ed, More Calls vs. Easy) = (Ed, More Visits vs. Easy) = 0 • Reject
Statistical Significance (cont’d) • Table 3.10 • (Race, More Calls vs. Easy) = 0 • Don’t reject • (Race, More Visits vs. Easy) = 0 • Reject • (Ed, More Calls vs. Easy) = 0 • Don’t reject • (Ed, More Visits vs. Easy) = 0 • Reject
Odds Ratios: Race • OR(More Calls vs. Easy) = 1.36 • The odds of requiring more calls, compared to being easy-to-track, are not significantly different for European- and African-Americans. • OR(More Visits vs. Easy) = 2.48 • The odds of requiring more visits, compared to being easy-to-track, are higher for African-Americans by a factor of 2.48 (148%).
Odds Ratios: Education • OR(More Calls vs. Easy) = .89 • The odds of requiring more calls, compared to being easy-to-track, are not significantly associated with education. • OR(More Visits vs. Easy) = .77 • For every additional year of education the odds of needing more visits, compared to being easy-to-track, decrease by a factor of .77 (i.e., -23%), when controlling for race.
Figures • Race & Education.xls
Effect of Education on Tracking Effort for African-Americans (Odds)
Effect of Education on Tracking Effort for African-Americans (Probabilities)
Question & Answer • What is the relationship between race and interview tracking effort, when controlling for education? • The odds of requiring more calls, compared to being easy-to-track, are not significantly different for European- and African-Americans, when controlling for education. The odds of requiring more visits, compared to being easy-to-track, are higher for African-Americans by a factor of 2.48 (148%), when controlling for education.
Assumptions Necessary for Testing Hypotheses • Assumptions discussed in GZLM lecture • Independence of irrelevant alternatives (IIA) • Odds of one outcome (e.g., More Calls) relative to another (e.g., Easy) are not influenced by other alternatives (e.g., More Visits)
Model Evaluation • Create a set of binary DVs from the polytomous DV recode TrackCat (1=0) (2=1) (3=sysmis) into MoreCalls. recode TrackCat (1=0) (2=sysmis) (3=1) into MoreVisits. Run separate binary logistic regressions • Use binary logistic regression methods to detect outliers and influential observations
Model Evaluation (cont’d) • Index plots • Leverage values • Standardized or unstandardized deviance residuals • Cook’s D • Graph and compare observed and estimated counts
Analogs of R2 • None in standard use and each may give different results • Typically much smaller than R2 values in linear regression • Difficult to interpret
Multicollinearity • SPSS multinomial logistic regression doesn’t compute multicollinearity statistics • Use SPSS linear regression • Problematic levels • Tolerance < .10 or • VIF > 10
Additional Topics • Polytomous IVs • Curvilinear relationships • Interactions
Additional Regression Models for Polytomous DVs • Multinomial probit regression • Substantive results essentially indistinguishable from binary logistic regression • Choice between this and binary logistic regression largely one of convenience and discipline-specific convention • Many researchers prefer binary logistic regression because it provides odds ratios whereas probit regression does not, and binary logistic regression comes with a wider variety of fit statistics
Additional Regression Models for Polytomous DVs (cont’d) Discriminant analysis • Limited to continuous IVs