560 likes | 1.26k Views
Ordinal Logistic Regression “Good, better, best; never let it rest till your good is better and your better is best” (Anonymous). Ordinal Logistic Regression.
E N D
Ordinal Logistic Regression“Good, better, best; never let it rest till your good is better and your better is best” (Anonymous)
Ordinal Logistic Regression • Also known as the “ordinal logit,” “ordered polytomous logit,” “constrained cumulative logit,” “proportional odds,” “parallel regression,” or “grouped continuous model” • Generalization of binary logistic regression to an ordinal DV • When applied to a dichotomous DV identical to binary logistic regression
Ordinal Variables • Three or more ordered categories • Sometimes called “ordered categorical” or “ordered polytomous” variables
Ordinal DVs • Job satisfaction: • very dissatisfied, somewhat dissatisfied, neutral, somewhat satisfied, or very satisfied • Severity of child abuse injury: • none, mild, moderate, or severe • Willingness to foster children with emotional or behavioral problems: • least acceptable, willing to discuss, or most acceptable
Single (Dichotomous) IV Example • DV = satisfaction with foster care agencies • (1) dissatisfied; (2) neither satisfied nor dissatisfied; (3) satisfied • IV = agencies provided sufficient information about the role of foster care workers • 0 (no) or 1 (yes) • N = 300 foster mothers
Single (Dichotomous) IV Example (cont’d) • Are foster mothers who report that they were provided sufficient information about the role of foster care workers more satisfied with their foster care agencies?
Crosstabulation • Table 4.1 • Relationship between information and satisfaction is statistically significant [2(2, N = 300) = 23.52, p < .001]
Cumulative Probability • Ordinal logistic regression focuses on cumulative probabilities of the DV and odds and ORs based on cumulative probabilities. • By cumulative probability we mean the probability that the DV is less than or equal to a particular value (e.g., 1, 2, or 3 in our example).
Cumulative Probabilities • Dissatisfied • Insufficient Info: .2857 • Sufficient Info: .1151 • Dissatisfied or neutral • Insufficient Info: .5590 (.2857 + .2733) • Sufficient Info: .2878 (.1151 + .1727) • Dissatisfied, neutral, or satisfied • Insufficient Info: 1.00 (.2867 + .2733 + .4410) • Sufficient Info: 1.00 (.1151 + .1727 + .7121)
Cumulative Odds • Probability that the DV is less than or equal to a particular value is compared to (divided by) the probability that it is greater than that value • Reverse of what you do in binary and multinomial logistic regression • Probability that the DV is 1 (dissatisfied) vs. the probability that it is either 2 or 3 (neutral or satisfied); probability that the DV is 1 or 2 (dissatisfied or neutral) vs. the probability that it is 3 (satisfied)
Cumulative Odds & Odds Ratios • Odds of being dissatisfied (vs. neutral or satisfied) • Insufficient Info: .4000 (.2857 / [1 - .2857]) • Sufficient Info: .1301 (.1151 / [1 - .1151]) • OR = .33 (.1301 / .4000) (-67%) • Odds of being dissatisfied or neutral (vs. satisfied) • Insufficient Info: 1.2676 (.5590 / [1 - .5590]) • Sufficient Info: .4041 (.2878 / [1 - .2878]) • OR = .32 (.4041 / 1.2676) (-68%)
Question & Answer • Are foster mothers who report that they were provided sufficient information about the role of foster care workers more satisfied with their foster care agencies? • The odds of being dissatisfied (vs. being neutral or satisfied) are .33 times (67%) smaller for mothers who received sufficient information. The odds of being dissatisfied or neutral (vs. being satisfied) are .32 times (68%) smaller for mothers who received sufficient information.
Ordinal Logistic Regression • Set of binary logistic regression models estimated simultaneously (like multinomial logistic regression) • Number of non-redundant binary logistic regression equations equals the number of categories of the DV minus one • Focus on cumulative probabilities and odds, and ORs are computed from cumulative odds (unlike multinomial logistic regression)
Threshold • Suppose our three-point variable is a rough measure of an underlying continuous satisfaction variable. At a certain point on this continuous variable the population threshold (symbolized by τ, the Greek letter tau), that is a person’s level of satisfaction, goes from one value to another on the ordinal measure of satisfaction. • e.g., the first threshold (τ1) would be the point at which the level of satisfaction goes from dissatisfied to neutral (i.e., 1 to 2), and the second threshold (τ2) would be the point at which the level of satisfaction goes from neutral to satisfied (i.e., 2 to 3).
Threshold (cont’d) • The number of thresholds is always one fewer than the number of values of the DV. • Usually thresholds are of little interest except in the calculation of estimated values. • Thresholds typically are used in place of the intercept to express the ordinal logistic regression model
Estimated Cumulative Logits L (Dissatisfied vs. Neutral/Satisfied) = t1 - BX L (Dissatisfied/Neutral vs. Satisfied) = t2 – BX Table 4.2 L (Dissatisfied vs. Neutral/Satisfied) = -.912 – 1.139X L (Dissatisfied/Neutral vs. Satisfied) = .235 – 1.139X
Estimated Cumulative Logits (cont’d) • Each equation has a different threshold (e.g., t1 and t2) • One common slope (B). • It is assumed that the effect of the IVs is the same for different values of the DV (“parallel regression”assumption) • Slope is multiplied by a value of the IV and subtracted from, not added to, the threshold.
Statistical Significance • Table 4.2 • (Info) = 0 • Reject
Estimated Cumulative Logits (X = 1) L (Dissatisfied vs. Neutral/Satisfied) = -2.051 = -.912 – (1.139)(1) L (Dissatisfied/Neutral vs. Satisfied) = -.904 = .235 – (1.139)(1)
Cumulative Logits to Cumulative Odds (X = 1) L (Dissatisfied vs. Neutral/Satisfied) = e-2.051 = .129 L (Dissatisfied/Neutral vs. Satisfied) = e-.904 = .405
Cumulative Logits to Cumulative Probabilities (X = 1) (cont’d)
Effect of Information on Satisfaction (Cumulative Probabilities)
Odds Ratio • Reverse the sign of the slope and exponentiate it. • e.g., OR equals .31, calculated as e-1.139 • In contrast to binary logistic regression, in which odds are calculated as a ratio of probabilities for higher to lower values of the DV (odds of 1 vs. 0), in ordinal logistic regression it is the reverse
Odds Ratio (cont’d) • SPSS reports the exponentiated slope (e1.139= 3.123)--the sign of the slope is not reversed before it is exponentiated (e-1.139 = .320)
Question & Answer • Are foster mothers who report that they were provided sufficient information about the role of foster care workers more satisfied with their foster care agencies? • The odds of being dissatisfied (vs. neutral or satisfied) are .32 times smaller (68%) for mothers who received sufficient information. Similarly, the odds of dissatisfied or neutral (vs. satisfied) are .32 times smaller (68%) for mothers who received sufficient information.
Single (Quantitative) IV Example • DV = satisfaction with foster care agencies • (1) dissatisfied; (2) neither satisfied nor dissatisfied; (3) satisfied • IV = available time to foster (Available Time Scale); higher scores indicate more time to foster • Converted to z-scores • N = 300 foster mothers
Single (Quantitative) IV Example (cont’d) • Are foster mothers with more time to foster more satisfied with their foster care agencies?
Statistical Significance • Table 4.3 • (zTime) = 0 • Reject
Odds Ratio • OR equals .76 (e-.281) • For a one standard-deviation increase in available time, the odds of being dissatisfied (vs. neutral or satisfied) decrease by a factor of .76 (24%). Similarly, for one standard-deviation increase in available time the odds of being dissatisfied or neutral (vs. satisfied) decrease by a factor of .76 (24%).
Figures • zATS.xls
Estimated Cumulative Logits L (Dissatisfied vs. Neutral/Satisfied) = t1 - BX L (Dissatisfied/Neutral vs. Satisfied) = t2 – BX Table 4.3 L (Dissatisfied vs. Neutral/Satisfied) = -1.365 – .281X L (Dissatisfied/Neutral vs. Satisfied) = -.269 – .281X
Question & Answer • Are foster mothers with more time to foster more satisfied with their foster care agencies? • For a one standard-deviation increase in available time, the odds of being dissatisfied (vs. neutral or satisfied) decrease by a factor of .76 (24%). Similarly, for one standard-deviation increase in available time the odds of being dissatisfied or neutral (vs. satisfied) decrease by a factor of .76 (24%).
Multiple IV Example • DV = satisfaction with foster care agencies • (1) dissatisfied; (2) neither satisfied nor dissatisfied; (3) satisfied • IV = available time to foster (Available Time Scale); higher scores indicate more time to foster • Converted to z-scores • IV = agencies provided sufficient information about the role of foster care workers • 0 (no) or 1 (yes) • N = 300 foster mothers
Multiple IV Example (cont’d) • Are foster mothers who receive sufficient information about the role of foster care workers more satisfied with their foster care agencies, controlling for available time to foster?
Statistical Significance • Table 4.4 • (Info) = (zTime) = 0 • Reject • Table 4.5 • (Info) = 0 • Reject • (zTime) = 0 • Reject • Table 4.6 • (Info) = 0 • Reject • (zTime) = 0 • Reject
Odds Ratio: Information • OR equals .33 (e-1.116) • The odds of being dissatisfied (vs. neutral or satisfied) are .33 times (67%) smaller for mothers who received sufficient information, when controlling for available time to foster. Similarly, the odds of being dissatisfied or neutral (vs. satisfied) are .33 times (67%) smaller for mothers who received sufficient information, when controlling for time.
Odds Ratio: Time • OR equals .77 (e-.260) • For a one standard-deviation increase in available time, the odds of being dissatisfied (vs. neutral or satisfied) decrease by a factor of .76 (24%), when controlling for information. Similarly, for one standard-deviation increase in available time the odds of being dissatisfied or neutral (vs. satisfied) decrease by a factor of .76 (24%), when controlling for information.
Estimated Cumulative Logits Table 4.6 • L(Dissatisfied vs. Neutral/Satisfied) = -.941 – [(1.116)(XInfo) + (.260)(XzTime)] • L(Dissatisfied/Neutral vs. Satisfied) = .222 – [(1.116)(XInfo) + (.260)(XzTime)]
Estimated Odds as a Function of Available Time and Information • See Table 4.7
Estimated Probabilities as a Function of Available Time and Information • See Table 4.9
Question & Answer • Are foster mothers who receive sufficient information about the role of foster care workers more satisfied with their foster care agencies, controlling for available time to foster? • The odds of being dissatisfied (vs. neutral or satisfied) are .33 times (67%) smaller for mothers who received sufficient information, when controlling for available time to foster. Similarly, the odds of being dissatisfied or neutral (vs. satisfied) are .33 times (67%) smaller for mothers who received sufficient information, when controlling for time.
Assumptions Necessary for Testing Hypotheses • Assumptions discussed in GZLM lecture • Effect of the IVs is the same for all values of the DV (“parallel lines assumption”) L(Dissatisfied vs. Neutral/Satisfied) = t1 – (BInfoXInfo + BzTimeXzTime) L(Dissatisfied/Neutral vs. Satisfied) = t2 - (BInfoXInfo + BzTimeXzTime) • Ordinal logistic regression assumes that BInfo is the same for both equations, and BzTime is the same for both equations • See Table 4.10
Model Evaluation • Create a set of binary DVs from the polytomous DV compute Satisfaction (1=1) (2=0) (3=0) into SatisfactionLessThan2. compute Satisfaction (1=1) (2=1) (3=0) into SatisfactionLessThan3. • Run separate binary logistic regressions • Use binary logistic regression methods to detect outliers and influential observations
Model Evaluation (cont’d) • Index plots • Leverage values • Standardized or unstandardized deviance residuals • Cook’s D • Graph and compare observed and estimated counts
Analogs of R2 • None in standard use and each may give different results • Typically much smaller than R2 values in linear regression • Difficult to interpret