1.16k likes | 1.97k Views
Binary Logistic Regression “To be or not to be, that is the question..”(William Shakespeare, “Hamlet”). Binary Logistic Regression. Also known as “logistic” or sometimes “logit” regression Foundation from which more complex models derived
E N D
Binary Logistic Regression“To be or not to be, that is the question..”(William Shakespeare, “Hamlet”)
Binary Logistic Regression • Also known as “logistic” or sometimes “logit” regression • Foundation from which more complex models derived • e.g., multinomial regression and ordinal logistic regression
Dichotomous Variables • Two categories indicating whether an event has occurred or some characteristic is present • Sometimes called “binary” or “binomial” variables
Dichotomous DVs • Placed in foster care or not • Diagnosed with a disease or not • Abused or not • Pregnant or not • Service provided or not
Single (Dichotomous) IV Example • DV = continue fostering, 0 = no, 1 = yes • Customary to code category of interest 1 and the other category 0 • IV = married, 0 = not married, 1 = married • N = 131 foster families • Are two-parent families more likely to continue fostering than one-parent families?
Crosstabulation • Table 2.1 • Relationship between marital status and continuation is statistically significant [2(1, N = 131) = 5.65, p = .017] • A higher percentage of two-parent families (62.20%) than single-parent families (40.82%) planned to continue fostering
Strength & Direction of Relationships • Different ways to quantify the relationship between IV(s) and DV • Probabilities • Odds • Odds Ratio (OR) • Also abbreviated as eB, Exp(B) (on SPSS output), or exp(B) • % change
Probabilities • Percentages in Table 2.1 as probabilities (e.g., 62.20% as .6220) • p • Probability that event will occur (continue) • e.g., probability that one-parent families plan to continue is .4082 • 1 – p • Probability that event will not occur (notcontinue) • e.g., probability that one-parent families do not plan to continue is .5918 (1 - .4082)
Odds • Ratio of probability that event will occur to probability that it will not • e.g., odds of continuation for one-parent families are .69 (.4082 / .5918) • Can range from 0 to positive infinity
Probabilities and Odds • Table 2.2 • Odds = 1 • Both outcomes equally likely • Odds > 1 • Probability that event will occur greater than probability that it will not • Odds < 1 • Probability that event will occur less than probability that it will not
Odds Ratio (OR) • Odds of the event for one value of the IV (two-parent families) divided by the odds for a different value of the IV, usually a value one unit lower (one-parent families) • e.g., odds of continuing for two-parent families more than double the odds for one-parent families • OR = 1.6455 / .6898 = 2.39
OR (cont’d) • Plays a central role in quantifying the strength and direction of relationships between IVs and DVs in binary, multinomial, and ordinal logistic regression • OR < 1 indicates a negative relationship • OR > 1 indicates a positive relationship • OR = 1 indicates no linear relationship
ORs > 1 • e.g., OR of 2.39 • A one-unit increase in the independent variable increases the odds of continuing by a factor of 2.39 • The odds of continuing are 2.39 times higher for two-parent compared to one-parent families
ORs < 1 • e.g., OR = .50 • A one-unit increase in the independent variable decreases the odds of continuing by a factor of .50 • The odds that two-parent families will continue are .50 (or one-half) of the odds that one-parent families will continue
ORs < 1 (cont’d) • Compute reciprocal (i.e., 1 / .50 = 2.00) • Express relationship as opposite event of interest (e.g., discontinuing) • A one-unit increase in the independent variable increases the odds of discontinuing by a factor of 2.00 • The odds that two-parent families will discontinue are 2.00 times (or twice) the odds of one-parent families
OR to Percentage Change • % change = 100(OR – 1) • Alternative way to express OR • e.g., A one-unit increase in the independent variable increases the odds of continuing by 139.00% • 100(2.39 – 1) = 139.00 • e.g., A one-unit increase in the independent variable decreases the odds of continuing by 50.00% • 100(.50 – 1) = -50.00
Comparing OR > 1 and OR < 1 • Compute reciprocal of one of the ORs • e.g., OR of 2.00 and an OR of .50 • Reciprocal of .50 is 2.00 (1 / .50 = 2.00) • ORs are equal in size (but not in direction of the relationship)
Qualitative Descriptors for OR • Table 2.3 • Use cautiously with IVs that aren’t dichotomous
Question & Answer • Are two-parent families more likely to continue fostering than one-parent families? • Yes. The odds of continuing are 2.39 times (139%) higher for two-parent compared to one-parent families. The probability of continuing is .41 for one-parent families and .62 for two-parent families.
Binary Logistic Regression Example • DV = continue fostering, 0 = no, 1 = yes • Customary to code category of interest 1 and the other category 0 • IV = married, 0 = not married, 1 = married • N = 131 foster families • Are two-parent families more likely to continue fostering than one-parent families?
Statistical Significance • Table 2.4 • Relationship between marital status and continuation is statistically significant (Wald 2 = 5.544, p = .019)
Direction of Relationship • B = slope • Positive slope, positive relationship • OR > 1 • Negative slope, negative relationship • OR < 1 • 0 slope, no linear relationship • OR = 1
Direction/Strength of Relationship • Positive relationship between marital status and continuation • Two-parent families more likely to continue • B = .869 • Exp(B) = OR = 2.385 • % change = 100(2.385 - 1) = 139% • The odds of continuing are 2.39 times (139%) higher for two-parent compared to one-parent families
Binary Logistic Regression Model • ln(π/ (1 - π)) = α + 1X1 + 1X2 + … kXk, or • ln(π / (1 - π)) = • π is the probability of the event • (eta) is the abbreviation for the linear predictor (right hand side of this equation) • k = number of independent variables
Logit Link • ln(π / (1 - π)) • Log of the odds that the DV equals 1 (event occurs) • Connects (i.e., links) DV to linear combination of IVs
Estimated Logits (L) ln(p / 1 - p) = a + B1X1 + B1X2 + … BkXk • ln(p / 1 – p) • Log of the odds that the DV equals 1 (event occurs) • Estimated logit, L • Does not have intuitive or substantive meaning • Useful for examining curvilinear relationships and interaction effects • Primarily useful for estimating probabilities, odds, and ORs
Estimated Logits (L) L(Continue) = a + BMarriedXMarried L(Continue) = -.372 + (.869)(XMarried) • a = intercept • B = slope
Logit to Odds • If L = 0: • Odds = eL = e0 = 1.00 • If L = .50: • Odds = eL = e.50 = 1.65 • If L = 1.00: • Odds = eL = e1.00 = 2.72
Logits to Odds (cont’d) • Table 2.4 • One-parent families • L(Continue) = -.372 = -.372 + (.869)(0) • Odds of continuing = e-.372 = .69 • Two-parent families • L(Continue) = .497 = -.372 + (.869)(1) • Odds of continuing = e.497 = 1.65
Odds to OR • OR = 1.65 / .69 = 2.39, or • e.869 = 2.39, labeled Exp(B) • Table 2.4
OR to Percentage Change • % change = 100(OR – 1) • e.g., A one-unit increase in the independent variable increases the odds of continuing by 139.00% • 100(2.39 – 1) = 139.00 • e.g., A one-unit increase in the independent variable decreases the odds of continuing by 50.00% • 100(.50 – 1) = -50.00
Logits to Probabilities • One-parent families, L(Continue) = -.372 • Two-parent families, L(Continue) = .497
Question & Answer • Are two-parent families more likely to continue fostering than one-parent families? • Yes. The odds of continuing are 2.39 times (139%) higher for two-parent compared to one-parent families. The probability of continuing is .41 for one-parent families and .62 for two-parent families.
Single (Quantitative) IV Example • DV = continue fostering, 0 = no, 1 = yes • Customary to code category of interest 1 and other category 0 • IV = number of resources • N = 131 foster families • Are foster families with more resources more likely to continue fostering?
Statistical Significance • Table 2.5 • Relationship between resources and continuation is statistically significant (Wald 2 = 4.924, p = .026) • H0: = 0, 0, ≤ 0, same as • H0: OR = 1, OR 1, OR ≤ 1 • Likelihood ratio 2 better than Wald
Direction/Strength of Relationship • Positive relationship between resources and continuation • Families with more resources are more likely to continue • B = .212 • Exp(B) = OR = 1.237 • % change = 100(1.237 – 1) = 24% • The odds of continuing are 1.24 times (24%) higher for each additional resource
Estimated Logits L(Continue) = -1.227 + (.212)(X)
Figures • Resources.xls
Question & Answer • Are foster families with more resources more likely to continue fostering? • Yes. The odds of continuing are 1.24 times (24%) higher for each additional resource. The probability of continuing is .31 for families with two resources, .51 for families with 6 resources, and .71 for families with 10 resources.
Relationship of Linear Predictor to Logits, Odds & p • Relationship between linear predictor and logits is linear • Relationship between linear predictor and odds is non-linear • Relationship between linear predictor and p is non-linear • Challenge is to summarize changes in odds and probabilities associated with changes in IVs in the most meaningful and parsimonious way
IVs to z-scores • z-scores (standard scores) • Only the IV (not DV)--semi-standardized slopes • One-unit increase in the IV refers to a one-standard-deviation increase • OR interpreted as expected change in the odds associated with a one standard deviation increase in the IV • Conversion to z-scores changes intercept, slope, and OR, but not associated test statistics • Table 2.6 (compare to Table 2.5)
Figures • zResources.xls