270 likes | 501 Views
Introduction and Description. Examples of binary regressionFeatures of linear probability modelsWhy use logistic regression?Interpreting coefficientsEvaluating the performance of the model . Binary Dependent Variables. In many regression settings, the Y variable is (0,1) A Few Examples:Consum
E N D
1. An Introduction to Regression with Binary Dependent Variables Brian Goff
Department of Economics
Western Kentucky University
2. Introduction and Description Examples of binary regression
Features of linear probability models
Why use logistic regression?
Interpreting coefficients
Evaluating the performance of the model
3. Binary Dependent Variables In many regression settings, the Y variable is (0,1)
A Few Examples:
Consumer chooses brand (1) or not (0);
A quality defect occurs (1) or not (0);
A person is hired (1) or not (0);
Evacuate home during hurricane (1) or not (0);
Other Examples?
4. Scatterplot of with Y=(0,1): Y = Hired-Not Hired; X= Experience
5. The Linear Probability Model (LPM) If we estimate the slope using OLS regression:
Hired = a + ??*Income + e ;
The result is called a “Linear Probability Model”
The predicted values are probabilities that Y equals 1;
The equation is linear – the slope is constant
6. Picture of LPM
7. An Example: Loan Approvals
8. Scatterplot (Loaned – NITA)
9. LPM Results
10. LPM Weaknesses The predicted probabilities can be greater than 1 or less than 0
Probabilities, by definition, have max =1; min = 0;
This is not a big issue if they are very close to 0 and 1
The error terms vary based on size of X-variable (“heteroskedastic”) –
There may be models that have lower variance – more “efficient”
The errors are not normally distributed because Y takes on only two values
Creates problems for
More of an issue for statistical theorists
11. Predicted Probabilities in LPM Loans Model
12. (Binary) Logistic Regression or “Logit” Selects regression coefficient to force predicted values for Y to be between (0,1)
Produces S-shaped regression predictions rather than straight line
Selects these coefficient through “Maximum Likelihood” estimation technique
13. Picture of Logistic Regression
14. LPM & Logit Regressions LPM & Logit Regressions in some cases provide similar answers
If few “outlying” X-values on upper or lower ends then LPM model often produces predicted values within (0,1) band
In such cases, the non-linear sections of the Logit regression are not needed
In such cases, simplicity of LPM may be reason for use
See following slide for an illustration
15. Example where LPM & Logit Results Similar
16. LPM & Logit: Loan Case In Loan example the results are similar:
R-square = 98% for regression of LPM-predicted probabilities & Logit-predicted probabilities
Descriptive statistics for both probabilities appear below:
The main difference is the LPM is max/min closer to 0 and 1
17. SPSS Logistic Regression Output for Loan Approval:
18. Interpreting Logistic Regression (Logit) Coefficients The slope coefficient from a logistic regression
(?) = the rate of change in the "log odds" of the event under study as X changes one unit
What in the world does that mean?
We want to know the change in the probability of the event as X changes
In Logistic Regression, this value changes as X-changes (S-shape instead of linear)
19. Loan Example: Effect of NITA on Probability of LoanNITA coefficient (B) = 0.11
20. Meaning? At moderate probabilities (around 0.5) of getting a loan (corresponds to average NITA of about 5), the likelihood of getting a loan increases by 2.75% for each 1% increase in NITA
This estimate is very close to the LPM estimate of 2.2%
At the lower and upper extremes (NITA values -/+ teens), the probability changes by only about 0.9% for a 1 unit increase in NITA
21. Alternative Methods of Evaluating Logit Regressions Statistics for comparing alternative logit models:
Model Chi-Square
Percent Correct Predictions
Pseudo-R2
22. Chi-Square Test for Fit The Chi-Square statistic and associated p-value (Sig.) tests whether the model coefficients as a group equal zero
Larger Chi-squares and smaller p-values indicate greater confidence in rejected the null hypothesis of no
23. Percent Correct Predictions The "Percent Correct Predictions" statistic assumes that if the estimated p is greater than or equal to .5 then the event is expected to occur and not occur otherwise.
By assigning these probabilities 0s and 1s and comparing these to the actual 0s and 1s, the % correct Yes, % correct No, and overall % correct scores are calculated.
Note: subgroups for the % correctly predicted is also important, especially if most of the data are 0s or 1s
24. Percent Correct Results
25. R2 Problems
26. Pseudo-R2 Values There are psuedo-R2 statistics that make adjustment for the (0,1) nature of the actual data: two are listed above
Their computation is somewhat complicated but yield measures that vary between 0 and (somewhat close to) 1 much like the R2 in a LP model.
27. Appendix: Calculating Effect of X-variable on Probability of Y Effect on probability of from 1 unit change in X
= (?)*(Probability)*(1-Probability)
Probability changes as the value of X changes
To calculate (1-P) for a given X values:
(1-P) = 1/exp[a + ??1*X1 + ?2*X2 …]
With multiple X-variables it is common to focus on one at a time and use average values for all but one