Binary regression: Logit and Probit Models

Binary regression: Logit and Probit Models 5 Modified Version by Woraphon Yamaka

Why use Binary regression? • There are many important research topics for which the dependent variable is "limited." • For example: voting, morbidity or mortality, and participation data is not continuous or distributed normally. • Binary choice regression is a type of regression analysis where the dependent variable is a dummy variable: coded 0 (did not vote) or 1(did vote)

Example of the Data X Y

Why Do We Need A Different Model Than Linear Regression?

Binary Regression Model Comparing Linear Regression and Binary Regression Models Linear Regression Model 1 0

Logit and Probit models for Binary choice Model Shorthand vector notation: the vector of explanatory variables x also contains the constant of the model. A cumulative distribution function . The response probability is thus a function of the explanatory variables x. Probability of a “success” given explanatory variables • There are two types of binary choice modes 1) Logit regression 2) Probit regression • Nonlinear models for binary response • Response probability is a nonlinear function of explainatory variables

Logit and Probit distribution

LogitandProbitmodelsforbinaryresponse Probit: (normal distribution) Logit: (logistic function) whereas If the latent variable y* is larger than zero, y takes on the value 1, if it is less or equal zero, y takes on 0 (y* can thus be interpreted as the propensity to have y = 1) and • Choices for the link function • Latent variable formulation of the Logit and Probit models

More: • The logistic and normal distribution constrains the estimated probabilities to lie between 0 and 1. • if you let X =0, then p = .50 • as X gets really big, p approaches 1 • as X gets really small, p approaches 0

Estimation • As the OLS is wrong, thus the Maximum Likelihood Estimation (MLE) is used. • MLE is a statistical method for estimating the coefficients of a model. • The likelihood function (L) measures the probability of observing the particular set of dependent variable values (p1, p2, ..., pn) that occur in the sample: L = Prob (p1* p2* * * pn) • The higher the L, the higher the probability of observing the p in the sample.

LogitandProbitmodelsforbinaryresponse The probability that individual i‘s outcome is yi =0 given that his/her characteristics are xi The probability that individual i‘s outcome is yi =1 given that his/her characteristics are xi Under random sampling Maximum likelihood estimates Maximum likelihood estimation of Logit and Probit models

Example of the results (STATA)

Interpretation • In linear regression, if the coefficient on x is β, then a 1-unit increase in x increases Y by β. • But in probit or logit models • So that the coefficient on BVAP is 0.0923, It means that a 1% increase in BVAP will raise the Pr(Y=1) by 0.0923. And this coefficient is significant at the 1% level. So raising x=BVAP has a constant effect on Y.

Problem of the traditional interpretation But this doesn’t translate into a constant effect on the original Y. It depends on your starting point. For instance, raising BVAP from .2 to .3 has little appreciable impact on Pr(Black Elected) But increasing BVAP from .5 to .6 does have a big impact on the probability

How we solve this problem : Marginal Effects in Probit This expression depends on not just βi, but on the value of xi and all other variables in the equation So to even calculate the impact of xi on Y you have to choose values for all other variables xj. Typical options are to set all variables to their means or their medians Another approach is to fix the xj and let xi vary from its minimum to maximum values Then you can plot how the marginal effect of xi changes across its observed range of values

LogitandProbitmodelsforbinaryresponse The partial effect of explanatory variable xj is considered for an “average” individual (this is problematic in the case of explana-tory variables such as gender) The partial effect of explanatory variable xj is computed for each individual in the sample and then averaged across all sample members (makes more sense) • Reporting partial effects of explanatory variables • The difficulty is that partial effects are not constant but depend on • Partial effects at the average: • Average partial effects: • Analogous formulas hold for discrete explanatory variables

LogitandProbitmodelsforbinaryresponse Chi-square distribution with q degrees of freedom The null hypothesis that the q hypotheses hold is rejected if the growth in maximized likelihood is too large when going from the restricted to the unrestricted model • Hypothesis testing after maximum likelihood estimation • The usual t-tests and confidence intervals can be used • There are three alternatives to test multiple hypotheses • Lagrange multiplier or score test (not discussed here) • Wald test (requires only estimation of unrestricted model) • Likelihood ratio test (restricted and unrestricted models needed)

LogitandProbitmodelsforbinaryresponse Individual i‘s outcome is predicted as one if the probability for this event is larger than .5, then percentage of correctly predicted y = 1 and y = 0 is counted Compare maximized log-likelihood of the model with that of a model that only contains a constant (and no explanatory variables) Look at correlation (or squared correlation) between predictions or predicted prob. and true values • Goodness-of-fit measures for Logit and Probit models • Percent correctly predicted • Pseudo R-squared • Correlation based measures

LogitandProbitmodelsforbinaryresponse The coefficients are not comparable across models Often, Logit estimated coefficients ¼ is 1.6 times Probit estimated because . The biggest difference between the LPM and Logit/Probit is that partial effects are nonconstant in Logit/Probit: (Larger decrease in probability for the first child) LPM = linear Probability model or linear regression model Example: Married women’s labor force participation

Binary regression: Logit and Probit Models