Limited Dependent Variables

Limited Dependent Variables • Often there are occasions where we are interested in explaining a dependent variable that has only limited measurement • Frequently it is even dichotomous.

Examples • War(1) vs. no War(0) • Vote vs. no vote • Regime change vs. no change

These are often Probability Models • E.g. • Power disparity leads to war: • Where Yt is the occurrence (or not) of war, and Xt is a measure of power disparity • We call this a Linear Probability Model

Problems with LPM Regression • OLS in this case is called the Linear Probability Model • Running regression produces some problems • Errors are not distributed normally • Errors are heteroskedastic • Predicted Ys can be outside the 0.0-1. bounds required for probability

Logistic Model • We need a model that produces true probabilities • The Logit, or cumulative logistic distribution offers one approach. • This produces a sigmoid curve. • Look at equation under 2 conditions: • Xi = +∞ • Xi = -∞

Sigmoid curve

Probability Ratio • Note that • Where

Log Odds Ratio • The logit is the log of the odds ratio, and is given by: • This model gives us a coefficient that may be interpreted as a change in the weighted odds of the dependent variable

Estimation of Model • We estimate this with maximum likelihood • The significance tests are z statistics • We can generate a Pseudo R2 which is an attempt to measure the percent of variation of the underlying logit function explained by the independent variables • We test the full model with the Likelihood Ratio test (LR), which has a χ2 distribution with k degrees of freedom

Neural Networks • The alternate formulation is representative of a single-layer perceptron in an artificial neural network.

Probit • If we can assume that the dependent variable is actually the result of an underlying (and immeasurable) propensity or utility, we can use the cumulative normal probability function to estimate a Probit model • Also, more appropriate if the categories (or their propensities) are likely to be normally distributed • It looks just like a logit model in practice

The Cumulative Normal Density Function • The normal distribution is given by: • The Cumulative Normal Density Function is:

The Standard Normal CDF • We assume that there is an underlying threshold value (Ii) that if the case exceeds will be a 1, and 0 otherwise. • We can standardize and estimate this as

Probit estimates • Again, maximum likelihood estimation • Again, a Pseudo R2 • Again, a LR ratio with k degrees of freedom

Assumptions of Models • All Y’s are in {0,1} set • They are statistically independent • No multicollinearity • The P(Yi=1) is normal density for probit, and logistic function for logit

Ordered Probit • If the dependent variable can take on ordinal levels, we can extend the dichotomous Probit model to an n-chotomous, or ordered, Probit model • It simply has several threshold values estimated • Ordered logit works much the same way

Multinomial Logit • If our dependent variable takes on different values, but they are nominal, this is a multinomial logit model

Some additional info • The Modal category is good benchmark • Present % correctly predicted • This can be calculated and presented. • This, when compared to the modal category, gives us a good indication of fit.

Stata • Use Leadership Change data • (1992 cross section)1992-Stata

Test different models • Dependent variable Leadership change • Examine distribution tables ledchan1 • Independent variables • Try different • Try corr and then (pwcorr)

Try the following regress ledchan1 grwthgdp hlthexp illit_f polity2 logit ledchan1 grwthgdp hlthexp illit_f polity2 logistic ledchan1 grwthgdp hlthexp illit_f polity2 probit ledchan1 grwthgdp hlthexp illit_f polity2 ologit ledchan1 grwthgdp hlthexp illit_f polity2 oprobit ledchan1 grwthgdp hlthexp illit_f polity2 mlogit ledchan1 grwthgdp hlthexp illit_f polity2 tobit ledchan1 grwthgdp hlthexp illit_f polity2, ul ll

Tobit • Assumes a 0 value, and then a scale • E.g., the decision to incarcerate • 0 or 1 • (Imprison or not) • If Imprison, than for how many years?

Other models • This leads to many other models • Count models & Poisson regression • Duration/Survival/hazard models • Censoring and truncation models • Selection bias models

Limited Dependent Variables