150 likes | 275 Views
Quantitative Methods. Analyzing event counts. Event Count Analysis.
E N D
Quantitative Methods Analyzing event counts
Event Count Analysis Event counts involve a non-negative interger-valued random variable. Examples are the number of bills introduced by a legislator, the number of car accidents, etc. Trivia: one of the earliest recorded uses of the poisson distribution was an 1898 analysis of the number of Prussian soldiers that were kicked to death by horses. OLS can generally not be used for event count analysis because it will produce biased and inconsistent estimates. (The dependent variable is not really interval / continuous—it is left censored—and the data are heteroskedastic.)
Event Count Analysis Poisson models
Poisson Models • The poisson distribution function: • (a poisson distribution has a mean and variance equal to λ. As λ increases, the distribution is approximately normal.
Poisson Models • The predicted counts (or “incidence rates”) can be calculated from the results as follows:
Poisson Models One can compare incidence rates with the “incidence rate ratios”. The incidence rate ratio for a one-unit change in xi with all of the variables in the model held constant is e Bi
Poisson Models—an example ------------------------------------------------------------------ daysabs | b z P>|z| e^b e^bStdX SDofX ---------+-------------------------------------------------------- gender | -0.40935 -8.489 0.000 0.6641 0.8147 0.5006 angnce | -0.01467 -11.342 0.000 0.9854 0.7686 17.9392 ------------------------------------------------------------------
Poisson Models—an example Being male decreases the # of days absent by a factor of .66. And it decreases the expected # of days absent by 100*(.66-1)% = =33%. For each point increase in the language score, the expected # of days absent decreases by a factor of .98 (or an expected decrease of 100%(.98-1)%= -2%))
Negative Binomial Regression Often, there is overdispersion, where the variance > mean. In practice, what this usually means of one of two things: first, it’s possible that there is some unobserved variable that makes some observations have higher counts than others (i.e., number of publications of professors—or # rbi of a sports team—can’t assume the mean # is the same across observations). Essentially, this is common with pooled data, and unobserved variables—and will look like heteroskedasticity. (Examplethe school from which one graduates).
Negative Binomial Regression The second possibility is that if you have one event, it increases or decreases the probability that you will have others (i.e., bill sponsorship counts)
Negative Binomial Regression A negative binomial regression analysis is appropriate in these cases (and if there is no “overdispersion”, a NBR will collapse down to a Poisson). (Notethere are also alternatives, such as zero-inflated (many, many zeros) and zero-truncated (no zeros) NBR.)
Negative Binomial Regression Zero inflated models essentially model based on the assumption that there is an “always zero” category of cases and a “sometimes zero” category of cases. Zero truncated models example would be online survey of web usage.