130 likes | 333 Views
Predicting Count Data . Poisson Regression. Review: Confusing Statistical Terms. General Linear Model (GLM) -Anything that can be written like this: -Solved using ordinary least squares -Assumptions revolve around the Normal Dist. Generalized Linear Model
E N D
Predicting Count Data Poisson Regression
Review: Confusing Statistical Terms General Linear Model (GLM) -Anything that can be written like this: -Solved using ordinary least squares -Assumptions revolve around the Normal Dist. Generalized Linear Model -Anything that can be written like this: -Solved using maximum likelihood -Assumptions use many different distributions
Remember: Why These Models? • Linear Regression: Assuming normal errors around the predicted score • When we violate this assumptions, our estimates of the distributions of the B’s are incorrect • Also…in some case our estimates of the effect size are inaccurate (usually too small)
Linear Regression • Linear regression is really a predictive model before anything else. (The statistical aspect is extra). B1 B0
Examples • (Criminal Justice) Number of offenses per year • (Domestic Violence) Number of DV events per person • (Epidemiology) Number of seizures per week
Count Data • This type of data can only have discrete values that are greater than or equal to zero. • In situations, this data follows the Poisson Distribution
Poisson Distribution • The Poisson random variable is defined by one parameter: the mean (μ) • It has the strong assumption that the mean is equal to the variance μ=σ
Poisson Regression • In this model, instead of predicting mean of a normal distribution, you are predicting the mean of a Poisson distribution (given some predictors)
Fundamental Equation • In linear regression: • In Poisson regression:
Assumptions • In your outcome variable (Y), the mean equals the variance. (There is a test for this) • For violations you can use Negative Binomial…which is just a Poisson where the variance is separate from the mean. • Observations are independent (as with most analyses) • And, basically, that the predictive model makes sense ( )
Interpreting Parameters • Like logistic, we have to interpret the EXP(B) • (This is the notation for ) • Instead of an odds ratio, this is a relative risk ratio: it is the additional rate given a one unit increase in X • 1 is the null hypothesis • 1.2 would be an increase of .2 in the relative rate for a one unit increase
Really, why the trouble? • Turns out that not using Poisson isn’t the worst thing ever. • Actually get alpha deflation • BUT- Many journals that are used to this kind of data will reject articles that do not use the proper technique