Are you wondering when you should use poisson regression? Well then you are in the right place! In this article we tell you everything you need to know to determine whether you should use poisson regression. We start out by explaining what the main advantages and disadvantages of poisson regression are to help contextualize this discussion. After that, we describe the situations in which you should use poisson regression.
What type of outcomes can poisson regression handle
Before we get into the main advantages and disadvantages of poisson regression, we will first talk about the types of outcomes that poisson regression can be used for. Poisson regression is generally used in the case where your outcome variable is a count variable. That means that the quantity that you are tying to predict should specifically be a count of something.
Poisson regression might also work in cases where you have non-negative numeric outcomes that are distributed similarly to count data, but the poisson regression model was originally intended for the count-based scenario.
Advantages and disadvantages of poisson regression
Now we will talk about some of the main advantages and disadvantages of poisson regression. This will provide some useful context that will help you understand why we recommend using poisson regression in some situations rather than others.
Advantages of poisson regression
- Simple model. The main advantage of using simple poisson regression rather than another regression model for count data like a zero-inflated poisson, negative binomial, or zero-inflated negative binomial model is the simplicity of the model itself. There are fewer parameters that need to be estimated in poisson regression than negative binomial regression, so poisson regression is great in cases where estimating parameters may be difficult (ex. small sample size).
Disadvantages of poisson regression
- Mean equals variance. One of the main disadvantages of the poisson regression model compared to other count-based regression models is that the poisson model makes the assumption that the mean of your distribution is equal to the variance. This assumption holds true in many cases where you are dealing with count data, but it may not always be true. If this assumption is not met, the standard errors of your poisson regression model will be incorrect.
- No natural zero inflation. When we talk about poisson regression, we are specifically referring to a simple poisson regression with no adjustments made for zero-inflated distributions. That means that poisson regression models do not perform well in cases where the distribution of the outcome has an excessive number of zero counts.
When to use poisson regression
- Simple baseline. The poisson regression model is a great model to reach for anytime you need a simple baseline model for count data. The poisson regression model is simpler than other count-based regression models like zero-inflated poisson, negative binomial, and zero-inflated negative binomial and it has the least parameters to fit. That means it is a great baseline to compare with to ensure that any added complexity that gets introduced in later models actually provides performance benefits.
- Small sample size. Since the poisson regression model is simple and has fewer parameters to estimate, it is also a great option to turn to when you are working with a relatively small sample size. In general, the more parameters you need to estimate, the more data you will need to do so.
When not to use poisson regression
- Zero inflation. If the distribution of your outcome variable is zero-inflated then you should consider using a zero-inflated poisson model. If the distribution of your outcome variable is zero-inflated, you will see an excessive number of zeros when you look at the distribution of your outcome variable.
- Overdispersion. If you have reason to believe that there is overdispersion in your model, then you may be better off using a negative binomial model than a poisson model. Overdispersion simply means that the variance of your distribution is greater than the mean of the distribution. Poisson regression uses a single parameter to estimate both the mean and the variance of the distribution, whereas negative binomial regression allows for additional flexibility by including separate parameters for the mean and variance.
- When to use Bayesian regression
- When to use logistic regression
- When to use ordinal logistic regression
- When to use multinomial regression
- When to use linear regression
- When to use ridge regression
- When to use LASSO
- When to use mixed models
- When to use generalized additive models
Are you trying to figure out which machine learning model is best for your next data science project? Check out our comprehensive guide on how to choose the right machine learning model.