Are you wondering when you should use bayesian regression over standard frequentist regression? Or maybe you are typing to decide whether you should use Bayesian regression or another machine learning model. Well either way, you are in the right place!
In this article, we tell you everything you need to know in order to decide whether a Bayesian regression model is the right option for you. We start out by talking about what type of outcomes Bayesian regression models can handle. After that, we go over some of the main pros and cons of Bayesian regression models. Finally, we discuss specific scenarios where you should and should not use Bayesian regression models.
Outcomes that Bayesian regression can handle
When we talk about Bayesian regression models in this article, we are not strictly talking about one specific Bayesian model, such as a Bayesian linear regression model. Instead, we are talking about a family of Bayesian regression models that can service a range of different types of outcome variables.
For example, there is a Bayesian equivalent to linear regression for numeric outcomes, logistic regression for binary outcomes, and poisson regression for count outcome variables. All of these models are considered in this article.
Advantages and disadvantages of bayesian regression
Are you wondering what some of the main advantages and disadvantages of Bayesian regression models are? Here are some of the main advantages and disadvantages of Bayesian regression models.
Advantages of bayesian regression
- Allow you to incorporate prior knowledge. Bayesian regression models make use of poor distributions that allow you to incorporate external knowledge into your model. There is not a straightforward way to do this in a standard frequentist regression model. This is a big plus in situations where you have strong prior knowledge about your domain.
- Perform well on small sample sizes. Bayesian regression models tend to preform better than standard frequentist regression models when you are working with a small sample size. This is especially true if you have external information that you can incorporate into your model prior.
- Confidence intervals have straightforward interpretation. Credible intervals, which are the Bayesian version of confidence intervals, have a much more straightforward interpretation than the confidence intervals that you get from standard linear regression models.
- Can incorporate automatic variable selection. Another benefit of Bayesian regression models is that if you use the right prior, you can get automatic variable selection in your model. There are frequentist regression models, such as the LASSO model, that have similar properties. However, in these frequentist models, the variable selection often comes at the detriment of model interpretability.
- More flexible. Another benefit of Bayesian models is that they are generally more flexible than frequentist regression models. If you find yourself in a tricky situation where you need to incorporate some non-standard logic into your model, you might want to look at Bayesian regression models.
Disadvantages of bayesian regression
- Takes longer to train. One of the main disadvantages of Bayesian regression is that Bayesian regression models tend to take longer to train than standard linear regression models. This can cause issues if you have a large set of training data that you want to take into consideration.
- Takes longer for inference. In addition to taking longer to train, Bayesian models also generally take longer for inference. In some cases this might not be a big deal, but in other cases this will be. If you need a model that makes fast predictions, you may be better off using a standard regression model or another machine learning model.
- Not available in many common libraries. Another disadvantage of Bayesian regression models is that you often need to use special libraries that are specifically tailored for Bayesian applications to train them. This may be problematic if you code in an environment with strict dependency control where it is difficult to install new packages.
- Common regression pitfalls. Bayesian regression models are still affected by many of the common pitfalls that affect standard frequentist regression models. For example, Bayesian models still assume a linear relationship between the features and outcome variable. They also require you to specify any interactions you want to be considered ahead of time.
When to use bayesian regression
When should you use Bayesian regression models over standard frequentist regression models or other machine learning models? Here are some examples of cases where Bayesian regression is a good bet.
- Small sample size. Bayesian inference tends to be particularly useful incases where you have a small sample size. If you want to build a model that is relatively complex, but you do not have a lot of data available to you, then Bayesian regression is a great option. In fact, it is likely one of the best options you have! Many other machine learning models require large sample sizes to function well.
- Strong prior knowledge. If you have strong external knowledge that you want to incorporate into your model, using a Bayesian model is the most straightforward way to do so. The smaller the size of the dataset you use, the more pronounced the effect of the prior information will be.
When not to use bayesian regression
When should you avoid using bayesian regression? Here are some examples of use cases where you should avoid using Bayesian regression.
- Large datasets. Bayesian regression models generally get more difficult to work with as the size of your dataset increases. This is because Bayesian models tend to be computationally intensive. Additionally, as the size of your dataset grows, the information incorporated in your prior distribution becomes less relevant. That means that some of the benefits of using Bayesian regression fade away. You may be better off using a standard linear regression, logistic regression, or poisson regression model.
- Real time predictions. If you want to use your model to make real time predictions on a website or web app, you may be better off using a different model. Bayesian models generally require you to use specific Bayesian packages that may be more difficult to serve in production. Additionally, if the speed of inference is important to you, Bayesian models tend to have slower inference than frequentist models.
- When to use logistic regression
- When to use ordinal logistic regression
- When to use multinomial regression
- When to use linear regression
- When to use random forests
- When to use ridge regression
- When to use LASSO
- When to use support vector machines
- When to use gradient boosted trees
- When to use poisson regression
- When to use neural networks
- When to use mixed models
Are you trying to figure out which machine learning model is best for your next data science project? Check out our comprehensive guide on how to choose the right machine learning model.