7 reasons to use Bayesian inference!

I’m not saying that you should use Bayesian inference for all your problems. I’m just giving seven different reasons to use Bayesian inference–that is, seven different scenarios where Bayesian inference is useful:

1. Decision analysis. You can pipe your posterior uncertainty into a decision analysis and work out the expected utility of different decision options. This is O.G. Bayesian decision making, and we have some examples in Chapter 9 of BDA3. Indeed, this formulation can handle uncertainties in the utilities as well as in the probabilities that come up in the decision problem.

2. Propagation of uncertainty. You can use posterior simulations to get uncertainties for any function of parameters, latent data, and predictive data. For a simple example, suppose you fit a line, y = a + bx + error, and you want inference for the x-intercept, the value of x where E(y|x) = 0. This is just the solution to the equation 0 = a + bx, that is, x = -a/b. The point is, if you fit the regression and get posterior simulations for (a, b), you can directly get uncertainty for -a/b, which is a problem that can be challenging analytically, especially as models get more complicated. I actually don’t recommend you look at such ratios; here I’m just giving this as a simple example.

3. Prior information. OK, this one should be pretty clear! For a simple example, see Section 9.4 of Regression and Other Stories. Or this paper with Zwet. The point is that there are lots of real problems where the prior is about as strong as, or stronger, than the data, and it’s good to have a method that works in such problems.

4. Regularization. Set up a big model with sparse data and your parameter estimates are gonna be noisy. Bayesian inference with informative priors is one way to “regularize,” that is, to get more stable estimates. This is different from item #3 above in that the prior used for regularization doesn’t need to represent subject-matter knowledge; it can just be set up with the goal of producing regularized estimates that have good statistical properties. The regularization prior is still “information” in the mathematical sense, but it’s coming from a different place and it can map to the external world in a different way.

5. Combining multiple sources of information. Just think about multilevel modeling. Here’s an example from pharmacology where we use soft constraints (that is, informative priors) to combine data from trials of two different drugs. Or multilevel regression and poststratification, where even if we’re using data from just one survey, we’re getting inference for multiple states, and you can think of this as combining the survey responses from different states. Also you’re combining information from a survey and a census.

6. Latent data and parameters. When a model is full of parameters–perhaps even more parameters than data–you can’t estimate them all. Instead we can consider the parameters as “latent data” and give them a joint probability model, which we can then estimate using Bayesian inference. Examples include latent continuous preferences underlying choice data, unobserved internal concentrations in pharmacology models, and intermediate states in all sorts of process models.

7. Enabling you to go further. Lots of the above ideas revolve around Bayesian inference being useful for models that are too large or complicated to estimate using maximum likelihood or other traditional approaches to point estimation. In practice, this implies that Bayesian inference doesn’t just allow you to fit models we otherwise would’ve had difficulty fitting; it also enables us to push out the frontier. Now that we can fit more complicated models, we’re more likely to do so. For example, we’ll set up latent-variable measurement error models where before we would’ve just ignored the errors or applied some analytical correction.

That said, I recognize that Bayesian inference can require some effort and that any specific benefit of Bayesian inference can be attained using other methods: just use regularization to estimate parameters, treat uncertain quantities as latent data, and summarize uncertainty using joint predictive simulations. No reason to call it “Bayesian,” and it can sometimes be ok–even useful–to use incoherent specifications for different parts of your problem. Bayesian inference works for the problems I’ve worked on, by way of the tools that I’ve learned and developed to build, fit, check, compare, expand, visualize, and use models. Other tools will work for other people in other settings.

The reason for this post is to lay out several different motivations for Bayesian inference. It’s easy to just pick one or two (for example, prior information and regularization, or decision analysis and propagation of uncertainty) without seeing the big picture. So I thought it would be helpful to put all these different things in one place.