9.8 Conjugacy

In the example with the Beta-Binomial model, we found that using the Beta distribution for the prior lead us to a posterior distribution that is also a Beta distribution. This is not a coincidence. Often, a particular distributional family is chosen for the prior, so that the resulting posterior distribution belongs to the same family. This is called a conjugate prior. Below are the conjugate priors for some common likelihood models.

Likelihood

Conjugate Prior

Bernoulli

Beta

Binomial

Beta

Poisson

Gamma

Geometric

Beta

Normal

Normal, Gamma and a few others

Exponential

Gamma

Gamma

Gamma

9.8.1 Exercise

Suppose that there is an experiment where \(n\) patients are asked to try different treatments each time they get a headache. We are interested in the number of different treatments a patient takes before they find one that is successful. For patient \(i\), for \(1 \leq i \leq n\), we denote by \(y_i\) the number of treatments tried before the first success. Note that \(\left\{ y_1, y_2, ..., y_n \right\}\) are a sample from a Geometric distribution: \(y_i \sim Geom(\theta)\). The probability density function of a geometric distribution is:

\[p(y | \theta) = \theta (\theta -1)^{y-1}\]

Suppose we wish to make inference on \(\theta\). By specifying a Beta prior for \(\theta\): \(\theta \sim Beta(a, b)\), derive the posterior distribution of \(\theta\).

Try the exercise and then click the button to reveal the solution.

Solution:

\[\begin{split} \begin{align*} p(\theta \mid y_1, ..., y_n) &\propto p(\theta) \prod_{i=1}^n p(y_i \mid \theta)\\ &\propto \frac{\Gamma(a+ b)}{ \Gamma(a)\Gamma(b) } \theta^{a-1} (1-\theta)^{b-1} \prod_{i=1}^n \theta (\theta -1)^{y-1}\\ &\propto \frac{\Gamma(a+ b)}{ \Gamma(a)\Gamma(b) } \theta^{a-1} (1-\theta)^{b-1} \theta^n (\theta -1)^{\sum_{i=1}^n y_i-n}\\ &\propto \theta^{a+n-1} (\theta -1)^{\sum_{i=1}^n y_i -n +b-1} \end{align*} \end{split}\]

This is a Beta distribution with parameters \(a+n\) and \(\sum_{i=1}^n y_i-n+b\).