9.7 Prior Predictions

Before observing a quantity \(y\), we can provide its predictive distribution by integrating out the unknown parameter,

\[ p(y) = \int p(y|\theta) p(\theta) d\theta. \]

Predictions are useful in many settings, for example forecasting, cost-effectiveness models and design of studies. In the trial described earlier in this section, we had 10 patients. Suppose we are interested in predicting the number of patients who will have a positive response. Recall that the Beta distribution is a suitable prior distribution for \(\theta\), the proportion of positive responses. We have:

\[\begin{split} \begin{align*} \theta &\sim \hbox{Beta}(a,b) \\ y &\sim \hbox{Binomial}(\theta,n) \end{align*} \end{split}\]

The exact predictive distribution \(p(y)\) can be computed analytically and is known as the Beta-Binomial distribution. It has the complex form with three parameters, number of trials \(n\) and shape parameters, \(a\) and \(b\):

\[\begin{split} \begin{align*} p(y) &= \frac{ \Gamma (a+ b)}{ \Gamma (a) \Gamma (b) } {n \choose y} \frac{\Gamma (a+ y) \Gamma (b+n-y)}{\Gamma (a+b+n)} \\ E(y) &= n \frac{a}{a+b} \end{align*} \end{split}\]

Given that we use the asymmetrical \(Beta(2, 9)\) prior, our predictive distribution would be:

\[ p(y) = \frac{ \Gamma (11)}{ \Gamma (2) \Gamma (9) } {10 \choose y} \frac{\Gamma (2+ y) \Gamma (19-y)}{\Gamma (21)}, \]

with \(E(y) = 10 \, \frac{2}{11} = 1.81\). So, before observing any data, we would predict around 2 patients to have a positive response out of 10.

9.7.1 Posterior Prediction

Suppose that have observed \(y\), and we want to predict future observations \(z\), assuming that \(z\) and \(y\) are independent, conditional on \(\theta\). The posterior predictive distribution for \(z\) is given by,

\[\begin{split} \begin{align*} p(z|y) &= \int p(z, \theta | y) d \theta \\ &= \int p(z |y, \theta) p(\theta |y ) d \theta \\ &= \int p(z | \theta) p(\theta |y ) d \theta \end{align*} \end{split}\]

We are now weighting the probability distribution function for \(z\) with our posterior belief after having observed \(y\).

For our example, we found that the posterior distribution \(p(\theta |y ) \) is a Beta(\(a+y, b+n-y\)) distribution. Thus our posterior predictive distribution is a Beta-binomial distribution with the number of trials \(n_p\) and shape parameters \(a+y, b+n-y\).

Now, given that we use the asymmetrical \(Beta(2, 9)\) prior, and then observe that \(y=4\) patients out of \(n=10\) had a successful result, and we wish to predict how many sucesses \(z\) out of \(n_p=20\) to expect, our posterior predictive distribution is a Beta-binomial with parameters \(20\) and shape parameters \(6\) and \(15\). The expectation of this distribution is \(E(y) = 20 \frac{6}{21} \approx 6\) patients.