15.4 Estimating the parameters¶
Having specified our model, we now want to use a sample of data to obtain estimates of the model parameters.
15.4.1 Statistical model and observed data¶
Data: Suppose we have a sample of \(n\) people. Person \(i\) has an observed \(X\) value of \(x_i\) and an observed outcome \(y_i\). Therefore, our sample of data consists of: \(\{ (x_i, y_i); i=1,2,...,n\}\).
Statistical model: Our statistical model assumes that these observations are independent (between people) and are drawn from the distribution:
where
We further have a model relating the outcome to the independent variable:
We could put these aspects all together to write our statistical model concisely as:
Note
In the previous section, we used the notation \(\pi_x\) in order to emphasise that this probability is conditional on the value of \(X\). Now we are applying the distribution to a sample of people so we have changed to \(\pi_i\) to emphasise that the probability is conditional on whatever value \(X\) takes for person \(i\).
15.4.2 Maximum likelihood estimation¶
We first need to derive the likelihood of the model. we assume that observations (people) are independent of each other, thus:
We can write this as
Because when the observed outcome is 1, the second term above is 1 (recall \(x^0\)=1 for any \(x\)) so we just have \(Pr(Y_i = 1|X_i=x_i)\), which is equal to \(Pr(Y_i = y_i|X_i=x_i)\) when \(y_i = 1\). Conversely, when \(y_i = 0\), the first term becomes 1 and we are left with just the second term.
Now \(Pr(Y_i = 1 | X_i = x_i)\) is just the fraction within the Bernoulli distribution above, so we can substitute this in to get:
Taking the log of the above likelihood, we derive the following log-likelihood
By maximizing this log-likelihood over the parameters \((\beta_0, \beta_1)\), we can obtain the maximum likelihood estimates of the parameters: \((\hat{\beta}_0, \hat{\beta}_1)\). There is no closed-form solution to this optimisation problem. Therefore, the maximisation over the parameters is done numerically.