14.7 Proofs

This section contains two important proofs. These are not examinable.

14.7.1 Proof for the ordinary least squares estimates in simple linear regression

Recall the ordinary least square (OLS) estimates of the intercept (\(\beta_0\)) and slope (\(\beta_1\)) in simple linear regression are:

\[\begin{split} \begin{align} \hat{\beta_0} &= \bar{y} - \hat{\beta_0}\bar{x} \\ \hat{\beta_1} &= \frac{\sum_{i=1}^2 (x_i-\bar{x})(y_i-\bar{y})}{\sum_{i=1}^n (x_i-\bar{x_i})^2} \end{align} \end{split}\]

Proof:

To solve for the value of \(\beta_0\) that minimises \(SS_{RES}\), we differentiate \(SS_{RES}\) with respect to \(\hat{\beta}_0\) and set the derivative to zero:

\[ \frac{d(SS_{RES})}{d(\hat{\beta}_0)} = \sum_{i=1}^n -2(y_i-\hat{\beta_0}-\hat{\beta_1}x_i)=0 \]

Since \(\sum_{i=1}^n (y_i)=n\bar{y}\) and \(\sum_{i=1}^n(x_i)=n\bar{x}\), we can simplify to:

\[ -n\bar{y}+n\hat{\beta}_0+n\hat{\beta}_1\bar{x}=0 \]

Rearranging the above and divide by \(n\) to give:

\[ \hat{\beta}_0=\bar{y}-\hat{\beta}_1\bar{x}. \]

To solve for the value of \(\beta_1\) that minimises \(SS_{RES}\), we have to differentiate with respect to \(\hat{\beta}_1\). First, we substitute in our solution for \(\hat{\beta}_0\) as follows:

\[ SS_{RES}=\sum_{i=1}^n(y_i-(\bar{y}-\hat{\beta}_1\bar{x})-\hat{\beta}_1x_i)^2=\sum_{i=1}^n ((y_i-\bar{y})-\hat{\beta}_1(x_i-\bar{x}))^2 \]

Now differentiating the above with respect to \(\hat{\beta}_1\) and setting the differential to zero gives:

\[ \frac{d(SS_{RES})}{d(\hat{\beta}_1)} = \sum_{i=1}^n -2(x_i-\bar{x})(y_i-\bar{y})+2\hat{\beta}_1(x_i-\bar{x})^2=0 \]

Rearranging gives:

\[ \hat{\beta}_1\sum_{i=1}^n (x_i-\bar{x})^2 = \sum_{i=1}^n (x_i-\bar{x})(y_i-\bar{y}) \]
\[ \hat{\beta}_1 = \frac{\sum_{i=1}^n (x_i-\bar{x})(y_i-\bar{y})}{\sum_{i=1}^n(x_i-\bar{x})^2} \]

14.7.2 Proof that the OLS estimates are also the maximum likelihood estimates

If \(Y_i \overset{iid}{\sim} N(\mu, \sigma^2)\), the log likelihood function is:

\[ l(\mu | y_1,...,y_n) = -\frac{1}{2\sigma^2}\sum_{i=1}^n (y_i - \mu)^2 \]

So, for the simple linear regression model:

\[ l(\beta_0, \beta_1 | y_1,...,y_n) = -\frac{1}{2\sigma^2}\sum_{i=1}^n (y_i - \beta_1x_i-\beta_0)^2. \]

Therefore, for any fixed positive value for \(\sigma^2\), maximising the log likelihood function is equivalent to minimising \(SS_{RES}\) and so the OLS estimates are also maximum likelihood estimates of \(\beta_0\) and \(\beta_1\).