16.2 Generalised Linear Model Components¶
A generalised linear model consists of three components:
A random component¶
This refers to the probability distribution of the outcome variable \(Y_{i}\) (for the ith of n independently sampled observations). It specifies the conditional distribution of the outcome given the values of the predictors (covariates) in the model. \(Y_{i}\) is generally formulated as distribution from the exponential family, however subsequent work has extended GLMs to multivariate exponential families, to certain non-exponential families and to also to situations where the distribution of \(Y_{i}\) is not completely specified. Within this chapter we will only explore the application to distributions from the exponential family (i.e. Normal, Gamma, Poisson, Bernoulli etc.)
A systematic component (the linear predictor)¶
This is the linear function of the predictors (covariates) in the model
Just as in linear and logistic regression models, the predictors (covariates) \(X_{ij}\) may be continuous and/or categorical.
A link function¶
This function transforms the expectation of the predictors (covariates) to be linear with the outcome variable.
Suppose we let \(\mu_{i} = E[Y_{i}]\). Then the link function is a function \(g(.)\) with
Or in other words,
This can be inverted so
The inverse link \(g^{-1}(.)\) is often called the mean function as it gives the expected value of the outcome.
16.2.1 An example - logistic regression¶
Suppose we wish to fit a logistic regression model (which we will see is one particular type of GLM) for a binary outcome \(Y\) and a single covariate \(X\). Normally we write \(\pi_i\) for the expected outcome but here we will use \(\mu_i\) instead, so you can see how the model we had previously connects with the more general notation above. Thus, we let \(\mu_i = E[Y_i]\) (=\(\pi_i\) in previous sessions). We have:
The linear predictor is given by:
The link function can be defined generically. In the equation below, \(z\) has no intrinsic meaning; it is just used here to enable us to define a function. The link function for logistic regression is the logit function:
Setting this equal to \(\eta_i\), as per the definition above, we get:
which is the logistic regression model we met previously.