16.3 GLM Assumptions¶

To successful apply a GLM, a number of assumptions about the data must be met.

The data must be independently distributed.
The outcome variable \(Y_{i}\) does not have to be normally distributed but should typically form a distribution from the exponential family
GLM’s must assume a linear relationship between the transformed outcome in terms of the link function and the predictor (covariate) variables.
The homogeneity of variance does not need to be satisfied. Generally the model structure, and overdispersion (when the observed variance is larger than what the model assumes) can be present.
Errors need to be independent but not normally distributed.
GLM’s use maximum likelihood estimation (MLE) rather than ordinary least squares (OLS) to estimate the parameters, and thus relies on large-sample approximations.

Statistics for Health Data Science