15.6 Inference¶

We have fitted the following logistic regression model:

$logit(\pi_i) = \beta_0 + \beta_1 x_i$

Having estimated the parameters of the logistic regression model using maximum likelihood estimation, we would like to obtain 95% confidence intervals for the parameters and perform hypothesis testing. We will now explore options available to do those things.

A sketch of the relevant statistical theory is provided in the optional reading in the appendix to this session.

15.6.1 Confidence intervals¶

A number of approximate confidence intervals can be obtained. Two commonly used confidence intervals are the Wald-type confidence intervals and profile confidence intervals.

Wald-type confidence interval: This confidence interval takes a familiar form. For slope parameter $\beta_1$ , an approximate 95% confidence inteval is given by

$\hat{\beta}_1 \pm 1.96 \hat{SE}(\hat{\beta}_1)$

where $\hat{\beta}_1$ is the maximum likelihood estimate for $\beta_1$ and $\hat{SE}(\hat{\beta}_1)$ is its standard error.

Profile likelihood confidence intervals These intervals are based on the log-likelihood-ratio. For each parameter of interest, a profile likelihood is constructed, which treats all other parameters as nuisances and removes them from the likelihood (by setting to their values which maximise the likelihood for each value of the parameter of interest). Then confidence intervals are constructed based on the profile likelihood. The Wald-type confidence intervals provide an approximation to this process. Profile likelihod confidence intervals are provided in R using the command confint.

15.6.2 Hypothesis tests¶

Often, the hypothesis we are interested in testing is that the independent variable $X$ is not associated with the outcome. Therefore, the null and alternative hypotheses are:

$H_0: \beta_1 = 0$
$H_1: \beta_1 \neq 0$

This is the null hypothesis tested, by default, in regression output provided in R.

There are three important type of tests available. These are all approximate tests and are asymptotically equivalent to one another. So in large samples, we would expect to see similar p-values from each test.

Likelihood ratio test This test is based directly on the approximate distribution of the log-likelihood-ratio.

Wald test This test is based on a quadratic approximation to the log-likelihood-ratio. As such, it can be less accurate than the likelihood ratio test, particularly if the null value is a long way from the maximum likelihood estimate. However, in this case all tests are likely to provide small p-values and similar qualitative conclusions.

The Wald test is used to obtain the p-values automatically displayed in regression output for GLMs in R and many other software platforms. This is because Wald tests are computationally less intensive than likelihood ratio tests.

Score test These tests are based on a slightly different quadratic approximation to the log-likelihood-ratio. This type of test is much less used than the other types, so we do not pursue this further here. Early tests used in epidemiology tended to be score tests, since they are less computationally intensive than the other approaches.

15.6.3 Example¶

We return to our model exploring the association between sex and diagnosis of dementia. We first perform a hypothesis test investigating the null hypothesis that sex is not associated with dementia diagnosis. Then we obtain 95% confidence intervals for our two parameters of interest.

dementia <- read.csv("Practicals/Datasets/Dementia/dementia2.csv")
dementia1 <- glm(dementia ~ sex, data = dementia, family = binomial(link="logit"))
summary(dementia1)

Call:
glm(formula = dementia ~ sex, family = binomial(link = "logit"), 
    data = dementia)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-0.2211  -0.2211  -0.1771  -0.1771   2.8855  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept) -4.14722    0.02439 -170.01   <2e-16 ***
sex          0.44771    0.03264   13.72   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 38333  on 199999  degrees of freedom
Residual deviance: 38143  on 199998  degrees of freedom
AIC: 38147

Number of Fisher Scoring iterations: 7

The p-value for sex is $p<0.001$ , providing strong evidence against the null hypothesis that sex is not associated with the odds of being diagnosed with dementia.

Now we will obtain the profile confidence intervals for the two estimated regression coefficients:

confint(dementia1)

Waiting for profiling to be done...

	2.5 %	97.5 %
(Intercept)	-4.1954026	-4.0997726
sex	0.3838153	0.5117587

In fact, these are more easily interpreted on the exponentiated scale, as below.

cbind(exp(coefficients(dementia1)), exp(confint(dementia1)))

Waiting for profiling to be done...

		2.5 %	97.5 %
(Intercept)	0.01580834	0.01506468	0.01657644
sex	1.56472021	1.46787427	1.66822252

The estimated odds in males is 0.0158 (95% CI 0.01506, 0.01657). We are 95% confident that thee odds of dementia diagnosis among males lies within this range.
The estimated odds ratio for females, compared with males, is 1.56 (95% CI 1.47, 1.67). We estimate that the odds of dementia diagnosis is 1.56 times higher among females than among males. The data are consistent with this value being as low as 1.47 or as high as 1.67.

Below is the code to obtain Wald test confidence intervals. Comparing these with the (unexponentiated) confidence intervals above, we see these are very similar, as we would expect.

confint.default(dementia1)

	2.5 %	97.5 %
(Intercept)	-4.1950299	-4.0994058
sex	0.3837405	0.5116735

Statistics for Health Data Science

15.6 Inference¶

15.6.1 Confidence intervals¶

15.6.2 Hypothesis tests¶

15.6.3 Example¶