8.3 Connection between p-values and confidence intervals

Recall that we previously used the following fact:

For a normal distribution, approximately 95% of observations are contained within 1.96 standard deviations of the mean.

Which, applied to sampling distributions, tells us that:

For a normally distributed sampling distribution that is centred around the true population value, 95% of the estimates obtained under repeated sampling would be contained within 1.96 standard errors of the true population value

Applying this to the estimator ˆδ, this leads to a 95% confidence interval of

ˆδ±1.96×SE(δ)

The graph below shows some possible values of ˆδ, along with their 95% confidence intervals. We see that:

  • if ˆδ is exactly equal to the number 1.96×SE(δ) then the 95% confidence interval just touches zero.

  • if ˆδ>1.96×SE(δ) then the 95% confidence interval does not include zero - the whole interval lies above zero.

  • if 0<ˆδ<1.96 then the 95% confidence interval does include zero.

So what p-values would these values of ˆδ result in?

  • if ˆδ=1.96×SE(δ) then we know that 2.5% of the estimates lie above that point, so p=0.05.

  • if ˆδ>1.96×SE(δ) then fewer than 2.5% of estimates lie above ˆδ, so p<0.05

  • if 0<ˆδ<1.96×SE(δ) then more than 2.5% of estimates lie above ˆδ, so p>0.05

This leads us to the connection between 95% confidence intervals and p-values. When a 95% confidence interval and p-value are obtained from the same sampling distribution (which is typically the case when both are presented),

P-value

95% confidence interval

<0.05

Excludes the null value

0.05

Contains the null value

# Labels for graph
lab1 <- expression(- 2*SE)
lab2 <- expression(- 1*SE)
lab3 <- expression(1*SE)
lab4 <- expression(2*SE)

# Draw sampling distribution
options(repr.plot.width=6, repr.plot.height=5)
plot(seq(-4, 4, by=.05), xaxt="none",  xlab=" ", ylab="Density", 
     dnorm(seq(-4, 4, by=.05), 0, 1), col="blue", type = "l")
axis(1, seq(-2, 2, by=1), labels=c(lab1, lab2, 0, lab3, lab4))

# True population value
abline(v=0, col="red")
# 1.96 SE from population value
abline(v=c(-1.96, 1.96), col="green", lty=2)

# Some 95% confidence intervals
points(c(0.2, 1.96, 2.15), c(0.13, 0.03, 0.18), col = "orange")

lines(c(-1.76, 2.16), c(0.13, 0.13), col="orange")
lines(c(0, 3.92), c(0.03, 0.03), col ="orange")
lines(c(0.19, 4.17), c(0.18, 0.18), col ="orange")

text(2.75, 0.08, expression(hat(delta)==1.96*SE))
text(-2.6, 0.25, expression(hat(delta)<1.96*SE))
text(2.95, 0.23, expression(hat(delta)>1.96*SE))

lines(c(2.25, 3), c(0.185, 0.215), col="black")
lines(c(2.05, 2.8), c(0.035, 0.065), col="black")
lines(c(-2.4, 0.2), c(0.23, 0.14), col="black")
_images/08.d. Frequentist II_1_0.png