13.4 Centering

In many analyses, interpreting the intercept is not as important as interpreting the estimated regression coefficients and so it does not matter if our intercept is an absurd value (as in the example above). However, if we do wish to obtain an interpretable intercept, we can center the independent variables.

Centering a variable means subtracting a constant from every value of the variable. This essentially shifts the scale of the predictor (the point 0 is shifted to the chosen constant), but does not affect the units of the variable. Consequently, the new interpretation of the intercept would be the mean of \(Y\) when the independent variable is equal to the constant. The estimated regression coefficient of the independent variable is not affected.

As as example, we will repeat the analysis above, but center each of the covariates on their mean value.

# What are the mean gestational days and mothers height in our data?
data<- read.csv('https://www.inferentialthinking.com/data/baby.csv')
summary(data$Gestational.Days)
summary(data$Maternal.Height)

# Create new (centered) variables in our data
data$Gestational.Days.Centered<-data$Gestational.Days-mean(data$Gestational.Days)
data$Maternal.Height.Centered<-data$Maternal.Height-mean(data$Maternal.Height)

# Redefine Model 4 using the centered variables
model4<-lm(Birth.Weight~Gestational.Days.Centered+Maternal.Height.Centered, data=data)
summary(model4)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  148.0   272.0   280.0   279.1   288.0   353.0 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  53.00   62.00   64.00   64.05   66.00   72.00 
Call:
lm(formula = Birth.Weight ~ Gestational.Days.Centered + Maternal.Height.Centered, 
    data = data)

Residuals:
    Min      1Q  Median      3Q     Max 
-53.829 -10.589   0.246  10.254  54.403 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)               119.46252    0.47980 248.983  < 2e-16 ***
Gestational.Days.Centered   0.45237    0.03006  15.051  < 2e-16 ***
Maternal.Height.Centered    1.27598    0.19049   6.698 3.27e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 16.44 on 1171 degrees of freedom
Multiple R-squared:  0.1969,	Adjusted R-squared:  0.1955 
F-statistic: 143.5 on 2 and 1171 DF,  p-value: < 2.2e-16

Now the intercept (\(\hat{\beta}_0\)) is equal to 119.46. This is interpreted as: the estimated mean birthweight for a child who was born after 279.1 gestastional days and whose mother’s height is 64.05 inches is 119.46 ounces. Additionally, notice that the estimated regression coefficients for gestational days and mother’s height, and their associated standard errors have not changed.