Print crude and adjusted

Max Gordon

2018-03-20

How to output crude and adjusted models

Background

It is a common practice in epidemiological papers to present regression estimates in both adjusted and unadjusted format. The unadjusted format is just that particular variable without any of the covariates and is often referred to as the crude estimate. The advantage with the two is that you provide the reader with a better understanding of how much the variables affect each other and also a sense of how much confounding is present in the model.

The printCrudeAndAdjusted function

The printCrudeAndAdjusted was designed for this purpose. It provides a table with the coef and confint estimates for the full model, then reruns for each variable through an update call on the model providing outcome ~ variable as input. Below is a basic example:

library(datasets)
data(mtcars)
mtcars$am <- factor(mtcars$am, labels = c("Automatic", "Manual"))
fit <- lm(mpg ~ cyl + disp + hp + am, data = mtcars)
library(Greg)
printCrudeAndAdjustedModel(fit)
Crude   Adjusted
Variable Coef 2.5 % to 97.5 %   Coef 2.5 % to 97.5 %
(Intercept) 20.09 17.92 to 22.26   30.48 24.60 to 36.36
cyl -2.88 -3.53 to -2.22   -0.83 -2.39 to 0.72
disp -0.04 -0.05 to -0.03   -0.01 -0.03 to 0.01
hp -0.07 -0.09 to -0.05   -0.03 -0.06 to 0.00
Manual 7.24 3.64 to 10.85   3.45 0.46 to 6.43

As the variable names often aren’t that pretty you can use the rowname.fn in order to get the row names desired. Here’s a simple substitution example together with some limits to the digits, and a reference group:

printCrudeAndAdjustedModel(fit, 
                           digits = 1, 
                           add_references = TRUE,
                           rowname.fn = function(n){
  if (n == "disp")
    return("Displacement (cu.in.)")
  if (n == "hp")
    return("Gross horsepower")
  if (n == "cyl")
    return("No. cylinders")
  if (n == "am")
    return("Transmission")
  return(n)
})
Crude   Adjusted
Variable Coef 2.5 % to 97.5 %   Coef 2.5 % to 97.5 %
(Intercept) 20.1 17.9 to 22.3   30.5 24.6 to 36.4
No. cylinders -2.9 -3.5 to -2.2   -0.8 -2.4 to 0.7
Displacement (cu.in.) 0.0 -0.1 to 0.0   0.0 0.0 to 0.0
Gross horsepower -0.1 -0.1 to 0.0   0.0 -0.1 to 0.0
Transmission
  Automatic 0.0 ref   0.0 ref
  Manual 7.2 3.6 to 10.8   3.4 0.5 to 6.4

You can achieve the same effect with the label function in the Hmisc package:

library(Hmisc)
label(mtcars$disp) <- "Displacement (cu.in)"
label(mtcars$cyl) <- "No. cylinders"
label(mtcars$hp) <- "Gross horsepower"
label(mtcars$am) <- "Transmission"

printCrudeAndAdjustedModel(fit, 
                           digits = 1, 
                           add_references = TRUE)
Crude   Adjusted
Variable Coef 2.5 % to 97.5 %   Coef 2.5 % to 97.5 %
(Intercept) 20.1 17.9 to 22.3   30.5 24.6 to 36.4
No. cylinders -2.9 -3.5 to -2.2   -0.8 -2.4 to 0.7
Displacement (cu.in) 0.0 -0.1 to 0.0   0.0 0.0 to 0.0
Gross horsepower -0.1 -0.1 to 0.0   0.0 -0.1 to 0.0
Transmission
  Automatic 0.0 ref   0.0 ref
  Manual 7.2 3.6 to 10.8   3.4 0.5 to 6.4

Binding columns/rows

The returned variable from printCrudeAndAdjustedModel works just like any matrix, i.e. you can bind it both on rows and on columns:

fit_mpg <- lm(mpg ~ cyl + disp + hp + am, data = mtcars)
fit_weight <- lm(wt ~ cyl + disp + hp + am, data = mtcars)
p_mpg <- printCrudeAndAdjustedModel(fit_mpg, digits = 1, add_references = TRUE)
p_weight <- printCrudeAndAdjustedModel(fit_weight, digits = 1, add_references = TRUE)
rbind("Miles per gallon" = p_mpg, 
      "Weight (1000 lbs)" = p_weight)
Crude   Adjusted
Variable Coef 2.5 % to 97.5 %   Coef 2.5 % to 97.5 %
Miles per gallon
(Intercept) 20.1 17.9 to 22.3   30.5 24.6 to 36.4
No. cylinders -2.9 -3.5 to -2.2   -0.8 -2.4 to 0.7
Displacement (cu.in) 0.0 -0.1 to 0.0   0.0 0.0 to 0.0
Gross horsepower -0.1 -0.1 to 0.0   0.0 -0.1 to 0.0
Transmission
  Automatic 0.0 ref   0.0 ref
  Manual 7.2 3.6 to 10.8   3.4 0.5 to 6.4
Weight (1000 lbs)
(Intercept) 3.2 2.9 to 3.6   2.3 1.5 to 3.2
No. cylinders 0.4 0.3 to 0.6   -0.1 -0.3 to 0.2
Displacement (cu.in) 0.0 0.0 to 0.0   0.0 0.0 to 0.0
Gross horsepower 0.0 0.0 to 0.0   0.0 0.0 to 0.0
Transmission
  Automatic 0.0 ref   0.0 ref
  Manual -1.4 -1.9 to -0.8   -0.6 -1.0 to -0.1
cbind("Miles per gallon" = p_mpg, 
      "Weight (1000 lbs)" = p_weight)
Variable Crude 2.5 % to 97.5 % Adjusted 2.5 % to 97.5 % Crude 2.5 % to 97.5 % Adjusted 2.5 % to 97.5 %
(Intercept) 20.1 17.9 to 22.3 30.5 24.6 to 36.4 3.2 2.9 to 3.6 2.3 1.5 to 3.2
No. cylinders -2.9 -3.5 to -2.2 -0.8 -2.4 to 0.7 0.4 0.3 to 0.6 -0.1 -0.3 to 0.2
Displacement (cu.in) 0.0 -0.1 to 0.0 0.0 0.0 to 0.0 0.0 0.0 to 0.0 0.0 0.0 to 0.0
Gross horsepower -0.1 -0.1 to 0.0 0.0 -0.1 to 0.0 0.0 0.0 to 0.0 0.0 0.0 to 0.0
Transmission
  Automatic 0.0 ref 0.0 ref 0.0 ref 0.0 ref
  Manual 7.2 3.6 to 10.8 3.4 0.5 to 6.4 -1.4 -1.9 to -0.8 -0.6 -1.0 to -0.1

Selecting column/rows using [

It is also possible to select individual rows/columns if so desired using the [ operator:

p_mpg[,1:2]
Variable Coef 2.5 % to 97.5 %
(Intercept) 20.1 17.9 to 22.3
No. cylinders -2.9 -3.5 to -2.2
Displacement (cu.in) 0.0 -0.1 to 0.0
Gross horsepower -0.1 -0.1 to 0.0
Transmission
  Automatic 0.0 ref
  Manual 7.2 3.6 to 10.8
p_mpg[1:2,]
Crude   Adjusted
Variable Coef 2.5 % to 97.5 %   Coef 2.5 % to 97.5 %
(Intercept) 20.1 17.9 to 22.3   30.5 24.6 to 36.4
No. cylinders -2.9 -3.5 to -2.2   -0.8 -2.4 to 0.7

More advanced models

The function is prepared for use with splines and strata but is currently not tested for mixed models. As the spline estimates are best interpreted in a graph they are omitted from the output.

library("survival")

set.seed(10)
n <- 500
ds <- data.frame(
  ftime = rexp(n),
  fstatus = sample(0:1, size = n, replace = TRUE),
  y = rnorm(n = n),
  x1 = factor(sample(LETTERS[1:4], size = n, replace = TRUE)),
  x2 = rnorm(n, mean = 3, 2),
  x3 = rnorm(n, mean = 3, 2),
  x4 = factor(sample(letters[1:3], size = n, replace = TRUE)))

library(survival)
library(splines)
fit <- coxph(Surv(ds$ftime, ds$fstatus == 1) ~ x1 + ns(x2, 4) + x3 + strata(x4), data=ds)

printCrudeAndAdjustedModel(fit, add_references = TRUE)
Crude   Adjusted
Variable HR 2.5 % to 97.5 %   HR 2.5 % to 97.5 %
x1
  A 1.00 ref   1.00 ref
  B 1.02 0.69 to 1.48   1.03 0.70 to 1.51
  C 1.25 0.87 to 1.78   1.30 0.91 to 1.87
  D 0.93 0.64 to 1.34   0.97 0.67 to 1.40
x3 1.05 0.98 to 1.12   1.05 0.99 to 1.13
# Note that the crude is with the strata
a <- getCrudeAndAdjustedModelData(fit)
a["x3", "Crude"] == 
  exp(coef(coxph(Surv(ds$ftime, ds$fstatus == 1) ~ x3 + 
                   strata(x4), data=ds)))
##   x3 
## TRUE