Lifecycle: Maturing License: MIT CRAN status Downloads 2022-10-08 GitHub issues Travis

glmmSeq

glmmSeq logo

The aim of this package is to model gene expression with a general linear mixed model (GLMM) as described in the R4RA study [1]. The most widely used mainstream differential gene expression analysis tools (e.g Limma, DESeq2, edgeR) are all unable to fit mixed effects linear models. This package however fits negative binomial mixed effects models at individual gene level using the negative.binomial function from MASS and the glmer function in lme4 which enables random effect, as as well as mixed effects, to be modelled.

Installing from CRAN

install.packages("glmmSeq")

Installing from Github

devtools::install_github("myles-lewis/glmmSeq")

Installing Locally

Or you can source the functions individually:

functions = list.files("./R", full.names = TRUE)
invisible(lapply(functions, source))

But you will need to load in the additional libraries:

# Install CRAN packages
invisible(lapply(c("MASS", "car", "ggplot2", "ggpubr", "lme4","lmerTest",
                   "methods", "parallel", "plotly", "pbapply", "pbmcapply"),
                 function(p){
                   if(! p %in% rownames(installed.packages())) {
                     install.packages(p)
                   }
                   library(p, character.only=TRUE)
                 }))

# Install BioConductor packages
if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
invisible(lapply(c("qvalue"), function(p){
  if(! p %in% rownames(installed.packages())) BiocManager::install(p)
  library(p, character.only=TRUE)
}))

Overview

To get started, first we load in the package:

library(glmmSeq)
set.seed(1234)

This vignette will demonstrate the power of this package using a minimal example from the PEAC data set. Here we focus on the synovial biopsy RNA-Seq data from this cohort of patients with early rheumatoid arthritis.

data(PEAC_minimal_load)

This data contains:

  • metadata: which describes each sample. Including patient ID, sample time-point, and six-month EULAR response. Where EULAR is a rheumatoid arthritis response metric based on composite DAS28 scores.
  • tpm: the transcript per million RNA-seq count data

These are outlined in the following subsections.

Metadata

metadata$EULAR_binary  = NA
metadata$EULAR_binary[metadata$EULAR_6m %in%
                        c("Good", "Moderate" )] = "responder"
metadata$EULAR_binary[metadata$EULAR_6m %in% c("Non-response")] = "non_responder"

kable(head(metadata), row.names = F) %>% kable_styling()
PATID Timepoint EULAR_6m EULAR_binary
PAT300 6 Good responder
PAT209 6 Good responder
PAT219 6 Moderate responder
PAT211 6 Good responder
PAT216 6 Good responder
PAT212 6 Good responder

Count data

kable(head(tpm)) %>% kable_styling() %>%
  scroll_box(width = "100%")
S9001 S9002 S9003 S9004 S9006 S9007 S9008 S9009 S9010 S9011 S9012 S9013 S9014 S9016 S9017 S9018 S9019 S9020 S9021 S9023 S9025 S9029 S9034 S9035 S9036 S9038 S9039 S9040 S9042 S9043 S9044 S9045 S9047 S9048 S9049 S9052 S9053 S9054 S9056 S9059 S9060 S9063 S9065 S9066 S9067 S9068 S9069 S9070 S9072 S9073 S9074 S9075 S9076 S9077 S9078 S9079 S9080 S9081 S9083 S9084 S9085 S9086 S9087 S9088 S9089 S9090 S9091 S9092 S9093 S9094 S9095 S9096 S9097 S9098 S9099 S9100 S9101 S9102 S9103 S9104 S9105 S9106 S9107 S9108 S9109 S9110 S9111 S9112 S9113 S9114 S9115 S9116 S9117 S9119 S9120 S9121 S9122 S9123 S9124 S9125 S9127 S9128 S9129 S9130 S9131 S9132 S9133 S9134 S9135 S9136 S9137 S9138 S9139 S9140 S9141 S9142 S9143 S9144 S9145 S9146 S9147 S9148 S9149
MS4A1 1.622818 2.024451 0.6995153 15.742876 2.123731 57.280078 2.011160 0.580000 1.006783 0.391420 3.886618 2.746742 31.647637 20.16757 0.925393 3.4271907 3.130648 1.422480 63.14643 0.295537 12.7877279 4.817494 15.874491 0.569317 4.429667 1.060991 2.1850700 5.321541 3.542383 1.676829 5.471954 29.942195 4.5519540 7.252108 1.9540320 12.947362 73.430132 2.364819 5.2372870 12.281007 17.637846 72.557856 2.379427 0.444729 89.042394 2.4190970 226.612930 8.01254 5.328680 0.7414830 4.669566 5.218332 0.0676825 26.0579080 77.100135 23.2918620 50.306168 5.24953 1.342645 0.511810 0.2428008 7.756002 1.020052 0.602854 0.2444728 2.047680 2.110914 0.683498 24.664095 1.9789030 0.1702800 3.910493 6.5513870 0.8342158 31.7618280 1.136854 17.98745 3.369558 8.577294 1.453945 41.031738 42.991580 27.58865 14.876902 21.6405734 42.049748 75.6557000 2.501497 3.560322 32.1091249 7.008288 0.746264 10.0969886 92.3355460 0.5157370 0.8011538 5.075003 1.979931 0.2068106 2.717007 0.984857 0.7771422 0.000000 8.2769093 1.261793 2.361150 2.372850 0.6968002 43.9459380 5.0257346 0.281387 0.4190982 1.716238 16.25508 1.9852750 26.591585 27.994183 1.325606 1.323197 12.47092 2.3049434 1.4901770 0.229867
ADAM12 2.581805 1.110548 0.9573992 5.466742 2.645153 0.972897 0.658545 1.159264 0.462860 3.074193 10.947068 9.678026 1.205472 11.63376 1.772810 0.3506414 3.528288 17.416140 6.36844 4.064450 0.2666618 2.113812 0.473972 1.222569 30.012798 0.181003 2.4554526 4.136035 13.961315 4.195791 3.629029 8.362213 0.6999296 7.894137 0.3268513 2.847538 7.015717 0.717947 0.4396555 1.141057 1.069326 1.895644 0.486478 14.504778 1.897848 0.9943284 4.140489 41.30644 8.812227 0.3317777 8.699836 8.323936 0.0292791 0.9615935 2.512841 0.8960788 3.766372 7.13890 1.917924 8.893462 0.0966324 1.550900 4.760995 1.769894 4.6901815 1.200418 1.392691 0.439077 0.658254 0.3610814 0.2501964 0.610416 0.6842673 2.8749874 0.6717734 16.374299 1.36295 8.725470 0.258386 0.214470 3.573125 2.086027 13.44347 4.007794 0.0338254 1.595552 0.9645215 1.913151 5.780085 0.7098401 5.242844 1.115879 0.6499524 0.5397263 0.2299409 0.7178614 0.535170 0.282374 0.2003997 2.202211 0.310417 2.0991998 0.215329 0.1962535 25.010318 2.970828 1.461817 2.6506093 0.6501276 0.9633052 8.715446 32.4636380 4.281851 0.83041 0.9840191 2.363019 0.314493 1.179129 0.656344 1.57062 0.4597666 0.4707508 4.727155
IGHV7-4-1 0.992885 0.000000 2.2060400 26.381700 0.000000 2158.300000 0.644889 0.000000 0.000000 1.500990 4.945980 38.268000 55.570100 107.91600 0.000000 0.0000000 0.000000 1.080940 27.51650 0.000000 21.4136000 6.121120 15.190300 0.000000 0.597722 0.000000 0.0844527 1228.620000 51.561100 1.149330 1.434460 3728.700000 6.3708500 1.553570 0.0000000 165.928000 30.714000 0.000000 5.5235500 0.770965 1603.290000 53.605000 16.063900 6.209400 1021.480000 0.2965720 85.466200 9.18953 41.191800 5.0514200 47.140500 42.541900 0.4660800 835.2970000 1929.760000 57.2972000 39.990700 0.00000 0.000000 0.411695 0.0000000 26.176100 1.033220 2.273400 0.0000000 11.443300 0.272802 0.000000 81.135100 0.7083360 0.0000000 1.298760 0.5145140 0.1260250 13.9310000 2.345910 77.63540 2.181890 63.096300 0.000000 6.473680 20.211900 347.99200 39.273300 2.3200600 584.705000 2147.2900000 242.387000 14.263100 52.4861000 0.399860 0.000000 0.7025670 11.2577000 0.6126850 4.5281300 155.926000 320.086000 1.3791200 1.587010 0.167711 0.0000000 0.000000 5.6540200 0.000000 0.000000 0.560530 1.2747600 50.6122000 5.3320100 0.000000 0.0000000 0.757454 132.98900 3.4184700 18.130100 3.085550 1.183940 0.277095 135.99100 0.7799880 0.6956290 0.000000
IGHV3-49 0.000000 0.196831 8.7892000 345.493000 2.093480 858.980000 3.201300 0.000000 0.000000 0.000000 114.334000 17.150200 136.605000 1080.73000 0.196674 0.0000000 0.000000 0.931275 140.22000 0.000000 202.4950000 60.114600 145.421000 0.145099 32.808500 1.207290 3.5942600 467.534000 55.301200 0.869369 14.462700 928.660000 64.4575000 0.430536 0.0000000 80.024800 53.191800 0.000000 35.4998000 0.303541 1002.870000 127.201000 634.856000 22.119900 975.855000 0.0000000 623.330000 129.50800 180.819000 29.8933000 183.605000 170.369000 0.2201470 403.4460000 384.597000 24.9283000 426.619000 1.08364 0.591472 4768.210000 0.0000000 93.669200 9.291720 0.607172 0.0000000 9.736200 1.132510 0.000000 175.472000 1.0305600 0.0885169 0.972741 12.5531000 0.8804760 163.1520000 1.764370 37.22090 15.345300 401.861000 0.575801 11.926900 45.198600 317.65400 523.093000 171.4520000 174.163000 5414.9000000 5.045300 386.829000 1190.5200000 4.854510 0.169181 15.7114000 415.2760000 0.1668720 0.1950730 34.623200 26.949400 2.3880600 2.021500 0.000000 0.0000000 0.000000 1.5027300 0.000000 1.783170 49.278100 1.1861700 99.7745000 10.0077000 0.000000 0.1908660 3.835610 22.31070 2.4220200 22.133600 440.657000 0.000000 0.000000 22.10260 5.3278400 0.5799090 0.210089
IGHV3-23 0.715133 3.999940 87.6508000 2349.850000 5.962460 5180.460000 17.278900 0.000000 0.000000 1.437710 985.721000 123.982000 1374.860000 3568.60000 9.857240 0.5625450 2.900430 2.172310 5859.05000 0.000000 502.8400000 366.704000 1793.510000 2.294430 97.123900 11.607200 8.1870600 1080.330000 411.058000 9.718060 114.608000 2306.720000 317.0130000 7.954070 2.7908400 865.088000 3563.990000 0.123914 23.4808000 4.896920 1681.420000 1929.860000 2241.850000 26.191600 5228.450000 4.1899600 2807.770000 1213.59000 387.839000 203.5700000 374.506000 374.388000 0.3392100 1359.9900000 1535.600000 118.7600000 2708.140000 4.59234 0.607162 93.584600 0.3873680 1273.530000 1.666060 5.899820 1.8143100 18.921500 17.654300 0.567110 1593.270000 0.7579240 0.2794190 6.994180 216.3070000 4.8487200 1286.5300000 12.741600 167.65000 115.720000 2142.700000 3.020540 79.259200 6471.240000 2508.41000 4077.530000 802.0710000 1884.330000 5188.7400000 296.555000 1696.260000 3027.5700000 215.969000 0.249956 79.7680000 338.8720000 2.2993700 5.4541000 214.318000 152.085000 9.7046700 11.809000 0.300002 0.0000000 0.000000 15.5774000 0.217214 4.113890 229.482000 14.7361000 1145.9400000 35.9383000 0.613331 5.6297400 4.861760 188.38500 94.7553000 297.408000 3505.310000 19.762300 0.860007 208.00400 54.3620000 52.5728000 0.785236
ADAM12.1 2.581805 1.110548 0.9573992 5.466742 2.645153 0.972897 0.658545 1.159264 0.462860 3.074193 10.947068 9.678026 1.205472 11.63376 1.772810 0.3506414 3.528288 17.416140 6.36844 4.064450 0.2666618 2.113812 0.473972 1.222569 30.012798 0.181003 2.4554526 4.136035 13.961315 4.195791 3.629029 8.362213 0.6999296 7.894137 0.3268513 2.847538 7.015717 0.717947 0.4396555 1.141057 1.069326 1.895644 0.486478 14.504778 1.897848 0.9943284 4.140489 41.30644 8.812227 0.3317777 8.699836 8.323936 0.0292791 0.9615935 2.512841 0.8960788 3.766372 7.13890 1.917924 8.893462 0.0966324 1.550900 4.760995 1.769894 4.6901815 1.200418 1.392691 0.439077 0.658254 0.3610814 0.2501964 0.610416 0.6842673 2.8749874 0.6717734 16.374299 1.36295 8.725470 0.258386 0.214470 3.573125 2.086027 13.44347 4.007794 0.0338254 1.595552 0.9645215 1.913151 5.780085 0.7098401 5.242844 1.115879 0.6499524 0.5397263 0.2299409 0.7178614 0.535170 0.282374 0.2003997 2.202211 0.310417 2.0991998 0.215329 0.1962535 25.010318 2.970828 1.461817 2.6506093 0.6501276 0.9633052 8.715446 32.4636380 4.281851 0.83041 0.9840191 2.363019 0.314493 1.179129 0.656344 1.57062 0.4597666 0.4707508 4.727155

Dispersion

Using negative binomial models requires gene dispersion estimates to be made. This can be achieved in a number of ways. A common way to calculate this for gene i is to use the equation:

Dispersioni = (variancei - meani)/meani2

This can be calculated using:

disp <- apply(tpm, 1, function(x){
  (var(x, na.rm=TRUE)-mean(x, na.rm=TRUE))/(mean(x, na.rm=TRUE)**2)
  })

head(disp)
##     MS4A1    ADAM12 IGHV7-4-1  IGHV3-49  IGHV3-23  ADAM12.1 
##  3.789428  2.391912 11.686420 10.863156  3.262557  2.391912

Alternatively, we recommend using edgeR to estimate of the common, trended and tagwise dispersions across all tags:

disp  <- setNames(edgeR::estimateDisp(tpm)$tagwise.dispersion, rownames(tpm))

head(disp)

or with DESeq2 using the raw counts:

dds <- DESeqDataSetFromTximport(txi = txi, colData = metadata, design = ~ 1)
dds <- DESeq(dds)
dispersions <- setNames(dispersions(dds), rownames(txi$counts))

Size Factors

There is also an option to include size factors for each gene. Again this can be estimated using:

sizeFactors <- colSums(tpm)  
sizeFactors <- sizeFactors / mean(sizeFactors)  # normalise to mean = 1

head(sizeFactors)
##      S9001      S9002      S9003      S9004      S9006      S9007 
## 0.20325747 0.09330435 0.09737100 3.13456671 0.24515015 4.19419297

Or using edgeR these can be calculated from the raw read counts:

sizeFactors <- calcNormFactors(counts, method="TMM")

Similarly, with DESeq2:

sizeFactors <- estimateSizeFactorsForMatrix(counts)

Note the sizeFactors vector needs to be centred around 1, since it used directly as an offset of form log(sizeFactors) in the GLMM model.

Fitting Models

To fit a model for one gene over time we use a formula such as:

gene expression ~ fixed effects + random effects

In R the formula is defined by both the fixed-effects and random-effects part of the model, with the response on the left of a ~ operator and the terms, separated by + operators, on the right. Random-effects terms are distinguished by vertical bars (“|”) separating expressions for design matrices from grouping factors. For more information see the ?lme4::glmer.

In this case study we want to use time and response as fixed effects and the patients as random effects:

gene expression ~ time + response + (1 | patient)

To fit this model for all genes we can use the glmmSeq function. Note that this analysis can take some time, with 4 cores:

  • 1000 genes takes about 10 seconds
  • 20000 genes takes about 4 mins
results <- glmmSeq(~ Timepoint * EULAR_6m + (1 | PATID),
                   countdata = tpm,
                   metadata = metadata,
                   dispersion = disp,
                   progress = TRUE)
## 
## n = 123 samples, 82 individuals
## Time difference of 3.152688 secs
## Errors in 4 gene(s): IL12A, FGF12, VIL1, IL26

or alternatively using two-factor classification with EULAR_binary:

results2 <- glmmSeq(~ Timepoint * EULAR_binary + (1 | PATID),
                    countdata = tpm,
                    metadata = metadata,
                    dispersion = disp)
## 
## n = 123 samples, 82 individuals
## Time difference of 3.016724 secs
## Errors in 4 gene(s): IL12A, FGF12, VIL1, IL26

Outputs

This creates a GlmmSeq object which contains the following slots:

names(attributes(results))
##  [1] "info"      "formula"   "stats"     "predict"   "reduced"   "countdata" "metadata" 
##  [8] "modelData" "optInfo"   "errors"    "vars"      "class"

The variables used by the model are in the @modeldata:

kable(results@modelData) %>% kable_styling()
Timepoint EULAR_6m
0 Good
6 Good
0 Moderate
6 Moderate
0 Non-response
6 Non-response

The model fit statistics can be viewed in the @stats slot which is a list of items including fitted model coefficients, their standard errors and the results of statistical tests on terms within the model using Wald type 2 Chi square. To see the most significant interactions we can order pvals by Timepoint:EULAR_6m:

stats <- summary(results)

kable(stats[order(stats[, 'P_Timepoint:EULAR_6m']), ]) %>%
  kable_styling() %>%
  scroll_box(width = "100%", height = "400px")
Dispersion AIC logLik meanExp (Intercept) Timepoint EULAR_6mModerate EULAR_6mNon-response Timepoint:EULAR_6mModerate Timepoint:EULAR_6mNon-response se_(Intercept) se_Timepoint se_EULAR_6mModerate se_EULAR_6mNon-response se_Timepoint:EULAR_6mModerate se_Timepoint:EULAR_6mNon-response Chisq_Timepoint Chisq_EULAR_6m Chisq_Timepoint:EULAR_6m Df_Timepoint Df_EULAR_6m Df_Timepoint:EULAR_6m P_Timepoint P_EULAR_6m P_Timepoint:EULAR_6m
IGHV3-23 3.2625571 1587.9550 -785.9775 5.9461793 6.8466473 -0.0213958 0.1369681 -0.8391523 -0.3010405 -0.4047369 0.4783325 0.0906407 0.5134806 0.6320099 0.1277461 0.1624190 7.6175338 14.2631824 8.8418654 1 2 2 0.0057803 0.0007994 0.0120230
CXCL13 4.4416343 953.3999 -468.7000 2.8488723 3.5376946 -0.0405699 0.4921167 -1.4651369 -0.3181048 0.0958172 0.3701902 0.0971787 0.5619716 0.7302593 0.1464648 0.1781329 4.2900227 4.3648596 6.7344833 1 2 2 0.0383367 0.1127672 0.0344846
FGF14 0.2915518 375.0390 -179.5195 1.0784074 -0.0846561 0.1148791 -0.0436742 0.9233238 -0.0035230 -0.1731267 0.2014575 0.0467465 0.3090738 0.3176555 0.0711392 0.0783922 5.5445597 5.0815197 5.6637042 1 2 2 0.0185382 0.0788065 0.0589037
IL24 0.4177215 376.8418 -180.4209 1.0076111 0.3478977 -0.0321854 0.2217562 -0.7615200 -0.1166353 0.1298070 0.1978093 0.0500220 0.2721124 0.4427622 0.0793379 0.1003295 1.9439310 1.0333073 5.5708454 1 2 2 0.1632423 0.5965133 0.0617030
HHIP 1.0353883 537.1812 -260.5906 1.4454128 0.7998605 -0.0399783 -0.1533630 0.5663556 0.1881176 0.0332220 0.2089737 0.0569350 0.3219344 0.3888552 0.0826903 0.0975405 1.0348345 4.7912317 5.5680271 1 2 2 0.3090259 0.0911165 0.0617900
MS4A1 3.7894278 898.6829 -441.3414 2.5917399 2.7363666 0.0490839 0.1933006 -1.7237593 -0.2107974 0.1746154 0.3366868 0.0894115 0.5110345 0.6778010 0.1347788 0.1655778 0.0112466 5.6828754 5.4378747 1 2 2 0.9155427 0.0583417 0.0659448
ADAM12 2.3919119 635.7162 -309.8581 1.6675082 1.6569079 -0.1236820 -0.2683597 -0.5047247 -0.0347932 0.2699943 0.2756082 0.0750725 0.4213674 0.5491941 0.1146527 0.1351883 2.5418570 2.3034252 5.2104307 1 2 2 0.1108643 0.3160950 0.0738872
ADAM12.1 2.3919119 635.7162 -309.8581 1.6675082 1.6569079 -0.1236820 -0.2683597 -0.5047247 -0.0347932 0.2699943 0.2756082 0.0750725 0.4213674 0.5491941 0.1146527 0.1351883 2.5418570 2.3034252 5.2104307 1 2 2 0.1108643 0.3160950 0.0738872
IL2RG 0.8332758 1077.5836 -530.7918 4.3589131 3.4913720 -0.0544559 0.2592216 -0.6099765 -0.1005384 0.0741224 0.1768209 0.0432425 0.2498525 0.3251603 0.0650989 0.0795735 6.8219047 3.0208070 4.9773656 1 2 2 0.0090046 0.2208209 0.0830192
IL16 0.2419416 974.1912 -479.0956 4.5153510 3.2821832 -0.0163058 -0.1351425 -0.3377962 0.0421528 0.0983179 0.0936429 0.0242321 0.1389855 0.1814160 0.0364868 0.0447041 1.0986627 0.2976437 4.9709614 1 2 2 0.2945598 0.8617226 0.0832855
EMILIN3 2.8558506 522.1530 -253.0765 1.2381694 1.0154229 0.1211976 -2.0741731 -0.8821688 0.2905788 0.0936821 0.3076528 0.0803498 0.5615745 0.6368489 0.1317965 0.1513721 15.7455688 9.3862145 4.8709355 1 2 2 0.0000725 0.0091582 0.0875568
BLK 1.0360018 532.3068 -258.1534 1.4610649 0.9405211 0.0152304 0.2889931 -0.6186534 -0.1477598 0.0407172 0.2366609 0.0545010 0.3164350 0.4438540 0.0838780 0.1061424 0.6098286 2.2049350 4.1711539 1 2 2 0.4348524 0.3320507 0.1242354
IGHV1-69 6.1631087 1150.4929 -567.2465 3.6446738 4.5052838 0.0254377 1.0678755 0.3278992 -0.1788186 -0.3840284 0.4260744 0.1132977 0.6470204 0.8341665 0.1701410 0.2070231 2.1903776 4.5563275 3.6032048 1 2 2 0.1388753 0.1024722 0.1650342
PADI4 2.0736630 472.8051 -228.4025 1.0912063 0.2320863 0.1335275 0.3673813 0.1562910 -0.2084779 -0.1075547 0.2903619 0.0735494 0.4303320 0.5600252 0.1128728 0.1359462 0.6800223 0.2066687 3.4338579 1 2 2 0.4095790 0.9018254 0.1796169
LILRA5 0.8432028 787.5680 -385.7840 2.8996321 2.1304753 -0.0337206 0.3077720 -0.0538474 -0.0772630 0.0741083 0.1682070 0.0451020