`## [1] "ss" "fitlist"`

`library("lme4")`

In general `lme4`

’s algorithms scale reasonably well with the number of observations and the number of random effect levels. The biggest bottleneck is in the number of *top-level parameters*, i.e. covariance parameters for `lmer`

fits or `glmer`

fits with `nAGQ`

=0 [`length(getME(model, "theta"))`

], covariance and fixed-effect parameters for `glmer`

fits with `nAGQ`

>0. `lme4`

does a derivative-free (by default) nonlinear optimization step over the top-level parameters.

For this reason, “maximal” models involving interactions of factors with several levels each (e.g. `(stimulus*primer | subject)`

) will be slow (as well as hard to estimate): if the two factors have `f1`

and `f2`

levels respectively, then the corresponding `lmer`

fit will need to estimate `(f1*f2)*(f1*f2+1)/2`

top-level parameters.

`lme4`

automatically constructs the random effects model matrix (\(Z\)) as a sparse matrix. At present it does *not* allow an option for a sparse fixed-effects model matrix (\(X\)), which is useful if the fixed-effect model includes factors with many levels. Treating such factors as random effects instead, and using the modular framework (`?modular`

) to fix the variance of this random effect at a large value, will allow it to be modeled using a sparse matrix. (The estimates will converge to the fixed-effect case in the limit as the variance goes to infinity.)

`calc.derivs = FALSE`

After finding the best-fit model parameters (in most cases using *derivative-free* algorithms such as Powell’s BOBYQA or Nelder-Mead, `[g]lmer`

does a series of finite-difference calculations to estimate the gradient and Hessian at the MLE. These are used to try to establish whether the model has converged reliably, and (for `glmer`

) to estimate the standard deviations of the fixed effect parameters (a less accurate approximation is used if the Hessian estimate is not available. As currently implemented, this computation takes `2*n^2 - n + 1`

additional evaluations of the deviance, where `n`

is the total number of top-level parameters. Using `control = [g]lmerControl(calc.derivs = FALSE)`

to turn off this calculation can speed up the fit, e.g.

```
m0 <- lmer(y ~ service * dept + (1|s) + (1|d), InstEval,
control = lmerControl(calc.derivs = FALSE))
```

Benchmark results for this run with and without derivatives show an approximately 20% speedup (from 54 to 43 seconds on a Linux machine with AMD Ryzen 9 2.2 GHz processors). This is a case with only 2 top-level parameters, but the fit took only 31 deviance function evaluations (see `m0@optinfo$feval`

) to converge, so the effect of the additional 7 (\(n^2 -n +1\)) function evaluations is noticeable.

`lmer`

uses the “nloptwrap” optimizer by default; `glmer`

uses a combination of bobyqa (`nAGQ=0`

stage) and Nelder_Mead. These are reasonably good choices, although switching `glmer`

fits to `nloptwrap`

for both stages may be worth a try.

`allFits()`

gives an easy way to check the timings of a large range of optimizers:

optimizer | elapsed |
---|---|

bobyqa | 51.466 |

nloptwrap.NLOPT_LN_BOBYQA | 53.432 |

nlminbwrap | 66.236 |

nloptwrap.NLOPT_LN_NELDERMEAD | 90.780 |

nmkbw | 94.727 |

Nelder_Mead | 99.828 |

optimx.L-BFGS-B | 117.965 |

As expected, bobyqa - both the implementation in the `minqa`

package [`[g]lmerControl(optimizer="bobyqa")`

] and the one in `nloptwrap`

[`optimizer="nloptwrap"`

or `optimizer="nloptwrap", optCtrl = list(algorithm = "NLOPT_LN_BOBYQA"`

] - are fastest.

Occasionally, the default optimizer stopping tolerances are unnecessarily strict. These tolerances are specific to each optimizer, and can be set via the `optCtrl`

argument in `[g]lmerControl`

. To see the defaults for `nloptwrap`

:

`environment(nloptwrap)$defaultControl`

```
## $algorithm
## [1] "NLOPT_LN_BOBYQA"
##
## $xtol_abs
## [1] 1e-08
##
## $ftol_abs
## [1] 1e-08
##
## $maxeval
## [1] 1e+05
```

In the particular case of the `InstEval`

example, this doesn’t help much - loosening the tolerances to `ftol_abs=1e-4`

, `xtol_abs=1e-4`

only saves 2 functional evaluations and a few seconds, while loosening the tolerances still further gives convergence warnings.

There are not many options for parallelizing `lme4`

. Optimized BLAS does not seem to help much.

`glmmTMB`

may be faster than`lme4`

for GLMMs with large numbers of top-level parameters, especially for negative binomial models (i.e. compared to`glmer.nb`

)- the
`MixedModels.jl`

package in Julia may be*much*faster for some problems. You do need to install Julia.- see this short tutorial or this example (Jupyter notebook)
- the JellyMe4 and jglmm packages provide R interfaces