Parameterization of Response Distributions in brms

Paul Bürkner

2017-02-17

The purpose of this vignette is to discuss the parameterizations of the families (i.e., response distributions) used in brms. For a more general overview of the package see vignette("brms_overview").

Notation

Throughout this vignette, we use \(y\) to refer to the response variable and \(\eta\) to refer to the (non-)linear predictor (see help(brmsformula) for details on supported predictor terms). We write \(y_i\) and \(\eta_i\) for the response and linear predictor of observation \(i\). Furthermore, we use \(f\) for a density function and \(g\) for an inverse link function.

Location shift models

The density of the gaussian family is given by \[f(y_i) = \frac{1}{\sqrt{2\pi}\sigma} \exp\left(-\frac{1}{2}\left(\frac{y_i - g(\eta_i)}{\sigma}\right)^2\right)\] where \(\sigma\) is the residual standard deviation. The density of the student family is given by \[f(y_i) = \frac{\Gamma((\nu + 1)/2)}{\Gamma(\nu/2)} \frac{1}{\sqrt{\nu\pi}\sigma} \left(1 + \frac{1}{\nu} \left(\frac{y_i - g(\eta_i)}{\sigma}\right)^2\right)^{-(\nu+1)/2}\] \(\Gamma\) denotes the gamma function and \(\nu > 1\) are the degrees of freedom. As \(\nu \rightarrow \infty\), the student distribution becomes the gaussian distribution. For location shift models, \(y_i\) can be any real value.

Binary and count data models

The density of the binomial family is given by \[f(y_i) = {N_i \choose y_i} g(\eta_i)^{y_i} (1-g(\eta_i))^{N_i - y_i}\] where \(N_i\) is the number of trials and \(y_i \in \{0, ... , N_i\}\). When all \(N_i\) are \(1\) (i.e., \(y_i \in \{0,1\}\)), the bernoulli distribution for binary data arises. binomial and bernoulli families are distinguished in brms as the bernoulli distribution has its own implementation in Stan that is computationlly more efficient.

For \(y_i \in \mathbb{N}_0\), the density of the poisson family is given by \[f(y_i) = \frac{g(\eta_i)^{y_i}}{y_i!} exp(-g(\eta_i))\] The density of the negative binomial family is \[f(y_i) = {y_i + \phi - 1 \choose y_i} \left(\frac{g(\eta_i)}{g(\eta_i) + \phi}\right)^{y_i} \left(\frac{\phi}{g(\eta_i) + \phi}\right)^\phi\] where \(\phi\) is a positive precision parameter. For \(\phi \rightarrow \infty\), the negative binomial distribution becomes the poisson distribution. The density of the geometric family arises if \(\phi\) is set to \(1\).

Survival models

With survival models we mean all models that are defined on the positive reals only, that is \(y_i \in \mathbb{R}^+\). The density of the lognormal family is given by \[f(y_i) = \frac{1}{\sqrt{2\pi}\sigma x} \exp\left(-\frac{1}{2}\left(\frac{\log(y_i) - g(\eta_i)}{\sigma}\right)^2\right)\] where \(\sigma\) is the residual standard deviation on the log-scale. The density of the Gamma family is given by \[f(y_i) = \frac{(\alpha / g(\eta_i))^\alpha}{\Gamma(\alpha)} y_i^{\alpha-1} \exp\left(-\frac{\alpha y_i}{g(\eta_i)}\right)\] where \(\alpha\) is a positive shape parameter. The density of the weibull family is given by \[f(y_i) = \frac{\alpha}{g(\eta_i / \alpha)} \left(\frac{y_i}{g(\eta_i / \alpha)}\right)^{\alpha-1} \exp\left(-\left(\frac{y_i}{g(\eta_i / \alpha)}\right)^\alpha\right)\] where \(\alpha\) is again a positive shape parameter. The exponential family arises if \(\alpha\) is set to \(1\) for either the gamma or Weibull distribution.

Extreme value models

Modeling extremes requires special distributions. One may use the weibull distribution (see above) or the frechet distribution with density \[ f(y_i) = \frac{\nu}{s_i} \left(\frac{y_i}{s_i}\right)^{-1-\nu} \exp\left(-\left(\frac{y_i}{s_i}\right)^{-\nu}\right) \] where \(s_i = g(\eta_i) / \Gamma(1 - 1 / \nu)\) is a positive scale parameter and \(\nu > 1\) is a shape parameter so that \(g(\eta_i)\) predicts the mean of the Frechet distribution. A generalization of both distributions is the generalized extreme value distribution (family gen_extreme_value) with density \[ f(y_i) = \frac{1}{\sigma} t(y_i)^{-1 - 1 / \xi} \exp(-t(y_i)) \] where \[ t(y_i) = \left(1 + \xi \left(\frac{y_i - g(\eta_i)}{\sigma} \right)\right)^{-1 / \xi} \] with positive scale parameter \(\sigma\) and shape parameter \(\xi\).

Response time models

One family that is especially suited to model reaction times is the exgaussian (‘exponentially modified Gaussian’) family. Its density is given by

\[f(y_i) = \frac{1}{2 \beta} \exp\left(\frac{1}{2 \beta} \left(2g(\eta_i) + \sigma^2 / \beta - 2 y_i \right) \right) \text{erfc}\left(\frac{g(\eta_i) + \sigma^2 / \beta - y_i}{\sqrt{2} \sigma} \right)\]

where \(\beta\) is the scale (inverse rate) of the exponential component, \(g(\eta_i) = \mu\) is the mean of the Gaussian componenent, \(\sigma\) is the standard deviation of the Gaussian component, and \(\text{erfc}\) is the complementary error function.

A family concerned with the combined modelling of reaction times and corresponding binary responses is the wiener diffusion model. It has four model parameters each with a natural interpreation. The parameter \(\alpha > 0\) describes the separation between two boundaries of the diffusion process, \(\tau > 0\) describes the non-decision time (e.g., due to image or motor processing), \(\beta \in [0, 1]\) describes the initial bias in favor of the upper alternative, and \(\delta \in \mathbb{R}\) describes the drift rate to the boundaries (a positive value indicates a drift towards to upper boundary). The density for the reaction time at the upper boundary is given by

\[f(y_i) = \frac{\alpha}{(y_i-\tau)^3/2} \exp \! \left(- \delta \alpha \beta - \frac{\delta^2(y_i-\tau)}{2}\right) \sum_{k = - \infty}^{\infty} (2k + \beta) \phi \! \left(\frac{2k + \alpha \beta}{\sqrt{y_i - \tau}}\right)\]

where \(\phi(x)\) denotes the standard normal density function. The density at the lower boundary can be obtained by substituting \(1 - \beta\) for \(\beta\) and \(-\delta\) for \(\delta\) in the above equation. In brms the parameters \(\alpha\), \(\tau\), and \(\beta\) are modeled as auxiliary parameters named bs (‘boundary separation’), ndt (‘non-decision time’), and bias respectively, whereas the drift rate \(\delta\) is modeled via the ordinary model formula that is as \(\delta = g(\eta_i)\).

Quantile regression

Quantile regression is implemented via family asym_laplace (asymmetric Laplace distribution) with density

\[f(y_i) = \frac{p (1 - p)}{\sigma} \exp\left(-\rho_p\left(\frac{y_i - g(\eta_i)}{\sigma}\right)\right)\] where \(\rho_p\) is given by \(\rho_p(x) = x (p - I_{x < 0})\) and \(I_A\) is the indicator function of set \(A\). The parameter \(\sigma\) is a positive scale parameter and \(p\) is the quantile parameter taking on values in \((0, 1)\). For this distribution, we have \(P(Y < g(\eta)) = p\). Thus, quantile regression can be performed by fixing \(p\) to the quantile to interest.

Beta models

The density of the Beta family for \(y_i \in (0,1)\) is given by \[f(y_i) = \frac{y_i^{g(\eta_i) \phi - 1} (1-y_i)^{(1-g(\eta_i)) \phi)-1}}{B(g(\eta_i) \phi, (1-g(\eta_i)) \phi)}\] where \(B\) is the beta function and \(\phi\) is a positive precision parameter.

Circular models

The density of the von_mises family for \(y_i \in (-\pi,\pi)\) is given by \[f(y_i) = \frac{\exp(\kappa \cos(y_i - g(\eta_i)))}{2\pi I_0(\kappa)}\] where \(I_0\) is the modified Bessel function of order 0 and \(\kappa\) is a positive precision parameter.

Ordinal and categorical models

For ordinal and categorical models, \(y_i\) is one of the categories \(1, ..., K\). The intercepts of ordinal models are called thresholds and are denoted as \(\tau_k\), with \(k \in \{1, ..., K-1\}\), whereas \(\eta\) does not contain a fixed effects intercept. Note that the applied link functions \(h\) are technically distribution functions \(\mathbb{R} \rightarrow [0,1]\). The density of the cumulative family (implementing the most basic ordinal model) is given by \[f(y_i) = g(\tau_{y_i + 1} - \eta_i) - g(\tau_{y_i} - \eta_i)\] The densities of the sratio (stopping ratio) and cratio (continuation ratio) families are given by \[f(y_i) = g(\tau_{y_i + 1} - \eta_i) \prod_{k = 1}^{y_i} (1 - g(\tau_{k} - \eta_i))\] and \[f(y_i) = (1 - g(\eta_i - \tau_{y_i + 1})) \prod_{k = 1}^{y_i} g(\eta_i - \tau_{k})\] respectively. Note that both families are equivalent for symmetric link functions such as logit or probit. The density of the acat (adjacent category) family is given by \[f(y_i) = \frac{\prod_{k=1}^{y_i} g(\eta_i - \tau_{k}) \prod_{k=y_i+1}^K(1-g(\eta_i - \tau_{k}))}{\sum_{k=0}^K\prod_{j=1}^k g(\eta_i-\tau_{j}) \prod_{j=k+1}^K(1-g(\eta_i - \tau_{j}))}\] For the logit link, this can be simplified to \[f(y_i) = \frac{\exp \left(\sum_{k=1}^{y_i} (\eta_i - \tau_{k}) \right)} {\sum_{k=0}^K \exp\left(\sum_{j=1}^k (\eta_i - \tau_{j}) \right)}\] The linear predictor \(\eta\) can be generalized to also depend on the category \(k\) for a subset of predictors. This leads to so called category specific effects (for details on how to specify them see help(brm)). Note that cumulative and sratio models use \(\tau - \eta\), whereas cratio and acat use \(\eta - \tau\). This is done to ensure that larger values of \(\eta\) increase the probability of higher reponse categories.

The categorical family is currently only implemented with the logit link function and has density \[f(y_i) = \frac{\exp(\eta_{iy_i})}{\sum_{k = 1}^{K} \exp(\eta_{ik})}\] Note that \(\eta\) does also depend on the category \(k\). For reasons of identifiability, \(\eta_{i1}\) is set to \(0\).

Zero-inflated and hurdle models

Zero-inflated and hurdle families extend existing families by adding special processes for responses that are zero. The densitiy of a zero-inflated family is given by \[f_z(y_i) = g_z(\eta_{zi}) + (1 - g_z(\eta_{zi})) f(y_i) \quad \text{if } y_i = 0 \\ f_z(y_i) = (1 - g_z(\eta_{zi}) f(y_i) \quad \text{if } y_i > 0\] The zero-inflation part has its own linear predictor \(\eta_{zi}\) combined with the logit link \(g_z\). Currently implemented families are zero_inflated_poisson, zero_inflated_binomial, zero_inflated_negbinomial, and zero_inflated_beta. The density of a hurdle family is given by \[f_z(y_i) = g_z(\eta_{zi}) \quad \text{if } y_i = 0 \\ f_z(y_i) = (1 - g_z(\eta_{zi})) f(y_i) / (1 - f(0)) \quad \text{if } y_i > 0\] Currently implemented families are hurdle_poisson, hurdle_negbinomial, hurdle_gamma, and hurdle_lognormal. Both linear predictors (i.e., \(\eta\) and \(\eta_z\)) can be specified within the formula argument. See help(brmsformula) for instructions on how to do that.