This vignettes demontrates those functions of the *sjstats*-package that deal especially with Bayesian models. *sjstats* provides following functions:

`hdi()`

`rope()`

`mcse()`

`n_eff()`

`tidy_stan()`

`equi_test()`

`mediation()`

`icc()`

`r2()`

Befor we start, we fit some models, including a mediation-object from the *mediation*-package, which we use for comparison with *brms*. The functions work with `brmsfit`

, `stanreg`

and `stanfit`

-objects.

```
library(sjstats)
library(sjmisc)
library(mediation)
library(brms)
# load sample data
data(jobs)
data(efc)
data(fish)
efc <- to_factor(efc, e42dep, c172code, c161sex, e15relat)
set.seed(123)
# linear models, for mediation analysis
b1 <- lm(job_seek ~ treat + econ_hard + sex + age, data = jobs)
b2 <- lm(depress2 ~ treat + job_seek + econ_hard + sex + age, data = jobs)
# mediation analysis, for comparison with brms
m1 <- mediate(b1, b2, sims = 1000, treat = "treat", mediator = "job_seek")
# fit Bayesian models
f1 <- bf(job_seek ~ treat + econ_hard + sex + age)
f2 <- bf(depress2 ~ treat + job_seek + econ_hard + sex + age)
m2 <- brm(f1 + f2 + set_rescor(FALSE), data = jobs, cores = 4)
m3 <- brm(
bf(count ~ child + camper + (1 | persons),
zi ~ child + camper + (1 | persons)),
data = fish,
family = zero_inflated_poisson(),
cores = 4
)
m4 <- brm(
mpg ~ wt + hp + (1 | cyl) + (1 + wt | gear),
data = mtcars,
cores = 4
)
m5 <- brm(
neg_c_7 ~ e42dep + c12hour + c172code + (1 | e15relat),
data = efc,
cores = 4
)
```

`hdi()`

computes the highest density interval for posterior samples. Unlike equal-tailed intervals that exclude 2.5% from each tail of the distribution, the HDI is *not* equal-tailed and therefor always includes the mode(s) of posterior distributions.

By default, `hdi()`

prints the 90% intervals, however, the `prob`

-argument can be used to calculate different or even multiple intervals.

```
hdi(m2)
#>
#> # Highest Density Interval
#>
#> HDI(90%)
#> b_jobseek_Intercept [ 3.47 3.88]
#> b_depress2_Intercept [ 1.97 2.46]
#> b_jobseek_treat [-0.02 0.15]
#> b_jobseek_econ_hard [ 0.01 0.09]
#> b_jobseek_sex [-0.09 0.07]
#> b_jobseek_age [ 0.00 0.01]
#> b_depress2_treat [-0.11 0.03]
#> b_depress2_job_seek [-0.29 -0.19]
#> b_depress2_econ_hard [ 0.12 0.18]
#> b_depress2_sex [ 0.04 0.17]
#> b_depress2_age [-0.00 0.00]
#> sigma_jobseek [ 0.70 0.76]
#> sigma_depress2 [ 0.59 0.64]
hdi(m2, prob = c(.5, .89))
#>
#> # Highest Density Interval
#>
#> HDI(50%) HDI(89%)
#> b_jobseek_Intercept [ 3.59 3.76] [ 3.47 3.86]
#> b_depress2_Intercept [ 2.13 2.32] [ 1.97 2.44]
#> b_jobseek_treat [ 0.03 0.10] [-0.02 0.15]
#> b_jobseek_econ_hard [ 0.04 0.07] [ 0.02 0.09]
#> b_jobseek_sex [-0.04 0.03] [-0.09 0.07]
#> b_jobseek_age [ 0.00 0.01] [ 0.00 0.01]
#> b_depress2_treat [-0.08 -0.02] [-0.11 0.03]
#> b_depress2_job_seek [-0.26 -0.22] [-0.29 -0.19]
#> b_depress2_econ_hard [ 0.13 0.16] [ 0.11 0.18]
#> b_depress2_sex [ 0.08 0.13] [ 0.04 0.17]
#> b_depress2_age [-0.00 0.00] [-0.00 0.00]
#> sigma_jobseek [ 0.71 0.74] [ 0.70 0.75]
#> sigma_depress2 [ 0.60 0.62] [ 0.59 0.64]
```

For multilevel models, the `type`

-argument defines whether the HDI of fixed, random or all effects are shown.

```
hdi(m5, type = "random")
#>
#> # Highest Density Interval
#>
#> HDI(90%)
#> r_e15relat.1.Intercept. [-0.12 1.31]
#> r_e15relat.2.Intercept. [-0.16 1.00]
#> r_e15relat.3.Intercept. [-0.84 0.77]
#> r_e15relat.4.Intercept. [-0.57 0.78]
#> r_e15relat.5.Intercept. [-0.76 0.87]
#> r_e15relat.6.Intercept. [-1.61 0.33]
#> r_e15relat.7.Intercept. [-1.28 0.76]
#> r_e15relat.8.Intercept. [-0.84 0.49]
```

The computation for the HDI is based on the code from *Kruschke 2015, pp. 727f*. For default sampling in Stan (4000 samples), the 90% intervals for HDI are more stable than, for instance, 95% intervals. An effective sample size of at least 10.000 is recommended if 95% intervals should be computed (see *Kruschke 2015, p. 183ff*).

Unlike a frequentist approach, Bayesian inference is not based on stastical significance, where effects need to be different from “zero”. Rather, the magnitude of a model’s parameter value and its uncertainty should not be ignored, and hence, an effect is not present when it simply differs from zero, but if it’s outside a specific range that can be considered as “practically no effect”. This range is called the *region of practical equivalence* (ROPE).

`rope()`

requires the `rope`

-argument, which defined this region, and then gives a summary about the parameters and their proportion that lies inside and outside this ROPE.

```
rope(m5, rope = c(-1, 1))
#>
#> # Proportions of samples inside and outside the ROPE
#>
#> inside outside
#> b_Intercept 0.0% 100.0%
#> b_e42dep2 43.0% 57.0%
#> b_e42dep3 0.6% 99.5%
#> b_e42dep4 0.0% 100.0%
#> b_c12hour 100.0% 0.0%
#> b_c172code2 99.7% 0.3%
#> b_c172code3 79.0% 21.0%
#> sigma 0.0% 100.0%
```

`rope()`

does not suggest limits for the region of practical equivalence and does not tell you how big is practically equivalent to the null value. However, there are suggestions how to choose reasonable limits (see *Kruschke 2018*), which are implemented in the `equi_test()`

functions.

`equi_test()`

combines the two functions `hdi()`

and `rope()`

and performs a “HDI+ROPE decision rule” (Test for Practical Equivalence) (*Kruschke 2018*) to check whether parameter values should be accepted or rejected against the background of a formulated null hypothesis.

`equi_test()`

computes the 95%-HDI and checks if a model predictor’s HDI lies completely outside, completely inside or partially inside the ROPE. If the HDI is completely outside the ROPE, the “null hypothesis” for this parameter is “rejected”. If the ROPE completely covers the HDI, i.e. all most credible values of a parameter are inside the region of practical equivalence, the null hypothesis is accepted. Else, it’s undecided whether to accept or reject the null hypothesis. In short, desirable results are low proportions inside the ROPE (the closer to zero the better) and the H0 should be rejected.

If neither the `rope`

nor `eff_size`

argument are specified, the effect size will be set to 0.1 (half of a small effect according to Cohen) and the ROPE is then `0 +/- .1 * sd(y)`

for linear models. This is the suggested way to specify the ROPE limits according to *Kruschke* (2018).

```
equi_test(m5)
#>
#> # Test for Practical Equivalence of Model Predictors
#>
#> Effect Size: 0.10
#> ROPE: [-0.39 0.39]
#> Samples: 4000
#>
#> H0 %inROPE HDI(95%)
#> b_Intercept (*) reject 0.00 [ 7.58 9.93]
#> b_e42dep2 (*) undecided 7.47 [-0.01 2.05]
#> b_e42dep3 (*) reject 0.00 [ 1.30 3.25]
#> b_e42dep4 (*) reject 0.00 [ 2.77 4.91]
#> b_c12hour accept 100.00 [ 0.00 0.01]
#> b_c172code2 (*) undecided 75.93 [-0.44 0.77]
#> b_c172code3 (*) undecided 20.50 [-0.07 1.44]
#> sigma reject 0.00 [ 3.42 3.76]
#>
#> (*) the number of effective samples may be insufficient for some parameters
```

For models with binary outcome, there is no concrete way to derive the effect size that defines the ROPE limits. Two examples from Kruschke suggest that a negligible change is about .05 on the logit-scale. In these cases, it is recommended to specify the `rope`

argument, however, if not specified, the ROPE limits are caluclated as suggested by Kruschke: The effect size is the probability of “success” for the outcome, divided by `pi`

. For all other models, `0 +/- .1 * sd(intercept)`

is used to determine the ROPE limits.

Beside a numerical output, the results can also be printed as HTML-table or plotted, using the `out`

-argument. For plots, the 95% distributions of the posterior samles are shown, the ROPE is a light-blue shaded region in the plot, and the distributions are colored depending on whether the parameter values are accepted, rejected or undecided.

`tidy_stan()`

is no substitute, but rather a convenient alternative to `summary()`

. The major differences are: `tidy_stan()`

…

- focusses on the parameter values (estimates) and gives no information on samples, data, or formula
- calculates the HDI rather than equi-tailed intervals
- separates different model parts, e.g. random from fixed effects, or conditional from zero-inflated models
- and prints everything nicely

```
tidy_stan(m3)
#>
#> # Summary Statistics of Stan-Model
#>
#> ## Conditional Model:
#>
#> estimate std.error HDI(89%) ratio rhat mcse
#> Intercept 1.27 0.69 [-0.24 2.65] 0.24 1 0.03
#> child -1.15 0.09 [-1.29 -0.99] 0.65 1 0.00
#> camper 0.73 0.10 [ 0.59 0.88] 1.00 1 0.00
#>
#> ## Zero-Inflated Model:
#>
#> estimate std.error HDI(89%) ratio rhat mcse
#> Intercept -0.68 0.67 [-1.85 0.50] 0.41 1 0.02
#> child 1.90 0.33 [ 1.40 2.46] 0.80 1 0.01
#> camper -0.85 0.38 [-1.38 -0.25] 0.32 1 0.01
```

Additional statistics in the output are:

- standard errors (which are actually median absolute deviations)
- ratio of effective numbers of samples,
*neff_ratio*, (i.e. effective number of samples divided by total number of samples); this ratio ranges from 0 to 1, and should be close to 1; the closer this ratio comes to zero means that the chains may be inefficient, but possibly still okay - Rhat statistics; when Rhat is above 1, it usually indicates that the chain has not yet converged, indicating that the drawn samples might not be trustworthy; drawing more iteration may solve this issue
- Monte Carlo Standard Error (
*mcse*);

By default, the “estimate” is the median of the posterior distribution, but this can be changed with the `typical`

-argument.

```
tidy_stan(m3, typical = "mean")
#>
#> # Summary Statistics of Stan-Model
#>
#> ## Conditional Model:
#>
#> estimate std.error HDI(89%) ratio rhat mcse
#> Intercept 1.28 0.69 [-0.24 2.65] 0.24 1 0.03
#> child -1.15 0.09 [-1.29 -0.99] 0.65 1 0.00
#> camper 0.73 0.10 [ 0.59 0.88] 1.00 1 0.00
#>
#> ## Zero-Inflated Model:
#>
#> estimate std.error HDI(89%) ratio rhat mcse
#> Intercept -0.67 0.67 [-1.85 0.50] 0.41 1 0.02
#> child 1.89 0.33 [ 1.40 2.46] 0.80 1 0.01
#> camper -0.85 0.38 [-1.38 -0.25] 0.32 1 0.01
```

To also show random effects of multilevel models, use the `type`

-argument.

```
# printing fixed and random effects of multilevel model
tidy_stan(m3, type = "all")
#>
#> # Summary Statistics of Stan-Model
#>
#> ## Conditional Model: Fixed effects
#>
#> estimate std.error HDI(89%) ratio rhat mcse
#> Intercept 1.27 0.69 [-0.24 2.65] 0.24 1 0.03
#> child -1.15 0.09 [-1.29 -0.99] 0.65 1 0.00
#> camper 0.73 0.10 [ 0.59 0.88] 1.00 1 0.00
#>
#> ## Conditional Model: Random effect (Intercept: persons)
#>
#> estimate std.error HDI(89%) ratio rhat mcse
#> persons.1 -1.33 0.73 [-2.75 0.19] 0.24 1 0.03
#> persons.2 -0.36 0.71 [-1.67 1.23] 0.24 1 0.03
#> persons.3 0.35 0.71 [-1.11 1.79] 0.24 1 0.03
#> persons.4 1.25 0.70 [-0.19 2.69] 0.23 1 0.03
#>
#> ## Zero-Inflated Model: Fixed effects
#>
#> estimate std.error HDI(89%) ratio rhat mcse
#> Intercept -0.68 0.67 [-1.85 0.50] 0.41 1 0.02
#> child 1.90 0.33 [ 1.40 2.46] 0.80 1 0.01
#> camper -0.85 0.38 [-1.38 -0.25] 0.32 1 0.01
#>
#> ## Zero-Inflated Model: Random effect (Intercept: persons)
#>
#> estimate std.error HDI(89%) ratio rhat mcse
#> persons.1 1.26 0.72 [ 0.01 2.61] 0.36 1 0.02
#> persons.2 0.34 0.66 [-0.85 1.50] 0.36 1 0.02
#> persons.3 -0.15 0.67 [-1.37 1.02] 0.34 1 0.02
#> persons.4 -1.33 0.75 [-2.55 -0.01] 0.35 1 0.02
```

By default, 89%-HDI are computed (a convention following *McElreath 2015*), but other or even multiple HDI can be computed using the `prob`

argument.

```
# two different HDI for multivariate response model
tidy_stan(m2, prob = c(.5, .95))
#>
#> # Summary Statistics of Stan-Model
#>
#> ## Response: jobseek
#>
#> estimate std.error HDI(50%) HDI(95%) ratio rhat mcse
#> Intercept 3.67 0.13 [ 3.59 3.76] [ 3.41 3.91] 1.73 1 0
#> treat 0.06 0.05 [ 0.03 0.10] [-0.04 0.16] 1.98 1 0
#> econ_hard 0.05 0.02 [ 0.04 0.07] [ 0.00 0.10] 1.65 1 0
#> sex -0.01 0.05 [-0.04 0.03] [-0.10 0.09] 1.82 1 0
#> age 0.00 0.00 [ 0.00 0.01] [-0.00 0.01] 1.84 1 0
#>
#> ## Response: depress2
#>
#> estimate std.error HDI(50%) HDI(95%) ratio rhat mcse
#> Intercept 2.21 0.15 [ 2.13 2.32] [ 1.92 2.50] 1.72 1 0
#> treat -0.04 0.04 [-0.08 -0.02] [-0.13 0.04] 1.66 1 0
#> job_seek -0.24 0.03 [-0.26 -0.22] [-0.29 -0.18] 1.71 1 0
#> econ_hard 0.15 0.02 [ 0.13 0.16] [ 0.11 0.19] 1.97 1 0
#> sex 0.11 0.04 [ 0.08 0.13] [ 0.02 0.18] 1.58 1 0
#> age 0.00 0.00 [-0.00 0.00] [-0.00 0.00] 1.33 1 0
```

`mediation()`

is another summary function, especially for mediation analysis, i.e. for multivariate response models with casual mediation effects.

Let us recall the models:

```
f1 <- bf(job_seek ~ treat + econ_hard + sex + age)
f2 <- bf(depress2 ~ treat + job_seek + econ_hard + sex + age)
m2 <- brm(f1 + f2 + set_rescor(FALSE), data = jobs, cores = 4)
```

Here, *treat* is the treatment effect, *job_seek* is the mediator effect, *f1* describes the mediator model and *f2* describes the outcome model.

`mediation()`

returns a data frame with information on the *direct effect* (median value of posterior samples from treatment of the outcome model), *mediator effect* (median value of posterior samples from mediator of the outcome model), *indirect effect* (median value of the multiplication of the posterior samples from mediator of the outcome model and the posterior samples from treatment of the mediation model) and the *total effect* (median value of sums of posterior samples used for the direct and indirect effect). The *proportion mediated* is the indirect effect divided by the total effect.

The simplest call just needs the model-object.

```
mediation(m2)
#>
#> # Causal Mediation Analysis for Stan Model
#>
#> Treatment: treat
#> Mediator: job_seek
#> Response: depress2
#>
#> Estimate HDI (90%)
#> Direct effect: -0.04 [-0.11 0.03]
#> Indirect effect: -0.02 [-0.04 0.01]
#> Total effect: -0.06 [-0.13 0.02]
#>
#> Proportion mediated: 27.68% [-92.66% 148.02%]
```

Typically, `mediation()`

finds the treatment and mediator variables automatically. If this does not work, use the `treatment`

and `mediator`

arguments to specify the related variable names. For all values, the 90% HDIs are calculated by default. Use `prob`

to calculate a different interval.

Here is a comparison with the *mediation* package. Note that the `summary()`

-output of the *mediation* package shows the indirect effect first, followed by the direct effect.

```
summary(m1)
#>
#> Causal Mediation Analysis
#>
#> Quasi-Bayesian Confidence Intervals
#>
#> Estimate 95% CI Lower 95% CI Upper p-value
#> ACME -0.0157 -0.0387 0.01 0.19
#> ADE -0.0438 -0.1315 0.04 0.35
#> Total Effect -0.0595 -0.1530 0.02 0.21
#> Prop. Mediated 0.2137 -2.0277 2.70 0.32
#>
#> Sample Size Used: 899
#>
#>
#> Simulations: 1000
mediation(m2, prob = .95)
#>
#> # Causal Mediation Analysis for Stan Model
#>
#> Treatment: treat
#> Mediator: job_seek
#> Response: depress2
#>
#> Estimate HDI (95%)
#> Direct effect: -0.04 [-0.13 0.04]
#> Indirect effect: -0.02 [-0.04 0.01]
#> Total effect: -0.06 [-0.15 0.03]
#>
#> Proportion mediated: 27.68% [-225.15% 280.5%]
```

If you want to calculate mean instead of median values from the posterior samples, use the `typical`

-argument. Furthermore, there is a `print()`

-method, which allows to print more digits.

```
mediation(m2, typical = "mean", prob = .95) %>% print(digits = 4)
#>
#> # Causal Mediation Analysis for Stan Model
#>
#> Treatment: treat
#> Mediator: job_seek
#> Response: depress2
#>
#> Estimate HDI (95%)
#> Direct effect: -0.0397 [-0.1284 0.0438]
#> Indirect effect: -0.0155 [-0.0392 0.0101]
#> Total effect: -0.0552 [-0.1502 0.0271]
#>
#> Proportion mediated: 28.0921% [-224.7357% 280.92%]
```

As you can see, the results are similar to what the *mediation* package produces for non-Bayesian models.

Similar to frequentist multilevel models, `icc()`

computes the intraclass correlation coefficient for Bayesian multilevel models. One advantage of Bayesian regression models is that you can compute the ICC for each sample of the posterior distribution, which allows you to easily calculate uncertainty intervals.

```
icc(m4)
#>
#> # Random Effect Variances and ICC
#>
#> Family: gaussian (identity)
#>
#> ## cyl
#> ICC: 0.26 HDI 89%: [0.00 0.71]
#> Between-group: 24.16 HDI 89%: [0.00 43.56]
#>
#> ## gear
#> ICC: 0.51 HDI 89%: [0.00 0.92]
#> Between-group: 45.96 HDI 89%: [0.00 104.20]
#>
#> ## Residuals
#> Within-group: 6.44 HDI 89%: [3.27 8.95]
#>
#> ## Random-slope-variance
#> gear: 7.71 HDI 89%: [0.00 16.89]
icc(m5)
#>
#> # Random Effect Variances and ICC
#>
#> Family: gaussian (identity)
#>
#> ## e15relat
#> ICC: 0.04 HDI 89%: [0.00 0.09]
#> Between-group: 0.55 HDI 89%: [0.00 1.24]
#>
#> ## Residuals
#> Within-group: 12.81 HDI 89%: [11.80 13.82]
```

For non-Gaussian models, there is no clean variance decomposition and hence the ICC can’t be calculated exactly. The general Bayesian way to analyse the random-effect variances is then to draw samples from the posterior predictive distribution, calculate the variances and compare how the variance across models changes when group-specific term are included or dropped.

You can achieve this with the `ppd`

-argument. In this case, draws from the posterior predictive distribution *not conditioned* on group-level terms (using `posterior_predict(..., re.form = NA)`

) as well as draws from this distribution *conditioned* on *all random effects* (by default, unless specified else in the `re.form`

-argument) are taken. Then, the variances for each of these draws are calculated. The “ICC” is then the ratio between these two variances.

```
icc(m4, ppd = TRUE, re.form = ~(1 | cyl), prob = .5)
#>
#> # Random Effect Variances and ICC
#>
#> Family: gaussian (identity)
#> Conditioned on: ~(1 | cyl)
#>
#> ## Variance Ratio (comparable to ICC)
#> Ratio: 0.09 HDI 50%: [-0.05 0.32]
#>
#> ## Variances of Posterior Predicted Distribution
#> Conditioned on fixed effects: 34.73 HDI 50%: [18.43 36.55]
#> Conditioned on rand. effects: 39.09 HDI 50%: [28.04 44.56]
#>
#> ## Difference in Variances
#> Difference: 4.36 HDI 50%: [-2.99 10.87]
```

Sometimes, when the variance of the posterior predictive distribution is very large, the variance ratio in the output makes no sense, e.g. because it is negative. In such cases, it might help to use a more robust measure to calculate the central tendency of the variances. This can be done with the `typical`

-argument.

```
# the "classical" ICC, not recommended for non-Gaussian
icc(m3)
#>
#> # Random Effect Variances and ICC
#>
#> Family: zero_inflated_poisson (log)
#>
#> ## persons
#> ICC: 0.68 HDI 89%: [0.43 0.98]
#> Between-group: 4.88 HDI 89%: [0.23 10.34]
#>
#> ## Residuals
#> Within-group: 1.00 HDI 89%: [1.00 1.00]
# variance ratio based on posterior predictive distributions,
# which is negative and hence obviously nonsense
icc(m3, ppd = TRUE)
#>
#> # Random Effect Variances and ICC
#>
#> Family: zero_inflated_poisson (log)
#> Conditioned on: all random effects
#>
#> ## Variance Ratio (comparable to ICC)
#> Ratio: -8.38 HDI 89%: [-0.92 1.00]
#>
#> ## Variances of Posterior Predicted Distribution
#> Conditioned on fixed effects: 357.23 HDI 89%: [ 0.03 75.20]
#> Conditioned on rand. effects: 39.53 HDI 89%: [31.12 48.15]
#>
#> ## Difference in Variances
#> Difference: -317.70 HDI 89%: [-39.18 48.93]
# use median instead of mean
icc(m3, ppd = TRUE, typical = "median")
#>
#> # Random Effect Variances and ICC
#>
#> Family: zero_inflated_poisson (log)
#> Conditioned on: all random effects
#>
#> ## Variance Ratio (comparable to ICC)
#> Ratio: 0.70 HDI 89%: [-0.88 1.00]
#>
#> ## Variances of Posterior Predicted Distribution
#> Conditioned on fixed effects: 11.71 HDI 89%: [ 0.04 74.93]
#> Conditioned on rand. effects: 39.33 HDI 89%: [30.75 48.09]
#>
#> ## Difference in Variances
#> Difference: 26.50 HDI 89%: [-35.34 51.70]
```

`r2()`

computes either the Bayes r-squared value, or - if `loo = TRUE`

- a LOO-adjusted r-squared value (which comes conceptionally closer to an adjusted r-squared measure).

For the Bayes r-squared, the standard error is also reported. Note that `r2()`

uses the median as measure of central tendency and the median absolute deviation as measure for variability.

Kruschke JK. Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan. 2nd edition. Academic Press, 2015

Kruschke JK. Rejecting or Accepting Parameter Values in Bayesian Estimation. Advances in Methods and Practices in Psychological Science. 2018; doi: 10.1177/2515245918771304

McElreath R. Statistical Rethinking. A Bayesian Course with Examples in R and Stan. Chapman and Hall, 2015

Norman GR, Sloan JA, Wyrwich KW. Interpretation of Changes in Health-related Quality of Life: The Remarkable Universality of Half a Standard Deviation. Medical Care. 2003;41: 582–592. doi: 10.1097/01.MLR.0000062554.74615.4C