BayesianPower

Fayette Klaassen

Introduction

BayesianPower can be used for sample size determination (using bayes_sampsize) and power calculation (using bayes_power) when Bayes factors are used to compare an inequality constrained hypothesis \(H_i\) to its complement \(H_c\), another inequality constrained hypothesis \(H_j\) or the unconstrained hypothesis \(H_u\). Power is defined as a combination of controlled error probabilities. The unconditional or conditional error probabilities can be controlled. Four approaches to control these probabilities are available in the methods of this package. Users are advised to read this vignette and the paper available at 10.17605/OSF.IO/D9EAJ where the four available approaches are presented in detail (Klaassen, Hoijtink & Gu, unpublished)).

Power calculation with bayes_power()

bayes_power(n, h1, h2, m1, m2, ngroup = NULL, comp = NULL, bound1 = 1, bound2 = 1/bound1, datasets = 1000, nsamp = 1000, seed = NULL)

Arguments

n A number. The sample size for which the error probabilities must be computed.

h1 A constraint matrix defining H1, see below for more details.

h2 A constraint matrix defining H2, or a character 'u' or 'c' for the unconstrained or complement hypothesis.

m1 A vector of expected population means under H1 (standardized), see below for more details.

m2 A vector of expected populations means under H2 (standardized). m2 must be of same length as m1.

ngroup A number or NULL . The number of groups. If NULL the number of groups is determined from the length of m1.

comp A vector or NULL . The complexity of H1 and H2. If NULL the complexity is estimated. See below for more details.

bound1 A number. The boundary above which BF12 favors H1, see below for more details.

bound2 A number. The boundary below which BF12 favors H2.

datasets A number. The number of datasets to simulate to compute the error probabilities

nsamp A number. The number of prior or posterior samples to determine the complexity or fit.

seed A number. The random seed to be set.

Details

Specifying hypotheses

Hypotheses are defined by means of a constraint matrix, that specifies the ordered constraints between the means \(\boldsymbol\mu\) using a constraint matrix \(R\), such that \(R \boldsymbol{\mu} > \bf{0}\), where \(R\) is a matrix with \(J\) columns and \(K\) rows, where \(J\) is the number of group means and \(K\) is the number of constraints between the means, \(\boldsymbol\mu\) is a vector of \(J\) means and \(\bf{0}\) is a vector of \(K\) zeros. The constraint matrix \(R\) contains a set of linear inequality constraints.

Consider

##      [,1] [,2] [,3]
## [1,]    1   -1    0
## [2,]    0    1   -1
## [1] 0.4 0.2 0.0
##      [,1]
## [1,]  0.2
## [2,]  0.2
##      [,1]
## [1,] TRUE
## [2,] TRUE

The matrix \(R\) specifies that the sum of \(1 \times \mu_1\) and \(-1 \times \mu_2\) and \(0 \times \mu_3\) is larger than \(0\), and the sum of \(0 \times \mu_1\) and \(1 \times \mu_2\) and \(-1 \times \mu_3\) is larger than \(0\). This can also be written as: \(\mu_1 > \mu_2 > \mu_3\). For more information about the specification of constraint matrices, see for example [@hoijtink12book].

The argument h1 has to be a constraint matrix as specified above. The argument h2 can be either a constraint matrix, or the character 'u' or 'c' if the goal is to compare \(H_1\) with \(H_u\), the unconstrained hypothesis, or \(H_c\) the complement hypothesis.

Specifying population means

Hypothesized population means have to be defined under \(H_1\) and \(H_2\), also if \(H_u\) or \(H_c\) are considered as \(H_2\). The population means have to be standardized.

Computing complexity

If the complexity of a hypothesis is known it can be specified under comp to reduce computational time. If comp = NULL the complexity is sampled using \(\mu_{\cdot} \sim \mathcal{N}(0,1000)\) as a prior distribution for each mean, that is, a normal distribution with mean \(0\) and standard deviation \(1000\).

Setting bounds

bound1 and bound2 describe the boundary used for interpreting a Bayes factor. If bound1 = 1, all \(BF_{12} > 1\) are considered to express evidence in favor of \(H_1\), if bound1 = 3, all \(BF_{12} > 3\) are considered to express evidence in favor of \(H_1\). Similarly, bound2 is the boundary below which \(BF_{12}\) is considered to express evidence in favor of \(H_2\).

Examples

Example 1. \(H_1\) vs \(H_c\)

An example where three group means are ordered in \(H_1: \mu_1 > \mu_2 > \mu_3\) which is compared to its complement. The power is determined for \(n = 40\)

Example 2. H1 vs H2

An example where four group means are ordered in \(H_1: \mu_1 > \mu_2 > \mu_3 > \mu_4\) and in \(H_2: \mu_3 > \mu_2 > \ mu_4 > \mu_1\). Only Bayes factors larger than \(3\) are considered evidence in favor of \(H_1\) and only Bayes factors smaller than \(1/3\) are considered evidence in favor of \(H_2\).

Sample size determination with bayes_sampsize()

bayes_sampsize(m1, m2, h1, h2, type = 1, cutoff, bound1 = 1, bound2 = 1 / bound1, datasets = 1000, nsamp = 1000, minss = 2, maxss = 1000, seed = 31)

Arguments

The arguments are the same as for bayes_power() with the addition of:

typeA character. The type of error to be controlled. The options are: "1", "2", "de", "aoi", "med.1", "med.2". See below for more details.

cutoff A number. The cutoff criterion for type. If type is "1", "2", "de", "aoi", cutoff must be between \(0\) and \(1\). If type is "med.1" or "med.2", cutoff must be larger than \(1\). See below for more details.

minss A number. The minimum sample size.

maxss A number. The maximum sample size.

Details

bayes_sampsize() iteratively uses bayes_power() to determine the error probabilities for a sample size, evaluates whether the chosen error is below the cutoff, and adjusts the sample size.

type

[@klaassenPIH] describes in detail the different types of controlling error probabilities that can be considered. Specifying "1" or "2" indicates that the Type 1 or Type 2 error probability has to be controlled, respectively the probability of concluding \(H_2\) is the best hypothesis when \(H_1\) is true or concluding that \(H_1\) is the best hypothesis when \(H_2\) is true. Note that when \(H_1\) or \(H_2\) is considered the best hypothesis depends on the values chosen for bound1 and bound2. Specifying "de" or "aoi" indicates that the Decision error probability (average of Type 1 and Type 2) or the probability of Indecision has to be controlled. Finally, specifying " med.1" or "med.2" indicates the minimum desired median \(BF_{12}\) when \(H_1\) is true, or the minimum desired median \(BF_{21}\) when \(H_2\) is true.

Examples

References

Hoijtink, H. (2012). Informative hypotheses. Theory and practice for behavioral and social scientists. Boca Raton: Chapman Hall/CRC.

Klaassen, F., Hoijtink, H., Gu, X. (unpublished). The power of informative hypotheses. Pre-print available at https://doi.org/10.17605/OSF.IO/D9EAJ