ph2rand provides functions to assist with the design of randomized comparative phase II oncology trials that assume their primary outcome variable is Bernoulli distributed. Specifically, support is provided to (a) perform a sample size calculation when using one of several published designs, (b) evaluate the operating characteristics of a given design (both analytically and via simulation), and (c) produce informative plots.

This vignette will proceed by detailing the type of trial designs ph2rand currently supports. Each of the available functions will then be described, before several examples and useful further information will be provided.

# 1. Introduction

Whilst almost all phase II oncology trials once utilised single-arm non-randomised designs, a large proportion now use randomised comparative designs. The reasons for this change in design are numerous, and are described at length in Grayling et al (2019). For the purposes of this vignette, the important consideration is that with the increased use of randomised comparative designs in phase II, software for determining such designs increases in value. It is for this reason that this package, ph2rand, has been developed. Its goal is to eventually support design determination when using the vast majority of published methods for randomised comparative oncology trials. At present, the focus is on the most commonly used class of design: those for two-arm trials with a single primary outcome variable that is assumed to be Bernoulli distributed.

# 2. Randomized comparative phase II oncology trials

## 2.1. Design setting

At present, all of the methods supported by ph2rand can be described within a single design framework. Each assumes that a single experimental treatment regimen will be compared to a single concurrent control arm (which would typically be expected to be the current standard-of-care treatment option), through a trial with at most $$J\in\mathbb{N}^+$$ stages (i.e., in what follows we will simultaneously treat one-stage designs and designs that include interim analyses).

Furthermore, each supposes that the treatment’s anti-tumour activity will be formally compared through a single Bernoulli distributed primary outcome variable (e.g., tumour response or the disease control rate). Precisely, they assume that outcomes $$y_{ijk}$$ are accrued from patients $$i\in\{1,\dots,n_{jk}\}$$, in stage $$j\in\{1,\dots,J\}$$, in arm $$k\in\{C,E\}$$, with $$Y_{ijk} \sim Bern(\pi_k)$$ (i.e., we use $$Y_{ijk}$$ to indicate the random value of $$y_{ijk}$$). Here, arm $$k=C$$ corresponds to the control arm and arm $$k=E$$ to the experimental arm, and together the indices $$i$$ and $$j$$ define a particular patient. Thus, $$\pi_k\in[0,1]$$ is the probability of success (or response, if you prefer) for patient $$ij$$ when they are assigned to arm $$k$$. For brevity in what follows, we set $$\boldsymbol{\pi}=(\pi_C,\pi_E)^\top\in[0,1]^2$$, and from here we will refer to the $$\pi_k$$ as response rates. Similarly, we set $$\boldsymbol{n}_k=(n_{1k},\dots,n_{Jk})$$ for $$k\in\{C,E\}$$. Note that at this stage we explicitly allow for the range of $$i$$ to depend on both $$j$$ and $$k$$, which will be discussed again in Section 3.

With the above, the designs supported by ph2rand test the following null hypothesis

$H_0 : \pi_C \le \pi_E.$ Then, given a trial design of any kind returned by ph2rand, which we denote in generality at this point by $$\mathcal{D}$$, we will signify the power function as follows

$P(\boldsymbol{\pi}) = \mathbb{P}(\text{Reject }H_0 \mid \boldsymbol{\pi},\mathcal{D}).$ Using the above definition, each of the design functions in ph2rand aims to return an optimised design (as defined below), which ensures that the type-I error-rate is controlled to a specified level $$\alpha \in (0,1)$$ over a given set $$\Pi_0\subseteq [0,1]$$. Formally, they look to identify a design with

$\max_{\pi\in\Pi_0} P\{(\pi,\pi)^\top\} \le \alpha.$ Thus, ph2rand implicitly assumes a monotonicity property on the power function such that the type-I error-rate is maximised along the boundary of the null hypothesis where $$\pi_C=\pi_E$$. Note that $$\Pi_0$$ can be anything from a single point (e.g., $$0.1$$), which would logically correspond to the anticipated response rate in the control arm, through to the entire line $$[0,1]$$. The implications of different choices for $$\Pi_0$$ are discussed in Section 3.

Furthermore, ph2rand supports powering trials under the following flexible framework. First, a clinically relevant difference in response rates (or treatment effect) $$\delta \in (0,1]$$ is specified. Then, a set $$\Pi_1 \subseteq [0,1-\delta]$$ is nominated, such that design determination should ensure that

$\min_{\pi\in\Pi_1} P\{(\pi,\pi+\delta)^\top\} \ge 1 - \beta,$ where $$\beta\in(0,1)$$ can be viewed as the type-II error-rate, and we refer to this scenario later as the alternative hypothesis.

The above outlines the hypothesis and error-rates each of the designs supported by ph2rand aims to control. To appreciate the differences between the way the designs work later, it is useful now to consider a general framework for decision making.

First, each design framework specifies particular test statistics $$\boldsymbol{t}_j$$ that would be computed after stage $$j$$ to base its decisions on. We denote that the random values of these test statistics by $$\boldsymbol{T}_j$$. In all instances $$\boldsymbol{t}_j$$ will be dependent on (at most) the chosen design (e.g., factors such as the sample size in each arm in each stage) and the number of responses seen in each arm in each stage. For computational purposes, it will be useful to categorise what we denote by $$\mathscr{T}_j$$, the support of $$\boldsymbol{T}_j$$ (i.e., the space of possible values such that $$\boldsymbol{t}_j\in\mathscr{T}_j$$). Note that it will always be the case that more positive values for the elements of $$\boldsymbol{t}_j$$ indicate increased patient benefit on the experimental treatment.

In addition, each of the design frameworks specifies particular rejection, non-rejection, and continuation regions. Each of these can be viewed as a sub-space such that if the chosen test statistics belongs to that space, then a given associated decision will be made. As above, these sets will only ever be dependent on the chosen design and the number of responses seen in each arm in each stage. We define them as $$\mathscr{R}_j$$, $$\mathscr{N}_j$$, and $$\mathscr{C}_j$$. Here, $$\mathscr{R}$$, $$\mathscr{N}$$, and $$\mathscr{C}$$ are chosen to indicate rejection, non-rejection, and continuation respectively. As will be seen later, this framework can readily handle any value of $$J$$, as we can always ensure a trial terminates after at most $$J$$ stages by ensuring $$\mathscr{R}_J\cup\mathscr{N}_J = \mathscr{T}_J$$. More generally, we can ensure a decision on how to proceed is always clear by making sure $$\mathscr{C}_j = \mathscr{T}_j\backslash\{\mathscr{R}_j \cup \mathscr{N}_j\}$$, thus in what follows we do not explicitly state the forms for $$\mathscr{C}_j$$. All designs supported by ph2rand ensure that these conditions are met.

Ultimately, with the above, the following algorithm describes how each of the designs work

1. Set $$j=1$$.
2. Recruit $$n_{jk}$$ patients to arms $$k\in\{C,E\}$$ and gather the associated outcomes $$y_{ijk}$$ for $$i\in\{1,\dots,n_{jk}\}$$.
3. Compute $$\boldsymbol{t}_j$$ and use the following decision rules
• If $$\boldsymbol{t}_j \in \mathscr{C}_j$$, set $$j=j+1$$ and return to step 2.
• If $$\boldsymbol{t}_j \in \mathscr{R}_j$$, terminate the trial and reject $$H_0$$.
• If $$\boldsymbol{t}_j \in \mathscr{N}_j$$, terminate the trial and do not reject $$H_0$$.

The above completes the categorisation of the core principles of the designs supported by ph2rand. In practice, a large number of potential designs will meet the type-I and type-II error-rate criteria, and thus a condition is needed for choosing the best (i.e., optimal) design amongst these. Before we outline the optimality criteria supported by ph2rand, though, we describe the various statistical operating characteristics that it can return.

## 2.2. Operating characteristics

For one-stage designs, $$P(\boldsymbol{\pi})$$ will likely be the only random quantity of interest. For designs with more than one stage, though, there are several additional random variables that should be considered. Firstly, it will generally be of both interest and great use to evaluate the probability that the trial stops at the end of each permitted stage, with sub-categorisation according to the decision on whether or not $$H_0$$ should be rejected. We define these quantities for $$j\in\{1,\dots,J\}$$ as follows

\begin{align} E_j(\boldsymbol{\pi}) &= \mathbb{P}(\text{Stop after stage }j\text{ and reject }H_0 \mid \boldsymbol{\pi},\mathscr{D}),\\ F_j(\boldsymbol{\pi}) &= \mathbb{P}(\text{Stop after stage }j\text{ and do not reject }H_0 \mid \boldsymbol{\pi},\mathscr{D}),\\ S_j(\boldsymbol{\pi}) &= E_j(\boldsymbol{\pi}) + F_j(\boldsymbol{\pi}), \end{align}

where we use the letters $$E$$, $$F$$, and $$S$$ to signify efficacy (i.e., activity), futility (i.e., lack of activity), and stopping respectively. We can also think about the above via the following equations

\begin{align} E_1(\boldsymbol{\pi}) &= \mathbb{P}(\boldsymbol{T}_1 \in \mathscr{R}_1 \mid \boldsymbol{\pi},\mathcal{D}),\\ F_1(\boldsymbol{\pi}) &= \mathbb{P}(\boldsymbol{T}_1 \in \mathscr{N}_1 \mid \boldsymbol{\pi},\mathcal{D}),\\ S_1(\boldsymbol{\pi}) &= \mathbb{P}(\boldsymbol{T}_1 \notin \mathscr{C}_1 \mid \boldsymbol{\pi},\mathcal{D}),\\ \end{align}

and for $$j\in\{2,\dots,J\}$$

\begin{align} E_j(\boldsymbol{\pi}) &= \mathbb{P}(\boldsymbol{T}_1 \in \mathscr{C}_1,\dots,\boldsymbol{T}_{j-1}\in \mathscr{C}_{j-1},\boldsymbol{T}_j \in \mathscr{R}_j \mid \boldsymbol{\pi},\mathcal{D}),\\ F_j(\boldsymbol{\pi}) &= \mathbb{P}(\boldsymbol{T}_1 \in \mathscr{C}_1,\dots,\boldsymbol{T}_{j-1}\in \mathscr{C}_{j-1},\boldsymbol{T}_j \in \mathscr{N}_j \mid \boldsymbol{\pi},\mathcal{D}),\\ S_j(\boldsymbol{\pi}) &= \mathbb{P}(\boldsymbol{T}_1 \in \mathscr{C}_1,\dots,\boldsymbol{T}_{j-1}\in \mathscr{C}_{j-1},\boldsymbol{T}_j \notin \mathscr{C}_j \mid \boldsymbol{\pi},\mathcal{D}). \end{align}

Then, also of much interest will be the trials expected sample size under given response rates. To this end, denote by $$N$$ the random variable giving the total sample size required by a trial, and set

$\tilde{n}_{jk} = \sum_{l=1}^j n_{lk},\qquad k\in\{C,E\}.$ Then, we simultaneously define and compute the expected sample size as follows

$ESS(\boldsymbol{\pi}) = \mathbb{E}(N \mid \boldsymbol{\pi},\mathcal{D}) = \sum_{j=1}^J S_j(\boldsymbol{\pi})\left(\tilde{n}_{jC} + \tilde{n}_{jE}\right).$ We will also consider the standard deviation of the required sample size

\begin{align} SDSS(\boldsymbol{\pi}) &= \mathbb{E}(N^2 \mid \boldsymbol{\pi},\mathcal{D}) - \mathbb{E}(N \mid \boldsymbol{\pi},\mathcal{D})^2,\\ &= \sum_{j=1}^J S_j(\boldsymbol{\pi})\left(\tilde{n}_{jC} + \tilde{n}_{jE}\right)^2 - ESS(\boldsymbol{\pi})^2. \end{align}

Next, setting $$\tilde{S}_j(\boldsymbol{\pi}) = S_1(\boldsymbol{\pi}) + \cdots + S_j(\boldsymbol{\pi})$$ and $$l = \text{argmin}_{j=1,\dots,J}\{\tilde{S}_j(\boldsymbol{\pi}) \ge 0.5\}$$, we signify the median required sample size by

$MSS(\boldsymbol{\pi}) = \begin{cases} \tilde{n}_{lC} + \tilde{n}_{lE} + 0.5(n_{l+1C} + n_{l+1E}) \ &: \ \tilde{S}_l(\boldsymbol{\pi}) = 0.5, \\ \tilde{n}_{lC} + \tilde{n}_{lE} \ &: \ \tilde{S}_l(\boldsymbol{\pi}) \neq 0.5. \end{cases}$ Finally, we will denote the maximum possible value of $$N$$ by $$\max N = \tilde{n}_{JC} + \tilde{n}_{JE}$$.

## 2.3. Optimality criteria

Designs that meet the desired type-I and type-II error-rate criteria are known as feasible. Within ph2rand, feasible designs are identified by searching over a discrete set of possible trial designs, as is described further in Section 4.1.

Then, for a one-stage trial design, ph2rand defines the optimal design as the feasible design that has the largest value of $$\min_{\pi\in\Pi_1} P\{(\pi,\pi+\delta)^\top\}$$, amongst the feasible designs with the smallest value of $$\max N$$. For each of the designs currently supported this guarantees a unique optimal design.

For designs with $$J>1$$, a more flexible optimality criteria is supported that builds upon that used, for example, in Jung et al (2004), Mander et al (2012), Wason et al (2012), and Wason (2015). Precisely, a set of weights, $$w_1,\dots,w_5\in[0,\infty)$$, are selected. As is a value $$\pi_\text{O}\in[0,1-\delta]$$. Then, the optimal design is that, amongst the feasible designs, which minimises the following criteria

$\begin{multline} w_1ESS\{(\pi_\text{O},\pi_\text{O})^\top\} + w_2ESS\{(\pi_\text{O},\pi_\text{O} + \delta)^\top\} + w_3\max_{\pi\in[0,1]}ESS\{(\pi,\pi)^\top\}\\ + w_4\max_{\boldsymbol{\pi}\in[0,1]^2}ESS(\boldsymbol{\pi}) + w_5\max N. \end{multline}$

In general, $$\pi_\text{O}$$ should likely be chosen as the anticipated response rate in the control arm. In addition, it should typically be ensured that $$w_1+\cdots+w_4>0$$, as many designs will likely share the same minimal maximal sample size. Note that there is no available result that guarantees the above criteria and the given restrictions on the weights will lead to a unique optimal design. However, in practice, this is almost certain to be the case. Additionally, note that in general setting $$w_3>0$$ or $$w_4>0$$ may be illogical unless both efficacy and futility stopping is permitted (see later), as their associated $$\max ESS$$ criteria may be equal to $$\max N$$ otherwise.

## 2.4. Supported designs

Currently, ph2rand supports the determination of four different types of randomised comparative design, all in one-stage and two-stage forms (i.e., they each support $$J\in\{1,2\}$$). Each is described in its own subsection below.

Note that in all instances the values $$n_{jk}$$ are determined by searching over possible values for the $$n_{jC}$$, assuming that $$n_{jE}=rn_{jC}$$ for a specified allocation ratio $$r$$.

Furthermore, in what follows the following notation will be helpful

\begin{align} x_{jk} &= \sum_{i=1}^{n_{jk}} y_{ijk},\qquad k\in\{C,E\},\\ z_j &= x_{jC} + x_{jE},\\ \tilde{x}_{jk} &= x_{1k} + \cdots + x_{jk},\\ \tilde{z}_j &= z_1 + \cdots + z_j. \end{align}

### 2.4.1. Jung (2008)

Jung (2008), in combination with further details provided in Jung (2013), outlines a design framework based on exact binomial tests. Specifically, the following test statistic is used at the end of stage $$j\in\{1,\dots,J\}$$ $\boldsymbol{t}_j = t_{Dj} = \tilde{x}_{jE} - \tilde{x}_{lC}.$ That is, the difference between the number of responses observed on the experimental and control arms is used. Note that $$\mathscr{t}_j \subseteq \{-n_{C1}-\cdots-n_{Cj},\dots,n_{E1}+\cdots+n_{Ej}\}$$; as noted earlier this is an important observation for efficient design determination.

This design then works by specifying the following rejection and non-rejection regions \begin{align} \mathscr{R}_j &= (e_j,\infty),\\ \mathscr{N}_j &= (-\infty,f_j]. \end{align}

Thus, for a $$J$$ stage design, parameters $$e_1,\dots,e_J$$ and $$f_1,\dots,f_J$$ are needed. For brevity, set $$\boldsymbol{e}=(e_1,\dots,e_J)$$ and $$\boldsymbol{f}=(f_1,\dots,f_J)$$. With this, we have $$\mathcal{D} = \{\boldsymbol{n}_C,\boldsymbol{n}_E,\boldsymbol{e},\boldsymbol{f}\}$$.

Note that with the above, if we wish to prevent early stopping for futility we can simply specify that $$f_j = -\infty$$ for $$j\in\{1,\dots,J-1\}$$. Similarly, to prevent early stopping for efficacy, we can set $$e_1=\dots=e_{J-1}=\infty$$.

Furthermore, note that in all cases ph2rand ensures that $$e_J = f_J$$. This is enough to ensure that $$\mathscr{C}_J=\emptyset$$ and thus that a decision is made on $$H_0$$ by the trial’s completion and the study ends after at most $$J$$ stages. When $$J>1$$, it also ensures that $$e_j > f_j$$ for $$j\in\{1,\dots,J-1\}$$, in such a way that $$\mathscr{C}_j \neq \emptyset$$ for $$j\in\{1,\dots,J-1\}$$.

Finally, note that because $$t_{Dj}\in\mathbb{Z}$$, nothing is lost by making the assumption that $$e_j,f_j\in\mathbb{Z}$$: ph2rand exploits this fact to search over possible designs as efficiently as possible. Specifically, as is discussed further in Section 3, the user specifies a range of allowed sample sizes. Then, ph2rand will search exhaustively over the designs that should be considered to evaluate all potential different options for the given sample sizes.

### 2.4.2. Shan et al (2013)

Shan et al (2013) propose a framework similar to Jung (2008), but based on the following Barnard style test-statistics

$\boldsymbol{t}_j = t_{Bj} = \frac{\frac{\tilde{x}_{jE}}{\tilde{n}_{jE}} - \frac{\tilde{x}_{jC}}{\tilde{n}_{jC}}}{\frac{\tilde{z}_j}{\tilde{n}_{jC} + \tilde{n}_{jE}}\left(1 - \frac{\tilde{z}_j}{\tilde{n}_{jC} + \tilde{n}_{jE}} \right)\left( \frac{1}{\tilde{n}_{jC}} + \frac{1}{\tilde{n}_{jE}} \right)}.$ The rejection, non-rejection, and continuation regions are as in Section 2.4.1. So to does the design search procedure remain qualitatively the same (i.e., exhaustive evaluation of potential designs is still carried out). However, whilst the $$t_{Bj}$$ will still take a discrete set of possible values that can be used to limit the number of designs that should be considered, in this case $$t_{Bj}\in\mathbb{R}$$, and thus we allow $$e_j,f_j\in\mathbb{R}$$ also.

### 2.4.3. Jung and Sargent (2014)

Jung and Sargent (2014) propose a somewhat different design framework to those given above. Specifically, whilst the test statistic used in Jung (2008) is retained, the following rejection and non-rejection regions are nominated \begin{align} \mathscr{R}_j &= (e_{jz_1\dots z_j},\infty),\\ \mathscr{N}_j &= (-\infty,f_{jz_1\dots z_j}]. \end{align} Thus, $$\mathscr{R}_j$$, $$\mathscr{N}_j$$, and $$\mathscr{C}_j$$ are allowed to vary based upon the the number of responses seen in the two arms in each stage.

Here, for brevity, we denote the $$(\tilde{n}_{1C}+\tilde{n}_{1E}) \times \cdots \times (\tilde{n}_{jC}+\tilde{n}_{jE})$$ arrays of efficacy and futility stopping boundaries for use after stage $$j$$ by $$\boldsymbol{e}_j$$ and $$\boldsymbol{f}_j$$ respectively. Then, $$\mathcal{D} = \{\boldsymbol{n}_C,\boldsymbol{n}_E,\boldsymbol{e}_1,\dots,\boldsymbol{e}_J,\boldsymbol{f}_1,\dots,\boldsymbol{f}_J\}$$.

Similarly to before, we can now prevent early stopping for efficacy or futility by prescribing that $$e_{jz_1\dots z_j}=\infty$$ and $$f_{jz_1\dots z_j}=-\infty$$ for $$j\in\{1,\dots,J\}$$ and $$(z_1,\dots,z_j) \in \{0,\dots,\tilde{n}_{1C}+\tilde{n}_{1E}\}\times\cdots\times\{0,\dots,\tilde{n}_{jC}+\tilde{n}_{jE}\}$$.

Unlike before, though, the boundaries are now not identified using an exhaustive search. Firstly, this is because this search procedure would be too computationally intensive to be useful in practice. However, the principal reason they are not chosen via an exhaustive assessment is because the motivation for this design framework comes from Fisher’s exact test and its approach to guaranteeing error control. See Jung and Sargent (2014) for further details.

### 2.4.4. Litwin et al (2017)

The design from Litwin et al (2017) takes a different approach to those above and uses two test statistics to make a decision. Specifically, it combines testing rules typically associated with single-arm and two-arm trial designs, setting

$\boldsymbol{t}_j=(t_{Sj},t_{Dj})^\top = (\tilde{x}_{jE},\tilde{x}_{jE} - \tilde{x}_{jC}).$ Then, the rejection and non-rejection regions are \begin{align} \mathscr{R}_j &= (e_{Sj},\infty)\times(e_{Tj},\infty),\\ \mathscr{N}_j &= \{(-\infty,f_{sj}]\times\mathbb{R}\}\cup\{\mathbb{R}\times(-\infty,f_{Tj}]\}. \end{align} Thus, for a $$J$$ stage design, parameters $$e_{S1},\dots,e_{SJ}$$, $$e_{T1},\dots,e_{TJ}$$, $$f_{S1},\dots,f_{SJ}$$, and $$f_{T1},\dots,f_{TJ}$$ are needed. For brevity, set $$\boldsymbol{e}_S=(e_{S1},\dots,e_{SJ})$$, $$\boldsymbol{e}_T=(e_{T1},\dots,e_{TJ})$$, $$\boldsymbol{f}_S=(f_{S1},\dots,f_{SJ})$$, and $$\boldsymbol{f}_T=(f_{T1},\dots,f_{TJ})$$. With this, we have $$\mathcal{D} = \{\boldsymbol{n}_C,\boldsymbol{n}_E,\boldsymbol{e}_S,\boldsymbol{e}_T,\boldsymbol{f}_S,\boldsymbol{f}_T\}$$.

Furthermore, in this case, early stopping for futility can be prevented by setting $$f_{S1}=\cdots=f_{SJ-1}=f_{T1}=\cdots=f_{TJ-1}=-\infty$$. Similarly, early stopping for efficacy can be prevented by setting $$e_{S1}=\cdots=e_{SJ-1}=e_{T1}=\cdots=e_{TJ-1}=\infty$$.

As for the designs from Jung (2008) and Shan et al (2013) described above, optimal designs are again identified via an exhaustive search procedure. Here, we make use of the fact that nothing is lost from assuming $$\boldsymbol{e}_S,\boldsymbol{e}_T,\boldsymbol{f}_S,\boldsymbol{f}_T\in\mathbb{Z}^J$$. Furthermore, ph2rand enforces $$e_{SJ} = f_{SJ}$$ and $$e_{TJ} = f_{TJ}$$, which ensures $$\mathscr{C}_J=\emptyset$$. When $$J>1$$, it also makes particular restrictions such that $$e_{Sj} > f_{Sj}$$ and $$e_{Tj} > f_{Tj}$$ which ensure $$\mathscr{C}_j \neq \emptyset$$ for $$j\in\{1,\dots,J-1\}$$.

# 3. Available functions

At present, ph2rand exports ten functions for user use, each of which is described in detail below.

Note that the computations in all of the functions except sim() are performed using exact binomial probabilities, without recourse to simulation or numerical approximations. Consequently, for all of these we should always have that $$S_1(\boldsymbol{\pi}) + \cdots + S_J(\boldsymbol{\pi}) = 1$$, and $$P(\boldsymbol{\pi}) = E_1(\boldsymbol{\pi}) + \cdots + E_J(\boldsymbol{\pi})$$.

In addition, where helpful Rcpp is utilised in order to speed up the evaluations. Nonetheless, notes are given below on options that may substantially increase run time.

## 3.1. des_one_stage()

des_one_stage() determines one-stage two-arm randomised clinical trial designs within the framework described above (i.e., in particular assuming the primary outcome variable is Bernoulli distributed). In all instances, des_one_stage() computes the optimal required sample size in each arm, the associated optimal stopping boundaries, and returns information on key operating characteristics.

It allows the following inputs to be specified

• type: A character string indicating the chosen design framework/test statistic to assume. Must be one of "barnard" (see Section 2.4.2), "binomial" (Section 2.4.1), "fisher" (Section 2.4.3), or "sat" (Section 2.4.4). Defaults to "binomial".
• alpha: A numeric indicating the chosen value for $$\alpha$$, the significance level (i.e., the type-I error-rate). Defaults to 0.1.
• beta: A numeric indicating the chosen value for $$\beta$$, used in the definition of the desired power (i.e., the type-II error-rate). Defaults to 0.2.
• delta: A numeric indicating the chosen value for $$\delta$$, the treatment effect assumed in the power calculation. Defaults to 0.2.
• ratio: A numeric indicating the chosen value for $$r$$, the allocation ratio to the experimental arm, relative to the control arm. Defaults to 1.
• Pi0: A numeric vector indicating the chosen value for $$\Pi_0$$, the control arm response rates to control the type-I error-rate to level $$\alpha$$ for. Must either be of length one, indicating a single point, or of length two. In this case, the elements indicate the range of possible response rates to allow for. Defaults to 0.1.
• Pi1: A numeric vector indicating the chosen value for $$\Pi_1$$s, the control arm response rates to allow for in the power calculations. Must either be of length one, indicating a single point, or of length two. In this case, the elements indicate the range of possible response rates to allow for. Defaults to Pi0[1].
• nCmax: A numeric indicating the maximum value of the sample size in the control arm, $$\tilde{n}_{JC}$$, to consider in the search procedure. Defaults to 50L.
• summary: A logical variable indicating whether a summary of the function’s progress should be printed to the console. Defaults to FALSE.

It returns a list with additional class "ph2rand_des", containing each of the input parameters along with several additional variables, including

• A list in the slot $boundaries giving the rejection boundary/boundaries of the optimal design. The names of these elements depends on the value of type. • A tibble in the slot $feasible summarising the operating characteristics of the feasible designs.
• A numeric in the slot $nC giving the sample size in the control arm, $$n_{1C}$$, for the optimal design. • A numeric in the slot $nE giving the sample size in the experimental arm, $$n_{1E}$$, for the optimal design.
• A tibble in the slot $opchar summarising the operating characteristics of the optimal design. ## 3.2. des_two_stage() ### 3.2.1. Inputs and outputs des_two_stage() determines two-stage two-arm randomised clinical trial designs within the framework described above (i.e., in particular assuming the primary outcome variable is Bernoulli distributed). In all instances, des_two_stage() computes the optimal required sample size in each arm in each stage, the associated optimal stopping boundaries, and returns information on key operating characteristics. It allows the following inputs to be specified • type: A character string indicating the chosen design framework/test statistic to assume. Must be one of "barnard" (see Section 2.4.2), "binomial" (Section 2.4.1), "fisher" (Section 2.4.3), or "sat" (Section 2.4.4). Defaults to "binomial". • alpha: A numeric indicating the chosen value for $$\alpha$$, the significance level (i.e., the type-I error-rate). Defaults to 0.1. • beta: A numeric indicating the chosen value for $$\beta$$, used in the definition of the desired power (i.e., the type-II error-rate). Defaults to 0.2. • delta: A numeric indicating the chosen value for $$\delta$$, the treatment effect assumed in the power calculation. Defaults to 0.2. • ratio: A numeric indicating the chosen value for $$r$$, the allocation ratio to the experimental arm, relative to the control arm. Defaults to 1. • Pi0: A numeric vector indicating the chosen value for $$\Pi_0$$, the control arm response rates to control the type-I error-rate to level $$\alpha$$ for. Must either be of length one, indicating a single point, or of length two. In this case, the elements indicate the range of possible response rates to allow for. Defaults to 0.1. • Pi1: A numeric vector indicating the chosen value for $$\Pi_1$$s, the control arm response rates to allow for in the power calculations. Must either be of length one, indicating a single point, or of length two. In this case, the elements indicate the range of possible response rates to allow for. Defaults to Pi0[1]. • nCmax: A numeric indicating the maximum value of the sample size in the control arm (across both stage), $$\tilde{n}_{JC}$$, to consider in the search procedure. Defaults to 50L. • equal: A logical variable indicating whether the sample size of the two stages should be equal (i.e., whether we should enforce $$n_{1C}=n_{2C}$$ and $$n_{1E}=n_{2E}$$). Defaults to TRUE. • w: A numeric vector indicating the weights to use in the optimality criteria. Must be of length five, with all elements greater than or equal to zero, and at least one of the first four elements strictly positive. Defaults to c(1, 0, 0, 0, 0). • piO: A numeric indicating the chosen value of $$\pi_O$$, the control arm response rate to assume in the optimality criteria. Defaults to Pi0[1]. • efficacy: Only used if type is one of "barnard", "binomial", or "sat". Then, it is a logical variable indicating whether to include early stopping for efficacy in the design. Defaults to FALSE. • futility: Only used if type is one of "barnard", "binomial", or "sat". Then, it is a logical variable indicating whether to include early stopping for futility in the design. Defaults to TRUE. • efficacy_type: Only used if type is "fisher". Then, it is a numeric indicating whether, and which type of, early stopping for efficacy to include in the design. See below for details. Defaults to 0L. • efficacy_param: Only used if type is "fisher" and efficacy_type is not equal to 0L. Then, it is a numeric that influences the precise way in which an efficacy boundary is specified. See below for details. Defaults to NULL. • futility_type: Only used if type is "fisher". Then, it is a numeric indicating whether, and which type of, early stopping for futility to include in the design. See below for details. Defaults to 1L. • futility_param: Only used if type is "fisher" and futility_type is not equal to 0L. Then, it is a numeric that influences the precise way in which a futility boundary is specified. See below for details. Defaults to 1L. • summary: A logical variable indicating whether a summary of the function’s progress should be printed to the console. Defaults to FALSE. It returns a list with additional class "ph2rand_des", containing each of the input parameters along with several additional variables, including • A list in the slot $boundaries giving the rejection boundary/boundaries of the optimal design. The names of these elements depends on the value of type.
• A tibble in the slot $feasible summarising the operating characteristics of the feasible designs. • A numeric in the slot $nC giving the sample sizes in the control arm in each stage, $$\boldsymbol{n}_C=(n_{1C},\dots,n_{JC})^\top$$, for the optimal design.
• A numeric in the slot $nE giving the sample sizes in the experimental arm in each stage, $$\boldsymbol{n}_E=(n_{1E},\dots,n_{JE})^\top$$, for the optimal design. • A tibble in the slot $opchar summarising the operating characteristics of the optimal design.

### 3.2.2. Stopping boundaries when type = "fisher"

When type = "fisher", the input parameters efficacy_type, efficacy_param, futility_type, and futility_param allow for highly flexible choices for how to specify $$\boldsymbol{e}_1$$ and $$\boldsymbol{f}_1$$ {$$\boldsymbol{e}_2$$ always being computed as in Jung and Sargent (2014)}. Specifically, efficacy_type must be equal to 0, 1, or 2. Then - A value of 0 indicates that early stopping for efficacy should be prevented. - For a value of 1, efficacy_param can then be set to NULL or any single numeric whole number. When it is NULL, ph2rand sets $e_{1z_1} = [0.5(n_{1C} + n_{1E})\delta]_* + 1,$ where $$[x]_*$$ indicates the nearest whole number to $$x$$, as originally proposed by Jung and Sargent (2014). When it is any given whole number, $$e$$ say, it sets $$e_{1z_1}=e$$ for all $$z_1$$. - For a value of 2, efficacy_param must be a single numeric in the range $$(0,1)$$. If we refer to this as $$\alpha_1$$, say, then ph2rand chooses $$e_{1z_1}$$ as $e_{1z_1} = \text{argmin}_{e\in\{-n_{C1},\dots,n_{E1}\}} [\text{max}_{\pi\in\Pi_0} E_1\{(\pi,\pi)^\top\} \le \alpha_1].$ Similarly, futility_type must also be equal to 0, 1, or 2. In this case - A value of 0 indicates that early stopping for futility should be prevented. - For a value of 1, futility_param must be a single numeric whole number. If this is $$f$$, say, ph2rand sets $f_{1z_1} = f$ for all $$z_1$$. - For a value of 2, futility_param must be a single numeric in the range $$(0,1)$$. If we refer to this as $$\beta_1$$, say, then ph2rand chooses $$f_{1z_1}$$ as $f_{1z_1} = \text{argmax}_{f\in\{-n_{C1},\dots,n_{E1}\}} [\text{max}_{\pi\in\Pi_1} F_1\{(\pi,\pi+\delta)^\top\} \le \beta_1].$

## 3.3. terminal()

terminal() determines the terminal points of a design returned by des_one_stage() or des_two_stage(), along with their associated decisions. That is, it returns all possible values of the number of responses that could be observed in the arms (stratified by stage where relevant), gives details on what the associated test statistic(s) would be, and provides information on what the decision would be in regards to $$H_0$$. For two-stage designs it also provides details on the scenarios that would lead to continuation to stage two at the interim analysis.

It allows the following inputs to be specified

• des: An object of class "ph2rand_des", as returned by des_one_stage() or des_two_stage(). Defaults to ph2rand::des_one_stage().
• k: A numeric vector indicating which stages to consider when determining the terminal points. Defaults to 1:des$J (i.e., to all stages of the given design). • summary A logical variable indicating whether a summary of the function’s progress should be printed to the console. Defaults to FALSE. It returns a list with additional class "ph2rand_terminal", containing each of the input parameters along with a tibble in the slot $terminal, which gives the determined terminal points.

## 3.4. pmf()

pmf() analytically determines probability mass functions of a design returned by des_one_stage() or des_two_stage(), under given response rate scenarios (see pi).

It allows the following inputs to be specified

• des: An object of class "ph2rand_des", as returned by des_one_stage() or des_two_stage(). Defaults to ph2rand::des_one_stage().
• pi: A numeric vector with two elements, or numeric matrix or data.frame with two columns, giving the response rate scenarios to consider. The first element/column should correspond to the control arm and the second element/column to the experimental arm. Defaults to des$opchar[, 1:2]. • k: A numeric vector indicating which stages to consider in determining the probability mass functions. That is, it will condition the calculations on the trial ending in the stages given in k. Defaults to 1:des$J (i.e., to all stages of the given design).
• summary: A logical variable indicating whether a summary of the function’s progress should be printed to the console. Defaults to FALSE.

It returns a list with additional class "ph2rand_pmf", containing each of the input parameters along with a tibble in the slot $pmf, which gives the determined probability mass functions. ## 3.5. opchar() opchar() analytically determines the operating characteristics of a design returned by des_one_stage() or des_two_stage(), under given response rate scenarios (see pi). It allows the following inputs to be specified • des: An object of class "ph2rand_des", as returned by des_one_stage() or des_two_stage(). Defaults to ph2rand::des_one_stage(). • pi: A numeric vector with two elements, or numeric matrix or data.frame with two columns, giving the response rate scenarios to consider. The first element/column should correspond to the control arm and the second element/column to the experimental arm. Defaults to des$opchar[, 1:2].
• k: A numeric vector indicating which stages to consider in determining the probability mass functions. That is, it will condition the calculations on the trial ending in the stages given in k. Defaults to 1:des$J (i.e., to all stages of the given design). • summary: A logical variable indicating whether a summary of the function’s progress should be printed to the console. Defaults to FALSE. It returns a list with additional class "ph2rand_opchar", containing each of the input parameters along with a tibble in the slot $opchar, which gives the determined operating characteristics.

## 3.6 sim()

sim() determines the operating characteristics of a design returned by des_one_stage() or des_two_stage(), under given response rate scenarios (see pi), via simulation.

It allows the following inputs to be specified

• des: An object of class "ph2rand_des", as returned by des_one_stage() or des_two_stage(). Defaults to ph2rand::des_one_stage().
• pi: A numeric vector with two elements, or numeric matrix or data.frame with two columns, giving the response rate scenarios to consider. The first element/column should correspond to the control arm and the second element/column to the experimental arm. Defaults to des$opchar[, 1:2]. • k: A numeric vector indicating which stages to consider in determining the probability mass functions. That is, it will condition the calculations on the trial ending in the stages given in k. Defaults to 1:des$J (i.e., to all stages of the given design).
• replicates: A numeric indicating the number of replicate simulations to use in estimating the operating characteristics. Defaults to 1e4.
• summary: A logical variable indicating whether a summary of the function’s progress should be printed to the console. Defaults to FALSE.

It returns a list with additional class "ph2rand_sim", containing each of the input parameters along with a tibble in the slot $sim, which gives the estimated operating characteristics. See the other package vignette, accessible via vignette("validation"), for further discussion on how sim() can be used to confirm results determined by des_one_stage(), des_two_stage(), and opchar(). ## 3.7. plot.ph2rand_des() plot.ph2rand_des() plots the operating characteristics of a design returned by des_one_stage() or des_two_stage(), under a range of key response rate scenarios. For convenience, it also calls plot.ph2rand_terminal() to plot the terminal points of the design. It allows the following inputs to be specified • x: An object of class "ph2rand_des", as returned by des_one_stage() or des_two_stage(). • k: A numeric vector indicating which stages to consider in determining the probability mass function. That is, it will condition the calculations on the trial ending in the stages given in k. Defaults to 1:des$J (i.e., to all stages of the given design).
• output: A logical variable indicating whether available outputs should be returned by the function.

## 3.10. plot.ph2rand_pmf()

plot.ph2rand_pmf() plots the probability mass functions of a design, as returned by pmf().

It allows the following inputs to be specified

• x: An object of class "ph2rand_pmf", as returned by pmf().
• output: A logical variable indicating whether outputs should be returned by the function. Defaults to FALSE.

If output = TRUE, it will return a list containing each of the input parameters along with a list in the slot $plots, giving the produced plots of the probability mass functions. # 4. Examples ## 4.1. One-stage design using Jung (2008) First, lets consider a simple toy scenario where we specify that $$\Pi_0=\Pi_1=0.1$$, $$\delta=0.4$$, $$\alpha=\beta=0.1$$, $$r=1$$, and that a one-stage design of the type from Jung (2008) will be used. We can find this design as follows des <- des_one_stage(type = "binomial", alpha = 0.1, beta = 0.1, delta = 0.4, ratio = 1, Pi0 = 0.1, Pi1 = 0.1, nCmax = 20L) A simple summary can then be acquired with summary(des) #> ------------------------------------------------- #> A single-stage trial based on an exact binomial test #> ------------------------------------------------- #> #> --------------- #> Hypothesis test #> --------------- #> You have chosen to test the following hypothesis #> H0 : piE <= piC #> with the following type-I error constraint #> P(0.1,0.1) <= alpha = 0.1 #> and the following type-II error constraint #> P(0.1,0.5) >= 1 - beta = 0.9 #> #> ----------------- #> Design parameters #> ----------------- #> The design has: #> - n1C = 14 #> - n1E = 14 #> - e1 = 3 #> #> ------------------------- #> Operating Characteristics #> ------------------------- #> Key operating characteristics include #> # A tibble: 2 x 3 #> piC piE P(pi) #> <dbl> <dbl> <dbl> #> 1 0.1 0.1 0.0545 #> 2 0.1 0.5 0.921 We can plot the terminal points of this design with term_des <- terminal(des) plot(term_des) #> NULL We can also find and plot the probability mass function for $$\boldsymbol{\pi}=(0.1,0.1)^\top$$ and $$\boldsymbol{\pi}=(0.1,0.5)^\top$$ via pmf_des <- pmf(des, pi = rbind(c(0.1, 0.1), c(0.1, 0.5))) plot(pmf_des) The operating characteristics (i.e., power) under the same scenarios can be found and inspected with opchar_des <- opchar(des, pi = rbind(c(0.1, 0.1), c(0.1, 0.5))) opchar_des$opchar
#> # A tibble: 2 x 3
#>     piC   piE P(pi)
#>   <dbl> <dbl>   <dbl>
#> 1   0.1   0.1  0.0545
#> 2   0.1   0.5  0.921

Finally, we can plot the power for a wider range of scenarios with

plot(des)