The gestate package is designed to assist accurate planning and event prediction for time-to-event clinical trials. In this vignette, we introduce the gestate package, and demonstrate how it may be used in trial planning. A separate vignette (“event_prediction”) covers event prediction of ongoing trials. The first section will show how distributions are easily described and stored as Curve and RCurve objects. The second section will demonstrate analytic methods for trial planning, and the third section will show how simulate may be alternatively used.

The least possible information required to calculate a time-to-event trial’s predicted properties are distributions of events in each arm and an expected distribution of patient recruitment, including the number per arm. However, typically event distributions are over-simplified by implicitly assuming exponential distributions and then defining them by a single number (often the median). This shortcut is not available in gestate since events in many clinical trials do not follow exponential distributions and assuming this without evidence can lead to inaccurate trial planning. Even where no historical information is available and an exponential distribution can be considered to be the best guess, it is important to be clear that a choice has been made in setting a distribution to exponential. Gestate therefore requires explicit selection of distribution types, as well as their parameters, and does not default to exponential.

Aside from simplicity, one of the historical reasons for use of the exponential distribution was a requirement for proportional hazards in many analysis methods, although this should not have prevented use of Weibull distributions. With the rising importance of non-proportional hazards in clinical trials, it is important that trial planning methods can handle both the expected patterns of events and also analytical approaches that are more suited to this situation. To date, this has mainly been done by simulation, with the standard drawbacks of having to account for Monte Carlo error and being relatively slow. Gestate’s numerical integration methods and Curve architecture for handling distributions are designed to handle unusual event distributions as well as complex censoring patterns, and to produce calculations for a range of analytical methods. These approaches are fast in comparison to simulation, and generally quite accurate. Simulation tools have also been provided to assist in validating output, but it is intended that the user arrives at their preferred sample size and then validates it by simulation - this workflow is designed to produce a considerable time-saving.

The gestate package is designed around calculating power from a given sample size rather than sample size from power. While this is clearly necessary for simulation, the analytic approach also does this for the following reasons.

First, it allows power of a trial to be assessed over time. This is important as there are usually questions around the trade-off between trial duration and sample size. Even where it is reasonable to reversibly relate event numbers to power, the most pertinent questions then relate to how many patients are required to produce those events and how long it will take.

Secondly, moderate-to-large changes of sample size usually require changing the length or shape of recruitment, rather than simply scaling the rate. However, the length and shape of recruitment are also important for calculating the number of events at any given time. Methods that calculate sample size from power generally assume that the rate of recruitment can be scaled freely; the alternative of specifying an algorithm to change other recruitment parameters as patient numbers increase cannot be done generally as trial logistics vary widely. To speed sample size identification up, gestate provides an option to estimate a sample size that would produce a specified power at each time point, using the simplified rate scaling assumption. This provides an indication of patient numbers for the next scenario to calculate power for, but allows the user opportunity to change other recruitment parameters appropriately.

A final complicating factor for direct sample size calculation is that the correspondence between event numbers and power is not absolute, even for log-rank tests. Particularly when randomisation ratios are not 1:1, it can be shown that the ‘maturity’ of the data (and in particular the ratio of events between arms) affects the number of events required for a given power. This is highlighted by the inclusion of the Frontier Power method for power calculation that is included in the package,which is a generalisation of the Schoenfeld formula to handle unequal randomisation ratios appropriately. Since data maturity affects power, even simply scaling the sample size to achieve the desired number of events may not actually give the required power^{1}.

The gestate package contains a built-in Shiny App to implement most of the functionality covered in this vignette.

It may be run by the command ‘run_gestate()’.

The app contains real-time interactive plots for the currently specified Curves and RCurve. Updating them will also update the plots. This can be useful for checking parameterisation. Most functionality is covered, and both analytic and simulation approaches are included. Most parameters and arguments are shared between the two approaches, so once one is run, it is trivial to also run the other. All plots and tables created are downloadable.

In general, it is recommended to use the app for interactive, low-throughput sessions, perhaps when there is uncertainty about parameters. It also makes a good starting point for new users of gestate. Use of gestate directly in R is recommended for higher throughput workflows, systematic searching of parameters, or where outputs are fed into other programs.

The gestate package is built upon Curve objects, named because they describe a Survival curve. These contain all necessary information about probability distributions that may be useful for planning time-to-event trials, including both the parameters and the names of functions governing its behaviour.

Curve objects serve three main purposes; they allow the gestate code to be independent of the distributions it runs, allow the free interchange and combination of different distributions, and provide a convenient format for holding all relevant information about a distribution.

So long as a distribution is supported, a user may create a Curve object by using a (constructor) function corresponding to the distribution of interest, providing its parameters as arguments.

gestate V1.4.x comes with the following distributions supported as Curve objects:

Distribution | Function | Parameter Arguments | S(t) |
---|---|---|---|

Exponential | Exponential | lambda | \(e^{- \lambda t}\) |

Weibull | Weibull | alpha, beta | \(e^{-(t/\alpha)^\beta)}\) |

Log-Normal | Lognormal | mu, sigma | \(0.5(1+erf\left(\frac{\log(t)-\mu }{\sigma\sqrt2}\right))\) |

Log-Logistic | LogLogistic | theta, eta | \(\frac{t^{\eta}}{\theta^{\eta}+t^{\eta}}\) |

Gompertz | Gompertz | theta, eta | \(e^{\eta - \eta e^{\theta t}}\) |

Generalised Gamma | GGamma | theta, eta, rho | \(\frac{\Gamma(\eta,(t/\theta)^\rho)}{\Gamma(\eta)}\) |

Piecewise-Exponential | PieceExponential | start, lambda | \(\prod_{x = 1}^{len(\mathbf{\lambda})} e^{- \lambda_x t_x}\) where \(t_x = \min{(start_{x+1},(\max{(0,t-start_x)}))}\) \(start_{x+1} = Inf\) if otherwise undefined. |

Mixture-Exponential | MixExp | props, lambdas | \(p_{1}e^{- \lambda_{1} t} + p_{2}e^{- \lambda_{2} t} + ...\) |

Mixture-Weibull | MixWei | props, alphas, betas | \(p_{1}e^{-(t/\alpha_{1})^{\beta_{1})}} + p_{1}e^{-(t/\alpha_{2})^{\beta_{2})}}+...\) |

Null | Blank | \(1\) |

Event distributions and dropout-censoring distributions must be provided to gestate in Curve format. An example of defining a Curve is as follows:

```
library(gestate)
# Define a Weibull curve with shape 0.8 and scale 200:
curve1 <- Weibull(alpha=200,beta=0.8)
# Show the specified curve
curve1
#> Curve Object
#> Type: Weibull
#> Distribution Parameters:
#> scale : 200
#> shape : 0.8
```

Where you want to make an event / censoring impossible, you can use define a null Curve by the function Blank(). Typically, this is the starting point for calculations (the ‘practical default’), and is useful where no dropout censoring is required. The null curve is also the default within gestate for the dropout censoring distribution.

Recruitment-associated distributions use a special form of Curve object; the RCurve. This differs in several ways. Firstly, it also includes details of patient numbers for two arms; an active arm and a control arm. Secondly, it stores the distribution of patient recruitment, rather than a censoring distribution. This is because the ‘censoring distribution’ can only be defined in conjunction with an assessment time.

gestate V1.4.x comes with the following recruitment distributions supported as RCurve objects:

Distribution | Function | Parameter Arguments | Notes |
---|---|---|---|

Linear | LinearR | rlength, Nactive, Ncontrol | Standard linear recruitment |

Instant | InstantR | Nactive, Ncontrol | All patients in trial for same duration |

Piecewise-linear | PieceR | recruitment, ratio | Recruitment is two-column matrix of durations and rates |

Piecewise-linear, fixed follow-up | PieceRMaxF | recruitment, ratio, maxF | Piecewise recruitment with patients followed for fixed duration |

For most event-driven TTE trials, LinearR or PieceR distributions are appropriate. For trials where patients are instead followed for a fixed duration (e.g. 12 months per patient), the InstantR distribution is typically sufficient, since the trial typically finishes once all patients have been followed for the required time. However, in cases where the end of trial is both event driven and has fixed duration patient follow-up, the PieceRMaxF distribution accounts for this additional complexity.

```
# Define a linear recruitment over 12 months with 100 patients in each arm:
rcurve1 <- LinearR(rlength=12,Nactive=100,Ncontrol=100)
rcurve1
#> RCurve Object
#> Type: LinearR
#> N Total: 200
#> N Active: 100
#> N Control: 100
#> Recruitment Ratio: 1
#> Distribution Parameters:
#> rlength : 12
# Define a more complex recruitment with increasing rates
pieces <- matrix(c(1,2,3,4,5,10,15,32.5),ncol=2)
#Matrix to create a piecewise recruitment distribution. First column should be duration, second column should be rate.
pieces
#> [,1] [,2]
#> [1,] 1 5.0
#> [2,] 2 10.0
#> [3,] 3 15.0
#> [4,] 4 32.5
rcurve2 <- PieceR(recruitment=pieces,ratio=1)
# Show the specified piecewise curve
rcurve2
#> RCurve Object
#> Type: PieceR
#> N Total: 200
#> N Active: 100
#> N Control: 100
#> Recruitment Ratio: 1
#> Distribution Parameters:
#> lengths : 1 2 3 4
#> rates : 5 10 15 32.5
```

Common questions in planning time to event trials include what the expected number of events, survival function, power or width of hazard ratio (HR) confidence interval will be at any given time. In non-proportion hazards situations, the expected HR, difference in Survival functions (referred to as landmark analysis) or difference in restricted mean survival time (RMST) at a given assessment time (including power to test them) may also be of interest. nph_traj is designed to analytically answer these questions and to do so in real-time.

nph_traj uses numerical integration techniques to calculate expected event numbers in each arm at each time point and from these derives predicted properties for the log-rank test and Cox regression using an approach based upon the Pike approximation for the HR^{2}. Separate numerical integration approaches are also used to predict standard errors for RMST and landmark analysis in the presence of censoring and hence calculate power for these analyses.

Due to the interchangeability of Curve objects and the flexibility afforded by numerical integration, the nph_traj function can therefore be used to analytically calculate expected properties over time of a time-to-event trial, even under non-proportional hazards or with complex censoring distributions and hence also derive power trajectories for multiple common analysis methods.

nph_traj makes no explicit assumptions about the unit of time, only that all inputs have the same unit. In practice however, it is usually most convenient for clinical trial planning to use units of months. nph_traj creates trajectories with one entry at each integer time, so using units of years will typically lead to overly-granular trajectories. Conversely, using e.g. units of days or weeks is typically impractical (as recruitment data is typically summarised by month), slow (due to the increased number of calculations), and over-precise (time lags and uncertainties typically ensure that timings to the nearest month are of most relevance).

At a minimum, nph_traj requires a Curve object to describe the distribution of each of the active and control arms (‘active_ecurve’ and ‘control_ecurve’ arguments respectively), as well as an RCurve object (‘rcurve’ argument) to describe the recruitment distribution. For more information on Curve and RCurve objects, see section 2. The distributions of the active and control arms should typically be estimated based upon parameters or curve-fitting from previous trials (using e.g. fit_KM or fit_tte_data from the gestate package; see the event_prediction vignette).

Example call:

```
# Define exponential distributions for active and control arms, with HR of 0.7
controlCurve <- Exponential(lambda=0.1)
activeCurve <- Exponential(lambda=0.1*0.7)
# Define a linear recruitment of 400 patients
rcurve3 <- LinearR(rlength=12,Nactive=200,Ncontrol=200)
# Run nph_traj with default settings to calculate expected properties up to 10 months
output <- nph_traj(active_ecurve=activeCurve,control_ecurve=controlCurve,rcurve=rcurve3,max_assessment=10)
```

```
# Show output from previous section
output
#> $active_ecurve
#> Curve Object
#> Type: Exponential
#> Distribution Parameters:
#> rate : 0.07
#>
#> $control_ecurve
#> Curve Object
#> Type: Exponential
#> Distribution Parameters:
#> rate : 0.1
#>
#> $active_dcurve
#> Curve Object
#> Type: Blank
#> Distribution Parameters:
#> Zero : 0
#>
#> $control_dcurve
#> Curve Object
#> Type: Blank
#> Distribution Parameters:
#> Zero : 0
#>
#> $rcurve
#> RCurve Object
#> Type: LinearR
#> N Total: 400
#> N Active: 200
#> N Control: 200
#> Recruitment Ratio: 1
#> Distribution Parameters:
#> rlength : 12
#>
#> $HRbound
#> [1] 1
#>
#> $Summary
#> Time Patients Events_Active Events_Control Events_Total HR LogHR LogHR_SE
#> 1 1 33.33 0.570 0.806 1.376 0.7 -0.3567 1.72143
#> 2 2 66.67 2.228 3.122 5.350 0.7 -0.3567 0.87239
#> 3 3 100.00 4.901 6.803 11.704 0.7 -0.3567 0.58938
#> 4 4 133.33 8.520 11.720 20.240 0.7 -0.3567 0.44788
#> 5 5 166.67 13.021 17.755 30.776 0.7 -0.3567 0.36298
#> 6 6 200.00 18.344 24.802 43.146 0.7 -0.3567 0.30639
#> 7 7 233.33 24.435 32.764 57.199 0.7 -0.3567 0.26596
#> 8 8 266.67 31.240 41.555 72.795 0.7 -0.3567 0.23564
#> 9 9 300.00 38.712 51.095 89.807 0.7 -0.3567 0.21205
#> 10 10 333.33 46.806 61.313 108.119 0.7 -0.3567 0.19318
#> Schoenfeld_Power Frontier_Power
#> 1 0.0400 0.0323
#> 2 0.0609 0.0434
#> 3 0.0885 0.0555
#> 4 0.1235 0.1085
#> 5 0.1659 0.1531
#> 6 0.2152 0.2037
#> 7 0.2705 0.2600
#> 8 0.3306 0.3210
#> 9 0.3936 0.3850
#> 10 0.4579 0.4504
```

The output from nph_traj is a list, with the first few entries corresponding to the input Curve and RCurve objects to provide context to the results. The results themselves are found under $Summary and are in the form of a trajectory, providing information about the trial at each integer time point up to the maximum assessment time (‘max_assessment’ argument).

Each row corresponds to a separate time point, given by the first column;‘Time’. The second column, ‘Patients’ gives the expected number of patients entered into the trial by that time. The three ‘Events’ columns give the expected numbers of events in each arm and overall at each time.The next three columns provide the Hazard Ratio, Log-Hazard Ratio and SE of the Log-Hazard Ratio. Finally, the power for the log-rank test / Cox regression are given based upon the Schoenfeld^{3,4} and Frontier^{1} methods.

Administrative censoring is handled implicitly based upon the specified RCurve object. However, dropout censoring is also common in time-to-event trials. This can be specified through Curve objects, and separate censoring distributions are supported for each arm. By default, gestate uses no censoring via the Blank() Curve.

It should be noted that the required censoring distributions must represent that of the censoring that would occur in the absence of events or administrative censoring. The observed numbers of censored patients in each arm will therefore also depend partially on the event distributions. The suggested way to use existing data to parameterise the censoring distributions is to create ‘reverse Kaplan Meier’ plots of existing data, whereby dropout censorings are ‘events’, and both events and administrative censorings are ‘censorings’. Curve-fitting techniques (e.g. fit_KM or fit_tte_data from gestate, see event_prediction vignette) may then be used to estimate the parametric distributions.

To control the one-sided type I error for power calculations, the ‘alpha1’ argument may be used. This is 0.025 by default.

An additional feature of nph_traj is it can estimate the required number of patients to achieve a given power at a given time point (using Schoenfeld), assuming all parameters remain constant (only the rate of recruitment changes to allow patient number to change in the existing specified time frame). This can be requested by setting the ‘required_power’ argument to the required decimal, e.g. 0.9 for 90% power. For sample size calculations it is strongly recommended to only use this as a guide of the next set of parameters to try, and to rerun the calculation. This is because unless the number is similar to that in the existing RCurve, it will typically require changing the recruitment length to reach the new number, and that will result in changes to the calculation.

It is also possible to get more detailed outputs by specifying ‘detailed_output=TRUE’. This includes several additional quantities required by the calculations or derived from them. Full details may be found in the documentation of the nph_traj function (via e.g. ?nph_traj).

Version 1.4.1 onwards supports non-inferiority trials for power calculations based on hazard ratio (only). The optional ‘HRbound’ argument can be used to specify the HR to be used as the non-inferiority bound. By default this is 1, i.e. a superiority trial. Values greater than 1 should be specified for non-inferiority settings. Values less than 1 could be used for trials looking to show evidence that they are at least a certain amount better than the comparator. Note that any requested power calculations based on a log rank Z test, or for RMST/landmark analyses will still be on the basis of superiority.

```
# Define exponential distributions for active and control arms, with HR of 0.7
censorCurveA <- Exponential(lambda=0.001)
censorCurveC <- Exponential(lambda=0.002)
# Run nph_traj with default settings to calculate expected properties up to 10 months in a NI setting against a HR bound of 1.3.
output1 <- nph_traj(active_ecurve=activeCurve,control_ecurve=controlCurve,active_dcurve=censorCurveA,control_dcurve=censorCurveC,rcurve=rcurve3,max_assessment=10,HRbound=1.3,detailed_output = TRUE,required_power = 0.9)
# Display output
output1
#> $active_ecurve
#> Curve Object
#> Type: Exponential
#> Distribution Parameters:
#> rate : 0.07
#>
#> $control_ecurve
#> Curve Object
#> Type: Exponential
#> Distribution Parameters:
#> rate : 0.1
#>
#> $active_dcurve
#> Curve Object
#> Type: Exponential
#> Distribution Parameters:
#> rate : 0.001
#>
#> $control_dcurve
#> Curve Object
#> Type: Exponential
#> Distribution Parameters:
#> rate : 0.002
#>
#> $rcurve
#> RCurve Object
#> Type: LinearR
#> N Total: 400
#> N Active: 200
#> N Control: 200
#> Recruitment Ratio: 1
#> Distribution Parameters:
#> rlength : 12
#>
#> $HRbound
#> [1] 1.3
#>
#> $Summary
#> Time Patients Events_Active Events_Control E_Events_Active E_Events_Control
#> 1 1 33.33 0.570 0.806 0.691 0.684
#> 2 2 66.67 2.227 3.118 2.699 2.645
#> 3 3 100.00 4.896 6.790 5.930 5.757
#> 4 4 133.33 8.509 11.691 10.297 9.903
#> 5 5 166.67 13.001 17.701 15.719 14.983
#> 6 6 200.00 18.310 24.712 22.121 20.902
#> 7 7 233.33 24.382 32.629 29.434 27.578
#> 8 8 266.67 31.165 41.362 37.592 34.934
#> 9 9 300.00 38.608 50.833 46.536 42.904
#> 10 10 333.33 46.668 60.969 56.210 51.427
#> Events_Total HR LogHR LogHR_SE HR_CI_Upper HR_CI_Lower Peto_LogHR
#> 1 1.375 0.7 -0.3567 1.72184 20.4516 0.0240 -0.3533
#> 2 5.344 0.7 -0.3567 0.87279 3.8728 0.1265 -0.3536
#> 3 11.686 0.7 -0.3567 0.58979 2.2239 0.2203 -0.3539
#> 4 20.200 0.7 -0.3567 0.44829 1.6853 0.2907 -0.3542
#> 5 30.701 0.7 -0.3567 0.36339 1.4270 0.3434 -0.3544
#> 6 43.023 0.7 -0.3567 0.30679 1.2771 0.3837 -0.3547
#> 7 57.011 0.7 -0.3567 0.26636 1.1798 0.4153 -0.3550
#> 8 72.526 0.7 -0.3567 0.23604 1.1118 0.4407 -0.3552
#> 9 89.441 0.7 -0.3567 0.21246 1.0616 0.4616 -0.3555
#> 10 107.637 0.7 -0.3567 0.19359 1.0230 0.4790 -0.3557
#> Expected_Z Expected_P Log_Rank_Stat Variance V_Pike_Peto Event_Ratio
#> 1 -0.2071 0.4179 -0.1215 0.344 0.344 0.7072
#> 2 -0.4087 0.3414 -0.4723 1.336 1.336 0.7142
#> 3 -0.6048 0.2727 -1.0335 2.921 2.921 0.7211
#> 4 -0.7956 0.2131 -1.7875 5.047 5.048 0.7278
#> 5 -0.9815 0.1632 -2.7181 7.669 7.671 0.7345
#> 6 -1.1626 0.1225 -3.8105 10.742 10.747 0.7409
#> 7 -1.3391 0.0903 -5.0511 14.229 14.238 0.7473
#> 8 -1.5111 0.0654 -6.4275 18.093 18.107 0.7535
#> 9 -1.6788 0.0466 -7.9282 22.302 22.323 0.7595
#> 10 -1.8425 0.0327 -9.5426 26.825 26.856 0.7654
#> Schoenfeld_Power Event_Prop_Power Z_Power Frontier_Power Estimated_SS
#> 1 0.0551 0.0545 0.0398 0.0396 31896
#> 2 0.1067 0.1048 0.0604 0.0870 8209
#> 3 0.1836 0.1799 0.0877 0.1695 3755
#> 4 0.2847 0.2789 0.1221 0.2724 2172
#> 5 0.4032 0.3954 0.1639 0.3927 1429
#> 6 0.5280 0.5190 0.2126 0.5194 1020
#> 7 0.6469 0.6378 0.2673 0.6404 770
#> 8 0.7505 0.7421 0.3268 0.7458 605
#> 9 0.8333 0.8263 0.3893 0.8302 491
#> 10 0.8946 0.8893 0.4532 0.8926 408
```

nph_traj can be provided essentially any valid Curve object for each event distribution, and fully supports non-proportional hazards. To take full advantage of this, it can calculate expectations for landmark and RMST quantities, along with the estimated standard errors and power calculations.

To request RMST calculations, include the argument ‘RMST = x’, where x is the restriction time of interest. To request landmark calculations, include the argument ‘landmark = y’, where y is the landmark time of interest. Here is an example of a non-proportional hazards calculation where estimates of power for both RMST and landmark calculations are requested:

```
# Define exponential distributions for active and control arms, with HR of 0.7
activeCurveNPH <- Weibull(alpha=100,beta=0.8)
controlCurveNPH <- Weibull(alpha=50,beta=1)
# Run nph_traj with default settings to calculate expected properties up to 30 months
output2 <- nph_traj(active_ecurve=activeCurveNPH,control_ecurve=controlCurveNPH,rcurve=rcurve3,max_assessment=30,RMST=20,landmark=20)
# Display output
output2
#> $active_ecurve
#> Curve Object
#> Type: Weibull
#> Distribution Parameters:
#> scale : 100
#> shape : 0.8
#>
#> $control_ecurve
#> Curve Object
#> Type: Weibull
#> Distribution Parameters:
#> scale : 50
#> shape : 1
#>
#> $active_dcurve
#> Curve Object
#> Type: Blank
#> Distribution Parameters:
#> Zero : 0
#>
#> $control_dcurve
#> Curve Object
#> Type: Blank
#> Distribution Parameters:
#> Zero : 0
#>
#> $rcurve
#> RCurve Object
#> Type: LinearR
#> N Total: 400
#> N Active: 200
#> N Control: 200
#> Recruitment Ratio: 1
#> Distribution Parameters:
#> rlength : 12
#>
#> $HRbound
#> [1] 1
#>
#> $Summary
#> Time Patients Events_Active Events_Control Events_Total HR LogHR
#> 1 1 33.33 0.231 0.166 0.396 1.3970 0.3343
#> 2 2 66.67 0.798 0.658 1.456 1.2173 0.1967
#> 3 3 100.00 1.646 1.470 3.116 1.1235 0.1165
#> 4 4 133.33 2.747 2.597 5.344 1.0616 0.0598
#> 5 5 166.67 4.085 4.031 8.116 1.0162 0.0160
#> 6 6 200.00 5.644 5.767 11.411 0.9806 -0.0196
#> 7 7 233.33 7.413 7.799 15.212 0.9516 -0.0496
#> 8 8 266.67 9.385 10.120 19.505 0.9273 -0.0755
#> 9 9 300.00 11.550 12.725 24.275 0.9064 -0.0983
#> 10 10 333.33 13.901 15.609 29.510 0.8882 -0.1185
#> 11 11 366.67 16.433 18.766 35.199 0.8721 -0.1368
#> 12 12 400.00 19.140 22.190 41.330 0.8577 -0.1535
#> 13 13 400.00 21.786 25.711 47.497 0.8412 -0.1729
#> 14 14 400.00 24.260 29.162 53.422 0.8243 -0.1932
#> 15 15 400.00 26.614 32.545 59.159 0.8087 -0.2124
#> 16 16 400.00 28.871 35.861 64.732 0.7943 -0.2303
#> 17 17 400.00 31.045 39.111 70.156 0.7813 -0.2468
#> 18 18 400.00 33.146 42.297 75.443 0.7694 -0.2622
#> 19 19 400.00 35.183 45.419 80.602 0.7585 -0.2764
#> 20 20 400.00 37.160 48.480 85.640 0.7485 -0.2897
#> 21 21 400.00 39.084 51.481 90.564 0.7393 -0.3021
#> 22 22 400.00 40.957 54.421 95.379 0.7308 -0.3137
#> 23 23 400.00 42.784 57.304 100.088 0.7228 -0.3246
#> 24 24 400.00 44.568 60.130 104.698 0.7154 -0.3349
#> 25 25 400.00 46.311 62.899 109.210 0.7085 -0.3446
#> 26 26 400.00 48.016 65.575 113.590 0.7020 -0.3539
#> 27 27 400.00 49.684 68.275 117.959 0.6959 -0.3625
#> 28 28 400.00 51.318 70.883 122.201 0.6902 -0.3708
#> 29 29 400.00 52.919 73.440 126.359 0.6847 -0.3787
#> 30 30 400.00 54.488 75.946 130.434 0.6796 -0.3863
#> LogHR_SE Schoenfeld_Power Frontier_Power RMST_Restrict RMST_Active
#> 1 3.20639 0.0195 0.0250 20 NA
#> 2 1.66272 0.0188 0.0211 20 NA
#> 3 1.13414 0.0196 0.0216 20 NA
#> 4 0.86535 0.0212 0.0227 20 NA
#> 5 0.70206 0.0237 0.0242 20 NA
#> 6 0.59210 0.0270 0.0262 20 NA
#> 7 0.51290 0.0312 0.0286 20 NA
#> 8 0.45307 0.0365 0.0346 20 NA
#> 9 0.40624 0.0429 0.0407 20 NA
#> 10 0.36856 0.0507 0.0482 20 NA
#> 11 0.33758 0.0601 0.0573 20 NA
#> 12 0.31164 0.0712 0.0680 20 NA
#> 13 0.29083 0.0863 0.0825 20 NA
#> 14 0.27436 0.1049 0.1004 20 NA
#> 15 0.26084 0.1265 0.1212 20 NA
#> 16 0.24948 0.1506 0.1447 20 NA
#> 17 0.23975 0.1771 0.1704 20 NA
#> 18 0.23129 0.2057 0.1983 20 NA
#> 19 0.22386 0.2360 0.2280 20 NA
#> 20 0.21725 0.2677 0.2591 20 NA
#> 21 0.21133 0.3006 0.2915 20 17.2073
#> 22 0.20599 0.3342 0.3248 20 17.2073
#> 23 0.20113 0.3683 0.3586 20 17.2073
#> 24 0.19670 0.4025 0.3926 20 17.2073
#> 25 0.19263 0.4366 0.4266 20 17.2073
#> 26 0.18894 0.4705 0.4604 20 17.2073
#> 27 0.18535 0.5034 0.4935 20 17.2073
#> 28 0.18219 0.5357 0.5259 20 17.2073
#> 29 0.17918 0.5670 0.5573 20 17.2073
#> 30 0.17638 0.5971 0.5877 20 17.2073
#> RMST_Control RMST_Delta RMST_SE RMST_Z RMST_Power RMST_Failure LM_Time
#> 1 NA NA NA NA NA 1 20
#> 2 NA NA NA NA NA 1 20
#> 3 NA NA NA NA NA 1 20
#> 4 NA NA NA NA NA 1 20
#> 5 NA NA NA NA NA 1 20
#> 6 NA NA NA NA NA 1 20
#> 7 NA NA NA NA NA 1 20
#> 8 NA NA NA NA NA 1 20
#> 9 NA NA NA NA NA 1 20
#> 10 NA NA NA NA NA 1 20
#> 11 NA NA NA NA NA 1 20
#> 12 NA NA NA NA NA 1 20
#> 13 NA NA NA NA NA 1 20
#> 14 NA NA NA NA NA 1 20
#> 15 NA NA NA NA NA 1 20
#> 16 NA NA NA NA NA 1 20
#> 17 NA NA NA NA NA 1 20
#> 18 NA NA NA NA NA 1 20
#> 19 NA NA NA NA NA 1 20
#> 20 NA NA NA NA NA 1 20
#> 21 16.484 0.7233 0.6160 1.1742 0.2160 0 20
#> 22 16.484 0.7233 0.6084 1.1888 0.2203 0 20
#> 23 16.484 0.7233 0.6025 1.2006 0.2238 0 20
#> 24 16.484 0.7233 0.5978 1.2099 0.2266 0 20
#> 25 16.484 0.7233 0.5942 1.2172 0.2288 0 20
#> 26 16.484 0.7233 0.5916 1.2226 0.2305 0 20
#> 27 16.484 0.7233 0.5897 1.2265 0.2316 0 20
#> 28 16.484 0.7233 0.5885 1.2291 0.2324 0 20
#> 29 16.484 0.7233 0.5878 1.2306 0.2329 0 20
#> 30 16.484 0.7233 0.5874 1.2314 0.2331 0 20
#> LM_Active LM_Control LM_Delta LM_A_SE LM_C_SE LM_D_SE LM_Z LM_Power
#> 1 NA NA NA NA NA NA NA NA
#> 2 NA NA NA NA NA NA NA NA
#> 3 NA NA NA NA NA NA NA NA
#> 4 NA NA NA NA NA NA NA NA
#> 5 NA NA NA NA NA NA NA NA
#> 6 NA NA NA NA NA NA NA NA
#> 7 NA NA NA NA NA NA NA NA
#> 8 NA NA NA NA NA NA NA NA
#> 9 NA NA NA NA NA NA NA NA
#> 10 NA NA NA NA NA NA NA NA
#> 11 NA NA NA NA NA NA NA NA
#> 12 NA NA NA NA NA NA NA NA
#> 13 NA NA NA NA NA NA NA NA
#> 14 NA NA NA NA NA NA NA NA
#> 15 NA NA NA NA NA NA NA NA
#> 16 NA NA NA NA NA NA NA NA
#> 17 NA NA NA NA NA NA NA NA
#> 18 NA NA NA NA NA NA NA NA
#> 19 NA NA NA NA NA NA NA NA
#> 20 NA NA NA NA NA NA NA NA
#> 21 0.7589 0.6703 0.0885 0.0413 0.0481 0.0634 1.3971 0.2868
#> 22 0.7589 0.6703 0.0885 0.0374 0.0429 0.0569 1.5562 0.3432
#> 23 0.7589 0.6703 0.0885 0.0351 0.0399 0.0532 1.6647 0.3839
#> 24 0.7589 0.6703 0.0885 0.0336 0.0379 0.0507 1.7465 0.4155
#> 25 0.7589 0.6703 0.0885 0.0326 0.0365 0.0489 1.8104 0.4405
#> 26 0.7589 0.6703 0.0885 0.0318 0.0354 0.0476 1.8595 0.4600
#> 27 0.7589 0.6703 0.0885 0.0312 0.0347 0.0467 1.8976 0.4751
#> 28 0.7589 0.6703 0.0885 0.0308 0.0341 0.0460 1.9262 0.4865
#> 29 0.7589 0.6703 0.0885 0.0306 0.0337 0.0455 1.9467 0.4947
#> 30 0.7589 0.6703 0.0885 0.0304 0.0334 0.0452 1.9601 0.5001
```

All of the RMST output columns are prefixed RMST_ . The first column, RMST_Restrict reports the restriction time, then RMST_Active, RMST_Control and RMST_Delta provide the RMST estimates for each arm and the difference between them. Following them, RMST_SE provides the SE for the difference, RMST_Z provides the expected Z score, and RMST_Power the power.Finally, RMST_Failure provides the probability that the RMST calculation will fail due to one or both of the arms having an incalculable RMST (caused by the Survival curve being undefined at the time of restriction).

All of the landmark output columns are prefixed LM_. The first column, LM_Time reports the time of the landmark analysis. LM_Active and LM_Control then give the estimates of S(t) for each arm, and LM_Delta that for the difference. LM_A_SE, LM_C_SE and LM_D_SE then give the corresponding standard errors. Finally LM_Z and LM_Power provide the expected Z score and power.

A plotting function, plot_npht is available to provide some commonly-useful trajectory plots for nph_traj output.

By default, this produces plots of the Survival functions for the event distributions (a KM plot), the CDFs of the censoring distributions, recruited patients over time, expected total events over time, expected HR over time (with predicted CI), and a plot of power over time for all available methods in the data.

The first three of these are useful for checking assumptions, to e.g. check that the distributions are reasonable and what was wanted. The expected event plot is helpful for setting expectations for trial conduct of event numbers. The logHR plot is useful in non-proportional hazards trials for looking at how the expected HR will change over time; the predicted CI for it also allows a straightforward observation of the time at which the HR will become significant. Finally, the power plot allows the evolution of the power over time to be observed, and comparisons to be made between the different analysis methods.

All plots are produced by default, but each individual plot may be disabled via arguments, and likewise arguments may be used to prevent the plotting of individual trajectories from the power plot. Standard plotting arguments may also be passed to it, e.g. regarding text size.

Simulating trials is an alternative to direct property calculation. As it is much slower, introduces Monte Carlo error and can only simulate one assessment time per run, it is recommended to use it primarily to check analytic results from nph_traj or as a starting point for simulating more complex cases that are beyond the scope of nph_traj.

Syntax for simulate_trials is very similar to that of nph_traj, and the same Curves and RCurve can be passed to it via the same argument names (e.g. ‘active_ecurve’; see section 3.1 for more details). As with nph_traj, it is minimally required to specify distribution Curve objects for the active and control arms, as well as an RCurve object describing the recruitment. For further details on Curve/RCurve objects, please see section 2.

In addition to these similar inputs, two simulation-specific arguments are also required; the number of simulation iterations to run (‘iterations’) and the random seed to use (‘seed’). Finally, it is also necessary to specify either a time (‘assess’), or a fixed number of events (‘fix_events’) to analyse at. If both are specified, the fixed event number will be used.

Four columns are produced in the output matrix; ‘Time’, which gives the time, ‘Censored’, which gives the censoring indicator (where 1 is patient was censored), ‘Trt’, which gives the treatment number, and ‘Iter’, which is the iteration number.

A simple example corresponding to the first example from the nph_traj section is given here:

```
# Simple simulation corresponding to first example of nph_traj
# Number of iterations kept unrealistically low to reduce processing time
simulation1 <- simulate_trials(active_ecurve=activeCurve,control_ecurve=controlCurve,rcurve=rcurve3,assess=10,iterations=100,seed=1234)
#> Note: assessment time is shorter than the length of the recruitment period:12. Any attempt to increase the assessment time at a later date will result in missing patients!
```

Using the same syntax as nph_traj, you can also specify censoring distributions that control dropout by providing Curve objects to the two dcurve arguments.

Also similar to nph_traj, more detailed output can be requested (‘detailed_output=TRUE’), which includes the times for events and censorings that were not observed because something else happened first. For most purposes this extra detail is not required, but it can be useful for checking distributions and it is required if you later want to change the time of assessment (or fixed event number). Functions to change assessment times will be covered in section 4.2.

Two formats of simulated data are supported by gestate; matrix and list format. In the default matrix format, all simulated data is present in a single table. In list format, each iteration is separated into its own table within a list and the Iter column is omitted. The ‘output_type’ argument controls which is produced. Both formats are automatically supported by analyse_sim.

When performing large simulations in R, all data is simultaneously held within RAM. Efforts has been taken to reduce the RAM usage of gestate’s simulations, but large simulations (>10,000) will still require a large quantity of available system memory; exceeding this leads to error messages about the inability to allocate sufficient RAM, excessive slowdown and/or system crashes. Considerable RAM savings can be made by shrinking the width of the simulation data sets by using the list format and not using detailed output, so these settings are recommended for large ‘production’ simulations; together these options save about 2/3rds of the RAM footprint compared to detailed output in matrix format.

A more complicated example showing the additional options and based on the second and third examples from the nph_traj section is presented here:

```
# Simple simulation corresponding to first example of nph_traj
# Number of iterations kept unrealistically low to reduce processing time
simulation2 <- simulate_trials(active_ecurve=activeCurveNPH,control_ecurve=controlCurveNPH,active_dcurve=censorCurveA,control_dcurve=censorCurveC,rcurve=rcurve3,fix_events=100,iterations=100,seed=12345,detailed_output = TRUE,output_type = "list")
```

Sometimes it is believed that there are important covariates in a trial or that there is heterogeneity of treatment effect between different patient strata. simulate_trials_strata is available to simulate this.

simulate_trials_strata acts as a wrapper for simulate_trials; consequently syntax is almost identical, with two main exceptions:

Firstly there are two additional arguments; ‘stratum_probs’ and ‘stratum_name’. The first is a required vector summing to 1 that contains the probability of a patient belonging to each stratum, with entries corresponding to each stratum required. If ‘stratum_probs’ does not sum to 1, an error will be produced. The second is optional and gives the name of the column with the stratum variable in.

The other main syntax difference is that each of the four arguments requiring Curve objects instead now instead takes a list of Curve objects. If a single Curve is supplied to a given argument, it will be shared across all strata (this is commonly done for the censoring arguments). Otherwise, the list should be of the same length as the ‘stratum_probs’. In this case, the Curve objects provide the distributions for events or censorings within their arm for their respective strata only.

A simple example is as follows:

```
# Define a Curve list for the active arm but keep just a single value for the control arm, to represent a predictive covariate.
activeCurveStrata <- c(activeCurve, Weibull(alpha=100,beta=0.8))
simulation3 <- simulate_trials_strata(stratum_probs=c(0.5,0.5),active_ecurve=activeCurveStrata,control_ecurve=controlCurve,rcurve=rcurve3, assess=10,iterations=100,seed=1234)
#> Note: assessment time is shorter than the length of the recruitment period:12. Any attempt to increase the assessment time at a later date will result in missing patients!
#>
#> Note: assessment time is shorter than the length of the recruitment period:12. Any attempt to increase the assessment time at a later date will result in missing patients!
```

Once simulations are complete, they may be analysed straight away. However sometimes it is desirable to first modify the data. gestate has two functions available to do this; ‘set_event_number’ and ‘set_assess_time’. These can be used to modify the assessment times to respectively either a fixed event number or a given assessment time. Both are usable on simulations previously using either approach, so long as detailed output is available. By keeping the data’s existing assessment criteria, it is also possible to just use these functions to change the format of the data from matrix to list, or vice versa.

There is one caveat; Patients that have not yet entered the trial at the time of assessment are always excluded from the data set. If the current assessment time for an iteration is before the maximum recruitment time, then any increases in assessment time will lead to errors caused by these missing patients. This is straightforward to check if the original data has a fixed assessment time, but can be awkward if a fixed event number was specified as the assessment time will vary between iterations. It is recommended therefore to only apply it to fixed event simulations if either the pre-existing number of events is sufficiently large that no iteration is still in the recruitment period, or if the number of events is being reduced (rather than a fixed time being set).

Using either function is straightforward; provide the name of the simulated data set as an argument to the relevant function, then the new time / event number. By default the output format will be the same as that entered, and detailed output will remain. Both of these may be changed by the same arguments discussed in the previous section.

Two examples:

```
# Increase the number of events for simulation 2
simulation2a <- set_event_number(data=simulation2,events=120)
#Rerun simulation 1 with detailed output specified
simulation1a <- simulate_trials(active_ecurve=activeCurve,control_ecurve=controlCurve,rcurve=rcurve3,assess=10,iterations=100,seed=1234,detailed_output=TRUE)
#> Note: assessment time is shorter than the length of the recruitment period:12. Any attempt to increase the assessment time at a later date will result in missing patients!
# Retain the assessment time for simulation 1 at 10, but convert to a list format with simple output
simulation1b <- set_assess_time(data=simulation1a,time=10,output_type="list",detailed_output = FALSE)
```

analyse_sim can be used to analyse data simulated by gestate. It is designed to automatically detect the input formats and provide minimal-fuss analysis. To perform a log-rank test and Cox regression is thus simply:

```
# Perform log-rank test and Cox regression on simulated data
analysis1 <- analyse_sim(simulation1)
head(analysis1)
#> HR LogHR LogHR_SE HR_Z HR_P LR_Z
#> result.1 0.8860301 -0.1210044 0.2123402 -0.5698611 0.284385967 -0.5702067
#> result.2 0.6544765 -0.4239196 0.2004611 -2.1147227 0.017226794 -2.1303437
#> result.3 0.8233719 -0.1943473 0.1977909 -0.9825896 0.162904737 -0.9841202
#> result.4 0.5907935 -0.5262887 0.1988439 -2.6467425 0.004063560 -2.6771595
#> result.5 0.5676261 -0.5662923 0.1848233 -3.0639662 0.001092118 -3.1042636
#> result.6 0.6570256 -0.4200323 0.1994501 -2.1059516 0.017604278 -2.1213151
#> LR_P Events_Active Events_Control
#> result.1 0.2842687721 43 46
#> result.2 0.0165716240 44 58
#> result.3 0.1625282124 46 58
#> result.4 0.0037124640 42 64
#> result.5 0.0009537665 52 68
#> result.6 0.0169476461 43 61
```

analyse_sim produces a matrix with each row containing the results of one iteration. For Cox regression, output is produced for the HR and log-HR as well as its standard error, Z score and p-value. For the log-rank test the Z score and p-value are returned. The number of events in each arm is also produced,

Stratified analysis for a single variable (covariate adjustment for Cox) is also supported by using the ‘stratum’ argument to provide the column name of the stratification variable:

```
# Perform stratified log-rank test and covariate-adjusted Cox regression on simulated data
analysis3 <- analyse_sim(data=simulation3, stratum="Stratum")
head(analysis3)
#> HR LogHR LogHR_SE HR_Z HR_P LR_Z
#> result.1 0.4983595 -0.6964335 0.2161342 -3.222228 0.0006359899 -3.190611
#> result.2 0.5574817 -0.5843256 0.2163308 -2.701074 0.0034557975 -2.618753
#> result.3 0.5163259 -0.6610170 0.2027848 -3.259698 0.0005576546 -3.253905
#> result.4 0.4923199 -0.7086266 0.2256932 -3.139778 0.0008453787 -3.244950
#> result.5 0.5134030 -0.6666942 0.2197812 -3.033446 0.0012088916 -3.126863
#> result.6 0.5207889 -0.6524105 0.2104464 -3.100127 0.0009671883 -3.060615
#> LR_P Events_Active Events_Control
#> result.1 0.0007098606 34 61
#> result.2 0.0044125890 36 55
#> result.3 0.0005691522 38 68
#> result.4 0.0005873565 30 57
#> result.5 0.0008834111 31 64
#> result.6 0.0011044150 37 59
```

Landmark and RMST analysis may also be requested by specifying the landmark and restriction times respectively to the ‘landmark’ and ‘RMST’ arguments, (the same syntax as nph_traj). RMST analysis follows the methods from Uno et al and Tian et al^{5,6,7}, while landmark analysis uses the Greenwood standard error and normal approximation approach.

For both landmark and RMST analyses, estimates are returned for each arm and the difference between them. Standard errors for all three quantities are produced, along with the Z score and p-value.

To speed up analysis, gestate supports parallel processing through the doParallel package. If this is installed then the ‘parallel_cores’ argument may be set to the required number of cores. Analysis code has also been optimised in V1.4.x to increase speed, with 4-fold improvement overall and at least 50-fold increase for covariate-adjusted RMST in comparison to v1.3.2.

A final analysis example is:

```
# Perform log-rank test, Cox regression, landmark analysis and RMST analysis on simulated data using parallel processing.
analysis2 <- analyse_sim(data=simulation2, RMST=10, landmark=10, parallel_cores=1)
head(analysis2)
#> HR LogHR LogHR_SE HR_Z HR_P LR_Z
#> result.1 0.9741865 -0.02615254 0.2000819 -0.1307092 0.4480026661 -0.1307129
#> result.2 0.8666650 -0.14310281 0.2003264 -0.7143482 0.2375059393 -0.7149549
#> result.3 0.6853549 -0.37781849 0.2024118 -1.8665836 0.0309798814 -1.8775705
#> result.4 0.7639749 -0.26922032 0.2015036 -1.3360573 0.0907652693 -1.3400901
#> result.5 0.5216262 -0.65080409 0.2072831 -3.1396877 0.0008456401 -3.1951239
#> result.6 0.7821452 -0.24571494 0.2010825 -1.2219610 0.1108611870 -1.2250305
#> LR_P Events_Active Events_Control RMST_Restrict RMST_Active
#> result.1 0.4480011946 49 51 10 8.894469
#> result.2 0.2373184580 48 52 10 9.115089
#> result.3 0.0302199738 43 57 10 9.467461
#> result.4 0.0901080329 44 56 10 9.024831
#> result.5 0.0006988541 37 63 10 9.478867
#> result.6 0.1102818733 45 55 10 9.157045
#> RMST_A_SE RMST_Control RMST_C_SE RMST_Delta RMST_D_SE RMST_Z
#> result.1 0.1908676 9.200848 0.1544769 -0.30637988 0.2455475 -1.2477419
#> result.2 0.1659930 9.048694 0.1737710 0.06639462 0.2403124 0.2762846
#> result.3 0.1291302 9.223193 0.1568467 0.24426754 0.2031637 1.2023187
#> result.4 0.1765084 8.969905 0.1658831 0.05492567 0.2422239 0.2267558
#> result.5 0.1213474 8.969737 0.1815976 0.50912949 0.2184099 2.3310738
#> result.6 0.1733281 8.924620 0.1764223 0.23242511 0.2473205 0.9397728
#> RMST_P LM_Time LM_Active LM_A_SE LM_Control LM_C_SE
#> result.1 0.893937212 10 0.8288031 0.02673240 0.8385181 0.02615064
#> result.2 0.391164727 10 0.8386908 0.02612384 0.8331164 0.02652844
#> result.3 0.114620035 10 0.8895153 0.02221686 0.8593153 0.02465128
#> result.4 0.410306813 10 0.8196045 0.02722308 0.7917703 0.02896078
#> result.5 0.009874735 10 0.8945876 0.02175786 0.8124251 0.02780837
#> result.6 0.173667053 10 0.8650000 0.02416351 0.8098204 0.02776559
#> LM_Delta LM_D_SE LM_Z LM_P
#> result.1 -0.009714975 0.03739622 -0.2597850 0.602485182
#> result.2 0.005574430 0.03723188 0.1497220 0.440491988
#> result.3 0.030200029 0.03318546 0.9100380 0.181401230
#> result.4 0.027834189 0.03974698 0.7002843 0.241874875
#> result.5 0.082162579 0.03530878 2.3269728 0.009983358
#> result.6 0.055179641 0.03680765 1.4991353 0.066919273
```

summarise_analysis is a function to summarise the results from multiple simulation iterations. It reads in the output from analyse_sim and automatically detects which analyses have been performed, adjusting its output as necessary. Apart from the name of the input analysis data, the only parameter is ‘alpha1’, which corresponds to the one-sided alpha level for power calculations. By default this is 0.025 (i.e. 2.5% 1-sided alpha)

An example call is:

```
# Summarise analysis from analysis2 output
summarise_analysis(analysis2)
#> Simulations HR LogHR LogHR_SE HR_Z HR_P HR_Power HR_Failed LR_Z
#> 1 100 0.7011 -0.3551 0.20356 -1.7302 0.0418 0.41 0 -1.7488
#> LR_P LR_Power LR_Failed Events_Active Events_Control Events_Total
#> 1 0.0402 0.41 0 42.3 57.7 100
#> RMST_Restrict RMST_Active RMST_A_SE RMST_Control RMST_C_SE RMST_Delta
#> 1 10 9.1772 0.1624 9.0525 0.1664 0.1247
#> RMST_D_SE RMST_Power RMST_Failed LM_Time LM_Active LM_A_SE LM_Control LM_C_SE
#> 1 0.2325 0.04 0 10 0.8554 0.0249 0.8167 0.0275
#> LM_Delta LM_D_SE LM_Power LM_Failed
#> 1 0.0387 0.0371 0.17 0
```

The output table contains a summary across all iterations of the simulation, with overall values and power. For Cox regression, the HR and logHR and SE of the logHR are produced, along with the average Z score, p-value and power. For the log-rank test, the average Z score, p value and power are returned. Failure rates (probability of analysis failure) for each are also given. If RMST and/or landmark analysis were previously requested, average values and SEs are given for each arm and the difference between arms. The powers and probabilities of analysis failure are also given.

Simulation may be used to validate analytic results; A simple workflow to compare outputs from nph_traj and summarise_analysis might look like the following:

```
# Run analytic planning and select a suitable time
outputX <- nph_traj(active_ecurve=activeCurveNPH, control_ecurve=controlCurveNPH, rcurve=rcurve3, max_assessment=50, RMST=40, landmark=40)
outputX$Summary[41:50,]
#> Time Patients Events_Active Events_Control Events_Total HR LogHR
#> 41 41 400 70.009 100.444 170.453 0.6366 -0.4517
#> 42 42 400 71.284 102.416 173.700 0.6336 -0.4564
#> 43 43 400 72.539 104.348 176.887 0.6307 -0.4610
#> 44 44 400 73.776 106.242 180.018 0.6279 -0.4654
#> 45 45 400 74.994 108.099 183.093 0.6252 -0.4697
#> 46 46 400 76.194 109.919 186.113 0.6226 -0.4739
#> 47 47 400 77.377 111.702 189.079 0.6201 -0.4779
#> 48 48 400 78.543 113.451 191.994 0.6176 -0.4818
#> 49 49 400 79.692 115.164 194.856 0.6153 -0.4856
#> 50 50 400 80.825 116.844 197.669 0.6130 -0.4893
#> LogHR_SE Schoenfeld_Power Frontier_Power RMST_Restrict RMST_Active
#> 41 0.15424 0.8386 0.8333 40 30.9011
#> 42 0.15277 0.8526 0.8477 40 30.9011
#> 43 0.15137 0.8655 0.8610 40 30.9011
#> 44 0.15003 0.8774 0.8733 40 30.9011
#> 45 0.14874 0.8884 0.8845 40 30.9011
#> 46 0.14751 0.8984 0.8948 40 30.9011
#> 47 0.14632 0.9076 0.9043 40 30.9011
#> 48 0.14518 0.9159 0.9129 40 30.9011
#> 49 0.14408 0.9236 0.9208 40 30.9011
#> 50 0.14303 0.9306 0.9280 40 30.9011
#> RMST_Control RMST_Delta RMST_SE RMST_Z RMST_Power RMST_Failure LM_Time
#> 41 27.5336 3.3675 1.3992 2.4067 0.6725 5e-04 40
#> 42 27.5336 3.3675 1.3959 2.4124 0.6745 0e+00 40
#> 43 27.5336 3.3675 1.3933 2.4170 0.6762 0e+00 40
#> 44 27.5336 3.3675 1.3912 2.4206 0.6775 0e+00 40
#> 45 27.5336 3.3675 1.3896 2.4234 0.6785 0e+00 40
#> 46 27.5336 3.3675 1.3884 2.4255 0.6792 0e+00 40
#> 47 27.5336 3.3675 1.3875 2.4270 0.6798 0e+00 40
#> 48 27.5336 3.3675 1.3869 2.4280 0.6801 0e+00 40
#> 49 27.5336 3.3675 1.3866 2.4286 0.6803 0e+00 40
#> 50 27.5336 3.3675 1.3864 2.4289 0.6805 0e+00 40
#> LM_Active LM_Control LM_Delta LM_A_SE LM_C_SE LM_D_SE LM_Z LM_Power
#> 41 0.6185 0.4493 0.1692 0.0416 0.0452 0.0615 2.7516 0.7857
#> 42 0.6185 0.4493 0.1692 0.0390 0.0416 0.0570 2.9679 0.8432
#> 43 0.6185 0.4493 0.1692 0.0375 0.0396 0.0545 3.1047 0.8738
#> 44 0.6185 0.4493 0.1692 0.0365 0.0382 0.0528 3.2014 0.8928
#> 45 0.6185 0.4493 0.1692 0.0358 0.0373 0.0517 3.2731 0.9054
#> 46 0.6185 0.4493 0.1692 0.0353 0.0366 0.0509 3.3269 0.9142
#> 47 0.6185 0.4493 0.1692 0.0350 0.0361 0.0502 3.3671 0.9203
#> 48 0.6185 0.4493 0.1692 0.0347 0.0357 0.0498 3.3968 0.9246
#> 49 0.6185 0.4493 0.1692 0.0345 0.0355 0.0495 3.4177 0.9275
#> 50 0.6185 0.4493 0.1692 0.0344 0.0353 0.0493 3.4312 0.9294
```

We select 47 months as it gives 90% power for LR/Cox and specify 2000 simulations (this is due to technical limitations in vignette creation; typically 10,000+ are recommended):

```
# Run simulation, analysis, summary at chosen time
simulationX <- simulate_trials(active_ecurve=activeCurveNPH, control_ecurve=controlCurveNPH, rcurve=rcurve3, assess=47, seed=123456, iterations = 2000, output_type="list")
analysisX <- analyse_sim(simulationX,RMST=40,landmark=40)
summaryX <- summarise_analysis(analysisX)
```

We can now pair results up and directly compare them between methods:

```
# Run simulation, analysis, summary at chosen time
A <- outputX$Summary[47,]
B <- summaryX
Events_Active <- c(A$Events_Active,B$Events_Active)
Events_Control <- c(A$Events_Control,B$Events_Control)
Events_Total <- c(A$Events_Total,B$Events_Total)
HR <- c(A$HR,B$HR)
LogHR <- c(A$LogHR,B$LogHR)
LogHR_SE <- c(A$LogHR_SE,B$LogHR_SE)
LR_Power <- c(A$Schoenfeld_Power,B$Power_LR)
RMST <-c(A$RMST_Delta,B$RMST_Delta)
RMST_SE <-c(A$RMST_SE,B$RMST_D_SE)
RMST_Power <- c(A$RMST_Power,B$RMST_Power)
Landmark <- c(A$LM_Delta,B$LM_Delta)
Landmark_SE <- c(A$LM_D_SE,B$LM_D_SE)
Landmark_Power <- c(A$LM_Power,B$LM_Power)
collated <- rbind(Events_Active,Events_Control,Events_Total,HR,LogHR,LogHR_SE,LR_Power,RMST, RMST_SE, RMST_Power, Landmark,Landmark_SE,Landmark_Power)
colnames(collated) <- c("Analytic","Simulated")
# Display comparison table
collated
#> Analytic Simulated
#> Events_Active 77.37700 77.54500
#> Events_Control 111.70200 111.66300
#> Events_Total 189.07900 189.20800
#> HR 0.62010 0.62060
#> LogHR -0.47790 -0.47710
#> LogHR_SE 0.14632 0.14854
#> LR_Power 0.90760 0.90760
#> RMST 3.36750 3.31760
#> RMST_SE 1.38750 1.38310
#> RMST_Power 0.67980 0.66900
#> Landmark 0.16920 0.16840
#> Landmark_SE 0.05020 0.05010
#> Landmark_Power 0.92030 0.91900
```

As expected, all properties align closely. For testing whether the two values differ by more than would be expected by random, Monte Carlo errors should be calculated for each simulated quantity; in general this is also good practice for analysis and reporting of simulation results.

1 Bell J (2019) Power Calculations for Time-to-Event Trials Using Predicted Event Proportions. Manuscript submitted for publication.

2 Pike MC (1972). Contribution to the Discussion on the Paper “Asymptotically efficient rank invariant test procedures” by R. Peto and J. Peto. Journal of the Royal Statistical Society Series A; 135(2): 201-203.

3 Schoenfeld D (1981) The asymptotic properties of nonparametric tests for comparing survival distributions. Biometrika; 68: 316-319.

4 Schoenfeld DA (1983) Sample-size formula for the proportional-hazards regression model. Biometrics; 39(2): 499-503.

5 Uno H, Tian L, Cronin A, Battioui C, Horiguchi M. (2017) survRM2: Comparing Restricted Mean Survival Time. R package version 1.0-2. https://CRAN.R-project.org/package=survRM2

6 Uno H, Claggett B, Tian L, Inoue E, Gallo P, Miyata T, Schrag D, Takeuchi M, Uyama Y, Zhao L, Skali H, Solomon S, Jacobus S, Hughes M, Packer M, Wei LJ. (2014) Moving beyond the hazard ratio in quantifying the between-group difference in survival analysis. Journal of clinical Oncology; 32: 2380-2385.

7 Tian L, Zhao L, Wei LJ. (2014) Predicting the restricted mean event time with the subject’s baseline covariates in survival analysis. Biostatistics; 15: 222-233.