Here a Bayesian design of the ViP trial with the inclusion of informative priors on the baseline hazard function is considered.
The target population for the trial is the same as for the three trials that are illustrated in Figure 6.1, with the GemCap trial in particular being administered in the same trials unit. ViP, in a similar fashion to the previous trials, also includes a control arm which is Gemcitabine alone. Furthermore, the data shown in Figure 6.1 were also used in the trial design, informing the sample size calculation.
To illustrate the effect of the informative baseline hazards, two approaches are taken. Firstly, a fully Bayesian sample size technique is followed based on the sam- pling methodology of Wang and Gelfand [178] and De Santis, [211]. Secondly an ap- proach is taken whereby the main efficacy parameter of interest is assumed fixed at pre-determined values. The purpose of this second approach is to obtain quantities similar to the frequentist Type I and Type II error rates for comparison with the initial trial design.
7.6.1 Bayesian sample size for ViP
The Average Length Criterion (ALC) is chosen as the utility function on which to base sample size calculations and it is define a-priori that a posterior length of 0.6 is of interest to obtain a coverage of 90%. Prior point estimates are obtained using the estimates that are given in Table 7.1 along with the survival estimates that are obtained from the GemCap trial.
Data are sampled using the marginal distribution of the Bayesian PEM as shown in Section 6.2. Prior variability is defined using the effective number of events approach and effective number of prior events are set as ’ = 10,20,30 and 50. Design priors are set from Normal distributions with the most informative log baseline hazard priors, (’ = 50) along with a prior distribution for the log hazard ratio ofN ≥(log(0.6),0.5).
This is chosen to replicate the initial design parameters of the ViP trial. For each sampled dataset, administrative censoring is applied to any survival time greater than 24 months. Patterns of censoring are obtained using the same methods as Section 3.4 and Section 6.4.
Data are simulated for total sample sizes of 60 to 150 by increments of ten. The resulting ALCs from each set of simulations, for each model are shown in Figure 7.8.
These show the resulting ALC estimates obtained from varying sample sizes for un- informative priors and informative priors based on the four effective event scenarios described above. This shows how the behaviour of the ALC criterion alters depending on the prior distributions set. Including prior information improves the behaviour in all cases. Improvement is negligible for the less informative of the locally flat priors
however. Normal priors consistently out perform the locally flat priors in passing the 0.6 threshold at smaller sample sizes.
60 80 100 120 140
0.40.50.60.70.8
ALC; Normal Priors
Total Sample Size (N)
Interval length
60 80 100 120 140
0.40.50.60.70.8
ALC; Step Priors
Total Sample Size (N)
Interval length
60 80 100 120 140
0.40.50.60.70.8
ALC; Trapezium Priors
Total Sample Size (N)
Interval length
Figure 7.8: Figure to show the performance of the ALC for normal, step and trapezium prior distributions
The results are given in Table 7.4. As a reference model, where the priors remain uninformative, 98 patients are required on average to ensure that a length of 0.6 will contain 90% of the posterior distribution. This is smaller than the 120 patient required for the frequentist design which is in part due to the differing approaches of the two methodologies as a Bayesian approach attempts only to control some aspect of the posterior distribution whereas frequentist approaches by contrast attempt to control against two types of error. Some disparity is also expected based on the parameters chosen on which to base the Bayesian design.
Sample size estimates show that as more information enters the model through the priors, smaller numbers of patients are required to control the width of the posterior distribution. In the most extreme case, 74 patients are required to obtain an ALC of 0.6. Again, this effect is accentuated for normal priors compared to the locally flat alternatives. Considering the locally flat priors, the Step prior has a larger effect than the Trapezium prior.
Effective Sample Size
’ = 10 ’ = 20 ’ = 30 ’ = 50
Normal 92 86 78 74
Step 94 88 79 74
Trapezium 97 92 84 76
Table 7.4: Sample size estimates for the ViP trial under various differing priors on the baseline hazard function under the ALC.
Both Figure 7.8 and Table 7.4 show that smaller sample sizes are obtained for the
Normal priors in comparison to the locally flat priors in terms of the ALC. The locally flat priors may still be preferred in practice as they may be easier to derive and can inform a trial design and analysis without a-priori setting a point estimate for the most likely solution.
7.6.2 Bayesian type I and type II error rates
To evaluate Bayesian Type I and Type II error rates, a Successful Trial Criterion (STC) is utilised. In the context of the ViP trial, according to the initial design parameters, the trial is a success only if ỉ 90% of the posterior distribution is less than zero. To evaluate Bayesian Type I and Type II error rates, two special conditions of the STC are considered where the design priors for— are set to fixed values of 0 and” respectively.
Specifically setting — = 0 and calculating the STC will give the Type I error rate and equivalently — = ” for a Type II error rate. Whilst from a Bayesian perspective, sampling from a distribution where the key parameter of interest is considered fixed is inappropriate, this method allows estimation of quantities analogous to the frequentist Type I and Type II error rates.
To ensure that reliable estimates of Type I and Type II errors are obtained, 2000 datasets are simulated following the same procedure as in Section 5.1 but with a fixed sample size of 120 patients to replicate the initial ViP design. The aim here is to show the ‘error rates’ can be improved over the proposed design as opposed to searching for a sample size based on controlling Type I or Type II error rates.
The results are given in Table 7.5 and show that for the reference model, design pa- rameters similar to that for the ViP trial are obtained. As with sample size calculations based on the ALC, as more information enters into the design through informative pri- ors, the Type I and Type II error rates improve. Considering prior distributions based on normal distributions, for the most informative priors Type I and Type II error rates of 0.07 and 0.08 respectively are obtained. Again, the effect is lessened for locally flat priors with only the most informative priors having any noticeable effect on the Type II error rates.
It is also of interest to note that there is a plateau in the effect that increasingly informative priors have. This is due to the reasoning that as more information enters the prior distributions through effective events, the further information that obtained from the events in the control arm during the course of a trial is reduced. Design parameters here become more dependent on what is observed in the experimental arm as the data form the control arm contribute less towards the estimate of the log hazard ratio.
There is little effect on the Type I error rates, showing when data are simulated with the efficacy parameter fixed at zero the effect of falsely concluding that a new therapy is superior does not change.
Effective Prior Events
Priors Error 10 20 30 50
Reference Type I 0.12 0.11 0.12 0.11 Type II 0.10 0.10 0.11 0.12 Normal Type I 0.11 0.09 0.10 0.09 Type II 0.08 0.07 0.07 0.07 Step Type I 0.11 0.10 0.10 0.10 Type II 0.09 0.07 0.07 0.08 Trapezium Type I 0.11 0.10 0.10 0.10 Type II 0.10 0.08 0.08 0.08
Table 7.5: The effect of different priors and effective prior events on Bayesian Type I and Type II error rates