A novel adaptive-block method of randomization to maximize the efficiency of overall treatment group balance, while maintaining balance at investigational centers in smaller sized studies, is proposed.
In large multi-center clinical trials, stratifying the randomization by center leads to overall treatment group balance with high efficiency. However, rare diseases can have a prevalence lower than one in 2,000 making large trials impossible to enroll. There are 6,000 to 8,000 rare diseases that affect 30 million people in the EU.11 Studies in rare diseases require a smaller sample size and many investigational sites to recruit enough patients to meet recruitment goals. Thus, studies will be necessarily small, perhaps in the range of 50 to 100 patients. Stratifying the randomization by site in small to medium studies significantly increases the probability of treatment group imbalances. In this article, a novel adaptive-block method of randomization to maximize the efficiency of overall treatment group balance while maintaining balance at investigational centers in small- to medium-sized studies is proposed.
The most common method for proactively balancing the treatment groups in a clinical trial among the known and unknown prognostic factors of the primary outcome is to stratify the randomization.10 In Hamilton’s experience with the randomization of approximately 2,500 clinical trials, stratification was used almost 100% of the time. Furthermore, the method most often used to control the balance among the unknown prognostic factors was to stratify by site (aka investigator or center). This practice is supported by the regulatory guidance document ICH E9 which reads “It is advisable to have a separate random scheme for each center, i.e. to stratify by center or to allocate several whole blocks to each center.”6 European regulatory guidance documents mention that “most multicenter trials are stratified by center (or investigator) for practical reasons or because center (or investigator) is expected to be confounded with other known or unknown prognostic factors.”5 It is the author’s experience involved in many studies where the FDA has explicitly requested stratification by investigator to balance the known and unknown prognostic factors among the treatment groups. For these reasons, the lion’s share of clinical studies are stratified by site/investigator. Indeed, a survey of medical studies in leading journals found that over 70% were stratified by site/investigator.1
Since most randomized studies utilize list-based randomization, permuted block randomization is by far the most common method to implement site stratification. With permuted blocked randomization, stratifying by investigator works well in large multicenter trials, i.e. balance among treatment groups is achieved within site and in the overall total. However, it can lead to over-stratification in small to medium sized studies. That is, the chance of overall total study imbalance can be greater with the stratification than it would otherwise be with no stratification. Therneau showed that when the number of distinct factor level combinations exceeds n/k the probability of overall imbalance is greater than that with no stratification, where k indicates the number of treatment groups. A real-life example is a study where two treatments are compared across 80 subjects and they are enrolled across 20 centers. If permuted blocked randomization with a block size of four and stratified by site is used, our simulations show that there is a 0.46 probability of a significant imbalance exceeding 10%; essentially flipping a coin that the overall treatment group imbalance will be worse than 36 in one group and 44 in the other. Thus, stratifying by site in small- to medium-sized studies can cause the overall treatment group totals to end up out of balance, especially in rare diseases when many sites are needed for recruitment goals.
Given the need to meet recruitment timelines and the large number of studies in rare diseases, multi-center trials will be utilized for small to medium sized studies, and statisticians will design randomization plans with site stratification.10 Therefore, we need a stratification strategy that will accommodate a large number of sites without overly elevating the risk of overall treatment group imbalance. Certainly, dynamic randomization is a recommended method to effectively balance treatment groups among many prognostic factors and levels.11 However, experience indicates most sponsors are reluctant to utilize dynamic randomization for fear of complexity and perceived resistance from regulatory authorities. Other IRT providers have encountered similar reluctance.4 Given its wide acceptance, it would be optimal to adapt permuted block randomization with a constraint that would accommodate a large number of prognostic factors/levels and maintain the overall balance.
The ideas presented grow out of a collaboration between Hamilton and Alster. We worked on the randomization plan for a study where 90 subjects were to enroll at 18 sites and randomized to one of two treatments. The risk of overall treatment group imbalance was fairly high, especially at the interim analysis scheduled for enrollment of 60 subjects. The method described herein is a proposed solution. The research in this paper is intended to help statisticians and programmers designing randomizations when faced with a highly stratified randomization. To that end, we performed an experiment to provide results that show the properties of this adaptive block method compared to standard permuted block randomization. Hamilton designed the experiment that is presented in the next section and Alster programmed the simulations in Python version 3.7.
The essence of the proposed method is that the blocks are not constructed in advance. They are built as the randomization progresses with the assignment to treatment of each subject. Let’s look again at our example above where there are two treatments (A, B) randomized in a 2:2 ratio. There is no pre-defined randomization list. Therefore, a centralized IRT, such as a web-based randomization system, would be employed. When study site personnel access the IRT to randomize a subject, the system recognizes the site and registers the overall treatment group balance (e.g. Treatment group A total vs. Treatment group B total). If there are more subjects in one group then the system looks at the current block at that site that is randomizing to see whether the other group can be assigned within the constraints of the block position. For example, if there are more subjects in group A than B in the overall total and the subject will be randomized in the first position of the current block it would randomize the subject to group B. However, in our example the constraint of the block position is that standard blocks of size four with a 2:2 ratio have to be created. Thus, if the block position was 3 or 4 and 2 previous slots in the block had been filled by group B, the system would randomize the subject to A to maintain the correct blocked structure of the developing randomization list.
Algorithmically, the blocks are constructed in real-time as the subjects randomize in the following manner. Note that each block referenced in the algorithm is within a stratum (site).
To assess the effectiveness of this adaptive-block randomization method Lance and I executed a simulation study where we used the example of 80 subjects with a randomization plan that stratifies by 20 sites. Based on our experience with randomization plans from approximately 2,500 studies, I chose to investigate the treatment group/randomization ratios encountered in over 85% of the studies where Signant Health has managed the randomization. They are listed in Table 1.
Two important properties of randomization are predictability and efficiency. The term “predictability”, sometimes called selection bias, refers to the probability study personnel could predict the next assigned treatment. For a more complete description of predictability please see Shum, Hamilton, and Lo12, or for a more detailed explanation see Blackwell and Hodges.2 Since the proposed method is an enhancement to regular permuted blocked randomization, the predictability remains the same. Thus, we focused our experiment on efficiency of treatment group balance. I have already described the adaptive-block specifications for scenario one in the shaded box above. The specifications for scenarios two to five are detailed in the Appendix. To examine the efficiency of our proposed method, Alster executed 100,000 Monte Carlo samples to randomize through the two methods: 1) standard site-stratified permuted block randomization and 2) the proposed adaptive-block randomization. Each sample consisted of samples of 78 or 80 subjects with random allocation to 20 sites. The measure of imbalance consisted of the largest treatment group ratio among all the treatment groups normalized by the randomization ratio and converted to a percentage. With two treatment groups, the imbalance measure is simply the normalized ratio. With three treatment groups we choose the largest of the three ratios. We present results for the overall balance and average balance within sites. The software used to program the simulations was Python version 3.7.
Recall that standard permuted block randomization stratified on site will directly balance treatment groups within sites, and that overall balance is expected as a result of the site balancing. That is, when treatment groups are balanced at all the sites there should be overall balance. However, as shown above the overall balance can break down in small to medium sized studies. Thus, we expected the adaptive-block method to perform better at maintaining overall balance. The important question was whether the adaptive-block method would maintain balance within sites as well. The results in Table 2 report the percent of trials that had perfect overall balance, imbalance by one to three subjects, and imbalance of 10% or greater. Most statisticians would agree that a 10% or greater imbalance would be a failure of the randomization. We also report average imbalance within the sites. Table 2 displays these results for each of the five scenarios studied. The results show that the adaptive-block method is far more efficient and was always much better than standard permuted block randomization at maintaining overall balance. For instance, the adaptive-block method maintained perfect balance in 55% to 92% of the trials vs. 3% to 23% with standard permuted block randomization, respectively. Perhaps more importantly, the adaptive-block method essentially guaranteed no imbalances of 10% or greater (only 1:1:1 showed greater than 10% imbalance 4% of the time), whereas standard permuted block randomization produced significant imbalances more than 50% of the time in four of the five scenarios studied (range of 8% to 64%). Reassuringly, the adaptive-block method-maintained balance within sites as well and sometimes more efficiently than standard permuted block randomization.
Additionally, we examined the impact of the adaptive-block method on the permutation distribution. As discussed in Hamilton and Shum, permutations that have strings of repeated treatments produce more treatment group imbalances. Thus, we would expect that the adaptive-block method would create fewer of these permutations. Indeed, the results showed that in the adaptive-block randomizations the blocks with strings of repeated treatments were created approximately a third of time less than the uniform distribution created by standard permuted block randomization. We explored the impact on overall balance using permuted block randomization and reducing the blocks with strings of repeated treatments by a third, however the balancing efficiency did not increase. As expected, the position of the permutations in relation to the enrollment distribution was the key to the efficiency of the adaptive-block method.
In this paper, we propose a novel adaptive-block randomization method that greatly increases the efficiency of overall treatment group balance without sacrificing efficiency within site. Our method is most useful in small to medium sized studies and in rare diseases where many sites are needed to meet recruitment goals. Blazynski3 reported that approximately 1,350 industry sponsored phase II trials were completed in 2018. Since most Phase II studies enroll 100 or fewer subjects from several investigational sites, this method has the potential for high impact in phase II studies as well. This paper shows that for some very common designs standard permuted block randomization can result in 10% or greater imbalances more than 50% of the time. This is clearly unacceptable and must be remedied. Fortunately, we have shown that the fairly simple adaptive-block method will maintain excellent overall and within site balance in these small to medium size studies.
Stratifying by site is the most common cause for a large number of stratification factor levels. However, even with a reasonable number of sites when additional prognostic factors are added to the stratification, the number of levels gets large. Fortunately, the proposed method is agnostic to what the stratification factors represent. It will be highly efficient at balancing treatment groups with a large number of stratification levels regardless of what they are.
Scott A Hamilton, PhD, is the Principal Biostatistician for Signant Health. Lance Alster is the Manager of SAS programming for Fibrogen.
Adaptive Block Treatment Assignment Specifications
3 treatments (A, B, C) ratio 1:1:1 and a blocksize of 3.
Master Protocols: Implementing Effective Treatment Adaptations in the Randomization
August 23rd 2023It is unrealistic to include infinite adaptations in an IRT system, thus identifying the optimal level of adaptations requires examination of the study’s characteristics and planning phase considerations.