`
`Clinical Trials
`
`Richard Simon, PhD
`
`Biometric Research Branch, National Cancer Institute, Bethesda, Maryland
`
`
`ABSTRACT: The primary objective of a phase II clinical trial of a new drug or regimen is to
`determine whether it has sufficient biological activity against the disease under study
`to warrant more extensive development. Such trials are often conducted in a multi-
`institution setting where designs of more than two stages are difficult to manage. This
`paper presents two-stage designs that are optimal in the sense that the expected sample
`size is minimized if the regimen has low activity subject to constraints upon the size
`of the type 1 and type 2 errors. Two~stage designs which minimize the maximum
`sample size are also determined. Optimum and ”minimax” designs for a range of
`design parameters are tabulated. These designs can also be used for pilot studies of
`new regimens where toxicity is the endpoint of interest.
`
`KEY WORDS: clinical trials, phase II trials, optimization
`
`INTRODUCTION
`
`A phase II study of a cancer treatment is an uncontrolled trial for obtaining
`an initial estimate of the degree of antitumor effect of the treatment. Phase I
`trials provide information about the maximum tolerated dose(s) of the treat-
`ment, which is important because most cancer treatments must be delivered
`at maximum dose for maximum effect. Phase I trials generally treat only three
`to six patients per dose level, however, and the patients are diverse with
`regard to their cancer diagnosis [1]. Consequently such trials provide little or
`no information about antitumor activity. The proportion of patients whose
`tumors shrink by at least 50% is the primary endpoint of most phase II trials
`although the durability of such responses is also of interest. Such trials are
`not controlled and do not determine the ”effectiveness” of the treatment or
`
`the role of the drug in the treatment of the disease. The purpose of a phase
`II trial of a new anticancer drug is to determine whether the drug has sufficient
`activity against a specified type of tumor to warrant its further development.
`Further development may mean combining the drug with other drugs, eval-
`uation in patients with less advanced disease, or initiation of phase III studies
`in which survival results are compared to those for a standard treatment.
`Phase II trials of combination regimens are also conducted to determine whether
`
`Address reprint requests to: Richard Simon, PhD, Biometric Research Branch, National Cancer Institute,
`EPN 739, Bethesda, MD 20892
`Received Ianuary 6, 1988; revised Ialy 28, 1988.
`
`Controlled Clinical Trials 10140 (1989)
`© Elsevier Science Publishing C0,, 111:, 1989
`655 Avenue of the Americas, New York, New York 10010
`
`1
`0197-2456/1989/$3.50
`
`Genentech 2108
`
`Hospira v. Genentech
`|PR2017-00737
`
`Genentech 2108
`Hospira v. Genentech
`IPR2017-00737
`
`
`
`2
`
`R. Simon
`
`the treatment is sufficiently promising to warrant a major controlled clinical
`evaluation against the standard therapy.
`The designs developed here are based on testing a null hypothesis Hozp s
`p0 that the true response probability is less than some uninteresting level p0.
`If the null hypothesis is true, then we require that the probability should be
`less than a of concluding that the drug is sufficiently promising that it should
`be accepted for further study in other clinical trials. We also require that if a
`specified alternative hypothesis H12]? 2 p1 that the true response probability
`is at least some desirable target level p1 is true, then the probability of rejecting
`the drug for further study should be less than B. In addition to these con-
`straints, we wish to minimize the number of patients treated with a drug of
`low activity. We shall restrict our attention to two-stage designs because of
`practical considerations in the management of multi-institution clinical trials.
`The main practical consideration is that evaluation of a patient’s response is
`not instantaneous and may require observation for weeks or months. Con-
`sequently patient accrual at the end of a stage may have to be suspended
`until it is determined whether the criteria for continuing are satisfied. Such
`suspension of accrual is awkward for physicians who are entering patients
`on the study. More than one such disruption during the trial would, in many
`cases, not be acceptable. Although more stages are desirable from the stand-
`point of efficiency, two—stage designs often achieve a substantial portion of
`the savings of fully sequential designs [2—4].
`
`OPTIMAL TWO-STAGE DESIGNS
`
`If the numbers of patients studied in the first and second stage are denoted
`by 711 and 712 respectively, then the expected sample size is EN = 111 + (1 —
`PET)n2, where PET represents the probability of early termination after the
`first stage. The decision of whether or not to terminate after the first stage
`will be based on the number of responses observed for those 111 patients. The
`expected sample size EN and the probability of early termination depend on
`the true probability of response p. We will terminate the experiment at the
`end of the first stage and reject the drug if r1 or fewer responses are observed.
`This occurs with probability PET = B( r1;p,n1), where B denotes the cumulative
`binomial distribution. We will reject the drug at the end of the second stage
`if r or fewer responses are observed. Hence the probability of rejecting a drug
`with success probability p is
`min[n1,r]
`
`B(r1;p,n1) + E b(X;P/n1)B(r — WW2),
`x=r1+1
`
`(1)
`
`where 1) denotes the binomial probability mass function.
`The design approach considered here is to specify the parameters p0,p1,a,
`and B and then determine the two-stage design that satisfies the error prob-
`ability constraints and minimizes the expected sample size when the response
`probability is p0. The optimization is taken over all values of ml and n2 as well
`as r1 and r. Early acceptance of the drug is not permitted here. The ethical
`imperative for early termination occurs when the drug has low activity. When
`
`
`
`Optimal Two-Stage Designs
`
`3
`
`the drug has substantial activity (2111) there is often interest in studying
`additional patients in order to estimate the proportion, extent, and durability
`of response. We have optimized the designs to minimize expected sample
`size when the response probability is p0. This is because of ethical consider-
`ations. It would be possible to reduce expected sample sizes for the designs
`considered here by termination early if the number of responses in the first
`stage exceeds the final decision criterion r. This would have a negligible effect
`on performance under H0. If early rejection of H0 is really of interest, however,
`a less conservative early rejection rule should be used.
`For specified values of p0,p1,a, and B we have determined optimal designs
`by enumeration using exact binomial probabilities. For each value of total
`sample size n and each value of 111 in the range (1,11 — 1) we determined the
`integer values of r1 and r, which satisfied the two constraints and minimized
`the expected sample size when p = p0. This was found by searching over the
`range r1 E (0,n1). For each value of r1 we determined the maximum value of
`r that satisfied the type 2 error constraint. We then examine whether that set
`of parameters (n,n1,r1,r) satisfied the type 1 error constraint. If it did, then
`we compared the expected sample size to the minimum achieved by previous
`feasible designs and continued the search over r1. Keeping 11 fixed we searched
`over the range of n1 to find the optimal two-stage design for that maximum
`sample size n. The search over n ranged from a lower value of about
`2
`
`_
`P(
`
`1 — ———— ,
`
`21A,, + 21-3]
`_
`p)[ m _ P0
`
`where I? = (p0 + p1)/2. We checked below this starting point to ensure that
`we had determined the smallest maximum sample size n for which there was
`a nontrivial (n1,n2 > 0) two-stage design that satisfied the error probability
`constraints. The enumeration procedure searched upwards from this mini-
`mum value of n until it was clear that the optimum had been determined.
`The minimum expected sample size for fixed 11 is not a unimodal function of
`n because of discreteness of the underlying binomial distributions. Never-
`theless, eventually as n increased the value of the local minima increased and
`it was clear that a global minimum had been found. Calculations were carried
`out in APL on a Microvax II computer. The computer program is available
`on request.
`Tables 1 and 2 show optimal designs for a variety of design parameters.
`Table 1 applies to trials with p1 — p0 = 0.20 and Table 2 is for trials with p1
`— p0 = 0.15. The optimal designs are shown on the left half of the tables.
`For each (pmpl), the three rows correspond to optimal designs for (a,B) =
`(010,010), (0.05,0.20), and (005,010), respectively. The tabulated results
`include the optimal size of the first stage (711), the maximum sample size (n),
`the upper limits on observed response rate that result in rejection of the drug
`at the end of the first stage (rl/nl) and at the end of the trial (r/n), the expected
`sample size [EN( p0)], and probability of terminating the trial at the end of the
`first stage [PET(p0)] if the response probability is p0. For example, the first
`line in Table 1 corresponds to a design with pa = 0.05 and p1 = 0.25. The
`first stage consists of nine patients. If no responses are seen then the trial is
`terminated. Otherwise accrual continues to a total of 24 patients. The average
`
`
`
`R. Simon
`
`Table 1 Designs for p1 — p0 = 0.20“
`
`Optimal Design
`
`Minimax Design
`
`p0
`0.05
`
`p1
`0.25
`
`0.10
`
`0.30
`
`0.20
`
`0.40
`
`0.30
`
`0.50
`
`0.40
`
`0.60
`
`0.50
`
`0.70
`
`0.60
`
`0.80
`
`0.70
`
`0.90
`
`Reject Drug if
`Response
`Rate
`
`Srl/nl
`0/9
`0/9
`0/9
`
`1/12
`1/10
`2/18
`
`3/17
`3/13
`4/19
`
`7/22
`5/15
`8/24
`
`7/18
`7/16
`11/25
`
`11/21
`8/15
`13/24
`
`6/11
`7/11
`12/19
`
`6/9
`4/6
`11/15
`
`$r/n
`2/24
`2/17
`3/30
`
`5/35
`5/29
`6/35
`
`10/37
`12/43
`15/54
`
`17/46
`18/46
`24/63
`
`22/46
`23/46
`32/66
`
`26/45
`26/43
`36/61
`
`26/38
`30/43
`37/53
`
`22/28
`22/27
`29/36
`
`Reject Drug if
`Response
`Rate
`
`Srl/m Sr/n
`0/13
`2/20
`0/12
`2/16
`0/15
`3/25
`
`EN(pg)
`16.4
`13.8
`20.4
`
`PET(p0)
`0.51
`0.54
`0.46
`
`EN(p0)
`14.5
`12.0
`16.8
`
`PET(p0)
`0.63
`0.63
`0.63
`
`19.8
`15.0
`22.5
`
`26.0
`20.6
`30.4
`
`29.9
`23.6
`34.7
`
`30.2
`24.5
`36.0
`
`29.0
`23.5
`34.0
`
`25.4
`20.5
`29.5
`
`17.8
`14.8
`21.2
`
`0.65
`0.74
`0.71
`
`0.55
`0.75
`0.67
`
`0.67
`0.72
`0.73
`
`0.56
`0.72
`0.73
`
`0.67
`0.70
`0.73
`
`0.47
`0.70
`0.69
`
`0.54
`0.58
`0.70
`
`1/16
`1/15
`2/22
`
`3/19
`4/18
`5/24
`
`7/28
`6/19
`7/24
`
`11/28
`17/34
`12/29
`
`11/23
`12/23
`14/27
`
`18/27
`8/13
`15/26
`
`11/16
`19/23
`13/18
`
`4/25
`5/25
`6/33
`
`10/36
`10/33
`13/45
`
`15/39
`16/39
`21/53
`
`20/41
`20/39
`27/54
`
`23/39
`23/37
`32/53
`
`24/35
`25/35
`32/45
`
`20/25
`21/26
`26/32
`
`20.4
`19.5
`26.2
`
`28.3
`22.3
`31.2
`
`35.0
`25.7
`36.6
`
`33.8
`34.4
`38.1
`
`31.0
`27.7
`36.1
`
`28.5
`20.8
`35.9
`
`20.1
`23.2
`22.7
`
`0.51
`0.55
`0.62
`
`0.46
`0.50
`0.66
`
`0.36
`0.48
`0.56
`
`0.55
`0.91
`0.64
`
`0.50
`0.66
`0.65
`
`0.82
`0.65
`0.48
`
`0.55
`0.95
`0.67
`
`“For each value of (pa, p1), designs are given for three sets of error probabilities (a, B). The first,
`second and third rows correspond to error probability limits (0.10, 0.10), (0.05, 0.20), and (0.05,
`0.10) respectively. For each design, EN(po) and PET(p0) denote the expected sample size and the
`probability of early termination when the true response probability is p0.
`
`sample size is 14.5 and the probability of early termination is 0.63 for a drug
`with a response probability of 0.05. All calculations are based on exact binomial
`probabilities.
`As pointed out above, the optimal two-stage design does not necessarily
`minimize the maximum sample size 11 subject to the error probability con-
`straints. For example, consider the case of (pmpl) = (030,050) and ((1,8) =
`(010,010). The optimal design, as seen in Table 1, has a maximum sample
`size of 46 patients. There is a two-stage design based on a maximum of 39
`patients that also satisfies the error constraints. That design has 111 = 28, r1
`= 7, r = 15. The expected sample size for that design is 35.0, a 17% increase
`over the expected size of 29.9 for the design with the minimum expected
`sample size. For each set of design parameters in Tables 1 and 2, the left side
`shows the optimal design and the right side shows the two-stage design
`
`
`
`Optimal Two—Stage Designs
`
`5
`
`Table 2 Designs for p, — p0 = 0.15“
`
`Optimal Design
`
`Minimax Design
`
`Reject Drug if
`Response Rate
`srl/nl
`Sr/n
`0/12
`3/37
`0/10
`3/29
`1/21
`4/41
`
`Reject Drug if
`Response
`Rate
`
`srl/nl
`EN(p0)
`PET(p0)
`firm EN(p0)
`23.5
`0.54
`0/18
`3/32
`26.4
`17.6
`0.60
`0/13
`3/27
`19.8
`26.7
`0.72
`1/29
`4/38
`32.9
`
`PET(p0)
`0.40
`0.51
`0.57
`
`2/21
`2/18
`2/21
`
`5/27
`5/22
`8/37
`
`9/30
`9/27
`13/40
`
`16/38
`11/26
`19/45
`
`18/35
`15/28
`22/42
`
`21/34
`17/27
`21/34
`
`14/20
`14/19
`18/25
`
`5/7
`7/9
`16/19
`
`7/50
`7/43
`10/66
`
`16/63
`19/72
`22/83
`
`29/82
`30/81
`40/110
`
`40/88
`40/84
`49/ 104
`
`47/84
`48/83
`60/105
`
`47/71
`46/67
`64/95
`
`45/59
`46/59
`61/79
`
`27/31
`26/29
`37/42
`
`31.2
`24.7
`36.8
`
`43.6
`35.4
`51.4
`
`51.4
`41.7
`60.8
`
`54.5
`44.9
`64.0
`
`53.0
`43. 7
`62. 3
`
`47.1
`39.4
`55.6
`
`36.2
`30.3
`43.4
`
`20.8
`17.7
`24.4
`
`0.65
`0.73
`0.65
`
`0.54
`0.73
`0.69
`
`0.59
`0.73
`0.70
`
`0.67
`0.67
`0.68
`
`0.63
`0.71
`0. 68
`
`0.65
`0.69
`0.65
`
`0.58
`0.72
`0.66
`
`0.42
`0.56
`0.76
`
`2/27
`2/22
`3/31
`
`6/33
`6/31
`8/42
`
`16/50
`16/46
`27/77
`
`18/45
`28/59
`24/62
`
`19/40
`39/66
`28/57
`
`25/43
`18/30
`48/72
`
`15/22
`16/23
`33/44
`
`5/7
`7/9
`31/35
`
`6/40
`7/40
`9/55
`
`15/58
`15/53
`21/77
`
`25/69
`25/65
`33/88
`
`34/73
`34/70
`45/94
`
`41/72
`40/68
`54/93
`
`43/64
`43/62
`57/84
`
`40/52
`39/49
`53/68
`
`27/31
`26/29
`35/40
`
`33.7
`28.8
`40.0
`
`45.5
`40.4
`58.4
`
`56.0
`49.6
`78.5
`
`57.2
`60.1
`78.9
`
`58.0
`66.1
`75.0
`
`54.4
`43.8
`73.2
`
`36.8
`34.4
`48.5
`
`20.8
`17.7
`35.3
`
`0.48
`0.62
`0.62
`
`0.50
`0.57
`0.53
`
`0.68
`0.81
`0.86
`
`0.56
`0.90
`0.47
`
`0.44
`0. 95
`0. 50
`
`0.46
`0.57
`0.90
`
`0.51
`0.56
`0.81
`
`0.42
`0.56
`0.94
`
`p0
`0.05
`
`p1
`0.20
`
`0.10
`
`0.25
`
`0.20
`
`0.35
`
`0.30
`
`0.45
`
`0.40
`
`0.55
`
`0.50
`
`0.65
`
`0.60
`
`0.75
`
`0.70
`
`0.85
`
`0.80
`
`0.95
`
`”For each value of (po, p1), designs are given for three sets of error probabilities (a, B). The first,
`second, and third rows correspond to error probability limits (0.10, 0.10), (0.05, 0.20), and (0.05,
`0.10) respectively. For each design, EN(p0) and PET(po) denote the expected sample size and the
`probability of early termination when the true response probability is m.
`
`having the smallest maximum sample size n that satisfies the design con-
`straints.
`
`In some cases, the ”minimax” design may be more attractive than that with
`the minimum expected sample size. This will be the case when the difference
`in expected sample sizes is small and the patient accrual rate is low. Consider,
`for example, the case of distinguishing p0 = 0.10 from p1 = 0.30 with a = B
`= 0.10. The optimal design in Table 1 has an expected sample size under H0
`of 19.8 and a maximum sample size of 35. The minimax two-stage design has
`an expected sample size of 20.4 and a maximum sample size of 25. If the
`accrual rate is only ten patients per year, it could take 1 year longer to complete
`
`
`
`6
`
`R. Simon
`
`the optimal design than the minimax design. This may be more important
`than the slight reduction in expected sample size. Also, the optimal designs
`achieve reductions in EN(po) by having smaller first stages than the minimax
`designs. The small first stage exposes few patients to an inactive treatment.
`In cases where the patient population is very heterogeneous, however, a very
`small first stage may not be desirable because patients entered early in the
`study may be unrepresentative of the eligible population. Hence there are
`circumstances where the minimax designs are preferable.
`Usually the minimax two-stage design has the same maximum sample size
`n as the smallest single—stage design that satisfies the error probabilities. The
`minimax two-stage design has a smaller expected sample size under Ho, how—
`ever. In determining the minimax designs, we have limited attention to non—
`trivial two—stage designs; those with m and n2 > O. In some cases, the minimax
`two-stage design has a smaller maximum sample size than the optimal single
`stage design. For example, the optimal single—stage design for distinguishing
`p0 = 0.60 from p1 = 0.80 with a = B = 0.10 has a sample size of 36 and
`rejects H1 if 25 or fewer responses are observed. As shown in Table 1, the
`minimax two-stage design for this case has a maximum sample size of 35.
`This is due to the discreteness of the binomial distribution and the fact that
`
`although the one- and two—stage designs both satisfy the error constraints,
`they do not have the same error probabilities.
`
`DISCUSSION
`
`A number of statistical designs have been previously proposed for phase
`II clinical trials. The first, and most commonly used design was developed
`by Gehan [5]. It is a two-stage design for estimating the response rate. It is
`most commonly employed with a first stage of 14 patients. If no responses
`are observed in the first stage, then the trial is terminated because this event
`has probability less than 0.05 if the true response probability is greater than
`or equal to 0.20. If at least one response is observed in the first 14 patients,
`then a second stage of accrual is carried out in order to obtain an estimate of
`the response probability having a specified standard error. The number of
`patients accrued in the second stage depends on the number of responses
`observed in the first stage and the desired standard error. Gehan’s design is
`often used with a second stage of 11 patients. This provides for estimation
`with approximately a 10% standard error, although this may provide very
`broad confidence limits. Requiring that the standard error be 5% instead of
`10% provides estimates with more satisfactory precision, but requires much
`larger sample sizes. If the second stage is to be much larger, then it is not
`clear that the first stage should include only 14 patients. This is because even
`for a poor drug with a true response probability of 5%, there is a 51% chance
`of obtaining at least one response in the first 14 patients.
`Although one can make a cogent case that the main objective of phase III
`clinical trials should be estimation rather than hypothesis testing, in planning
`phase II trials it often seems more useful to identify levels of activity to
`distinguish than to select a precision for estimation. The two design principles
`are mathematically similar, but the hypothesis testing formulation encourages
`
`
`
`Optimal Two-Stage Designs
`
`7
`
`phase II trial designers to think carefully about the objectives of the experiment
`and to define how decisions will be influenced by results. It is for this reason
`that we have adopted the hypothesis testing framework employed by most
`authors. Jennison and Turnbull [6] and Chang and O’Brien [7] have described
`methods for calculating confidence intervals following sequential sampling
`procedures of the type proposed here.
`Schultz et al. [8] developed recursive formulae for calculating the operating
`characteristics of general k-stage designs with the possibility of acceptance
`and rejection at each stage. Early acceptance is appropriate for situations
`where patients are very limited or the drug is very expensive. Fleming [9]
`also studied k-stage designs with acceptance or rejection possible at each stage.
`Fleming’s design is based on an approach developed for phase III trials [10]
`in which early rejection of a hypothesis occurs only when interim results are
`quite extreme. This conservatism permits final analysis to be unaffected by
`interim monitoring if early termination does not occur but is not always
`desirable for phase II trials of agents that are likely to be inactive. Lee et a1.
`[11] considered two—stage designs that permit the possibility of recommending
`additional phase II trials. Herson [12] described a multistage Bayesian ap-
`proach in which the trial is terminated early if the predictive probability of
`rejecting the null hypothesis at the maximum sample size falls below a spec-
`ified level. The predictive probability is calculated with regard to the posterior
`distribution of p given the prior distribution and the data. None of the above
`authors attempted to optimize their designs.
`Sylvester and Staquet [13] have provided an interesting decision theory
`approach to this problem, although the complexities of real-world decision-
`making are difficult to capture with simple models. Colton and McPherson
`[2] considered the design of two-stage clinical trials with binary outcome.
`They restricted attention to the case where the sample sizes of the two stages
`are equal and where the null hypothesis is p = 0.05. They determined a total
`sample size and rejection regions r1 and r to minimize the expected sample
`size under the alternative hypothesis.
`Chang et a1. [14] have recently also considered the problem of optimizing
`the design of phase II trials. They described an algorithm that, although not
`guaranteed to find the optimum design, seemed to work well. For their de-
`signs early acceptance of the drug is permitted and the expected sample size,
`averaged over the null and alternative hypotheses, is minimized. In their
`published tables they have not optimized with regard to the maximum sample
`Size.
`
`Comparison of the designs developed here to those published by others
`for distinguishing a null hypothesis p $ p0 from an alternative p 2 p1 is made
`difficult by the fact that two designs may not be equivalent with regard to
`the error probabilities or and B. Table 3 compares the optimum designs de-
`veloped here with two-stage designs tabulated by Fleming [9] and by Chang
`et a1. [14] for cases where the error probabilities are not too dissimilar. For
`most cases shown in Table 3, the new optimized designs offer a meaningful
`reduction in expected sample size when the null hypothesis is true. It must
`be recognized, however, that Fleming did not optimize with regard to the
`sample size within stages subject to constraints on the error probabilities.
`
`
`
`Table 3 Comparison of Two-Stage Designs“
`
`p0
`p1
`Type
`rl/nl
`al/nl
`
`r/n
`
`0.05
`
`0.20
`
`0.10
`
`0.30
`
`0.20
`
`0.40
`
`0.20
`
`0.40
`
`0.30
`
`0.50
`
`Fleming
`Chang
`Optimal
`
`Fleming
`Optimal
`
`Fleming
`Chang
`Optimal
`
`Fleming
`Chang
`Optimal
`
`Fleming
`Chang
`Optimal
`
`0/20
`0/20
`1/21
`
`1/15
`1/10
`
`4/20
`7/25
`3/ 13
`
`4/25
`5/25
`4/19
`
`8/25
`9/25
`5/15
`
`4/20
`5/20
`
`5/15
`
`9/20
`9/25
`
`11/25
`10/25
`
`14/25
`13/25
`
`4/40
`4/40
`4/41
`
`5/25
`5/29
`
`11/35
`16/50
`12/43
`
`15/50
`15/50
`15/54
`
`19/45
`21/50
`18/46
`
`0.30
`
`0.50
`
`20/50
`14/25
`7/25
`Fleming
`20/50
`14/25
`6/25
`Chang
`
`Optimal
`8/24
`24/63
`
`R. Simon
`
`or
`
`0.052
`0.047
`0.046
`
`0.036
`0.047
`
`0.037
`0.050
`0. 049
`
`0.032
`0.039
`0.048
`
`0.029
`0.032
`0.049
`
`0.048
`0. 049
`0.049
`
`B
`
`EN(p0)
`
`0.922
`0.920
`0.902
`
`0.807
`0.805
`
`0.801
`0.814
`0.800
`
`0.904
`0.901
`0.904
`
`0.807
`0.801
`0.803
`
`0.894
`0.899
`0.903
`
`32.5
`32.8
`26.7
`
`19.4
`15.0
`
`25.4
`26.6
`20. 6
`
`39.3
`34.2
`30.4
`
`31.3
`29.3
`23.6
`
`37.1
`41.3
`34. 7
`
`the number of responses is Sr] after n1 patients
`is rejected if
`“The hypothesis p2 p1
`or fir after H patients. The hypothesis p S p0 is rejected if the number of responses is
`am after n1 patients or >7 after n patients.
`
`[14] presented optimized results only for three-stage designs.
`Chang et al.
`Also, the latter two designs provide for early rejection of the null hypothesis,
`a feature that may be useful in some clinical trials.
`We have tabulated optimal phase II designs for (or,B) = (010,010), (005,020),
`and (005,010). In phase II trials, both kinds of error are important.
`[3 rep-
`resents the probability of rejecting a treatment with response rate 2p].
`(1
`represents the probability of failing to reject a treatment with response prob-
`ability Spo. This is a less serious error from a drug discovery viewpoint, but
`it is serious from a cost perspective since it leads to unnecessary follow-up
`trials. The tabulated designs should be appropriate for most situations. It is
`unusual to have [3 < a and designs based on a = B = 0.05 require too large
`a sample size for practical use in most phase II trials. Tabulation of the minimax
`designs should also be useful for those who prefer such designs or who wish
`to know the smallest value of maximum sample size to use for selecting
`designs of other types.
`For phase II trials of new drugs against solid tumors, designs with (190471)
`equal to (005,020), (005,025), or (010,025) will often be appropriate. This
`is because many new drugs are almost totally inactive against the common
`solid tumors. Also, new drugs that provide relatively modest (20%—25%)
`response rates against these refractory diseases are of interest for further
`development. In some cases effective treatments can be obtained by combin—
`ing such drugs with other drugs, by optimizing their schedules or routes of
`administration, or by using them for patients with less advanced forms of
`the same histopathological type of cancer. Other tabulated designs will be
`appropriate for phase II trials of combination regimens. It is for this reason
`that the full range of (pmpl) was produced. For pilot studies of combinations,
`
`
`
`Optimal Two-Stage Designs
`
`9
`
`the level p1 — p0 = 0.20 is commonly the degree of difference targeted.
`Designing a trial to distinguish only larger differences is often unrealistic and
`uninformative. A p1 — p0 of 0.15 is probably the smallest difference that one
`would consider for a phase II study because the sample sizes become pro-
`hibitively large for smaller differences and because the lack of controls limits
`the interpretability of trials based on distinguishing smaller differences. The
`designs presented here could also be utilized for pilot studies of intensive
`regimens with toxicity as the endpoint. In this case the hypotheses should
`be specified in terms of the probability of no toxic event.
`The optimization criterion chosen here is not unique. One could minimize
`the expected sample size averaged with regard to a prior distribution for the
`true response probability p. Historically, however, most new regimens are
`not successful and, more importantly, optimizing the design for performance
`under the null hypothesis seems ethically appropriate.
`As pointed out by DeMets [15], the decision to terminate a controlled
`clinical trial early is often complex and sequential boundaries are generally
`to be regarded as guidelines rather than rigid decision rules. This also applies
`to phase II clinical trials because there are secondary endpoints and sometimes
`patient subsets of interest. A decision to terminate a phase II trial for a treat-
`ment having poor activity for a well defined and fairly homogeneous set of
`patients, h0wever, is generally less complex than a decision for early termi-
`nation of a large controlled study.
`
`REFERENCES
`
`1. Simon R: Design, Analysis and Reporting of Cancer Clinical Trials. In: Biophar-
`maceutical Statistics for Drug Development (Peace, KE, Ed.) New York: Marcel
`Dekker, 1987
`
`2. Colton T, McPherson K: Two-stage plans compared with fixed-sample-size and
`Wald SPRT plans. I Am Stat Assoc 71:80—86, 1976
`3. McPherson K: On choosing the number of interim analyses in clinical trials. Stat
`Med 1:25—36, 1982
`
`4. Jennison C: Efficient group sequential tests with unpredictable group sizes. Bio-
`metrika 74:155—166, 1987
`
`5. Gehan EA: The determination of the number of patients required in a follow-up
`trial of a new chemotherapeutic agent. I Chron Dis 13:346—353, 1961
`
`6. Iennison C, Turnbull BW: Confidence intervals for a binomial parameter following
`a multistage test with application to MIL-STD 105D and medical trials. Techno-
`metrics 25:49—58, 1983
`
`7. Chang MN, O’Brien PC: Confidence intervals following group sequential tests.
`Controlled Clin Trials 7:18—26, 1986
`
`8. Schultz JR, Nichol FR, Elfring GL, Weed SD: Multi-stage procedures for drug
`screening. Biometrics 29:293—300, 1973
`9. Fleming TR: One sample multiple testing procedure for phase II clinical trials.
`Biometrics 38:143—151, 1982
`
`10. O’Brien PC, Fleming TR: A multiple testing procedure for clinical trials. Biometrics
`35:549—555, 1979
`
`11. Lee Y], Staquet M, Simon R, Cantane R, Muggia F: Two stage plans for patient
`accrual in phase 11 cancer clinical trials. Cancer Treat Rep 63:1721—1726, 1979
`
`
`
`10
`
`R. Simon
`
`12.
`
`13.
`
`14.
`
`15.
`
`Herson I: Predictive probability early termination plans for phase II clinical trials.
`Biometrics 35:775—783, 1979
`
`Sylvester R], Staquet M]: Design of phase II trials in clinical trials in cancer using
`decision theory. Cancer Treat Rep 64:519—524, 1977
`
`Chang MN, Therneau TM, Wieand HS, Cha SS: Designs for group sequential
`phase II clinical trials. Biometrics 43:865—874, 1987
`DeMets DL: Stopping guidelines versus stopping rules: A practitioner’ 5 point of
`view. Communications Stat Theory Meth 13:2395—2417, 1984
`
`