`doi:10.1093/biostatistics/kxx069
`
`Estimation of clinical trial success rates and related
`parameters
`
`CHI HEEM WONG, KIEN WEI SIAH
`MIT Computer Science and Artificial Intelligence Laboratory & Department of Electrical Engineering
`and Computer Science, Cambridge, MA 02139, USA and MIT Sloan School of Management and
`Laboratory for Financial Engineering, Cambridge, MA 02142, USA
`∗
`
`ANDREW W. LO
`MIT Computer Science and Artificial Intelligence Laboratory & Department of Electrical Engineering
`and Computer Science, Cambridge, MA 02139, USA, MIT Sloan School of Management and Laboratory
`for Financial Engineering, Cambridge, MA 02142, USA, and AlphaSimplex Group, LLC,
`Cambridge, MA 02142, USA
`alo-admin@mit.edu
`
`SUMMARY
`Previous estimates of drug development success rates rely on relatively small samples from databases
`curated by the pharmaceutical industry and are subject to potential selection biases. Using a sample of
`406 038 entries of clinical trial data for over 21 143 compounds from January 1, 2000 to October 31,
`2015, we estimate aggregate clinical trial success rates and durations. We also compute disaggregated
`estimates across several trial features including disease type, clinical phase, industry or academic sponsor,
`biomarker presence, lead indication status, and time. In several cases, our results differ significantly in
`detail from widely cited statistics. For example, oncology has a 3.4% success rate in our sample vs. 5.1%
`in prior studies. However, after declining to 1.7% in 2012, this rate has improved to 2.5% and 8.3% in
`2014 and 2015, respectively. In addition, trials that use biomarkers in patient-selection have higher overall
`success probabilities than trials without biomarkers.
`
`Keywords: Clinical phase transition probabilities; Clinical trial statistics; Probabilities of success.
`
`1. INTRODUCTION
`
`The probability of success (POS) of a clinical trial is critical for clinical researchers and biopharma
`investors to evaluate when making scientific and economic decisions. Prudent resource allocation relies
`on the accurate and timely assessment of risk. Without up-to-date estimates of the POS, however, investors
`may misjudge the risk and value of drug development, leading to lost opportunities for both investors and
`patients.
`One of the biggest challenges in estimating the success rate of clinical trials is access to accurate
`information on trial characteristics and outcomes. Gathering such data is expensive, time-consuming, and
`
`∗
`
`To whom correspondence should be addressed.
`
`© The Author 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
`
`Downloaded from https://academic.oup.com/biostatistics/advance-article-abstract/doi/10.1093/biostatistics/kxx069/4817524
`by Alabama State University user
`on 09 February 2018
`
`Abraxis EX2074
`Apotex Inc. and Apotex Corp. v. Abraxis Bioscience, LLC
`IPR2018-00151; IPR2018-00152; IPR2018-00153
`
`
`
`2
`
`C. H. WONG AND OTHERS
`
`susceptible to error. Previous studies of success rates have been constrained by the data in several respects.
`Abrantes-Metz and others (2005) surveyed 2328 drugs using 3136 phase transitions (e.g., from Phase 1
`to Phase 2 in the approval process), while DiMasi and others (2010) studied 1316 drugs from just 50
`companies. In the landmark study of this area, Hay and others (2014) analyzed 7372 development paths
`of 4451 drugs using 5820 phase transitions. In two recent papers, Smietana and others (2016) computed
`statistics using 17 358 phase transitions for 9200 compounds, while Thomas and others (2016) used 9985
`phase transitions for 7455 clinical drug development programs. In contrast, ClinicalTrials.gov, the clinical
`trial repository maintained by the National Institutes of Health (NIH), contains over 217 000 clinical trial
`entries submitted by various organizations as of July 1, 2016 (see www.clinicaltrials.gov). It is estimated
`that trained analysts would require tens of thousands of hours of labor to incorporate its full information
`manually to produce POS estimates.
`In this article, we construct estimates of the POS and other related risk characteristics of clinical
`trials using 406 038 entries of industry- and non-industry-sponsored trials, corresponding to 185 994
`unique trials over 21 143 compounds from Informa Pharma Intelligence’s Trialtrove and Pharmaprojects
`databases from January 1, 2000 to October 31, 2015. This is the largest investigation thus far into clinical
`trial success rates and related parameters. To process this large amount of data, we develop an automated
`algorithm that traces the path of drug development, infers the phase transitions, and computes the POS
`statistics in hours. In this article, we introduce the “path-by-path” approach that traces the proportion of
`development paths that make it from one phase to the next. In contrast, extant literature uses what we
`call the “phase-by-phase” approach, which estimates the POS from a random sample of observed phase
`transitions. Apart from the gains in efficiency, our algorithmic approach allows us to perform previously
`infeasible computations, such as generating time-series estimates of POS and related parameters.
`We estimate aggregate success rates, completion rates (CRs), phase-transition probabilities, and trial
`durations, as well as more disaggregated measures across various dimensions such as clinical phase,
`disease, type of organization, and whether biomarkers are used. Before presenting these and other results,
`we begin by discussing our methodology and describing some features of our data set.
`
`2. DATA
`
`We use Citeline data provided by Informa Pharma Intelligence, a superset of the most commonly used data
`sources that combines individual clinical trial information from Trialtrove and drug approval data from
`Pharmaprojects. In addition to incorporating multiple data streams, including nightly feeds from official
`sources such as ClinicalTrials.gov, Citeline contains data from primary sources such as institutional press
`releases, financial reports, study reports, and drug marketing label applications, and secondary sources
`such as analyst reports by consulting companies. Secondary sources are particularly important for reducing
`potential biases that may arise from the tendency of organizations to report only successful trials, especially
`those prior to the FDA Amendments Act of 2007, which requires all clinical trials to be registered and
`tracked via ClinicalTrials.gov. Our database contains information from both US and non-US sources.
`The database encodes each unique quartet of trial identification number, drug, indication, and sponsor
`as a data point. As such, a single trial can be repeated as multiple data points. The trials range from
`January 1, 2000, to October 31, 2015, the latter being the date that we received the data set. After deleting
`46 524 entries with missing dates and unidentified sponsors, and 1818 entries that ended before January 1,
`2000, 406 038 data points remain. Of these, 34.7% (141 086) are industry sponsored and 65.3% (264 952)
`are non-industry sponsored. In our industry-sponsored analysis, we counted 41 040 development paths
`or 67 752 phase transitions after the imputation process. Figure S1 in Section A1 of the supplementary
`material available at Biostatistics online contains an illustrative sample of the data set and some basic
`summary information.
`
`Downloaded from https://academic.oup.com/biostatistics/advance-article-abstract/doi/10.1093/biostatistics/kxx069/4817524
`by Alabama State University user
`on 09 February 2018
`
`Abraxis EX2074
`Apotex Inc. and Apotex Corp. v. Abraxis Bioscience, LLC
`IPR2018-00151; IPR2018-00152; IPR2018-00153
`
`
`
`Estimation of clinical trial success rates and related parameters
`
`3
`
`Some trials are missing end-dates due to the failure of their sponsors to report this information. Since
`these dates are required by our algorithm, we estimate them by assuming that trials lasted the median
`duration of all other trials with similar features. Only 14.6% (59 208) of the data points required the
`estimation of end-dates.
`
`3. MODELING THE DRUG DEVELOPMENT PROCESS
`
`To avoid confusion and facilitate the comparison of our results with those in the extant literature, we begin
`by defining several key terms. A drug development program is the investigation of a particular drug for
`a single indication (see top diagram of Figure S2 of the supplementary material available at Biostatistics
`online). A drug development program is said to be in Phase i if it has at least one Phase i clinical trial.
`If a Phase i clinical trial concludes and its objectives are met, this trial is said to be completed. If it is
`terminated prematurely for any reason, except in the case that it has positive results, the trial is categorized
`as failed. Conditioned on one or more trial(s) being completed, the sponsor can choose to either pursue
`Phase i + 1 trials, or simply terminate development. If the company chooses the former option, the drug
`development program is categorized as a success in Phase i, otherwise, it will be categorized as terminated
`in Phase i. See Figure S2 (bottom) of the supplementary material available at Biostatistics online for an
`illustration. The POS for a given Phase i, denoted by POSi,i+1, is defined as the probability that the drug
`development program advances to the next phase. The probability of getting a drug development program
`in Phase i through to approval is denoted by POSi,APP. Hence the overall probability of success—moving
`a drug from Phase 1 to approval, which Hay and others (2014) calls the likelihood of approval (LOA)—is
`POS1,APP.
`The proper interpretation of drug development programs from clinical trial data requires some under-
`standing of the drug development process, especially in cases of missing data. This is particularly important
`for estimating a drug candidate’s POS1,APP, which is typically estimated by multiplying the empirical POS
`of Phase 1 (safety), 2 (efficacy for a given indication), and 3 (efficacy for larger populations and against
`alternatives) trials. If, for example, Phase 2 data are missing for certain approved drugs, the estimated
`POS1,APP would be biased downward. Here, we take a different approach to estimating POSs.
`Consider an idealized process in which every drug development program passes through Phase 1, 2,
`and 3 trials, in this order. This is plausible, since each of these stages involves distinct predefined tests, all
`of which are required by regulators in any new drug application (NDA). If we observe data for Phases 1
`and 3 but not Phase 2 for a given drug-indication pair, our idealized process implies that there was at least
`one Phase 2 trial that occurred, but is missing from our data set. Accordingly, we impute the successful
`completion of Phase 2 in these cases. There exist some cases where Phase 2 trials are skipped, as with the
`recent example of Aducanumab (BIIB037), Biogen’s Alzheimer’s candidate, as reported by Root (2014).
`Since skipping Phase 2 trials is motivated by compelling Phase 1 data, imputing the successful completion
`of Phase 2 trials in these cases to trace drug development paths may not be a bad approximation. In addition,
`we make the standard assumption that Phase 1/2 and Phase 2/3 trials are to be considered as Phase 2 and
`Phase 3, respectively.
`These assumptions allow us to more accurately reconstruct ‘drug development paths’ for individual
`drug-indication pairs, which in turn yield more accurate POS estimates. Let nj be the number of drug
`development paths with observed Phase j trials, and nj
`s be the number of drug development paths where
`we observe phase transitions of state s of Phase j (defined below).
`
`⎧⎪⎨
`ip,
`⎪⎩
`t,
`m,
`
`s =
`
`if all the trials are in progress
`if the program failed to proceed to phase i + 1 (i.e., terminated)
`if the phase transition can be inferred to be missing
`
`Downloaded from https://academic.oup.com/biostatistics/advance-article-abstract/doi/10.1093/biostatistics/kxx069/4817524
`by Alabama State University user
`on 09 February 2018
`
`Abraxis EX2074
`Apotex Inc. and Apotex Corp. v. Abraxis Bioscience, LLC
`IPR2018-00151; IPR2018-00152; IPR2018-00153
`
`
`
`4
`
`C. H. WONG AND OTHERS
`Equation 3.1 is the conservation law for drug development paths in Phase j + 1.
`nj+1 = nj + nj
`− nj
`− nj
`∀j = 1, 2, 3
`
`m
`
`ip
`
`t
`
`(3.1)
`
`The POS from any one state to the next, POSj,j+1, is thus the ratio of the number of drug development
`projects in Phase j + 1, both observed and non-observed, to the number of drug development projects in
`Phase j, both observed and non-observed:
`
`POSj,j+1(Path-by-Path) =
`
`nj+1
`m − nj
`nj + nj
`
`ip
`
`(3.2)
`
`Given our model, we can now compute POS1,APP by finding the proportion of development paths that
`made it from Phase 1 to Approval:
`
`POS1,APP(Path-by-Path) =
`
`n1 + n1
`
`m
`
`nApproval
`− n1
`− n2
`
`ip
`
`ip
`
`− n3
`
`ip
`
`(3.3)
`
`We term this the ‘path-by-path’approach. In contrast, extant papers define the phase transition probabil-
`ity as the ratio of observed phase transitions to the number of observed drug development programs in Phase
`i and multiply the individual phase probabilities to estimate the overall POS. We term this the ‘phase-
`by-phase’ approach, which we shall differentiate from the path-by-path computation by a superscript
`p as follows:
`
`POSp
`j,j+1
`
`POSp
`1,APP
`
`m
`
`= nj+1 − nj
`nj − nj
`(cid:6)
`ip
`=
`POSp
`j,j+1
`j∈{1,2,3}
`
`(3.4)
`
`(3.5)
`
`Implicit in the path-by-path computation method is the assumption that we have relatively complete
`information about the trials involved in drug development programs. This is true of our data set, as we
`are analyzing relatively recent years where trial pre-registration is a prerequisite for publication in major
`medical journals and use of the studies as supporting evidence for drug applications.
`However, this assumption breaks down when we look at short windows of duration, for example,
`in a rolling window analysis to estimate the change in the POS over time. In such cases, we default
`back to the ‘phase-by-phase’ estimation to get an insight into the trend. This is done by considering
`only those drug development programs with phases that ended between t1 and t2 in the computation of
`the POS.
`
`POSp
`j,j+1
`
`POSp
`1,APP
`
`(t1, t2) = nj+1(t1, t2) − nj
`nj(t1, t2) − nj
`(cid:6)
`(t1, t2) =
`POSp
`j,j+1
`j∈{1,2,3}
`
`(t1, t2)
`m
`(t1, t2)
`
`ip
`
`(t1, t2)
`
`(3.6)
`
`(3.7)
`
`We further note that if no phase transitions are missing, the path-by-path and phase-by-phase methods
`should produce the same results, but the former will be more representative of actual approval rates if
`
`Downloaded from https://academic.oup.com/biostatistics/advance-article-abstract/doi/10.1093/biostatistics/kxx069/4817524
`by Alabama State University user
`on 09 February 2018
`
`Abraxis EX2074
`Apotex Inc. and Apotex Corp. v. Abraxis Bioscience, LLC
`IPR2018-00151; IPR2018-00152; IPR2018-00153
`
`
`
`Estimation of clinical trial success rates and related parameters
`
`5
`
`phase transitions are missing. We elaborate on this in Section A2 of the supplementary material available
`at Biostatistics online.
`Given our development-path framework, we can compute the POS using an algorithm that recursively
`considers all possible drug-indication pairs and determines the maximum observed phase. Reaching Phase
`i would imply that all lower phases were completed. To determine if a drug development program has
`been terminated in the last observed phase or is still ongoing, we use a simple heuristic: if the time elapsed
`between the end date of the most recent Phase i and the end of our sample exceeds a certain threshold
`ti, we conclude that the trial has terminated. Based on practical considerations, we set ti to be 360, 540,
`and 900 days for Phases 1, 2, and 3, respectively. For example, we assume that it takes approximately 6
`months to prepare documents for an NDA filing after a Phase 3 trial has been completed. Since the FDA
`has a 6-month period to decide if it wishes to follow-up on a filing, and an additional 18 months to deliver
`a verdict, this places the overall time between Phase 3 and Approval to about 30 months, hence we set
`t3 = 900 days. A pseudo-code for the algorithm is given in Figure S5 in Section A3 of the supplementary
`material available at Biostatistics online.
`In summary, our algorithm allows us to impute missing trial data, and by counting the number of phase
`transitions, we can estimate the phase and overall POS.
`
`4. RESULTS
`
`4.1. POS for all drugs and indications
`
`Table 1 contains our estimates of the aggregate POS for each clinical phase across all indications.
`Corresponding estimates from the prior literature are also included for comparison. We find that 13.8%
`of all drug development programs eventually lead to approval, which is higher than the 10.4% reported
`by Hay and others (2014) and the 9.6% reported by Thomas and others (2016). The overall POS pre-
`sented in this study, Hay and others (2014), and Thomas and others (2016) are much higher than the
`1% to 3% that is colloquially seen as it is conditioned on the drug development program entering Phase
`1. Our phase-specific POS estimates are higher in all phases. The largest increase is seen in POS2,3,
`where we obtained a value of 58.3% compared to 32.4% in Hay and others (2014) and 30.7% in
`Thomas and others (2016). These differences may be due to our method of imputing missing clinical
`trials.
`Table 2 contains phase and overall POS estimates by therapeutic group. The overall POS (POS1,APP)
`ranges from a minimum of 3.4% for oncology to a maximum of 33.4% for vaccines (infectious disease).
`The overall POS for oncology drug development programs is about two-thirds the previously reported
`estimates of 5.1% in Thomas and others (2016) and 6.7% in Hay and others (2014).
`A significantly different pattern emerges when we consider the phase POS for lead indications. The
`overall POS (POS1,APP) increases when considering only lead indications, which is in line with the findings
`by Hay and others (2014). However, while we find an increase in the POS for Phase 1 (POS1,2) and Phase 3
`(POS3,APP), we find a decrease in the POS for Phase 2 (POS2,3) when looking only at lead indications. The
`POS for lead indications may be lower than the POS for all indications if a company initiates clinical trials
`for many indications, and most of them move on to the next phase. Conversely, the POS for lead indications
`may be higher if many of the initiated clinical trials for the same drug fail. The practice of initiating clinical
`trials for multiple indications using the same drug is prevalent in the industry, as documented in Table S2
`in Section A5 of the supplementary material available at Biostatistics online. The relative performance of
`the various therapeutic groups remains the same when considering only lead indications, with oncology
`remaining the lowest performing group at 11.4% for POS1,APP. Finally, the overall POS for individual
`therapeutic groups when considering only lead indications shows mixed directions in comparison to the
`respective overall POS specific to the indication.
`
`Downloaded from https://academic.oup.com/biostatistics/advance-article-abstract/doi/10.1093/biostatistics/kxx069/4817524
`by Alabama State University user
`on 09 February 2018
`
`Abraxis EX2074
`Apotex Inc. and Apotex Corp. v. Abraxis Bioscience, LLC
`IPR2018-00151; IPR2018-00152; IPR2018-00153
`
`
`
`50
`
`835
`
`1103
`
`5764
`
`1993–2009(17years)
`
`2003–2011(9years)
`
`2006–2015(10years)
`
`2000–2015(16years)
`
`1316
`
`4736
`
`5820
`
`Unknown
`
`15102
`
`19.0%
`59.5%
`26.8%
`19.0%
`POSi,APP
`Phase-by-Phase
`
`50.0%50.0%58.4%58.4%60.0%
`32.4%16.2%39.5%23.1%45.0%
`64.5%10.4%66.5%15.3%71.0%
`POSi,i+1POSi,APPPOSi,i+1POSi,APPPOSi,i+1
`Phase-by-Phase
`
`Phase-by-Phase
`
`15.3%
`
`10.4%
`
`9.6%
`49.6%
`15.2%
`9.6%
`
`POSi,APP
`Phase-by-Phase
`
`21.6%
`
`6.9%
`
`13.8%
`
`59.0%59.0%59.0%59.0%70.0%70.0%49.6%
`58.3%35.1%38.2%11.2%55.9%26.4%30.7%
`66.4%13.8%38.8%6.9%75.8%21.6%63.2%
`POSi,i+1POSi,APPPOSi,i+1POSi,APPPOSi,i+1POSi,APPPOSi,i+1
`
`Path-by-Path
`
`Phase-by-Phase
`
`Path-by-Path
`
`companies
`Numberof
`data(time-span)
`Yearsofsource
`drugs
`Numberof
`Phase1toAPP
`Phase3toAPP
`Phase2to3
`Phase1to2
`
`Method
`
`6
`
`indications
`
`—lead
`
`others(2010)
`DiMasiand
`
`indications
`
`—lead
`
`others(2014)
`
`Hayand
`
`indications
`
`—all
`
`others(2014)
`
`Hayand
`
`indications
`
`—all
`
`others(2016)
`Thomasand
`
`(industry)
`indications
`
`lead
`
`Thisstudy—
`
`Thisstudy—allindications(industry)
`
`programsthatadvancefromonephasetoanother
`thisusingthealgorithmshowninFig.S5intheSupplementaryMaterial,whichtracesdrugdevelopmentprogramsandcalculatestheproportionof
`Table1.ComparisonoftheresultsofourarticlewithpreviouspublicationsusingdatafromJanuary1,2000,toOctober31,2015.Wecomputed
`
`Downloaded from https://academic.oup.com/biostatistics/advance-article-abstract/doi/10.1093/biostatistics/kxx069/4817524
`by Alabama State University user
`on 09 February 2018
`
`Abraxis EX2074
`Apotex Inc. and Apotex Corp. v. Abraxis Bioscience, LLC
`IPR2018-00151; IPR2018-00152; IPR2018-00153
`
`
`
`Estimation of clinical trial success rates and related parameters
`
`7
`
`Table 2. The POS by therapeutic group, using data from January 1, 2000, to October 31, 2015. We
`computed this using the path-by-path method. SE denotes the standard error
`
`All indications (industry)
`
`Phase 1 to Phase 2
`
`Phase 2 to Phase 3
`
`Phase 3 to Approval
`
`Overall
`
`Therapeutic group
`
`Total paths
`
`Oncology
`
`17 368
`
`Metabolic/
`Endocrinology
`Cardiovascular
`
`CNS
`
`Autoimmune/
`Inflammation
`Genitourinary
`
`Infectious disease
`
`Ophthalmology
`
`Vaccines
`(Infectious
`Disease)
`Overall
`
`All without
`oncology
`
`3589
`
`2810
`
`4924
`
`5086
`
`757
`
`3963
`
`674
`
`1869
`
`41 040
`
`23 672
`
`POS1,2, %
`(SE, %)
`
`Total paths
`
`POS2,3, % POS2,APP, %
`(SE, %)
`(SE, %)
`
`Total paths
`
`POS3,APP, % POS, %
`(SE, %)
`(SE, %)
`
`57.6
`(0.4)
`76.2
`(0.7)
`73.3
`(0.8)
`73.2
`(0.6)
`69.8
`(0.6)
`68.7
`(1.7)
`70.1
`(0.7)
`87.1
`(1.3)
`76.8
`(1.0)
`
`6533
`
`2357
`
`1858
`
`3037
`
`2910
`
`475
`
`2314
`
`461
`
`1235
`
`21 180
`
`14 647
`
`32.7
`(0.6)
`59.7
`(1.0)
`65.7
`(1.1)
`51.9
`(0.9)
`45.7
`(0.9)
`57.1
`(2.3)
`58.3
`(1.0)
`60.7
`(2.3)
`58.2
`(1.4)
`
`58.3
`(2.3)
`27.3
`(0.4)
`
`6.7
`(0.3)
`24.1
`(0.9)
`32.3
`(1.1)
`19.5
`(0.7)
`21.2
`(0.8)
`29.7
`(2.1)
`35.1
`(1.0)
`33.6
`(2.2)
`42.1
`(1.4)
`
`35.1
`(2.2)
`27.3
`(0.4)
`
`1236
`
`1101
`
`964
`
`1156
`
`969
`
`212
`
`1078
`
`207
`
`609
`
`7532
`
`6296
`
`35.5
`(1.4)
`51.6
`(1.5)
`62.2
`(1.6)
`51.1
`(1.5)
`63.7
`(1.5)
`66.5
`(3.2)
`75.3
`(1.3)
`74.9
`(3.0)
`85.4
`(1.4)
`
`59.0
`(0.6)
`63.6
`(0.6)
`
`3.4
`(0.2)
`19.6
`(0.7)
`25.5
`(0.9)
`15.0
`(0.6)
`15.1
`(0.6)
`21.6
`(1.6)
`25.2
`(0.8)
`32.6
`(2.2)
`33.4
`(1.2)
`
`13.8
`(0.2)
`20.9
`(0.3)
`
`66.4
`(0.2)
`73.0
`(0.3)
`
`Lead indications (Industry)
`
`Phase 1 to Phase 2
`
`Phase 2 to Phase 3
`
`Phase 3 to Approval
`
`Overall
`
`Therapeutic group
`
`Total paths
`
`POS1,2, %
`(SE, %)
`
`Total paths
`
`POS2,3, % POS2,APP, %
`(SE, %)
`(SE, %)
`
`Total paths
`
`POS3,APP, % POS, %
`(SE, %)
`(SE, %)
`
`Oncology
`
`Metabolic/
`Endocrinology
`Cardiovascular
`
`CNS
`
`Autoimmune/
`Inflammation
`Genitourinary
`
`Infectious Disease
`
`Ophthalmology
`
`Vaccines
`(Infectious
`Disease)
`Overall
`
`All without
`oncology
`
`3107
`
`2012
`
`1599
`
`2777
`
`2900
`
`568
`
`2186
`
`437
`
`881
`
`16 467
`
`13 360
`
`78.7
`(0.7)
`75.2
`(1.0)
`71.1
`(1.1)
`75.0
`(0.8)
`78.9
`(0.8)
`73.4
`(1.9)
`74.6
`(0.9)
`89.0
`(1.5)
`75.8
`(1.4)
`
`75.8
`(0.3)
`75.8
`(0.4)
`
`1601
`
`1273
`
`1002
`
`1695
`
`1862
`
`382
`
`1326
`
`302
`
`567
`
`10 010
`
`8409
`
`53.9
`(1.2)
`57.0
`(1.4)
`64.9
`(1.5)
`54.5
`(1.2)
`48.7
`(1.2)
`59.2
`(2.5)
`58.0
`(1.4)
`57.6
`(2.8)
`57.1
`(2.1)
`
`55.9
`(0.5)
`29.0
`(0.5)
`
`13.1
`(0.8)
`26.4
`(1.2)
`34.1
`(1.5)
`24.1
`(1.0)
`24.3
`(1.0)
`31.9
`(2.4)
`34.3
`(1.3)
`30.5
`(2.6)
`40.4
`(2.1)
`
`26.4
`(0.4)
`29.0
`(0.5)
`
`431
`
`535
`
`473
`
`648
`
`659
`
`176
`
`594
`
`124
`
`269
`
`3909
`
`3478
`
`48.5
`(2.4)
`62.8
`(2.1)
`72.3
`(2.1)
`63.0
`(1.9)
`68.6
`(1.8)
`69.3
`(3.5)
`76.6
`(1.7)
`74.2
`(3.9)
`85.1
`(2.2)
`
`67.7
`(0.7)
`70.0
`(0.8)
`
`11.4
`(0.7)
`21.3
`(1.0)
`26.6
`(1.2)
`19.3
`(0.9)
`20.3
`(0.9)
`25.3
`(2.0)
`26.7
`(1.1)
`30.7
`(2.7)
`31.6
`(1.7)
`
`21.6
`(0.4)
`23.4
`(0.4)
`
`Downloaded from https://academic.oup.com/biostatistics/advance-article-abstract/doi/10.1093/biostatistics/kxx069/4817524
`by Alabama State University user
`on 09 February 2018
`
`Abraxis EX2074
`Apotex Inc. and Apotex Corp. v. Abraxis Bioscience, LLC
`IPR2018-00151; IPR2018-00152; IPR2018-00153
`
`
`
`8
`
`C. H. WONG AND OTHERS
`
`4.2. POS of trials with biomarkers
`
`As the use of biomarkers to select patients, enhance safety, and serve as surrogate clinical endpoints has
`become more common, it has been hypothesized that trials using biomarkers are more likely to succeed.
`We test this hypothesis by comparing the POS of drugs with and without biomarkers.
`We perform two separate analyses. In the first, we investigate the use of biomarkers only for patient
`selection, as did Thomas and others (2016). In the second, we expand the definition of a biomarker trial
`to include those trials with the objective of evaluating or identifying the use of any novel biomarkers as
`indicators of therapeutic efficacy or toxicity, in addition to those that use biomarkers for patient selection.
`In our database, only 7.1% of all drug development paths that use biomarkers use them in all stages of
`development. As such, we adopt the phase-by-phase approach instead of using the path-by-path approach.
`This is done by modifying Algorithm 1 (see Figure S5 of supplementary material available at Biostatistics
`online) to increment counts only if there exists a biomarker trial in that phase. Furthermore, as 92.3%
`of the trials using biomarkers in our database are observed only on or after January 1, 2005, we do not
`include trials before this date to ensure a fair comparison of the POS between trials that do and do not use
`biomarkers.
`Table 3 shows only trials that use biomarkers to stratify patients. As can be seen, there is substantial
`variation in the use of biomarkers across therapeutic areas. Biomarkers are seldom used outside of oncol-
`ogy. Trials using biomarkers exhibit almost twice the overall POS (POS1,APP) compared to trials without
`biomarkers (10.3% vs. 5.5%). While the use of biomarkers in the stratification of patients improves the
`POS in all phases, it is most significant in Phases 1 and 2. (We caution against over-interpreting the results
`for therapeutic areas outside oncology due to their small sample size.) These findings are similar in spirit
`to the analysis by Thomas and others (2016), which also found substantial improvement in the overall
`POS when biomarkers were used.
`However, when we expanded the definition of a biomarker trial to include trials with the objective of
`evaluating or identifying the use of any novel biomarker as an indicator of therapeutic efficacy or toxicity,
`in addition to the selection of patients, we obtained significantly different results (see Table S3 in Section
`A6 of the supplementary material available at Biostatistics online). Instead of finding a huge increase
`in the overall POS, we find no significant difference. It may be that trials that attempt to evaluate the
`effectiveness of biomarkers are more likely to fail, leading to a lower overall POS compared to trials that
`only use biomarkers in patient stratification. Comparison of the two tables shows that new biomarkers are
`being evaluated in all therapeutic areas.
`We provide a more detailed analysis of the differences between our analysis and Thomas and others
`(2016) in Section A7 of the supplementary material available at Biostatistics online.
`
`4.3. POS of orphan drugs trials
`
`Table 4 contains POS estimates for drugs that treat rare diseases, also known as ‘orphan drugs’. The
`classifications for rare diseases are obtained from both EU and US rare disease resources: OrphaNet and
`NIH GARD. Rare diseases may belong to any therapeutic group, and the computation of the statistics for
`orphan drugs is identical to that used for the trials in Table 2.
`Broadly speaking, orphan drug development has significantly lower success rates, with only 6.2%
`of drug development projects reaching the market. Comparing these results against those for all drug
`development, we see that, while the Phase 1 POS increases from 66.4% to 75.9%, the Phase 2 and Phase 3
`success rates fall from 58.3% to 48.8% and from 59.0% to 46.7%, respectively, leading to a decline in the
`overall POS.
`Our data reveal that most orphan drug trials are in oncology. Our overall POS of 6.2% is much lower than
`the 25.3% reported in Thomas and others (2016). This discrepancy can be attributed to their identification of
`only non-oncology indications as ‘rare diseases’ and their use of the phase-by-phase method of computing
`
`Downloaded from https://academic.oup.com/biostatistics/advance-article-abstract/doi/10.1093/biostatistics/kxx069/4817524
`by Alabama State University user
`on 09 February 2018
`
`Abraxis EX2074
`Apotex Inc. and Apotex Corp. v. Abraxis Bioscience, LLC
`IPR2018-00151; IPR2018-00152; IPR2018-00153
`
`
`
`(0.2)
`(1.6)
`(0.2)
`(1.3)
`N.A.
`(1.3)
`(2.8)
`N.A.
`(2.8)
`(0.9)
`(16.8)
`(0.9)
`(1.5)
`N.A.
`(1.5)
`(0.6)
`(16.8)
`(0.6)
`(0.6)
`(6.4)
`(0.6)
`(1.0)
`(13.2)
`(1.0)
`(0.8)
`(13.9)
`(0.8)
`(0.2)
`(1.9)
`(0.2)
`
`5.7
`10.3
`5.5
`11.4
`N.A.
`11.4
`13.5
`N.A.
`13.6
`10.5
`29.6
`10.4
`6.7
`N.A.
`6.6
`6.3
`11.9
`6.3
`6.3
`8.3
`6.2
`9.5
`85.7
`9.3
`7.8
`5.7
`7.9
`2.1
`10.7
`1.6
`
`(0.6)
`(4.4)
`(0.6)
`(1.4)
`N.A.
`(1.4)
`(3.0)
`N.A.
`(3.0)
`(1.3)
`(0.0)
`(1.3)
`(3.2)
`N.A.
`(3.2)
`(1.5)
`(21.9)
`(1.5)
`(1.5)
`(12.9)
`(1.5)
`(1.6)
`(0.0)
`(1.6)
`(1.5)
`(10.3)
`(1.5)
`(1.4)
`(5.5)
`(1.4)
`
`59.0
`60.2
`59.0
`85.4
`N.A.
`85.4
`74.9
`N.A.
`74.9
`75.3
`100.0
`75.1
`66.5
`N.A.
`66.5
`63.7
`60.0
`63.7
`51.1
`53.3
`51.1
`62.2
`100.0
`62.2
`51.6
`20.0
`52.0
`35.5
`63.6
`33.6
`
`7532
`123
`7409
`609
`
`0
`
`609
`207
`
`0
`
`207
`1078
`
`9
`
`1069
`212
`
`0
`
`212
`969
`
`5
`
`964
`1156
`15
`1141
`964
`
`2
`
`962
`1101
`15
`1086
`1236
`77
`1159
`
`(0.4)
`(1.7)
`(0.4)
`(1.7)
`(0.0)
`(1.7)
`(2.9)
`(27.2)
`(2.9)
`(1.2)
`(9.6)
`(1.2)
`(2.7)
`N.A.
`(2.7)
`(0.9)
`(12.8)
`(0.9)
`(1.0)
`(7.0)
`(1.0)
`(1.5)
`(0.0)
`(1.5)
`(1.2)
`(35.4)
`(1.2)
`(0.5)
`(1.8)
`(0.5)
`
`27.4
`38.6
`26.8
`32.6
`0.0
`32.9
`34.7
`33.3
`34.7
`34.9
`44.4
`34.7
`28.9
`N.A.
`28.9
`25.5
`35.7
`25.4
`30.2
`28.6
`30.2
`38.2
`100.0
`37.9
`34.0
`50.0
`33.9
`20.3
`38.8
`17.4
`
`15009
`
`840
`
`14169
`
`766
`
`5
`
`761
`277
`
`3
`
`274
`1480
`27
`1453
`287
`
`0
`
`287
`2120
`14
`2106
`2092
`42
`2050
`1032
`
`5
`
`1027
`1440
`
`2
`
`1438
`5515
`742
`4773
`
`(0.3)
`(1.4)
`(0.3)
`(1.8)
`N.A.
`(1.8)
`(3.7)
`(0.0)
`(3.7)
`(1.1)
`(19.2)
`(1.1)
`(2.5)
`(17.9)
`(2.5)
`(1.0)
`(16.6)
`(1.0)
`(1.0)
`(7.7)
`(1.1)
`(1.4)
`(13.2)
`(1.4)
`(1.3)
`(18.7)
`(1.3)
`(0.4)
`(1.5)
`(0.5)
`
`35.2
`44.5
`34.7
`40.8
`N.A.
`40.8
`51.9
`0.0
`52.2
`39.8
`66.7
`39.7
`34.9
`80.0
`34.3
`39.0
`55.6
`38.9
`40.7
`54.8
`40.4
`39.9
`85.7
`39.6
`44.6
`57.1
`44.5
`29.7
`43.5
`28.0
`
`21255
`1213
`20042
`
`733
`
`0
`
`733
`181
`
`1
`
`180
`1967
`
`6
`
`1961
`364
`
`5
`
`359
`2515
`
`9
`
`2506
`2223
`42
`2181
`1248
`
`7
`
`1241
`1539
`
`7
`
`1532
`10485
`1136
`9349
`
`Nobiomarker
`All
`Withbiomarker
`Nobiomarker
`All
`Withbiomarker
`
`Ophthalmology
`
`All
`Withbiomarker
`Nobiomarker
`All
`
`Overall
`
`(infectiousdisease)Withbiomarker
`Vaccines
`
`InfectiousdiseaseNobiomarker
`
`All
`Withbiomarker
`Nobiomarker
`All
`Withbiomarker
`Nobiomarker
`All
`Withbiomarker
`Nobiomarker
`All
`Withbiomarker
`Nobiomarker
`All
`Withbiomarker
`Nobiomarker
`All
`Withbiomarker
`Nobiomarker
`
`Genitourinary
`
`inflammation
`Autoimmune/
`
`9
`
`CNS
`
`Cardiovascular
`
`endocrinology
`Metabolic/
`
`Oncology
`
`POS3,APP,%(SE,%)POS,%(SE,%)
`
`transitions
`Totalphase
`
`POS2,3,%SE,%
`
`transitions
`Totalphase
`
`POS1,2,%(SE,%)
`
`transitions
`Totalphase
`
`Therapeuticgroup
`
`Overall
`
`Phase3toapproval
`
`Phase2toPhase3
`
`Phase1toPhase2
`
`betweentrialsusingandnotusingbiomarkers.SEdenotesstandarderror
`usingbiomarkers(92.3%)theirstatusisobservedonlyonorafterJanuary1,2005,thechoiceofthetimeperiodistoensureafaircomparison
`usingthephase-by-phasemethod.Theseresultsconsideronlytrialsthatusebiomarkersinpatientstratification.Sinceforthemajorityoftrials
`Table3.POSofdrugdevelopmentprogramswithandwithoutbiomarkers,usingdatafromJanuary1,2005,toOctober31,2015,computed
`
`Biomarkers
`
`Downloaded from https://academic.oup.com/biostatistics/advance-article-abstract/doi/10.1093/biostatistics/kxx069/4817524
`by Alabama State University user
`on 09 February 2018
`
`Abraxis EX2074
`Apotex Inc. and Apotex Corp. v. Abraxis Bioscience, LLC
`IPR2018-00151; IPR2018-00152; IPR2018-00153
`
`
`
`10
`
`C. H. WONG AND OTHERS
`
`Table 4. The POS of orphan drug development programs. We computed the results using the path-by-path
`method. While we used the entire data set from January 1, 2000, to October 31, 2015, it has to be noted
`that there are only 3548 data points relating to orphan drugs, with the majority (95.3%) of the trials’
`statuses observed on or after January 1, 2005. SE denotes standard error
`
`Orphan drugs (industry, all indications)
`
`Phase 1 to Phase 2
`
`Phase 2 to Phase 3
`
`Phase 3 to Approval
`
`Overall
`
`Therapeutic group
`
`Oncology
`
`Total paths
`1245
`
`Metabolic/ endocrinology
`
`Cardiovascular
`
`CNS
`
`Autoimmune/inflammation
`
`Genitourinary
`
`Infectious disease
`
`Ophthalmology
`
`Vaccines (infectious disease)
`
`Overall
`
`All except oncology
`
`89
`
`115
`
`160
`
`228
`
`14
`
`157
`
`19
`
`57
`
`2084
`
`839
`
`45
`
`58
`
`96
`
`114
`
`13
`
`104
`
`POS1,2, %
`(SE, %) Total paths
`72.0
`535
`(1.3)
`84.3
`(3.9)
`69.6
`(4.3)
`85.0
`(2.8)
`76.3
`(2.8)
`100.0
`(0.00)
`89.2
`(2.5)
`73.7
`(10.1)
`89.5
`(4.06)
`75.9
`(0.9)
`81.5
`(1.3)
`
`7
`
`43
`
`1015
`
`480
`
`POS2,3