`
`RESEARCH
`
`APPLICA TION NUMBER:
`
`21-2 72
`
`STATISTICAL REVIEW! S!
`
`
`
`UniprostTM (treprostinol sodium)— NDA 21-272
`
`Page 1 of 20
`
`STATISTICAL REVIEW AND EVALUATION
`
`.
`
`NDA #:
`Applicant:
`Name of Drug:
`Indication:
`Document reviewed:
`
`Date of submission:
`Statistical Reviewer:
`Medical Reviewer:
`
`1. Introduction
`
`21-272
`United Therapeutics Corporation
`UniprostTM (treprostinol sodium)
`Treatment for pulmonary arterial hypertension
`Volumes 2.1, 2.24, and 2.27—2.50
`
`October 16, 2000
`John Lawrence, Ph.D. (HFD—710)
`Abraham Karkowsky, M.D. (HFD—l 10)
`
`Uniprostm, or UT-15, is a structural analog of epoprostenol (FlolanQ) with a
`similar pharmacological profile. Flolan has been approved for the chronic treatment of
`patients with primary pulmonary hypertension and has been used to treat patients with
`pulmonary hypertension associated with other conditions. Unlike Flolan, Uniprost is
`chemically stable at room temperature and it has a longer half-life than Flolan. For these
`reasons, the sponsor believes that Uniprost would improve risks associated with treatment
`and should be considered as an alternative therapy for pulmonary arterial hypertension
`(PAH). There were two Phase III studies conducted by the sponsor to support the safety
`and efficacy of the treatment- Studies P01:04 and P01 :05. ‘
`
`2. Study Design
`
`The design of Studies P01 :04 and P01 :05 were identical. Each study was a
`multicenter, double-blind, parallel—group study. Patients between the ages of 8 and 75
`were eligible for each study if they had a current documented diagnosis of PAH. On Day
`1 of the Screening Period, routine baseline assessments were performed. On Day 2, the
`baseline Six-Minute Walk Test was administered. Patients whose baseline exercise
`
`capacity was less than 50 m or greater than 450 m were excluded from entering the
`Treatment Phase. Patients were randomized within strata determined by dichotomous
`levels of etiology of the disease (primary PH/ secondary PH) and baseline exercise
`capacity (low = 50-150 m/ high = 151-450 m). Randomization among patients with
`secondary PH was further stratified by use of vasodilators. The lZ-Week Treatment
`Phase began immediately afier baseline assessments and randomization on Day 2. Six-
`Minute Walk Tests were scheduled at Day 9, Day 44, and Day 87.
`
`In order to select the sample size, an estimate of the expected treatment effect was
`made using data from a study using the active treatment Flolan. The treatment effect in
`
`
`
`UniprostTM (treprostinol sodium)- NDA 21-272
`
`Page 2 of 20
`
`the Flolan study was an improvement of 45 m in change from baseline compared to
`placebo. Assuming a treatment effect for Uniprost of 55 m over placebo, it was expected
`that a sample size of 210 in a single study would provide a 95% chance of rejecting the
`null hypothesis at 0t=0.05. So, the actual sample sizes of 224 in Study P01 :04 and 246 in
`P0] :05 should have been adequate if the estimate of the treatment effect was reasonable.
`
`0f the 470 patients randomized in both studies, 233 were assigned to receive the
`active treatment and 237 received the placebo. One patient assigned to the placebo group
`never received treatment. The remaining 469 patients constitute the modified Intent-To-
`Treat population (m1T7). In the mII'I' population, the average age was 44.5, there were
`382 females and 87 males, 396 Caucasians, 21 Blacks, l3 Asians, 33 Hispanics, 2 Native
`Americans, and 4 from a race other than those listed.
`
`Patients received an initial dose of Uniprost or placebo of 1.25 ng/kg/min. This
`was the maximum allowable dose at the end of Week 1, but could be decreased to a
`tolerated dose. Following Week 1, patients were contacted weekly to assess whether
`changes in dosage were warranted. The dose was increased if symptoms did not improve
`and was reduced at the onset of any adverse experience that was judged to be related to
`study drug or there were changes in hemodynamics, vital signs, or clinical signs or
`symptoms that warranted reductions.
`
`3. Primag Efficacy Variable
`
`The primary endpoint of the two studies was change in exercise capacity at Week
`12 as measured by distance walked in six minutes.
`
`4. Secondagy Efficacy Variables
`
`Three principal reinforcing endpoints were prospectively identified: signs and
`symptoms of PAH, Dyspnea-Fatigue Rating, and an assessment of the occurrence of
`death, transplantation, or discontinuation from study drug due to clinical deterioration.
`Hemodynamics and Borg Dyspnea Score were defined as secondary endpoints.
`
`5. Protocol Specified Planned Statistical Analysis
`
`The primary analysis was a nonparametric analysis of covariance using the mITT
`population and the pooled data from the two studies. There is no provision for analyzing
`patients in the ml77' population with no post-baseline walking distances. First, separate
`least squares regression models were fit to the Week 1, Week 6, and Week 12 distance
`walked as a function of baseline distance walked, center, etiology of PH (primary or
`secondary), and vasodilator use at baseline. On p. 30 of the Final Analysis Plan [V0].
`2.33] an additional covariate for use of steroids to treat PHT at baseline is included.
`
`
`
`UniprostTM (treprostinol sodium)— NDA 21-272
`
`Page 3 of 20
`
`However, this covariate is not listed on p. 90 of the Study Report [Vol 2.27].
`Standardized mid-ranks (also known as modified ridit scores), defined as
`rank/(# observations + l), were determined fiom the residuals from the ordinary least
`squares regression. Missing values were imputed by carrying forward the standardized
`midrank from the last valid observation. The lowest standardized rank (0) was assigned
`to deaths, transplants, or clinical deterioration. Standardized mid-ranks were then
`recalculated and compared between treatment groups using the Cochran-Mantel-Haenzsel
`procedure mean score statistic with table scoresstratified by the stratification factors used
`during randomization [Source: Vol. 2.2 7 pages 88-92].
`
`According to a letter from the sponsor dated March 23, 2000, the analysis plan
`was modified slightly: if an exercise test is missing because “patient was too critically
`ill”, the lowest standardized rank will be used for the nonparametric analysis.
`
`The null hypothesis of no treatment difference was to be rejected if the two-sided
`p-value from the pooled analysis was less than 0.049 and both of the p—values from the
`individual studies were less than 0.049. This is the traditional standard for two
`
`confirmatory studies with an adjustment because the sponsor wanted to test the null
`hypothesis within the subgroup of PPH patients at 0t=0.001. If the global null hypothesis
`was not rejected, then the protocol states the null hypothesis would be rejected if the
`p-
`value from the pooled analysis was less than 0.01 and at least one of the analyses from a
`single study had a p-value less than 0.049. This gives the sponsor a second chance to
`reject the null hypothesis. This issue is discussed more thoroughly in Section 7.
`
`>
`’3O m
`2:1:-
`ON
`a”.4
`22
`a;
`3:
`-<
`
`6. Characteristics of Patients at Baseline and Dropouts
`
`The baseline characteristics of the patients in the two treatment arms for the two
`studies are in Table 6.1. There was no significant difference between the two treatment
`arms with respect to any of these characteristics.
`
`APPEARS THIS WAY
`0" ORIGINAL
`
`Table 6.] Characteristics of the patients in the two groups at baseline. For continuous
`variables, this table shows the group mean i standard error of mean. [Source: Vol. 2.27,
`Tables 11.2.1, 11.2.2.1, and 11.2.2.4]
`
`
`
`UniprostTM (treprostinol sodium)— NDA 2l -272
`
`Page 4 of 20
`
`
`Lni mist (iron 3
`
`Placebo (lrou )
`
`15.5
`
`21.6
`
`85
`
`84
`
`4.3 i 0.5
`
`3.3 i 0.4
`
`Characteristic
`
`N A
`
`ge (years)
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`""
`
`""
`
`""
`
`""
`
`Limited Scleroderma %
`
`Mixed Connective Tissue Disease %
`
`S stemic Lu-us E hematosus %
`
`Overla S
`
`drome %
`
`"" con enital s stemic-to-uulmon
`
`shunts %
`
`82
`
`5
`
`3
`
`0.4
`
`25
`
`12
`
`2
`
`22
`
`Distance walked at baseline (m)
`
`326 i 5.5
`
`
`327 i 5.7
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`In the Uniprost group, 200 patients completed the 12 weeks of treatment. 6
`patients discontinued due to clinical deterioration, 18 withdrew for adverse experiences, 7
`died on study drug, and 2 withdrew consent. In addition to the 7 patients who died on
`Study Drug, 2 more patients died within 12 weeks from being randomized after they had
`withdrew from the study. A total of 13 patients withdrew for death, transplantation, or
`clinical deterioration [Source: Vol. 2.27 Tables 10.1A, 11.4.1.2.3 and 12.5.5.].
`
`In the placebo group, 221 patients completed the 12 weeks of treatment, 6 patients
`deteriorated, l withdrew for adverse experiences, 7 died on study drug, 1 patient had a
`transplant, and l withdrew consent. In addition to the 7 patients who died on Study Drug,
`3 more patients died within 12 weeks from being randomized afier they had withdrew
`from the study. A total of 16 patients withdrew for death, transplantation, or clinical
`deterioration [Source: Vol. 2.27 Tables 10.1.4, 11.4.1.2.3 and 12.5.5.].
`
`In the mITT population, one patient did not have any exercise tolerance
`measurements post baseline, 455 patients had a Six-Minute Walk Test at Week 1, 468
`patients had a Six-Minute Walk Test at Week 6, and 419 patients had a Six-Minute Walk
`Test at Week 12 [Source: Vol. 2.27 Tables 11.4.1.1.23, 11.4.1.1.4G, and 11.4.1.1.4H].
`
`7. Statistical Comments About the Analysis Plan
`
`The decision to impute a worst possible score for those patients who died or
`discontinued for transplantation or clinical deterioration is reasonable. A nonparametric
`
`
`
`
`
`Uniprostm (treprostinol sodiurn)— NDA 21-272
`
`Page 5 of20
`
`analysis is suitable because we can then assign a worst score, or a rank of 0, for these
`patients. It might be more appropriate to rank all the patients who died below those who
`discontinued for clinical deterioration and those patients, in turn, below all those who
`completed the study. The relative ranks among those patients who died and among those
`patients who discontinued for clinical deterioration can be determined by length of time
`in the study. However, there were roughly the same number of patients in each arm who
`died and discontinued for clinical deterioration, so this will not likely have an impact
`here.
`
`However, there was a substantial imbalance in the number of patients who
`discontinued the study due to serious adverse experiences (18 versus 1). These patients
`all had their last rank carried forward in the analysis, rather than a worst rank assigned.
`When it is not entirely clear whether serious adverse experiences can also be associated
`with clinical deterioration or vice versa, assigning these patients a worst rank may be
`needed. As a supportive analysis, it may be illustrative to see the impact of using the last
`rank carried forward for these patients by assigning a rank of 0 for these patients also.
`
`A more important issue is the overall Type I error rate for the proposed analysis in
`this submission. First, consider the traditional standard for approval at the FDA based on
`two confirmatory trials. Even if the efficacy of a treatment is shown convincingly in one
`study, the agency likes to see replication in a second study because we will then be in a
`better position to infer that the results generalize to the entire population of patients with
`the disease. The overall Type I error rate (or false positive rate) is the chance that both
`studies will have a p-value less than 0.05 and the results of both studies are in the same
`direction. If the treatment effects in the two studies are identically 0, then the chance that
`both p-values will be less than 0.05 and both treatment effects are in the same direction is
`0.00125]. For this reason, the Division of Cardio-Renal Drugs has often advised
`sponsors that one study with a p—value less than 0.00125 may be sufficient for approval.
`When there is no between trial variability in the treatment effect, these two standards are
`indeed equivalent.
`
`Now, consider the approach that is used in this submission. We will reject the
`null hypothesis of no treatment effect under either of these two circumstances:
`
`I) both studies have p-values <0.049 and the pooled data has a p-value <0.049
`2) either study has a p—value <0.049 and the pooled data has a p-value <0.01
`
`Furthermore, if neither 1) nor 2) occurs, we will reject the null hypothesis of no treatment
`effect in the subgroup of PPH patients under the following condition:
`
`3) the data on PPH patients pooled from both studies has a p-value <0.001.
`
`' P[first p-value <0.05 and second p-value <0.05 and direction is the same]
`= P[first p-value <0.05] ' P[second p~vaiue <0.05]* P[direction is the same] = 0.05‘005/2 = 0.00125
`
`
`
`UniprostTM (treprostinol sodium)— NDA 21-272
`
`Page 6 of 20
`
`According to this reviewer‘s simulation, if 40% of the patients have PPH then the overall
`Type I error rate for the criteria used in this submission is 0.01. However, it is widely
`recognized that even when the designs are identical, the treatment effect may vary from
`study to study. If there is any between trial variability in the treatment effect, the chance
`that any of the three conditions will hold is inflated. The appendix of this review
`illustrates this in more detail.
`
`An overall Type I error rate of 0.01 is already more liberal than the error rate of
`0.00125 for the traditional FDA approach. Now, if we include other conditions that were
`not pre-specified under which the sponsor can claim that efficacy was demonstrated, the
`Type I error rate will be inflated even further. For instance, suppose one p-value from an
`individual study had been 0.009 and the second had been 0.10 and the p-value from the
`pooled data was 0.015. Someone might look at this and argue that the drug should be
`approved because Condition 2 was almost satisfied since the p-value from one study was
`significantly less than 0.049 and the second was in‘the right direction and the p—value
`from the pooled data was really close to 0.01. However, if we allow this to happen, then
`it is possible that our minds cannot stretch wide enough to imagine all of the possible
`scenarios that are "close enough" and therefore, we have no hope of calculating, much
`less controlling, the real Type I error rate.
`
`There are many possible ways to calculate an overall p-value fi'om this experiment
`and therefore, there is no correct way to do this. In order to make things simple, assume
`that the statistic is univariate and has a standard normal distribution under the null
`
`hypothesis. We create a test by prospectively specifying a critical region, which defines
`the set of values for the statistic for which the null hypothesis will be rejected. If the
`significance level is 0.05, then the probability of observing a value in the critical region is
`0.05 if the null hypothesis is true. Now, suppose we prospectively define the critical
`region to be all numbers greater than 1.96 in absolute value, but when we actually do the
`experiment, we observe a value of 1.7. The p-value is the probability of observing
`something as extreme or more extreme than 1.7. In this case, nobody would argue that
`any value greater than 1.7 in absolute value is more extreme, so the p-value is 2 d>(-1.7) =
`0.089.
`
`The situation here is more complex because the outcome is not univariate. There
`are outcomes fiom two studies and the outcome of the data pooled together and the
`outcome from the analysis of the PPH subgroup. When the outcome is not univariate, it
`is harder to see what is more extreme than what was actually observed. Clearly, if the
`observed value is not in the critical region, then anything in the critical region would have
`to be considered more extreme. The approach that would give the smallest p-value is‘to
`assume that only the exact outcome that was observed or anything in the critical region is
`,_ counted in computing the p-value. Figure 7.1 illustrates in two dimensidns several
`possible regions that could be used to calculate the p-value. In these figures, the gray area
`represents the region that is as extreme or more extreme in calculating the p—value. Figure
`7.1A corresponds to the region where only the critical region and the actual observed
`value are considered to be as extreme or more extreme. Figure 7. 13 corresponds to the
`
`
`
`UniprostTM (treprostinol sodium)- NDA 21-272
`
`Page 7 of 20
`
`region where only the critical region‘and a'very small set of values that connect the
`critical region to the actual observed value are considered as extreme or more extreme.
`The other two figures allow more scenarios that were not actually observed to be
`considered as extreme or more extreme that what was actually observed. Nobody knows
`the right way to calculate the p—value and that is why we have to prospectively specify
`what outcomes we might observe in this experiment that would convince us that the null
`hypothesis does not adequately explain the data.
`
`The goal of the agency is not only controlling the Type I error rate, i.e. making
`sure that ineffective drugs are not approved. It is also important to make sure that
`effective drugs do get approved. Is the bar set too high in the protocol? Assume that the
`real average treatment effect across studies is 45 m. This represents a 14% increase from
`baseline assuming that the placebo group is unchanged and is equal to the observed effect
`in the Flolan study and is a smaller effect than the sponsor expected for this drug. The
`probability that Conditions 1, 2, or 3 would be satisfied is 0.999. Using the FDA
`traditional standard (similar to Condition 1 alone), the probability of two positive trials is
`96%. So, the bar is not set too high by either the traditional FDA criteria or the actual
`criteria stated in the protocol. To put it simply, a drug that allows patients in this
`
`Figure 7.1 Different regions that could define values as extreme or more extreme than
`the observed value.
`
`Observed
`
`population to improve walking distance by an average of 45 In more than placebo should
`have no trouble demonstrating this in these two studies. The reader is again referred to
`
`
`
`UniprostTM (treprostinol sodium)— NDA 21 ~272
`
`Page 8 of 20
`
`the appendix for an illustration of the power when there is between study variability in the
`treatment effect.
`
`8. Primagy Analysis
`
`‘
`
`Using the pie-specified analysis the study report indicates that the p-values from
`the primary analysis for the pooled studies, Study P01204 alone, and Study P01 :05 alone
`were 0.0064, 0.0607, and 0.0550 respectively. The median change from baseline in the
`treatment group using the pooled data was 10 m and in the individual studies, the median
`changes were 3 m and 16 m. The median change from baseline in the placebo group using
`the pooled data was 0 m and in the individual studies the median changes were 1 m and -
`3 in [Source: Vol. 2.27 Table 11.4.1.I.1A]. The results of the sponsor's analysis are
`summarized in Table 8.1.
`
`Table 8.1. Results from sponsor's primary analysis. Baseline and Week 12 walking
`distance and change from baseline are summarized by median and the first and third
`quartiles. [Source: Vol. 2.27 Tables 11.2.2.4 and 11.4.1.1.IA except where noted].
`
`
`
`{Grainy}
`l‘lastrlint‘ Work 1233
`Change
`Placebo
`349 m
`346 m
`1.0 m
`
`lM-aluc
`
`0.0064
`
`0.0607
`
`n=119
`
`268, 396
`
`304, 404
`
`-22.0, 50.0
`
`Pooled
`
`' n=236
`
`272, 396
`
`277, 400
`
`44.5, 32.5
`
`0.0550
`
`
`
`n=232
`
`264 395
`
`304 402
`
`—24.5 47.5
`
`
`
`
`This colurrm was produced by the FDA reviewer from all the observed data at Week 12 for completeness
`of the table (no imputation was done for missing values). The reviewer could not find this information in
`the sponsor's report.
`
`The FDA's interpretation of the primary analysis differs from the sponsor's in a
`few minor ways. These differences arise fiom issues that were not prospectively defined
`in the protocol.
`
`Patient number 7004: This patient was assigned to treatment and had a baseline walking
`distance of 345 m. This patient had a Week 1 walking distance of 393 In and a Week 12
`walking distance of 398 m. No Week 6 walking distance was measured because the
`patient was too critically ill. The sponsor uses the Week 12 walking distance to calculate
`
`..-..
`
`-..‘._1». f... -.-_.2‘._...-w,...r...-_..:.. g . ___
`
`. r-..
`
`. .. _
`
`.
`
`._
`
`.
`
`-..
`
`
`
`Study
`
`P01 :04
`
`P01:05
`
`n=111
`Treatment
`
`n=113
`Placebo
`
`n=125
`Treatment
`
`272, 407
`341 m
`
`264, 390
`338 m
`
`275, 400
`340 m
`
`306, 400
`348 m
`
`272, 377
`
`293, 400
`
`-53.0, 30.8
`3.0 m
`
`-27.4, 36.6
`-3.0 m
`
`-37.0, 35.0
`16.0 m
`
`
`
`
`
`'
`
`
`
`
`
`
`UniprostTM (treprostinol sodium)— NDA 21-272
`
`Page 9 of 20
`
`a score for this patient while the FDA analysis irnputes a worst score for this patient. The
`letter dated March 23, 2000 states: In addition to the descriptions ofthe handling of
`missing data in Table 8.3.1 on page 14 ofthefinal analysis plan, ifan exercise test is
`missing because "patient was too critically il ”, the lowest standardized rank will be
`usedfor the nonparametric analysis and a distance of0 meters will be usedfor the
`parametric analysis. Data missingfor any other reason will have last standardized ranks
`carriedforwardfor the nonparametric analyses and last observations carriedforward
`for the parametric analyses. The literal interpretation of this is that if any ETT is
`missing, the patient gets a worst score, not only if the Week 12 ETT is'missing. This is
`not just a technical semantic argument- it is difficult to understand why patients who were
`too ill to walk at Week 12 should be analyzed differently than those who were too ill to
`walk at Week 6 because there was already a method defined prospectively for imputing a
`score for patients with no walking distance measured at Week 12.
`
`Patient number 10507: This patient was assigned to the active treatment arm and had a
`baseline walking distance of 183 m but no subsequent walking distances were measured.
`The patient withdrew on day 9 for an adverse event. The last day of follow-up on the
`patient was 39 days after randomization. There are several ways to handle this patient
`including: a) analyze the data without this patient b) fit a regression of baseline vs. the
`remaining covariates and carry forward the standardized rank for this patient c) carry
`forward a worst rank. The sponsor uses the first approach. Since this patient is included
`in the mITT population, it does not seem reasonable to ignore this patient. There is a
`strong argument for imputing a worst possible score because of the circumstances.
`Approach b) is in the same spirit as the planned analysis. Patients who do not have
`complete followup are imputed by carrying forward the last value afier adjusting for
`several covariates. This approach is not perfect because patients with lower baseline
`tended to show greater improvement. Therefore, this approach will tend to carry forward
`a smaller rank than that which would be used if post—baseline walking distances were
`observed. In this case, approach b) would carry forward a standardized rank of 0.138 for
`this patient. This is the approach used in the FDA analysis.
`
`Patient number 52006: This patient was assigned to placebo and had only the first
`walking distance measured post-baseline. The patient died within 100 days of
`randomization. Since the assessment window for all measurements at week 12 extends to
`
`Study Day 100, this patient is assigned a worst possible score in the FDA analysis. The
`last observed standardized rank at Week 1 is used by the sponsor.
`
`Patient number 61008: This patient was assigned to placebo and had a baseline walking
`distance of 357 m. This patient had a Week 1 walking distance of 338 m and a Week 12
`walking distance of 256 In. No Week 6 walking distance was measured because the
`patient was too critically ill. The sponsor uses the Week 12 walking distance to calculate
`a score for this patient while the FDA analysis imputes a worst score for this patient.
`
`Patient number 18501: This patient was assigned to the placebo group and had a baseline
`walking distance of 362 m. Subsequent walking distances were measured 35, 55, and 71
`
`x g li
`
`:E
`s;
`iA
`ill
`l'1:
`
`
`
`
`
`
`
`
`UniprostTM (treprostinol sodium)— NDA 21-272
`
`Page 10 of 20
`
`days after randomization. The first two of these fell within the window that would be
`counted in the Week 6 visit, but the last did not fall within the Week 6 or the Week 12
`window. The idea of the imputation used in the primary analysis is to compare
`measurements between individuals at the same time in the study (using residuals from the
`linear regression) and to carry the ranks forward. There was no other patient that had a
`measurement between the windows for Week 6 and Week 12. Hence, it is not possible to
`calculate a rank for the measurement on day 71 for this patient. So, two alternatives are i)
`carry the actual observation at day 71 to Week 12 and do the entire analysis as if it were a
`Week 12 observation or ii) find the rank of the residual for the day 55 observation and
`carry this rank forward (in other words, ignore the unscheduled measurement at day 71
`entirely). The sponsor uses alternative i) and the FDA uses alternative ii).
`
`Patient 60005: Assigned to active treatment, dropped informed consent after 46 days.
`The patient was followed-up after withdrawal and had a 12 week walking distance
`measured. The sponsor's analysis uses the measurement at week 12 while the FDA
`carries the standard rank from week 6 (the last observation before the patient withdrew).
`
`Patients 2004, 52003 and 52004: All were assigned to placebo and correctly received
`placebo treatment for the first 6 weeks on study. However, they were inadvertently
`switched to active treatment for the last 6 weeks of the study. The sponsor carries
`forward the standardized rank from week 6 for these patients, while the FDA uses the
`week 12 walking distance.
`
`Both the FDA and the sponsor's analysis begin by finding the standardized ranks
`of the residuals from linear regression models at Weeks 1, 6, and 12. These regression
`models included main effects for etiology, baseline distance walked, vasodilator use, and
`center. The residuals from these linear regression models were ranked and the last
`observed rank was carried forward to Week 12 but a value of 0 (worst case) was assigned
`for patients who died or discontinued for clinical deterioration or were too ill to take the
`ETT. The pre-specified analysis is the CMH (mean score) statistic adjusted for the
`stratification variables used at randomization. The Final Study Report indicates that
`because of the low number of patients with low baseline walking distance (defined as less
`than 150 m), the primary analysis was modified to not include baseline as a covariate.
`The FDA analysis uses baseline distance as a covariate and finds the significance of the
`mean score statistic from the asymptotic chi-square approximation except in the case of
`the P01 :05 study where the permutation distribution was used. The reason for the use of
`the permutation distribution to find the p-value is that in one stratum, there was only one
`patient and this causes one term in the asymptotic formula to have a zero denominator.
`The p-value from the FDA analysis for the data from both studies pooled together is
`0.0153 and the p-values from the individual studies are 0.104 and 0.081.
`
`The analysis that uses data only fi'om those patients with PPH did not
`convincingly show a benefit in this subgroup (p=0.0433 for both studies pooled together
`[Source: Study Report Table 11.4.1.1.5, not verified by the FDA]).
`
`
`
`UniprostTM (treprostinol sodium)— NDA 21-272
`
`Page 11 of 20
`
`Whether one uses the sponsor's or the FDA's primary analysis, it is clear that the
`pre-specified criteria was technically not met, but there appears to be some evidence of
`efficacy in these two studies. In Sections 9 and 10, some supportive analyses are
`presented that may be helpful in making a decision about approval.
`
`9. Spopsor's Supportive Analysis of Primagy Efficacy Variable
`
`The report contains several planned and unplanned supportive analyses of the
`primary endpoint. This review will discuss two of these supportive analyses. For the first
`supportive analysis, the primary analysis was repeated using the per-protocol population.
`All patients who did not follow the protocol, using pre-specified criteria, were removed
`in this analysis. The p-values fiom the individual studies are 0.103 and 0.086 and the p-
`value for the pooled data is 0.015 [Source: Vol. 2.27 Table 11.4.1.1.23].
`
`For the second supportive analysis, the mITT population was used but the method
`of imputing missing values was modified. Recall that for the primary analysis, worst
`possible ranks were imputed for discontinuations due to death, transplants, or clinical
`deterioration while the last rank was carried forward for discontinuations due to other
`
`reasons. In this supportive analysis, the last rank was carried forward for all patients
`without a measurement at Week 12, regardless of the reason. Using this approach, the
`p-values for the individual studies were 0.083 and 0.075 and the p—value from the pooled
`data is 0.011 [Source: Vol. 2.27 Table 11.4.1.1.4B].
`
`In summary, both- of these supportive analyses tend to show the same thing as the
`primary analysis by the sponsor. That is, both studies taken individually show that the
`drug was numerically, but not significantly, better than placebo. Since the results of the
`two studies are consistent, when the data fi'om both studies are combined, the p-value
`, from the pooled analysis is smaller than either p-value from the individual studies.
`
`10. FDA's Supportive Analysis of Primapy Efficacy Variable
`
`The primary analysis is a nonparametric analysis. One of the main arguments for
`a nonparametric analysis is that a patient who dies or discontinues for clinical
`deterioration should be counted as having a worse outcome than any patient who
`completed the study. If we do not use ranks, then we would have to answer the question
`of what walking distance at Week 12 should we assign to these patients. The use of ranks
`takes some of the subjectivity out of the process. One of the drawbacks of this
`nonparametric analysis is that it does not yield an easily interpretable estimate of the
`treatment effect.
`
`A linear mixed efi‘ect model can be used here as an exploratory analysis in order
`to see the treatment effect over time. The model that we will use makes the assumption
`that those patients who discontinue early- regardless of the reason- would have walking
`distances similar to those patients who completed the study. In other words, if a patient
`in the placebo group had a Week 6 walking distance but no Week 12 measurement, then
`
`
`
`UniprostTM (treprostinol sodium)- NDA 21—272
`
`Page 12 of 20
`
`the model can be used to predict a Week 12 observation for this patient by using the data
`from the other patients that have similar characteristics to this one. Since each patient
`would theoretically have three measurements post-baseline, the change from baseline was
`modeled as a quadratic function of time. The specific linear model that was used includes
`fixed effects for treatment group, baseline distance walked, etiology, vasodilator use
`among secondary PH patients, and time as a quadratic function. In addition, all two-way
`interactions between treatment group and the other variables as well as the two-way
`interactions between stratification (etiology/ vasodilator use) and time were included in
`the model. There were random effects for the intercept, slope, and the quadratic term for
`time. The strategy was to specify a complex model and let the data decide which terms
`were important. The curves for each stratification level at the average baseline walking
`distance are shown in Figure 10.
`
`Figure 10.1 Fitted curves from linear mixed effects model at the average baseline value.
`USPHV=Uniprost, secondary PH, vasodilator use; PboPPH=Placebo, PPH, etc.
`
`Change
`
`fromBaseline
`
`20
`
`,
`
`40
`
`60
`
`80
`
`Days from Randomization
`
`From Figure 10.1, it appears that at Week 1, patients in all strata in the placebo
`group improved walking distance by an average of about 10 m, but over the course of the
`trial, the improvement from baseline decreased slightly. In the Uniprost group, the
`change at Week 1 was about 30 m in the SPH vasodilator subgroup and about 20 m in the
`other two subgroups, but over the course of the trial, the improvement was maintained or
`increased slightly.
`
`.. .
`
`.- a
`
`.3... v..._,.....__._. :“:"~ ...-., _.
`
`.,... .
`
`.
`
`
`
`UniprostTM (treprostinol sodiurn)— NDA 21—272
`
`Page 13 of 20
`
`Although the large change from baseline in all subgroups at day 7 appears to be
`unusual because of the low starting dose and the short amount of time involved, this is
`not just