`Table of Contents
`A. GENERAL............................................................................................................................................................................. 1
`STATISTICAL ...................................................................................................................................................................... 2
`STATISTICAL MODEL......................................................................................................................................................3
`STATISTICAL APPROACHES FOR BIOEQUIVALENCE.........................................................................................3
`A. AVERAGE BIOEQUIVALENCE.............................................................................................................................................. 4
`POPULATION BIOEQUIVALENCE ......................................................................................................................................... 5
`INDIVIDUAL BIOEQUIVALENCE ........................................................................................................................................... 6
`STUDY DESIGN....................................................................................................................................................................7
`EXPERIMENTAL DESIGN..................................................................................................................................................... 7
`SAMPLE SIZE AND DROPOUTS........................................................................................................................................... 8
`STATISTICAL ANALYSIS ................................................................................................................................................9
`LOGARITHMIC TRANSFORMATION .................................................................................................................................... 9
`DATA ANALYSIS .............................................................................................................................................................. 10
`MISCELLANEOUS ISSUES ........................................................................................................................................13
`STUDIES IN MULTIPLE GROUPS....................................................................................................................................... 13
`CARRYOVER EFFECTS...................................................................................................................................................... 13
`OUTLIER CONSIDERATIONS ............................................................................................................................................. 14
`REFERENCES ................................................................................................................................................................................16
`APPENDIX A..................................................................................................................................................................................21
`APPENDIX B..................................................................................................................................................................................25
`APPENDIX C..................................................................................................................................................................................28
`APPENDIX D..................................................................................................................................................................................32
`APPENDIX E..................................................................................................................................................................................34
`APPENDIX F..................................................................................................................................................................................35
`APPENDIX G..................................................................................................................................................................................40
`APPENDIX H..................................................................................................................................................................................45


`This guidance represents the Food and Drug Administration's current thinking on this topic. It
`does not create or confer any rights for or on any person and does not operate to bind FDA or the
`public. An alternative approach may be used if such approach satisfies the requirements of the
`applicable statutes and regulations.
`This guidance provides recommendations to sponsors and applicants who intend, either before or after
`approval, to use equivalence criteria in analyzing in vivo or in vitro bioequivalence (BE) studies for
`investigational new drug applications (INDs), new drug applications (NDAs), abbreviated new drug
`applications (ANDAs) and supplements to these applications. This guidance discusses three
`approaches for BE comparisons: average, population, and individual. The guidance focuses on how to
`use each approach once a specific approach has been chosen. This guidance replaces a prior FDA
`guidance entitled Statistical Procedures for Bioequivalence Studies Using a Standard Two-
`Treatment Crossover Design, which was issued in July 1992.
`Requirements for submitting bioavailability (BA) and BE data in NDAs, ANDAs, and
`supplements, the definitions of BA and BE, and the types of in vivo studies that are appropriate
`to measure BA and establish BE are set forth in 21 CFR part 320. This guidance provides
`recommendations on how to meet provisions of part 320 for all drug products.
`Defined as relative BA, BE involves comparison between a test (T) and reference (R) drug
`product, where T and R can vary, depending on the comparison to be performed (e.g., to-be-
`marketed dosage form versus clinical trial material, generic drug versus reference listed drug,
`1 This guidance has been prepared by the Population and Individual Bioequivalence Working Group of the
`Biopharmaceutics Coordinating Committee in the Office of Pharmaceutical Science, Center for Drug Evaluation and
`Research (CDER) at the Food and Drug Administration (FDA).


`drug product changed after approval versus drug product before the change). Although BA
`and BE are closely related, BE comparisons normally rely on (1) a criterion, (2) a confidence
`interval for the criterion, and (3) a predetermined BE limit. BE comparisons could also be used
`in certain pharmaceutical product line extensions, such as additional strengths, new dosage
`forms (e.g., changes from immediate release to extended release), and new routes of
`administration. In these settings, the approaches described in this guidance can be used to
`determine BE. The general approaches discussed in this guidance may also be useful when
`assessing pharmaceutical equivalence or performing equivalence comparisons in clinical
`pharmacology studies and other areas.
`In the July 1992 guidance on Statistical Procedures for Bioequivalence Studies Using a
`Standard Two-Treatment Crossover Design (the 1992 guidance), CDER recommended that
`a standard in vivo BE study design be based on the administration of either single or multiple
`doses of the T and R products to healthy subjects on separate occasions, with random
`assignment to the two possible sequences of drug product administration. The 1992 guidance
`further recommended that statistical analysis for pharmacokinetic measures, such as area under
`the curve (AUC) and peak concentration (Cmax), be based on the two one-sided tests
`procedure to determine whether the average values for the pharmacokinetic measures
`determined after administration of the T and R products were comparable. This approach is
`termed average bioequivalence and involves the calculation of a 90% confidence interval for
`the ratio of the averages (population geometric means) of the measures for the T and R
`products. To establish BE, the calculated confidence interval should fall within a BE limit,
`usually 80-125% for the ratio of the product averages.2 In addition to this general approach,
`the 1992 guidance provided specific recommendations for (1) logarithmic transformation of
`pharmacokinetic data, (2) methods to evaluate sequence effects, and (3) methods to evaluate
`outlier data.
`Although average BE is recommended for a comparison of BA measures in most BE studies,
`this guidance describes two new approaches, termed population and individual
`bioequivalence. These new approaches may be useful, in some instances, for analyzing
`in vitro and in vivo BE studies.3 The average BE approach focuses only on the comparison of
`population averages of a BE measure of interest and not on the variances of the measure for the
`2 For a broad range of drugs, a BE limit of 80 to 125% for the ratio of the product averages has been adopted
`for use of an average BE criterion. Generally, the BE limit of 80 to 125% is based on a clinical judgment that a test
`product with BA measures outside this range should be denied market access.
`3 For additional recommendations on in vivo studies, see the FDA guidance for industry on Bioavailability
`and Bioequivalence Studies for Orally Administered Drug Products C General Considerations. Additional
`recommendations on in vitro studies will be provided in an FDA guidance for industry on Bioavailability and
`Bioequivalence Studies for Nasal Aerosols and Nasal Sprays for Local Action, when finalized.


`T and R products. The average BE method does not assess a subject-by-formulation
`interaction variance, that is, the variation in the average T and R difference among individuals.
`In contrast, population and individual BE approaches include comparisons of both averages and
`variances of the measure. The population BE approach assesses total variability of the measure
`in the population. The individual BE approach assesses within-subject variability for the T and
`R products, as well as the subject-by-formulation interaction.
`Statistical analyses of BE data are typically based on a statistical model for the logarithm of the BA
`measures (e.g., AUC and Cmax). The model is a mixed-effects or two-stage linear model. Each
`subject, j, theoretically provides a mean for the log-transformed BA measure for each formulation, m Tj
`and m Rj for the T and R formulations, respectively. The model assumes that these subject-specific
`means come from a distribution with population means m T and m R, and between-subject variances s BT
`2, respectively. The model allows for a correlation, r, between
`and s BR
`m Tj and m Rj. The subject-by-
`formulation interaction variance component (Schall and Luus 1993), s D
`2, is related to these parameters
`as follows:
`s D2 = variance of (m Tj - m Rj)
` = (s BT - s BR)2 + 2 (1-r)s
`Equation 1
`For a given subject, the observed data for the log-transformed BA measure are assumed to be
`independent observations from distributions with means m Tj and m Rj, and within-subject variances s WT
`and s WR
`2. The total variances for each formulation are defined as the sum of the within- and between-
`subject components (i.e., s TT
`2 = s WT
`2 + s BT
`2 and s TR
`2 = s WR
`2 + s BR
`2). For analysis of crossover
`studies, the means are given additional structure by the inclusion of period and sequence effect terms.
`The general structure of a BE criterion is that a function (Q) of population measures should be
`demonstrated to be no greater than a specified value (q). Using the terminology of statistical hypothesis
`testing, this is accomplished by testing the hypothesis H0: Q>q versus H A: Q#q at a desired level of
`significance, often 5%. Rejection of the null hypothesis H0 (i.e., demonstrating that the estimate of Q is
`statistically significantly less than q) results in a conclusion of BE. The choice of
`Q and q differs in
`average, population, and individual BE approaches.
`A general objective in assessing BE is to compare the log-transformed BA measure after administration
`of the T and R products. As detailed in Appendix A, population and individual approaches are based
`on the comparison of an expected squared distance between the T and R formulations to the expected


`squared distance between two administrations of the R formulation. An acceptable T formulation is one
`where the T-R distance is not substantially greater than the R-R distance. In both population and
`individual BE approaches, this comparison appears as a comparison to the reference variance, which is
`referred to as scaling to the reference variability.
`Population and individual BE approaches, but not the average BE approach, allow two types of scaling:
` reference-scaling and constant-scaling. Reference-scaling means that the criterion used is scaled to the
`variability of the R product, which effectively widens the BE limit for more variable reference products.
`Although generally sufficient, use of reference-scaling alone could unnecessarily narrow the BE limit for
`drugs and/or drug products that have low variability but a wide therapeutic range. This guidance,
`therefore, recommends mixed-scaling for the population and individual BE approaches (section IV.B
`and C). With mixed scaling, the reference-scaled form of the criterion should be used if the reference
`product is highly variable; otherwise, the constant-scaled form should be used.
`Average Bioequivalence
`The following criterion is recommended for average BE:
`(m T - m R)2 # q A
`Equation 2
` m T = population average response of the log-transformed measure for the T
` formulation
` m R = population average response of the log-transformed measure for the R
` formulation
`as defined in section III above.


`This criterion is equivalent to:
`-q A # (m T - m R) # q A
`Equation 3
`and, usually, q A = ln(1.25).
`Population Bioequivalence
`The following mixed-scaling approach is recommended for population BE (i.e., use the
`reference-scaled method if the estimate of s TR > s T0 and the constant-scaled method if the
`estimate of s TR # s T0).
`The recommended criteria are:
`2 - s TR
`(m T - m R)2 + (s TT
`-------------------------------- # q p
` s TR
`Equation 4
`2 - s TR
`(m T - m R)2 + (s TT
`-------------------------------- # q p
` s T0
`Equation 5
`m T
`m R
`= population average response of the log-transformed measure
` for the T formulation
`= population average response of the log-transformed measure
` for the R formulation
`= total variance (i.e., sum of within- and between-subject
` variances) of the T formulation
`= total variance (i.e., sum of within- and between-subject
` variances) of the R formulation
`s T0
`2 = specified constant total variance
`q p
`= BE limit
`s TT
`s TR


`Equations 4 and 5 represent an aggregate approach where a single criterion on the left-hand
`side of the equation encompasses two major components: (1) the difference between the T and
`R population averages (m T - m R), and (2) the difference between the T and R total variances
`(s TT
`2 - s TR
`2). This aggregate measure is scaled to the total variance of the R product or to a
`constant value (s T0
`2, a standard that relates to a limit for the total variance), whichever is
`The specification of both s T0 and q P relies on the establishment of standards. The generation of
`these standards is discussed in Appendix A. When the population BE approach is used, in
`addition to meeting the BE limit based on confidence bounds, the point estimate of the
`geometric test/reference mean should fall within 80-125%.
`Individual Bioequivalence
`The following mixed-scaling approach is one approach for individual BE (i.e., use the reference-
`scaled method if the estimate of s WR > s W0, and the constant-scaled method if the estimate of
`s WR # s W0). Also see section VII.D, Discontinuity, for further discussion.
`The recommended criteria are:
`2 - s WR
`(m T - m R)2 + s D2 + (s WT
`----------------------------------------- # q
` s WR
`2 - s WR
`(m T - m R)2 + s D2 + (s WT
`----------------------------------------- # q
` s W0
`Equation 6
`Equation 7
`m T
`m R
`s D
`= population average response of the log-transformed measure
` for the T formulation
`= population average response of the log-transformed measure
` for the R formulation
`= subject-by-formulation interaction variance component


`s WT
`2 = within-subject variance of the T formulation
`s WR
`2 = within-subject variance of the R formulation
`s W0
`2 = specified constant within-subject variance
`= BE limit
`Equations 6 and 7 represent an aggregate approach where a single criterion on the left-hand
`side of the equation encompasses three major components: (1) the difference between the T
`and R population averages (m T - m R), (2) subject-by-formulation interaction (s D
`2), and (3) the
`difference between the T and R within-subject variances (s WT
`2 - s WR
`2). This aggregate
`measure is scaled to the within-subject variance of the R product or to a constant value (s W0
`2, a
`standard that relates to a limit for the within-subject variance), whichever is greater.
`The specification of both s W0 and q
`I relies on the establishment of standards. The generation of
`these standards is discussed in Appendix A. When the individual BE approach is used, in
`addition to meeting the BE limit based on confidence bounds, the point estimate of the
`geometric test/reference mean ratio should fall within 80-125%.
`Experimental Design
`Nonreplicated Designs
`A conventional nonreplicated design, such as the standard two-formulation, two-period,
`two-sequence crossover design, can be used to generate data where an average or
`population approach is chosen for BE comparisons. Under certain circumstances,
`parallel designs can also be used.
`Replicated Crossover Designs
`Replicated crossover designs can be used irrespective of which approach is selected to
`establish BE, although they are not necessary when an average or population approach
`is used. Replicated crossover designs are critical when an individual BE approach is
`used to allow estimation of within-subject variances for the T and R measures and the
`subject-by-formulation interaction variance component. The following four-period,
`two-sequence, two-formulation design is recommended for replicated BE studies (see
`Appendix B for further discussion of replicated crossover designs).


`For this design, the same lots of the T and R formulations should be used for the
`replicated administration. Each period should be separated by an adequate washout
`Other replicated crossover designs are possible. For example, a three-period design,
`as shown below, could be used.
` Period
`A greater number of subjects would be encouraged for the three-period design
`compared to the recommended four-period design to achieve the same statistical power
`to conclude BE (see Appendix C).
`Sample Size and Dropouts
`A minimum number of 12 evaluable subjects should be included in any BE study. When an
`average BE approach is selected using either nonreplicated or replicated designs, methods
`appropriate to the study design should be used to estimate sample sizes. The number of
`subjects for BE studies based on either the population or individual BE approach can be
`estimated by simulation if analytical approaches for estimation are not available. Further
`information on sample size is provided in Appendix C.
`Sponsors should enter a sufficient number of subjects in the study to allow for dropouts.
`Because replacement of subjects during the study could complicate the statistical model and
`analysis, dropouts generally should not be replaced. Sponsors who wish to replace dropouts
`during the study should indicate this intention in the protocol. The protocol should also state


`whether samples from replacement subjects, if not used, will be assayed. If the dropout rate is
`high and sponsors wish to add more subjects, a modification of the statistical analysis may be
`recommended. Additional subjects should not be included after data analysis unless the trial
`was designed from the beginning as a sequential or group sequential design.
`The following sections provide recommendations on statistical methodology for assessment of average,
`population, and individual BE.
`Logarithmic Transformation
`General Procedures
`This guidance recommends that BE measures (e.g., AUC and Cmax) be log-
`transformed using either common logarithms to the base 10 or natural logarithms (see
`Appendix D). The choice of common or natural logs should be consistent and should
`be stated in the study report. The limited sample size in a typical BE study precludes a
`reliable determination of the distribution of the data set. Sponsors and/or applicants are
`not encouraged to test for normality of error distribution after log-transformation, nor
`should they use normality of error distribution as a reason for carrying out the statistical
`analysis on the original scale. Justification should be provided if sponsors or applicants
`believe that their BE study data should be statistically analyzed on the original rather
`than on the log scale.
`Presentation of Data
`The drug concentration in biological fluid determined at each sampling time point should
`be furnished on the original scale for each subject participating in the study. The
`pharmacokinetic measures of systemic exposure should also be furnished on the original
`scale. The mean, standard deviation, and coefficient of variation for each variable
`should be computed and tabulated in the final report.
`In addition to the arithmetic mean and associated standard deviation (or coefficient of
`variation) for the T and R products, geometric means (antilog of the means of the logs)
`should be calculated for selected BE measures. To facilitate BE comparisons, the
`measures for each individual should be displayed in parallel for the formulations tested.
`In particular, for each BE measure the ratio of the individual geometric mean of the T
`product to the individual geometric mean of the R product should be tabulated side by
`side for each subject. The summary tables should indicate in which sequence each


`subject received the product.
`Data Analysis
`Average Bioequivalence
`Parametric (normal-theory) methods are recommended for the analysis of log-
`transformed BE measures. For average BE using the criterion stated in
`equations 2 or 3 (section III.A), the general approach is to construct a 90%
`confidence interval for the quantity m T-m R and to reach a conclusion of average
`BE if this confidence interval is contained in the interval [-q A , q A]. Due to the
`nature of normal-theory confidence intervals, this is equivalent to carrying out
`two one-sided tests of hypothesis at the 5% level of significance (Schuirmann
`The 90% confidence interval for the difference in the means of the log-
`transformed data should be calculated using methods appropriate to the
`experimental design. The antilogs of the confidence limits obtained constitute
`the 90% confidence interval for the ratio of the geometric means between the T
`and R products.
`Nonreplicated Crossover Designs
`For nonreplicated crossover designs, this guidance recommends parametric
`(normal-theory) procedures to analyze log-transformed BA measures. General
`linear model procedures available in PROC GLM in SAS or equivalent
`software are preferred, although linear mixed-effects model procedures can also
`be indicated for analysis of nonreplicated crossover studies.
`For example, for a conventional two-treatment, two-period, two-sequence (2 x
`2) randomized crossover design, the statistical model typically includes factors
`accounting for the following sources of variation: sequence, subjects nested in
`sequences, period, and treatment. The Estimate statement in SAS PROC
`GLM, or equivalent statement in other software, should be used to obtain
`estimates for the adjusted differences between treatment means and the
`standard error associated with these differences.
`Replicated Crossover Designs


`Linear mixed-effects model procedures, available in PROC MIXED in SAS or
`equivalent software, should be used for the analysis of replicated crossover
`studies for average BE. Appendix E includes an example of SAS program
`Parallel Designs
`For parallel designs, the confidence interval for the difference of means in the
`log scale can be computed using the total between-subject variance. As in the
`analysis for replicated designs (section VI. B.1.b), equal variances should not
`be assumed.
`Population Bioequivalence
`Analysis of BE data using the population approach (section IV.B) should focus
`first on estimation of the mean difference between the T and R for the log-
`transformed BA measure and estimation of the total variance for each of the
`two formulations. This can be done using relatively simple unbiased estimators
`such as the method of moments (MM) (Chinchilli 1996, and Chinchilli and
`Esinhart 1996). After the estimation of the mean difference and the variances
`has been completed, a 95% upper confidence bound for the population BE
`criterion can be obtained, or equivalently a 95% upper confidence bound for a
`linearized form of the population BE criterion can be obtained. Population BE
`should be considered to be established for a particular log-transformed BA
`measure if the 95% upper confidence bound for the criterion is less than or
`equal to the BE limit, q P, or equivalently if the 95% upper confidence bound for
`the linearized criterion is less than or equal to 0.
`To obtain the 95% upper confidence bound of the criterion, intervals based on
`validated approaches can be used. Validation approaches should be reviewed
`with appropriate staff in CDER. Appendix F includes an example of upper
`confidence bound determination using a population BE approach.
`Nonreplicated Crossover Designs
`For nonreplicated crossover studies, any available method (e.g., SAS PROC
`GLM or equivalent software) can be used to obtain an unbiased estimate of the
`mean difference in log-transformed BA measures between the T and R
`products. The total variance for each formulation should be estimated by the


`usual sample variance, computed separately in each sequence and then pooled
`across sequences.
`Replicated Crossover Designs
`For replicated crossover studies, the approach should be the same as for
`nonreplicated crossover designs, but care should be taken to obtain proper
`estimates of the total variances. One approach is to estimate the within- and
`between-subject components separately, as for individual BE (see section
`VI.B.3), and then sum them to obtain the total variance. The method for the
`upper confidence bound should be consistent with the method used for
`estimating the variances.
`Parallel Designs
`The estimate of the means and variances from parallel designs should be the
`same as for nonreplicated crossover designs. The method for the upper
`confidence bound should be modified to reflect independent rather than paired
`samples and to allow for unequal variances.
`Individual Bioequivalence
`Analysis of BE data using an individual BE approach (section IV.C) should focus on
`estimation of the mean difference between T and R for the log-transformed BA
`measure, the subject-by-formulation interaction variance, and the within-subject
`variance for each of the two formulations. For this purpose, we recommend the MM
`To obtain the 95% upper confidence bound of a linearized form of the individual BE
`criterion, intervals based on validated approaches can be used. An example is
`described in Appendix G. After the estimation of the mean difference and the variances
`has been completed, a 95% upper confidence bound for the individual BE criterion can
`be obtained, or equivalently a 95% upper confidence bound for a linearized form of the
`individual BE criterion can be obtained. Individual BE should be considered to be
`established for a particular log-transformed BA measure if the 95% upper confidence
`bound for the criterion is less than or equal to the BE limit, q I, or equivalently if the 95%
`upper confidence bound for the linearized criterion is less than or equal to 0.
`The restricted maximum likelihood (REML) method may be useful to estimate mean
`differences and variances when subjects with some missing data are included in the
`statistical analysis. A key distinction between the REML and MM methods relates to


`differences in estimating variance terms and is further discussed in Appendix H.
`Sponsors considering alternative methods to REML or MM are encouraged to discuss
`their approaches with appropriate CDER review staff prior to submitting their
`Studies in Multiple Groups
`If a crossover study is carried out in two or more groups of subjects (e.g., if for logistical
`reasons only a limited number of subjects can be studied at one time), the statistical model
`should be modified to reflect the multigroup nature of the study. In particular, the model should
`reflect the fact that the periods for the first group are different from the periods for the second
`group. This applies to all of the approaches (average, population, and individual BE) described
`in this guidance.
`If the study is carried out in two or more groups and those groups are studied at different clinical
`sites, or at the same site but greatly separated in time (months apart, for example), questions
`may arise as to whether the results from the several groups should be combined in a single
`analysis. Such cases should be discussed with the appropriate CDER review division.
`A sequential design, in which the decision to study a second group of subjects is based on the
`results from the first group, calls for different statistical methods and is outside the scope of this
`guidance. Those wishing to use a sequential design should consult the appropriate CDER
`review division.
`Carryover Effects
`Use of crossover designs for BE studies allows each subject to serve as his or her own control
`to improve the precision of the comparison. One of the assumptions underlying this principle is
`that carryover effects (also called residual effects) are either absent (t

