`
`Prediction of intestinal absorption:
`comparative assessment of
`gastroplus™ and idea™
`
`Neil Parrott
`
`European Journal of Pharmaceutical Sciences
`
`Cite this paper
`
`Downloaded from Academia.edu
`
`Get the citation in MLA, APA, or Chicago styles
`
`Related papers
`
`Download a PDF Pack of the best related papers
`
`Prediction of the human oral bioavailability by using in vitro and in silico drug related paramete…
`Luis F Gouveia
`
`Applications of a 7-day Caco-2 cell model in drug discovery and development
`Neil Parrott
`
`An evaluation of the utility of physiologically based models of pharmacokinetics in early drug discovery
`Neil Parrott
`
`Apotex v. Cellgene - IPR2023-00512
`Petitioner Apotex Exhibit 1047-0001
`
`
`
`European Journal of Pharmaceutical Sciences 17 (2002) 51–61
`
`www.elsevier.nl / locate / ejps
`
`P rediction of intestinal absorption: comparative assessment of GASTROPLUS姠
`and IDEA姠
`*
`´
`Neil Parrott , Thierry Lave
`F. Hoffmann-La Roche AG, Pharmaceuticals Division, CH-4070 Basel, Switzerland
`
`Received 17 April 2002; received in revised form 9 July 2002; accepted 17 July 2002
`
`Abstract
`
`We have assessed two commercial software tools employing physiologically based models for prediction of intestinal absorption in
`human. IDEA姠 2.0 and GASTROPLUS姠 3.1.0 were compared both in their ability to predict fraction absorbed for a set of 28 drugs and in
`terms of the functionality offered. The emphasis was placed on the practical usefulness to pharmaceutical drug discovery. Predictions
`were assessed for three levels of input data (i) pure in silico input, (ii) thermodynamic solubility and in silico permeability, (iii)
`thermodynamic solubility and human colon carcinoma cell line (CACO-2) permeability. We found the pure in silico prediction ability of
`the tools to be comparable with 70% correct classification rate. With measured input data the IDEA姠 prediction rate improved to 79% while
`GASTROPLUS姠 stayed at 70%. In terms of functionality GASTROPLUS姠 is a powerful system for the trained user. Open access to model
`parameters, diagnostic tools and the ability to integrate data make it particularly suitable for the later stages of discovery and development.
`IDEA姠 is web based and presents a simple interface suitable for widespread use with minimal training. However the limited functionality
`and inconvenient handling of multiple compound batches currently restrict the usefulness of version 2.0 for drug discovery.
` 2002 Elsevier Science B.V. All rights reserved.
`
`Keywords: ADME; Oral absorption; Physiologically based pharmacokinetics; Simulation; Modeling
`
`1 . Introduction
`
`The ability to be administered by the oral route is a
`highly desirable property for new pharmaceutical drugs
`because it is the safest, most convenient and economical
`method (Goodman et al., 1999). For this reason good oral
`availability is a required property for drug candidate
`molecules in a large percentage of pharmaceutical discov-
`ery projects. It has been observed that drug development
`often failed for reasons of poor pharmacokinetics (Prentis
`et al., 1988). To avoid the high costs associated with such
`failures the current practice is to consider metabolism and
`pharmacokinetic properties in parallel with pharmacologi-
`cal tests during the discovery phase. High throughput in
`vitro technology now allows properties of importance for
`oral absorption like solubility, permeability, lipophilicity
`and pK to be measured early and for many compounds.
`a
`Consequently the discovery scientist
`is presented with
`large volumes of multivariate data and is faced with
`considerable data integration and information generation
`
`*Corresponding author. Tel.: 141-61-688-0813.
`E-mail address: neil john.parrott@roche.com (N. Parrott).
`]
`
`challenges. In particular a need for reliable models of oral
`absorption exists.
`The prediction of in vivo absorption is complex and the
`number of factors to be considered large. Absorption of
`drugs from the gastrointestinal tract can be influenced by
`physicochemical, physiological and formulation factors.
`The physicochemical
`factors
`include pK ,
`solubility,
`a
`stability, lipophilicity, and salt forms. The physiological
`factors include gastrointestinal pH, gastric emptying, small
`and large bowel transit times, active transport and efflux,
`and gut wall metabolism. The formulation factors are
`related to drug particle size, crystal form and dosage forms
`such as solution, tablet, capsule or suspension.
`Considering the complexity of absorption and the num-
`ber of processes involved, an integrated approach taking
`most of the data into account is highly desirable. Physio-
`logically based models provide a rational basis for integra-
`tion of data and can predict both extent and rate of
`absorption.
`A further step forward is the pure in silico approach
`where measured in vitro data is replaced with properties
`predicted from chemical structure alone. Input of in silico
`predictions into validated predictive models then gives the
`
`0928-0987 / 02 / $ – see front matter 2002 Elsevier Science B.V. All rights reserved.
`P I I : S 0 9 2 8 - 0 9 8 7 ( 0 2 ) 0 0 1 3 2 - X
`
`Apotex v. Cellgene - IPR2023-00512
`Petitioner Apotex Exhibit 1047-0002
`
`
`
`52
`
`´
`N. Parrott, T. Lave / European Journal of Pharmaceutical Sciences 17 (2002) 51–61
`
`possibility for absorption potential to be assessed before
`compounds are synthesized. With access to such tools the
`medicinal chemist is in the position to synthesize only
`compound collections with a good chance of being well
`absorbed.
`Two physiologically based models to predict oral ab-
`sorption recently became commercially available. These
`models are IDEA姠 (in vitro determination for the estimation
`of ADME) From Lion Bioscience (LION Bioscience) and
`the ACAT (advanced compartmental absorption and trans-
`it) model available in GASTROPLUS姠 from SIMULATIONS PLUS
`1
`(Simulations Plus ).
`In this study we describe a comparative evaluation
`aimed at assessing the predictiveness of these tools when
`used with the typical data available to drug discovery
`scientists. Our evaluation also discusses usability and
`functionality.
`
`2 . Methods
`
`Brief overviews of GASTROPLUS姠 and IDEA姠 are pre-
`sented here. For more detailed information we refer the
`reader to the web sites of the respective companies.
`
`2 .1. GASTROPLUS姠 3.1.0
`
`GASTROPLUS姠 simulates gastrointestinal absorption and
`pharmacokinetics for drugs dosed orally or intravenously
`in humans and animals. The simulation model underlying
`GASTROPLUS姠 is known as the advanced compartmental
`absorption and transit model (ACAT) (Agoram et al.,
`2001). The ACAT model of SIMULATIONS PLUS is based
`upon an original CAT model described by Yu et al. (1996)
`and is semiphysiological with nine compartments corre-
`sponding to different segments of
`the digestive tract
`(stomach, seven small intestinal compartments and colon).
`In addition to human physiology, models for other species
`(rat, dog, rabbit or cat) are provided. The model accounts
`for controlled-release profiles, pH dependence of dissolu-
`tion and permeability, transport of drug material through
`the gastrointestinal tract and absorption of drug material
`through
`the
`intestinal wall
`into
`the
`portal
`vein.
`GASTROPLUS姠 provides the ability for customization by
`allowing absorption constants to be set individually for
`each intestinal compartment. In addition an optimization
`module permits fitting of model parameters such as
`regional permeabilities, physiological variables and formu-
`lation factors to observed data. For early drug discovery a
`generic log D model for regional permeability is based
`upon the premise that as the ionized fraction of drug
`increases permeability decreases. Therefore permeability in
`each compartment is scaled according to the pH of that
`
`1
`
`Simulations Plus, I., 1220 W. Avenue J, Lancaster, California 93534-
`2902, USA. http: / / www.simulations-plus.com /.
`
`compartment the log P and the pK values of the drug. The
`a
`exact function for adjusting compartmental permeabilities
`was optimized by SIMULATIONS PLUS to explain the observed
`rate and extent of absorption for a proprietary training set
`of drugs. The log D model and other default settings are
`recommended when sufficient data needed for construction
`of drug-specific models are not available (sufficient data
`might constitute measured regional permeabilities or in
`vivo concentration versus time data from multiple doses).
`For pure in silico predictions it
`is necessary to use a
`companion product, QMPRPLUS姠, that takes an input file of
`multiple structures and computes properties including
`solubility, permeability and log P. Models based upon 2D
`or 3D structures are available (in this evaluation we used
`only 2D). The model for human permeability is based on a
`combination of in vivo human values and in situ rat wall
`permeabilities converted to human values based upon a
`correlation. The model was constructed using partial least
`square regression on a training set of 47 drugs (N547,
`2
`R 50.76, RMSE50.29). Several solubility models are
`available. We used the model where the only input is 2D
`chemical structure (an artificial neural network trained on
`2
`1204 examples (N51204, R 50.943, RMSE50.47 log
`units)).
`GASTROPLUS姠 3.1.0 and QMPRPLUS姠 2.2 were the versions
`evaluated. The latest versions GASTROPLUS姠 3.1.1a and
`QMPRPLUS姠 3.0 were released in February and April 2002
`and provide enhanced functionality but without major
`changes to the models.
`
`2 .2. IDEA姠 2.0
`
`IDEA姠 simulates human physiology and accounts for
`regional variations in intestinal permeability, solubility,
`surface area and fluid movement. The IDEA姠 absorption
`module runs as a web application on the corporate Intranet
`and is aimed specifically at facilitating lead selection early
`in drug discovery.
`The simulation model underlying IDEA姠 is based upon
`published work by Grass (1997). Subsequently work was
`done within Lion to develop and train the model with input
`from a consortium of pharmaceutical companies. Consor-
`tium members supplied oral and intravenous clinical data
`while Lion generated a database of in vitro data for a
`training set of around 70 nonmetabolized drugs (Norris et
`al., 2000). Many model details are kept proprietary and
`only briefly described in the IDEA姠 reference manual.
`In addition to the physiologically based absorption
`model IDEA姠 2.0 includes a structure-based model for in
`silico prediction of absorption class. This model uses
`statistical pattern recognition techniques trained on in vitro
`and clinical pharmacokinetic data for 121 drug com-
`pounds. A separate model for prediction of Caco-2 per-
`meability was trained on Lion data for 250 drugs.
`Detailed assay protocols for generation of in vitro data
`for use with the absorption model are described in the
`
`Apotex v. Cellgene - IPR2023-00512
`Petitioner Apotex Exhibit 1047-0003
`
`
`
`´
`N. Parrott, T. Lave / European Journal of Pharmaceutical Sciences 17 (2002) 51–61
`
`53
`
`IDEA姠 reference manual. In addition, a set of data for ten
`example drugs is provided so that results of company
`assays may be compared to the data of Lion.
`In a previous evaluation of IDEA姠 (Leesman et al., 2000)
`Roche provided eight drug samples as an external valida-
`tion set for a blinded study and the predictions based upon
`the CACO-2 permeability model were within previously
`agreed acceptance criteria for seven of the eight com-
`pounds.
`Version 2.0 of IDEA姠 was evaluated. Version 2.2 is due
`for release in Q4 of 2002 [Lion, personal communication].
`
`2 .3. Input data used
`
`Three different combinations of input data were assessed
`
`(i) Pure in silico input for solubility and permeability (i.e.
`chemical structure alone as input)
`(ii) thermodynamic solubility and in silico permeability
`and
`(iii) thermodynamic solubility and measured CACO-2 per-
`meability.
`
`For a set for 28 drugs, data as above was generated in
`the standard screening assays used in our drug discovery
`projects. Additional common input was the clinical dose
`levels corresponding to the fraction absorbed reported in
`the literature (Table 1).
`The drugs were selected for diversity in physicochemi-
`cal properties and to cover the full range of fraction
`absorbed in man (Fig. 1). Practical considerations related
`to the availability of compound samples necessitated a bias
`towards well-absorbed drugs. For the 28 drugs the solu-
`bility at pH 6.5 covered the range from 0.005 to over
`.1000 mg / ml. The CACO-2 permeability covered a range
`26
`26
`from 0.2310
`to .60310
`cm / s.
`At this point some explanation of the input requirements
`of the two software as well as the differences in the modes
`of operation is necessary. Table 2 compares the input data
`requirements while Figs. 2 and 3 illustrate the operation
`modes for pure in silico prediction and prediction where
`measured data for permeability and solubility are used.
`Regarding the pure in silico predictions, one required
`input where estimates are not currently available from
`QMPRPLUS姠 is pK . Therefore to complete the in silico data
`a
`set
`for GASTROPLUS姠 we used a separate tool pKalc
`2
`(CompuDrug ). For IDEA姠 the pure in silico capability,
`consisting of a classification into low, medium or high
`(#33%, 33–66%, >66%), requires only chemical struc-
`ture as input.
`To facilitate a direct comparison it was necessary to
`ensure identical input to both tools. Since IDEA姠 requires
`
`2CompuDrug International, I., P.O.B. 160, Budapest, 1255, Hungary.
`
`Table 1
`Clinical fraction absorbed data
`
`Drug
`
`Aciclovir
`Amiloride
`Antipyrine
`Atenolol
`Carbamazepine
`Chloramphenicol
`Desipramine
`Diazepam
`Diltiazem
`Etoposide
`Furosemide
`Ganciclovir
`Hydrochlorothiazide
`Ketoprofen
`Metoprolol
`Naproxen
`Penicillin V
`Pirenzepine
`Piroxicam
`Progesterone
`Propranolol
`Ranitidine
`Saquinavir
`Sulpiride
`Terbutaline
`Theophylline
`Verapamil
`Warfarin
`
`Dose
`(mg)
`
`350
`10
`600
`50
`200
`250
`150
`5
`90
`350
`80
`75
`50
`75
`100
`500
`200
`50
`20
`2.5
`240
`60
`600
`200
`10
`200
`120
`5
`
`Fraction absorbed
`(%)
`
`23
`50
`97
`50
`70
`90
`100
`100
`90
`50
`61
`3
`69
`92
`95
`99
`38
`27
`100
`100
`99
`63
`30
`44
`62
`100
`100
`98
`
`Where not otherwise indicated fraction absorbed is from Zhao et al.
`(2001). Diltiazem, etopside are from Dollery (1998), carbamazepine is
`from Goodman et al.
`(1999), pirenzepine is from Vergin (1986),
`saquinavir is from Hoffmann LaRoche (data on file).
`
`the input of solubility values measured at several pH
`values, the ability of GASTROPLUS姠 to generate solubility
`versus pH profiles from a single measured value and a
`table of pK was not used. Rather solubility values at 5 pH
`a
`values ranging from 1.5 to 7.5 were used for both
`softwares.
`Concerning dose, GASTROPLUS姠 takes a number of inputs
`describing dosage form and formulation whereas IDEA姠
`accepts just a single value (Table 2). In this evaluation we
`took the GASTROPLUS姠 defaults namely
`
`1. Dosage form: immediate release tablet
`2. Dose volume: 250 ml
`3. Drug particle density: 1.2 g / ml
`4. Effective particle radius: 25 mm
`
`For predictions based upon measured permeability, the
`GASTROPLUS姠 model has been trained to accept values for
`human jejunal permeability as input. Thus a preliminary
`step when using CACO-2 data is transformation based
`upon a correlation. In this study we built a correlation of
`log human permeability against log CACO-2 permeability
`that
`included 20 drugs. The correlation coefficient was
`
`Apotex v. Cellgene - IPR2023-00512
`Petitioner Apotex Exhibit 1047-0004
`
`
`
`54
`
`´
`N. Parrott, T. Lave / European Journal of Pharmaceutical Sciences 17 (2002) 51–61
`
`Fig. 2. Operation modes for pure in silico prediction.
`
`Fig. 1. Distribution of fraction absorbed in human, calculated log P and
`measured pK for the 28 drugs studied.
`a
`
`Fig. 3. Operation modes for prediction when measured data for per-
`meability and solubility are used.
`
`Table 2
`Input data for GASTROPLUS姠 and IDEA姠
`
`Chemical
`structure
`
`Dose and
`formulation
`
`Solubility
`
`Permeability
`
`pK
`
`a
`
`Lipophilicity
`
`GASTROPLUS姠
`
`ISIS structure
`(or SMILES)
`
`Initial dose (mg)
`Subsequent doses (mg)
`Dosing interval (h)
`Dose volume (ml)
`Drug particle density (g / ml)
`Effective particle radius (microns)
`Dosage form (selection from
`a list of options)
`
`Solubility at different pH values
`in the range 1.5–7.5
`or solubility at one known pH
`plus a table of pK values
`a
`
`Permeability measure that is
`transformed based on a correlation
`to human permeability or in
`silico estimate of human P
`
`eff
`
`Table of pK values (used in the
`a
`log D model of regional permeability)
`
`log D at known pH or log P
`
`IDEA姠
`
`SMILES structure
`(or ChemDraw structure)
`
`Dose (mg)
`
`Solubility at different pH values
`in range 1.5–7.5
`
`Caco-2 (may need to be transformed
`using a correlation to Lion data)
`or rabbit intestinal tissue
`(four segments) plus permeability
`efflux ratio (optional)
`
`(Lipophilicity is predicted internally
`—no input is possible)
`
`Apotex v. Cellgene - IPR2023-00512
`Petitioner Apotex Exhibit 1047-0005
`
`
`
`´
`N. Parrott, T. Lave / European Journal of Pharmaceutical Sciences 17 (2002) 51–61
`
`55
`
`of these in the test set can lead to a misleadingly low
`RMSE.
`Therefore all three measures taken together were used to
`characterize the predictions.
`
`2 .5. Variability
`
`In considering the possible relevance of differences in
`predictiveness it is obviously important to note the vari-
`ability in the input data used as well as that of the clinical
`fraction absorbed data that we are attempting to predict. In
`the paper where we obtained the majority of our fraction
`absorbed values, Zhao et al. (2001) estimate that their
`RMSE of prediction of 14% is close to the variability in
`the observed clinical fraction absorbed. Regarding the
`calculated properties, SIMULATIONS PLUS gives the RMSE of
`predicted human jejunal permeability as 0.3 log units and
`of predicted solubility as 0.5 log units. For measured
`permeability and solubility the experimental error in our
`in-house assays is of the order of 0.2–0.3 log units. The
`models have not been trained with our data and error is
`obviously introduced when we transform input data via
`correlation. Our evaluation also ignores the effects of other
`processes that might be involved for certain drugs such as
`active transport, efflux, gut metabolism and formulation
`effects. For the purposes of comparative assessment we
`have assumed that neglecting such factors will affect the
`predictions of both models similarly.
`We have found that generation of surface plots can be a
`useful aid to understanding the sensitivity of prediction to
`input data. Such plots may be generated by systematically
`varying the input to the predictive absorption model. In
`other words we simulate for a large number of vital drugs
`each with slightly different properties. Fig. 4 shows one
`example of such a plot generated with GASTROPLUS姠. In
`this case we held the dose and log P fixed at 100 mg and
`0, respectively and varied permeability and solubility. Onto
`the plot we have added an indication of the range covered
`when the two key input parameters vary by 0.5 log
`units—a variability that we believe is a reasonable esti-
`mate. As can be seen, given such variability in input, the
`error associated with fraction absorbed is of the order of
`20%.
`To gain an impression of the significance of differences
`obtained in the RMSE for our test set we carried out some
`simulations. A normal distribution with standard deviation
`of 14% (an estimate of the variability in observed absorp-
`tion) was applied to the observed fraction absorbed values
`for our set of 28 drugs (with truncation at 0% and 100%).
`We then ran a series of trials where we took random
`samples from these distributions and computed the RMSE
`for each trial. Based upon 300 trials the RMSE showed a
`normal distribution with mean of 7% and standard devia-
`tion of 0.75%. Bearing in mind this variability of RMSE
`that arises from the variability in the observed data alone
`
`0.88 and the standard error of estimate was 0.28 log units.
`The CACO-2 assay and the correlation will be published
`separately (development of a new 7-day, 96-well Caco-2
`permeability assay combined with HT compound analysis
`[J. Alsenz et al. in preparation].
`The IDEA姠 model has been trained with CACO-2 values
`generated in Lions’ assay. It is well known that CACO-2
`data generated in different
`laboratories can vary con-
`siderably (Artursson et al., 1996) and the IDEA姠 manual
`therefore recommends comparison to data from the Lion
`assay for a set of example drugs. In our case, our data were
`not within 2-fold of Lion data for three out of twelve
`compounds. As this seemed to be a borderline situation we
`investigated the results using both unchanged CACO-2
`2
`data as well as data transformed via the correlation (R 5
`0.77, SE50.5 log units, N 5 12). In fact there was no
`significant difference in the results obtained and so we
`report here the predictions where our CACO-2 data was
`input unchanged.
`
`2 .4. Assessment of the predictions
`
`In addition to graphical plots of observed versus pre-
`dicted fraction absorbed the following numerical measures
`have been applied to facilitate assessment of the predic-
`tions.
`to
`ability
`The
`Categories used are:
`
`correctly
`
`categorize
`
`compounds.
`
`• High: Fraction absorbed >66%
`• Medium: 33%,Fraction absorbed ,66%
`• Low: Fraction absorbed #33%
`
`The root mean squared error (RMSE) for predictions of
`fraction absorbed.
`
`]]]]]]]]
`N
`2
`
`O(observed 2 predicted)
`
`œ
`
`1]
`
`RMSE 5
`
`]]]]]]]
`N
`
`Each of these measures has weaknesses when used
`alone. Graphical plots provide an overall impression of
`prediction quality but do not allow quantitative com-
`parisons. Categorization reflects well the likely practical
`application of the predictions in drug discovery projects
`where one tends to rank and classify compounds. How-
`ever, classification can be misleading when the predictions
`fall close to the boundary regions e.g. 35% predicted vs.
`30% observed would be in the wrong class but is actually a
`good prediction.
`RMSE is a commonly used measure for such predictions
`and thus allows comparison with previous published
`results but this too can be misleading when the test sets are
`different. For example 100% absorbed compounds are not
`the most challenging to predict and a very large proportion
`
`Apotex v. Cellgene - IPR2023-00512
`Petitioner Apotex Exhibit 1047-0006
`
`
`
`56
`
`´
`N. Parrott, T. Lave / European Journal of Pharmaceutical Sciences 17 (2002) 51–61
`
`Table 3
`Prediction of absorption class with pure in silico input
`
`Aciclovir
`Amiloride
`Antipyrine
`Atenolol
`Carbamazepine
`Chloramphenicol
`Desipramine
`Diazepam
`Diltiazem
`Etoposide
`Furosemide
`Ganciclovir
`Hydrochlorothiazide
`Ketoprofen
`Metoprolol
`Naproxen
`Penicillin V
`Pirenzepine
`Piroxicam
`Progesterone
`Propranolol
`Ranitidine
`Saquinavir
`Sulpiride
`Terbutaline
`Theophylline
`Verapamil
`Warfarin
`
`Observed
`absorption
`class
`
`Low
`Med
`High
`Med
`High
`High
`High
`High
`High
`Med
`Med
`Low
`High
`High
`High
`High
`Med
`Low
`High
`High
`High
`Med
`Low
`Med
`Med
`High
`High
`High
`
`IDEA姠
`predicted
`class
`
`Med
`Med
`High
`Med
`High
`High
`Med
`High
`High
`High
`Med
`Low
`High
`High
`High
`High
`High
`High
`High
`Med
`High
`High
`Low
`Med
`High
`High
`High
`High
`20 correct
`predictions
`for 28 drugs
`
`GASTROPLUS姠
`predicted
`class
`
`Med
`High
`High
`Med
`High
`High
`High
`High
`High
`Med
`Med
`Low
`Med
`High
`High
`High
`Med
`High
`High
`Med
`High
`High
`High
`High
`High
`High
`High
`High
`19 correct
`predictions
`for 28 drugs
`
`Fraction absorbed classes are defined as: low #33%, medium 33–66%,
`high $66%.
`
`to conclude that our results must be considered as optimis-
`tic for the predictions likely to be achieved for novel
`structures.
`Overall the pure in silico performance of the two tools is
`similar at around 70% correct classification and it
`is
`interesting to compare this with other in silico approaches.
`For example the ‘Pfizer rules’ or ‘rule of 5’ (Lipinski et
`al., 1997) is often used as an alert for compounds likely to
`be poorly absorbed. In our implementation,
`the ‘Pfizer
`rules’ correctly provide an alert for saquinavir but not for
`the three other low absorption drugs and a false alert is
`triggered for etoposide, a medium absorption drug. Using a
`more sophisticated approach (Wessel et al., 1998) trained a
`six-descriptor neural network model with a data set of 76
`compounds to predict for a test set of 10 drugs and
`achieved a RMSE of 16%. Zhao et al.
`(2001) used
`multiple regression based upon Abraham structural de-
`scriptors and achieved an RMSE of 14% when predicting
`for a test set of 131 drugs. The RMSE obtained from
`GASTROPLUS姠 was 24%, however, direct comparison of
`RMSE is not valid because the data sets are different.
`Table 4 enables comparison of the drugs from our data set
`
`Fig. 4. Surface plot generated with GASTROPLUS姠 showing the dependency
`of fraction absorbed on solubility and permeability for a log P of 0 and a
`dose of 100 mg. The ellipse superimposed onto the plot shows how a
`variability of 0.5 log units in the two key input parameters translates to
`approximately 20% error in predicted fraction absorbed.
`
`we should certainly not assign significance to differences
`of ,1.5% in RMSE of the predictions.
`
`3 . Results and discussion
`
`3 .1. Prediction based upon chemical structure alone
`
`Table 3 compares the results after conversion of the
`GASTROPLUS姠 prediction of fraction absorbed to a classifi-
`cation identical to that used by IDEA姠.
`For IDEA姠 20 out of 28 drugs (71%) are assigned to the
`correct absorption class. Two of the Low class compounds
`are correctly predicted (ganciclovir, saquinavir) while two
`are missed (aciclovir, pirenzepine) and in the case of
`pirenzepine by more than one class.
`For GASTROPLUS姠 19 out of 28 drugs (68%) are assigned
`to the correct absorption class. In addition to the two low
`category drugs missed by IDEA姠 saquinavir (fabs%530) is
`also misclassified by GASTROPLUS姠.
`In considering the performance of the pure in silico
`approach it must be noted that the set of 28 drugs used are
`well studied and might belong to the training sets that have
`been used for the development of the models. Neither
`company discloses their internal training sets and so we
`can only speculate on this. However, we feel that it is fair
`
`Apotex v. Cellgene - IPR2023-00512
`Petitioner Apotex Exhibit 1047-0007
`
`
`
`´
`N. Parrott, T. Lave / European Journal of Pharmaceutical Sciences 17 (2002) 51–61
`
`57
`
`Table 4
`Summary of pure in silico predictions achieved by Zhao and Wessel
`
`Aciclovir
`Amiloride
`Antipyrine
`Atenolol
`Carbamazepine
`Chloramphenicol
`Desipramine
`Diazepam
`Diltiazem
`Etoposide
`Furosemide
`Ganciclovir
`Hydrochlorothiazide
`Ketoprofen
`Metoprolol
`Naproxen
`Penicillin V
`Pirenzepine
`Piroxicam
`Progesterone
`Propranolol
`Ranitidine
`Saquinavir
`Sulpiride
`Terbutaline
`Theophylline
`Verapamil
`Warfarin
`
`Obs.
`Abs.
`
`23
`50
`97
`50
`70
`90
`100
`100
`90
`50
`61
`3
`69
`92
`95
`99
`59
`27
`100
`100
`99
`64
`30
`44
`62
`100
`100
`98
`Classification
`rate
`RMSE
`
`Zhao
`Pred.
`
`63
`54
`95
`79
`–
`79
`106
`104
`–
`68
`69
`55
`55
`95
`94
`94
`76
`–
`78
`111
`98
`78
`–
`72
`58
`76
`108
`94
`16 / 24
`(66%)
`19%
`
`Correct
`class?
`
`Wessel
`Pred.
`
`Correct
`class?
`
`N
`Y
`Y
`N
`–
`Y
`Y
`Y
`–
`N
`N
`N
`Y
`Y
`Y
`Y
`N
`–
`Y
`Y
`Y
`N
`–
`N
`Y
`Y
`Y
`Y
`
`92
`80
`–
`96
`
`–
`52
`89
`6
`62
`100
`91
`100
`
`–
`
`94
`96
`77
`–
`
`100
`11 / 14
`(78%)
`12%
`
`Y
`N
`–
`Y
`
`–
`Y
`N
`Y
`Y
`Y
`Y
`Y
`
`–
`
`Y
`Y
`N
`–
`
`Y
`
`in common with the Zhao and Wessel data sets in terms of
`the rate of correct classification and the RMSE.
`
`3 .2. Prediction based upon measured solubility and
`predicted permeability
`
`Fig. 5 shows the results obtained with IDEA姠. There is
`an apparent
`tendency for underprediction of
`the high
`absorption compounds while for medium category drugs
`the fraction absorbed was overpredicted.
`Fig. 6 shows the results obtained with GASTROPLUS姠. The
`predictions show a different pattern to those from IDEA姠
`but overall the accuracy of the predictions as measured by
`the rate of correct categorization or the RMSE is the same
`for IDEA姠 and GASTROPLUS姠 being 19 out of 28 and 22% in
`both cases.
`
`3 .3. Prediction based upon measured solubility and
`measured CACO-2 permeability
`
`Fig. 7 shows the results obtained with IDEA姠 and Fig. 8
`those with GASTROPLUS姠.
`Table 5 summarizes all predictions. Of the four poorly
`absorbed drugs in our data set, ganciclovir is the only drug
`
`that is consistently well predicted with aciclovir, piren-
`zepine and saquinavir being over predicted by both IDEA姠
`and GASTROPLUS姠.
`Based on pure in silico as well as on measured solu-
`bilities there was no significant difference in performance
`of GASTROPLUS姠 and IDEA姠. However, IDEA姠 did show an
`improvement
`in predictive accuracy when measured
`CACO-2 permeability was used as input. This improve-
`ment was apparent when measured by either the correct
`classification rate or the RMSE. But even here only one of
`the four poorly absorbed compounds was correctly iden-
`tified and pirenzepine, a low absorption compound would
`have been misclassified as high absorption. Interestingly,
`GASTROPLUS姠 did not show a comparable improvement in
`predictiveness. Thus independently of the permeability
`used (in silico or experimental), the number of correct
`classifications was 19. Although two of the poorly ab-
`sorbed compounds were correctly recognized there was a
`tendency to underpredict absorption for medium range
`compounds (e.g. hydrochlorothiazide,
`theophylline). For
`early drug discovery such a tendency for false negatives
`would be more problematic than overpredictions (false
`positives) because it could lead to rejection of potentially
`interesting compounds before in vivo tests are performed.
`
`Apotex v. Cellgene - IPR2023-00512
`Petitioner Apotex Exhibit 1047-0008
`
`
`
`58
`
`´
`N. Parrott, T. Lave / European Journal of Pharmaceutical Sciences 17 (2002) 51–61
`
`Fig. 5. IDEA姠 predicted fraction absorbed based on measured solubility
`and predicted CACO-2 permeability. The line of unity is shown on the
`graph. Categorization of fraction absorbed into high, medium or low
`absorption is correct for 19 out of 28 drugs. The root mean squared error
`of prediction is 22%.
`
`On the other hand, false positives, although clearly unde-
`sirable, would probably be found out at the stage of the in
`vivo assay. Part of the reason for the poorer performance
`of GASTROPLUS姠 might be related to the need to transform
`in vitro permeability data to in vivo human permeability.
`The human jejunal permeability data is associated with
`large variability and this certainly introduces considerable
`additional error into the predictions. Conversely, IDEA姠 is
`calibrated directly for CACO-2 data and so the transforma-
`tion step is avoided.
`Independently of the approach and the software used,
`several drugs were consistently badly predicted with an
`error of more than 20%. Some comments and possible
`explanations for these incorrect predictions are considered
`here.
`like pirenzepine,
`these compounds
`A number of
`ranitidine and sulpiride are absorbed passively through the
`paracellular route. Pirenzepine is overpredicted by both
`GASTROPLUS姠 and IDEA姠 based upon both in silico and in
`vitro permeability. Furosemide, ranitidine and sulpiride are
`drugs with low CACO-2 permeability which are better
`predicted by IDEA姠 than GASTROPLUS姠.
`
`Fig. 6. GASTROPLUS姠 predicted fraction absorbed based on measured
`solubility and predicted human jejunal permeability. The line of unity is
`shown on the graph. Categorization of fraction absorbed into high,
`medium or low absorption is correct for 19 out of 28 drugs. The root
`mean squared error of prediction is 22%.
`
`For hydrochlorothiazide the fraction absorbed predicted
`with GASTROPLUS姠 based on CACO-2 data and measured
`solubility (24%) was significantly lower than the 69%
`observed in vivo. Hydrochlorothiazide is consistently
`under predicted by GASTROPLUS姠 based upon both in silico
`and in vitro permeability. In fact we found that even when
`using the measured human jejunal permeability this com-
`pound is considerably under predicted.
`It seems that
`hydrochlorothiazide could not be well modeled by SIMULA-
`TIONS PLUS and both hydrochlorothiazide and furosemide
`were left out of the QMPRPLUS姠 models of human per-
`meability most likely because the low in vivo permeability
`seems inconsistent with the observed intestinal absorption
`and bioavailability in man. Amiloride is also under pre-
`dicted by GASTROPLUS姠 when using in vitro permeability
`but is better predicted by IDEA姠. In common with hydro-
`chlorothiazide this drug lies in the region of lower per-
`meability where the IDEA姠 model seems to be better
`predictive.
`Saquinavir is overpredicted by GASTROPLUS姠 based upon
`both in silico and in vitro permeability. Interestingly,
`IDEA姠 predicts the fraction absorbed for this drug better
`
`Apotex v. Cellgene - IPR2023-00512
`Petitioner Apotex Exhibit 1047-0009
`
`
`
`´
`N. Parrott, T. Lave / European Journal of Pharmaceutical Sciences 17 (2002) 51–61
`
`59
`
`Fig. 7. IDEA姠 predicted fraction absorbed for measured solubility and
`measured CACO-2 permeability. The line of unity is shown on the graph.
`Categorization of fraction absorbed into high, medium or low absorption
`is correct for 22 out of 28 drugs. The r