`Kecewed 9 August 1996: accepted 14 August 1996 Abstract Experimental and computational approaches tn estimate soluhility and permeability in discovery and development settings are described. In the discovery setting ‘the rule of 5’ predicts that poor absorption or permeation is more likely when there are more than 5 H-bond donors. 10 H-bond acceptors, the molecular weight (MWT) is greater than 500 and the calculated Log P (CLogP) is greater than 5 (or MlogP > 4.15). Computational methodology for the rule-based Moriguchi Log P (MLogP) calculation is described. Turbidimetric solubility measurement is described and applied to known drugs. High throughput screening (HTS) leads tend to have higher MWT and Log P and lower turbidimetric solubility than leads in the pre-HTS era. In the development setting, solubility calculations focus on exact value prediction and are difficult because of polymorphism. Recent work on linear free energy relationships and Log P approaches are critically reviewed. Useful predictions are possible in closely related analog series when coupled with experimental thermodynamic solubility measurements.
Petitioner Torrent Pharmaceuticals Limited - Exhibit 1019 - Page 1

`1. Introduction 2. The drug discovery setting This review presents distinctly different but com- plementary experimental and computational ap- proaches to estimate solubility and permeability in drug discovery and drug development settings. In the discovery setting, we describe an experimental ap- proach to turbidimetric solubility measurement as well as computational approaches to absorption and permeability. The absence of discovery experimental approaches to permeation measurements reflects the authors’ experience at Pfizer Central Research. Ac- cordingly, the balance of poor solubility and poor permeation as a cause of absorption problems may be signiticantly different at other drug discovery locations, especially if chemistry focuses on peptidic- like compounds. This review deals only with solu- bility and permeability as barriers to absorption. lntestinal wall active transporters and intestinal wall metabolic events that influence the measurement of drug bioavailability are beyond the scope of this review. We hope to spark lively debate with our hypothesis that changes in recent years in medicinal chemistry physical property profiles may be the result of leads generated through high throughput screening. In the development setting, computational approaches to estimate solubility are critically re- viewed based on current computational solubility research and experimental solubility measurements. I4 IS IS I6 I7 I7 17 IX IX IX 19 21 22 2.3 24 2.1. Changes
`drug leads and physico-chemical properties In recent years, the sources of drug leads in the pharmaceutical industry have changed significantly. From about 1970 on, what were considered at that time to be large empirically-based screening pro- grams became less and less important in the drug industry as the knowledge base grew for rational drug design
`[ I]. Leads in this era were discovered using both in vitro and primary in vivo screening assays and came from sources other than massive primary in vitro screens. Lead sources were varied coming from natural products; clinical observations of drug side effects
`[ I]; published unexamined patents; presentations and posters at scientific meet- ings: published reports in scientific journals and collaborations with academic investigators. Most of these lead sources had the common theme that the ‘chemical lead’ already had undergone considerable scientific investigation prior to being identified as a drug lead. From a physical property viewpoint, the most poorly behaved compounds in an analogue series were eliminated and most often the starting lead was in a range of physical properties consistent with the previous historical record of discovering orally active compounds.
Petitioner Torrent Pharmaceuticals Limited - Exhibit 1019 - Page 2
`3.2. MLogP. Log P by the method of Moriguchi

`C.A. Lipirwki et cd. I Advanced Drug
`Z? (1997)
`1993. 8 and 8; 1994, I2 and II: 1995. 46 and 4.5. often compensated for by the follow-up to the primary screen. This is typically a more careful, more labor-intensive process of in vitro retesting to determine ICSOs from dose response curves with more attention paid to solubilization. The net result of all these testing changes is that in vitro activity is reliably detected in compounds with very poor thermodynamic solubility properties. A corollary result is that the measurement of the true thermo- dynamic aqueous solubility is not very relevant to the screening manner in which leads are detected. 2.2. Factors affecting physico-chenzical lead profiles The physico-chemical profile of current leads i.e. the ‘hits’ in HTS screens now no longer depends on compound solubility sufficient for in vivo activity but depends on: (1) the medicinal chemistry princi- ples relating structure to in vitro activity; (2) the nature of the HTS screen; (3) the physico-chemical profile of the compound set being screened and (4) to human decision making, both overt and hidden as to the acceptability of compounds as starting points for medicinal chemistry structure activity relation- ship (SAR) studies. One of the most reliable methods in medicinal chemistry to improve in vitro activity is to incorpo- rate properly positioned lipophilic groups. For exam- ple, addition of a single methyl group that can occupy a receptor ‘pocket’ improves binding by about 0.7 kcal/mol [6]. By way of contrast, it is generally difficult to improve in vitro potency by manipulation of the polar groups that are involved in ionic receptor interactions. The interaction of a polar group in a drug with solvent versus interaction with the target receptor is a ‘wash’ unless positioning of the polar group in the drug is precise. The traditional lore is that the lead has the polar groups in the correct (or almost correct) position and that in vitro potency is improved by correctly positioned lipo- philic groups that occupy receptor pockets. Polar groups in the drug that are not required for binding can be tolerated if they occupy solvent space but they do not add to receptor binding. The net effect of these simple medicinal chemistry principles is that, other factors being equal, compounds with correctly positioned polar functionality will be more readily
`This situation changed dramatically about 1989- 1991. Prior to 1989, it was technically unfeasible to screen for in vitro activity across hundreds of thousands of compounds, the volume of random screening required to efficiently discover new leads. With the advent of high throughput screening in the 1989-1991 time period, it became technically feas- ible to screen hundreds of thousands of compounds across in vitro assays [2-41. Combinatorial chemis- try soon began’ and allowed automated synthesis of massive numbers of compounds for screening in the new HTS screens. The process was accelerated by the rapid progress in molecular genetics which made possible the expression of animal and human re- ceptor subtypes in cells lacking receptors that might interfere with an assay and by the construction of receptor constructs to facilitate signal detection. The screening of very large numbers of compounds necessitated a radical departure from the traditional method of drug solubilization. Compounds were no longer solubilized in aqueous media under thermo- dynamic equilibrating conditions. Rather, compounds were dissolved in dimethyl sulfoxide (DMSO) as stock solutions, typically at about 20-30 mmol and then were serially diluted into 96-well plates for assays (perhaps with some non ionic surfactant to improve solubility). In this paradigm, even very insoluble drugs could be tested because the kinetics of compound crystallization determined the apparent ‘solubility’ level. Moreover, compounds could parti- tion into assay components such as membrane particulate material or cells or could bind to protein attached to the walls of the wells in the assay plate. The net effect was a screening technology for compounds in the pM concentration range that was largely divorced from the compounds true aqueous thermodynamic solubility. The apparent ‘solubility’ in the HTS screen is always higher, sometimes dramatically so, than the true thermodynamic solu- bility achieved by equilibration of a well character- ized solid with aqueous media. The in vitro HTS testing process is quite reproducible and potential problems related to poor compound solubility are ‘A search through SciSearch and Chemical Abstracts for refer- ences to combinatorial chemistry in titles or descriptors using the truncated terms COMBIN’? and CHEMISTR? gave the following number of references respectively: 1990, 0 and 0; 199
`I, 2 and
`I ;
Petitioner Torrent Pharmaceuticals Limited - Exhibit 1019 - Page 3

`detectable in HTS screens if they are larger and more lipophilic. The nature of the screen determines the physico- chemical profile of the resultant ‘hits’. The larger the number of hits that are detected, the more the physico-chemical profile of the ‘hits’ resembles the overall compound set being screened. Technical factors such as the design of the screen and human cultural factors such as the stringency of the evalua- tion as to what is a suitable lead worth are majol determinants of the physico-chemical profiles of the eventual leads. Screens designed with very high specificity. for example many receptor based assays. generate small numbers of hits in the PM range. In these types of screens the signal is easy to detect against background noise. the hits are few or can be made few by altering potency criteria and the physico-chemical profiles tend towards more lipo- philic, larger, less soluble compounds. Tight control of the criteria for activity detection in the initial HTS screen minimizes labor-intensive secondary evalua- tion and minimizes the effect of human biases. The downside is that lower potency hits with more favorable physico-chemical property protiles may be discarded. Cell-based assays, by their very nature tend to produce more ‘hits’ than receptor-based screens. These types of assays monitor a functional event, fol example a change in the level of a signaling inter- mediate or the expression level of M-RNA or protein. Multiple mechanisms may lead to the mea- sured end point and only a few of these mechanisms may be desirable. This leads to a larger number of hits and therefore their physico-chemical profile will more closely resemble that of the compound set being screened. Perhaps, equally importantly. a larger volume of secondary evaluation allows for a greater expression of human bias. Bias is especially difficult to quantify in the chemists perception of a desirable lead structure. The physico-chemical profile of the compound set being screened is the first tilter in the physico- chemical profile of an HTS ‘hit’. Obviously high molecular weight, high lipophilicity compounds will not be detected by a screen if they are not present in the library. In the real world, trade-offs occur in the choice of profiles for compound sets. An exclusively low molecular weight, low lipophilicity library likely increases the difficulty of detecting ‘hits’ but sim- plifies the process of discovering an orally active drug once the lead is identified. The converse is true of a high molecular weight high lipophilicity library. In our experience, commercially available (non combinatorial) compounds like those available from chemical supply houses tend towards lower molecu- lar weights and lipophilicities. Human decision making, both overt and hidden can play a large part in the profile of HTS ‘hits’. For example. a requirement that ‘hits’ possess an accept- able range of measured or calculated physico-chemi- cal properties will obviously affect the starting compound profiles for medicinal chemistry SAR. Lesh obvious are hidden biases. Are the criteria for a ‘hit’ changing to higher potency (lower IC.50) as the HTS screen runs? Labor-intensive secondary follow- up is decreased but less potent, perhaps physico- chemically more attractive leads, may be eliminated. How do chemists react to potential lead structures? In an interesting experiment, we presented a panel of our most experienced medicinal chemists with a group of theoretical lead structures - all containing literature ‘toxic’ moieties. Our chemists split into two very divergent groups; those who saw the toxic moieties as a bar to lead pursuit and those who recognized the toxic moiety but thought they might be able to replace the offending moiety. An easy way to illustrate the complexity of the chemists percep- tion of lead attractiveness is to examine the re- markably diverse structures of the new chemical entities (NCEs) introduced to market that appear at the back of recent volumes of Annuul Reports in Mrdicind C’hemistry. No single pharmaceutical com- pany can conduct research in all therapeutic areas and so some of these compounds, which are all marketed drugs, will inevitably be less familiar and potentially less desirable to the medicinal chemist at one research location, but may be familiar and desirable to a chemist at another research site. The idea in selecting a library with good absorp- tion properties is to use the clinical Phase II selection process as a filter. Drug development is expensive and the most poorly behaved compounds are weeded out early. Our hypothesis was that poorer physico- chemical properties would predominate in the many
Petitioner Torrent Pharmaceuticals Limited - Exhibit 1019 - Page 4

`C.A. Lipiwki et a/. I Advancrd Drug Drlivm Reviw~ 2.3 (1997) -3-2.5 1 compounds that enter into and fail to survive pre- clinical stages and Phase I safety evaluation. We expected that the most insoluble and poorly perme- able compounds would have been eliminated in those compounds that survived to enter Phase II efficacy studies. We could use the presence of United States Adopted Name (USAN) or International Non-pro- prietary Name (INN) names to identify compounds entering Phase II since most drug companies (includ- ing Pfizer) apply for these names at entry to Phase II. The (WDI) World Drug Index is a very large computerized database of about 50 000 drugs from the Derwent Co. The process used to select a subset of 2245 compounds from this database that are likely to have superior physico-chemical properties is as follows: From the 50 427 compounds in the WDl File. 7894 with a data field for a USAN name were selected as were 6320 with a data field for an INN. From the two lists, 8548 compounds had one or both USAN or INN names. These were searched for a data held ‘indications and usage’ suggesting clinical exposure, resulting in 3704 entries. From the 3704 using a substructure data field we eliminated 1176 compounds with the text string ‘POLY’, 87 with the text string ‘PEPTIDE’ and 101 with the text string ‘QUAT’. Also eliminated were 53 compounds con- taining the fragment 0 = P-O. We coined the term ‘USAN’ library for this collection of drugs. 2.4. The target audience - medicinal chemists Having identified a library of drugs selected by the economics of entry to the Phase II process we sought to identify calculable parameters for that library that were likely related to absorption or permeability. Our approach and choice of parameters was dictated by very pragmatic considerations. We wanted to set up an absorption-permeability alert procedure to guide our medicinal chemists. Keeping in mind our target audience of organic chemists we wanted to focus on the chemists very strong pattern recognition and chemical structure recognition skills. If our target audience had been pharmaceutical scientists we would not have deliberately excluded equations or regression coefficients. Experience had taught us that a focus on the chemists very strong skills in pattern recognition and their outstanding chemistry structural recognition skills was likely to enhance information transfer. In effect, we deliberately emphasized en- hanced educational effectiveness towards a well defined target audience at the expense of a loss of detail. Tailoring the message to the audience is a basic communications principle. One has only to look at the popular chemistry abstracting booklets with their page after page of chemistry structures and minimal text to appreciate the chemists structural recognition skills. We believe that our chemists have accepted our calculations at least in part because the calculated parameters are very readily visualized structurally and are presented in a pattern recognition format. 2.5. Calculated properties of the ‘USAN’ library Molecular weight (formula weight in the case of a salt) is an obvious choice because of the literature relating poorer intestinal and blood brain barrier permeability to increasing molecular weight [7,8] and the more rapid decline in permeation time as a function of molecular weight in lipid bi-layers as opposed to aqueous media 191. The molecular weights of compounds in the 2245 USANs were lower than those in the whole 50 427 WDI data set. In the USAN set 11% had MWTs > 500 compared to 22% in the entire data set. Compounds with MWT > 600 were present at 8% in the USAN set compared to 14% in the entire data set. This difference is not explainable by the elimination of the very high MWTs in the USAN selection process. Rather it reflects the fact that higher MWT compounds are in general less likely to be orally active than lower MWTs. Lipophilicity expressed as a ratio of octanol solubility to aqueous solubility appears in some form in almost every analysis of physico-chemical prop- erties related to absorption
`[ 101. The computational problem is that an operationally useful computational alert to possible absorption-permeability problems must have a no fail log P calculation. In our experience, the widely used and accurate Pomona College Medicinal Chemistry program applied to our compound file failed to provide a calculated log P (CLogP) value because of missing fragments for at least 25% of compounds. The problem is not an inordinate number of ‘strange fragments’ in our chemistry libraries but rather lies in the direction of the trade off between accuracy and ability to calcu- late all compounds adopted by the Pomona College
Petitioner Torrent Pharmaceuticals Limited - Exhibit 1019 - Page 5

`[ 1 I] within the Molec- ular Design Limited MACCS
`ISIS base
`pro- grams to avoid the missing fragment problem. As ;I rule-based system, the Moriguchi calculation always gives an answer. The pros and cons of the Moriguchi algorithm have been debated in the literature
`team. The CLogP calculation emphasizes high ac- curacy over breadth of calculation coverage. The fragmental CLogP value is defined with reference to five types of intervening isolating carbons between the polar fragments. As common a polar fragment as a sulfide (6) linkage generates missing fragments when flanked by rare combinations of the isolating carbon types. Polar fragments as defined by the CLogP calculation can be very large and are not calculated as the sum of smaller, more common, polar fragments. This approach enhances accuracy but increases the number of missing fragments. We implemented the log P calculation (MLogP) air described by Moriguchi et al.
`[ 12,131. We recommend that, within analog series, our medicinal chemists use the more accurate Pomona CLogP calculation if possible. For calculation or tracking of library properties the less accurate MLogP program is used. Only about 10% of USAN compounds have a CLogP over 5. The CLogP value of 5 calculated on the USAN data set corresponds to an MLogP of 4.15. The slope of CLogP (X axis) versus MLogP (J axis) is less than unity. At the high log P end, the Moriguchi MLogP is somewhat lower than the MedChem CLogP. In the middle log P range at about 2, the two scales are similar. Experimentally there is almost certainly a lower (hydrophilic) log P limit to absorption and permeation. Operationally, we have ignored a lower limit because of the errors
`I16]. Computationally, hy- drogen donor ability differences can be expressed by the solvatochromic cy parameter of a donor group with perhaps a steric modifier to allow for the interactions between donor and acceptor moieties. Experimental LY values for hydrogen bond donors and 6 values for acceptor groups 1171 have been com- piled by Professor Abraham in the UK and by the Raevsky group in Russia
`[ 18,191. Both research groups currently express the hydrogen bond donor and acceptor properties of a moiety on a thermo- dynamic free energy scale. In the Raevsky C scale, donors range from about - 4.0 for a very strong donor to - 0.5 for a very weak donor. Acceptors values in the Raevsky C scale are all positive and range from about 4.0 for a strong acceptor to about 0.5 for a weak acceptor. In the Abraham scale both donors and acceptors have positive values that are about one-quarter of the absolute C values in the Raevsky scale. We found that simply adding the number of NH
`and OH bonds does remarkably well as an index of H bond donor character. Importantly, this pa- rameter has direct structural relevance to the chemist. When one looks at the USAN library there is a sharp cutoff in the number of compounds containing more than 5 OHS and NHs. Only 8% have more than 5. SO 92% of compounds have five or fewer H bond donors and it is the smaller number of donors that the litera- ture links with better permeability. Too tnany hydrogen bond acceptor groups also hinder permeability across a membrane bi-layer. The sum of Ns and OS is a rough measure of H bond accepting ability. This very simple calculation is not nearly as good as the OH and NH count (as a model for donor ability) because there is far more variation in hydrogen bond acceptor than donor ability across atom types. For example, a pyrrole and pyridine nitrogen count equally as acceptors in the simple N 0 sum calculation even though a pyridine nitrogen is a very good acceptor (2.72 on the C scale) and the pyrrole nitrogen is an far poorer acceptor
`(1.33 on the C scale). The more accurate solvatochromic p parameter which measures acceptor ability varies far more on a per nitrogen or oxygen atom basis than the corresponding LY parameter. When we examined the USAN library we found a fairly sharp cutoff in protiles with only about 12% of compounds having more than 10 Ns and OS. -7.6. Thr ‘r-ulr of 5’ cml its implementation At this point we had four parameters that we
`in the
`MLogP calculation and because excessively hydro- philic compounds are not a problem in compounds originating in our medicinal chemistry laboratories. An excessive number of hydrogen
`donot groups impairs permeability across a membrane bi- layer Il4,15]. Hydrogen donor ability can be mea- sured indirectly by the partition coefticient between strongly hydrogen bonding solvents like water OI ethylene glycol and a non hydrogen bond accepting solvent like a hydrocarbon
`1151 or as the log of the ratio of octanol to hydrocarbon partitioning. In vitro systems for studying intestinal drug absorption have been recently reviewed
Petitioner Torrent Pharmaceuticals Limited - Exhibit 1019 - Page 6

`C.A. Lipinski ef trl. I Advancrd Drug Drlivrry RrrGws 23 (1997) 3-2.5 9 thought should be globally associated with solubility and permeability; namely molecular weight; Log P; the number of H-bond donors and the number of H-bond acceptors. In a manner similar to setting the confidence level of an assay at 90 or 95% we asked how these four parameters needed to be set so that about 90% of the USAN compounds had parameters in a calculated range associated with better solubility or permeability. This analysis led to a simple mnemonic which we called the ‘rule of 5’ [20] because the cutoffs for each of the four parameters were all close to 5 or a multiple of 5. In the USAN set we found that the sum of Ns and OS in the molecular formula was greater than 10 in 12% of the compounds. Eleven percent of compounds had a MWT of over 500. Ten percent of compounds had a CLogP larger than 5 (or an MLogP larger than 4.15) and in 8% of compounds the sum of OHS and NHs in the chemical structure was larger than 5. The ‘rule of 5’ states that: poor absorption or permeation are more likely when: There are more than 5 H-bond donors (expressed as the sum of OHS and NHs); The MWT is over 500; The Log P is over 5 (or MLogP is over 4.15); There are more than 10 H-bond acceptors (ex- pressed as the sum of Ns and OS) Compound classes that are substrates for bio- logical transporters are exceptions to the rule. When we examined combinations of any two of the four parameters in the USAN data set, we found that combinations of two parameters outside the desirable range did not exceed 10%. The exact values from the USAN set are: sum of N and 0 +
`of NH and OH - 10%; sum of N and 0 + MWT - 7%; sum of NH and OH + MWT - 4% and sum of MWT + Log P - 1%. The rarity (1%) among USAN drugs of the combination of high MWT and high log P was striking because this particular combination of physico-chemical proper- ties in the USAN list is enhanced in the leads resulting from high throughput screening. The rule of 5 is now implemented in our registra- tion system for new compounds synthesized in our medicinal chemistry laboratories and the calculation program runs automatically as the chemist registers a new compound. If two parameters are out of range, a ‘poor absorption or permeability is possible’ alert appears on the registration screen. All new com- pounds are registered and so the alert is a very visible educational tool for the chemist and serves as a tracking tool for the research organization. No chemist is prevented from registering a compound because of the alert calculation. 2.7. 0rull.v active drugs outside the ‘rule of 5 mnemonic and biologic tran.rporter.s The ‘rule of 5’ is based on a distribution of calculated properties among several thousand drugs. Therefore by definition, some drugs will lie outside the parameter cutoffs in the rule. Interestingly, only a small number of therapeutic categories account for most of the USAN drugs with properties falling outside our parameter cutoffs. These orally active therapeutic classes outside the ‘rule of 5’ are: antibiotics, antifungals, vitamins and cardiac glyco- sides. We suggest that these few therapeutic classes contain orally active drugs that violate the ‘rule of 5’ because members of these classes have structural features that allow the drugs to act as substrates for naturally occurring transporters. When

