`ESCOM
`
`593
`
`J-CAMD 183
`
`LUDI: rule-based automatic design of new substituents for
`enzyme inhibitor leads
`
`Hans-Joachim Bohm
`BASF AG, Central Research, 6700 Ludwigshafen, Germany
`
`Received 10 July 1992
`Accepted 17 August 1992
`
`Key words: Enzymes; Enzyme inhibitors; Molecular modeling; Drug design; De novo design
`
`SUMMARY
`
`Recent advances in a new method for the de novo design of enzyme inhibitors are reported. A new set
`of rules to define the possible non bonded contacts between protein and ligand is presented. This method was
`derived from published statistical analyses of nonbonded contacts in crystal packings of organic molecules
`and has been implemented in the recently described computer program LUDI. Moreover, LUDI can now
`append a new substituent onto an already existing ligand. Applications are reported for the design of inhibi(cid:173)
`tors of HIV protease and dihydrofolate reductase. The results demonstrate that LUDI is indeed capable of
`designing new ligands with improved binding when compared to the reference compound.
`
`I. INTRODUCTION
`
`The de novo design of protein ligands has recently gained increased attention [1-9]. Most effort
`so far has focused on the calculation of favorable binding sites [1-3] and on the docking of given
`ligands into the binding pocket of a protein [4,5]. A few groups have also reported on the auto(cid:173)
`matic design of novel ligands [6-9].
`Recently, I reported a new method for the de novo design of enzyme inhibitors, called LUDI
`[9]. This method is based on a statistical analysis of non bonded contacts found in the Cambridge
`structural database (CSD) [10]. The first version of the program made direct use of the contact
`patterns retrieved from the CSD and utilized them to position small molecules or fragments in a
`cleft in a protein structure (e.g. an active site) in such a way that hydrogen bonds are formed with
`the protein and hydrophobic pockets are filled with suitable side chains of the ligand. In the first
`paper on LUDI [9] I presented a very simple set of rules to generate the positions of atoms on the
`basis of fragments found suitable to form favorable interactions with the protein. However, this
`first set of rules turned out to be too simplistic because it took into account only the most heavily
`populated hydrogen-bond geometries. The direct use of contact geometries from the CSD carries
`the danger that some potentially important contact patterns are not included because they have
`
`0920-654X/$ 5.00 © 1992 ESCOM Science Publishers B.V.
`
`
`
`594
`
`not yet shown up in the crystal structures of small molecules. One should keep in mind that de(cid:173)
`spite the rather large number of structures (90 000) currently contained in the CSD (1991 version),
`the number of certain nonbonded contacts relevant for ligand protein interactions may be very
`small.
`I therefore decided to develop a new set of rules for non bonded contacts on the basis of the ex(cid:173)
`perimentally observed range of non bonded contact geometries revealed by statistical analysis of
`the CSD [11-18]. This new set of rules is thought to have the advantage of covering the complete
`space of energetically favorable arrangements for hydrogen bonds and hydrophobic contacts. The
`analysis of the CSD is used to define the range of allowed angles and dihedrals (see Fig. 1 for defi(cid:173)
`nition of the angles and dihedrals) describing the nonbonded contact geometry. This space is then
`populated by discrete points (or vectors) that are equally spaced. The point density can be con(cid:173)
`trolled by the user. Note that the data from the statistical analysis of the CSD are used merely to
`derive the allowed range of contact geometries. The rules derived from the CSD do not take into
`account the experimentally observed different populations of different contact geometries.
`In addition, some other improvements to LUDI are reported concerning the positioning of
`fragments, the evaluation of positioned fragments and the possible prioritization of the structures
`found to fit the binding site of a protein. Another new functionality that has been added to LUDI
`is the ability to link a new fragment to an already existing ligand while forming hydrogen bonds
`with the protein and filling a hydrophobic pocket. This feature offers the important possibility to
`design new substituents for a given lead compound.
`
`H-----
`
`(J)
`
`R
`~--·······
`)::f:.=====o-·: ............................. .
`
`N
`
`A
`
`R
`
`N
`
`R
`
`B
`
`R
`
`c
`
`/N
`
`.H
`
`·~· . /
`c/~.·.·.·.·························
`
`R
`
`~//
`\
`
`H
`
`Fig. 1. Definition of the geometric parameters R, a and w used in the rules for the allowed non bonded contacts. A: defini(cid:173)
`tion for terminal groups; B: definition for -0-; C: definition for -N =. For -N = groups; a denotes the angle between the
`bisector of the angle C = N-R and the vector N .. H.
`
`
`
`595
`
`Finally, LUDI was used to design new inhibitors of the aspartic protease of the human immu(cid:173)
`nodeficiency virus (HIV) and dihydrofolate reductase (DHFR).
`
`2. METHODOLOGY
`
`2.1. A new set of rules to generate the potential interaction sites
`Interactions between a protein and its ligand are usually formed through favorable nonbonded
`contacts such as hydrogen bonds or hydrophobic interactions. These contacts may be divided into
`individual interactions between single atoms or functional groups of the protein and the ligand.
`Thus, for every atom or functional group of the protein that is involved in binding with the ligand,
`there exists a counterpart on the ligand. This counterpart is again an atom or a functional group.
`For example, the counterpart for a carbonyl group C=O of the protein may be an amino group
`N-H of the ligand. A suitable position for such a functional group or atom of the ligand is referred
`to as its 'interaction site'. A statistical analysis of hydrogen-bond geometries in crystal packings
`of small molecules [11-18] reveals that there is a rather broad distribution of hydrogen-bond pat(cid:173)
`terns. Therefore, for every functional group of the protein there exists not only a single position
`but also a region in space suitable for favorable interactions with the protein. In LUDI, this distri(cid:173)
`bution of possible contact patterns is taken into account by using an ensemble of interaction sites
`distributed over the whole region of possible contact patterns. This approach has the advantage
`that it is purely geometrical and therefore avoids costly calculations of potential functions.
`The definition of an interaction site has been given previously [9]. LUDI distinguishes between
`four different types of interaction sites:
`1. hydrogen-donor,
`2. hydrogen-acceptor,
`3. lipophilic-aliphatic,
`4. lipophilic-aromatic.
`In LUDI, the hydrogen-donor and hydrogen-acceptor interaction sites are described by vectors
`(atom pairs) to account for the strong directionality of hydrogen bonds. Hydrogen-donor sites are
`represented by D-X vectors (Ro-x = 1 A) and hydrogen-acceptor sites are represented by A-Y vec(cid:173)
`tors (RA-Y= 1.23 A). The particular lengths for the vectors were chosen to correspond roughly to
`the N-H/0-H and C = 0 bond lengths, respectively. A suitable type of interaction site is selected
`for each functional group or atom of the enzyme. Then a user-defined number of interaction sites
`is positioned. This positioning is guided by the rules.
`The rules used to generate the hydrogen-donor and hydrogen-acceptor interaction sites will
`now be described. For the hydrophobic contacts the same rules are used as given in my previous
`paper [9]. The position of an interaction site is described by the distance R, angle a and dihedral
`OJ as defined in Fig. 1. The available experimental data on nonbonded contact geometries in crys(cid:173)
`tal packings of small organic molecules are used to define the allowed values for R, a, and OJ. The
`region in space defined by the values is then populated by discrete interaction sites. The distance
`between the interaction sites is typically 0.2-0.3 A. The rules are summarized in Table 1.
`The hydrogen-bond geometry of carbonyl groups in the solid state has been investigated exten(cid:173)
`sively [11,12,15]. The available data show a distribution of a from 110° to 180° with a preference
`for the lone-pair direction (a= 120°, OJ=0°,180°). However, as this preference is not particularly
`pronounced and the other regions are also significantly populated, an even distribution of interac-
`
`
`
`596
`
`tion sites was used, with Ro .. n=l.9 A, a=ll0-180° and m=0-360°. The optimal O .. D-X hy(cid:173)
`drogen bond is assumed to be linear ( < o .. n-x = 180°). This distribution is applied for the back(cid:173)
`bone carbonyl groups and those in the side chains of the amino acids Asn and Gln.
`The distribution of hydrogen-acceptor atoms around a N-H group falls into a smaller region in
`space than that around a carbonyl group. The statistical analyses that have been published
`[12, 14, 15] all show a strong preference for a linear hydrogen bond with < N-H .. OfN = 150-180°. A
`very similar distribution has also been found around the N-H group in aromatic rings [13,15]. The
`available data indicate similar distributions for N-H and 0-H. Therefore, identical rules for both
`groups were used to generate interaction sites with RH .. A = 1.9 A, a= 150-180° and ro = 0-360°.
`This distribution was used for the backbone N-H groups and for the hydrogen-donor groups in
`the side chains of the amino acids His, Gln, Asn, Ser, Thr and Tyr. For charged amino groups, a
`slightly shorter hydrogen-bond length of RH .. A = 1.8 A was used. This shorter hydrogen-bond
`length for charged groups has also been observed experimentally [14].
`A problem arises with the generation of the position of the second atom, Y, adjacent to the hy(cid:173)
`drogen-acceptor position A. The optimal position of this second atom is difficult to obtain from
`available experimental data. The position of the site Y was thus generated assuming <N-H .. A-Y
`=0°, <H..A-Y= 110-180° and RA_y= 1.23 A, although the particular choice of the dihedral is ad(cid:173)
`mittedly somewhat arbitrary.
`
`TABLE I
`GEOMETRIC PARAMETERS DESCRIBING THE ALLOWED RANGE OF NONBONDED CONTACT GEO-
`ME TRIES USED IN LUDI
`
`Enzyme
`functional
`group
`
`C=O
`
`N-H,O-H
`
`N-H(charged)
`
`coo-
`
`=N-
`
`R-O-R(sp2)
`
`R-O-R(sp3)
`
`Interaction
`site
`
`Geometric
`parameters
`
`D-X
`
`A-Y
`
`A-Y
`
`D-X
`
`D-X
`
`D-X
`
`D-X
`
`Ro .. o= 1.9 A
`U= 110-180°
`ro=0-360°
`RHA=l.9A
`a= 150-180°
`ro=0~360°
`RH .. A=1.8A
`a=150~180°
`ro=0~360°
`Ro .. o= 1.8A
`U= 100~140°
`ro-50-50°, 130-230°
`RN.o=L9A
`a= 150-180°
`ro=0-360°
`Ro .. o= 1.9 A
`a=I00-140°
`co= -60-60°
`Roo= 1.9 A
`a=90-130°
`(!)= -70-70°
`
`Reference
`
`11,12,15
`
`12,14,15
`
`12,14,15
`
`16
`
`13,15
`
`13,15
`
`12,15,18
`
`
`
`597
`
`The hydrogen-bond contact patterns around carboxylic acids have been studied by Gorbitz and
`Etter [16]. The data indicate a preference for <c=o .. H= 120° and <o-c-o .. H=0,180°. These au(cid:173)
`thors found no indication that syn hydrogen bonds are inherently more favorable than anti hy(cid:173)
`drogen bonds. Their data were translated into the following rules to generate the interaction sites
`around a carboxylic acid: Ro.:o = 1.8 A, a= 100-140°, ro =- 50 .. 50°, 130-230°.
`The distribution of hydrogen donors around an unprotonated nitrogen in aromatic rings has
`been investigated by Vedani and Dunitz [13]. The distribution of hydrogen donors is narrower
`than that around a carbonyl group. The following rule (which applies to the unprotonated nitro(cid:173)
`gen in the side chain of His) is derived from the results of Vedani and Dunitz: RN .. D = 1.9 A,
`a= 150-180°, ro=0-360°.
`Hydroxyl groups can act both as hydrogen donors and as hydrogen acceptors. Although a de(cid:173)
`tailed analysis of high-resolution protein structures [17] shows that hydroxyl groups act more of(cid:173)
`ten as donors than as acceptors, the possibility that hydroxyl groups act as acceptors has to be
`taken into account. For sp3-oxygen, the data of Kroon et al. [18] indicate a preference for the do(cid:173)
`nor group to lie in the plane of the lone pairs ( <c.o .. H= 109±20°). However, no evidence has
`been obtained for any preference of the lone-pair direction within this plane. This contrasts with
`data obtained by Vedani and Dunitz [13] and by Klebe [15], who report a preferred orientation of
`hydrogen-donor groups in the direction of the lone pairs. Since the experimental data are used
`merely to establish the allowed hydrogen-bond patterns, hydrogen bonds not pointing in the
`direction of the lone pair were also allowed for: Ro .. o = 1.9 A, a= 90-130°, ro = -70 .. 70°. For sp2-
`oxygen, as found in the side chain of Tyr, there is a clear preference for the hydrogen-donor
`groups to lie in the plane of the aromatic ring. The data ofVedani and Dunitz [13], Klebe [15] and
`Baker and Hubbard [17] were used to derive the following rule: Ro .. o = 1.9 A, a= 100-140° and
`ro= -50 .. 50°.
`As most publications on statistical analyses do not present a quantitative analysis of the data,
`there is a certain amount of ambiguity involved in the choice of the rules given above. A very re(cid:173)
`stricted definition of the allowed hydrogen-bond geometries would strongly reduce the number of
`hits obtained in the subsequent fragment fitting, and carries the risk of eventually missing some of
`the promising hits. On the other hand, a very broad definition would result in a very large number
`of hits, with the difficulty of selecting the most interesting ones. Thus, the present choice of rules
`represents a compromise.
`The generated interaction sites were finally checked for van der Waals overlap with the protein.
`
`2.2. Fragment linking
`In my previous paper I described the 'bridge' mode which allows one to connect positioned
`fragments by suitable spacers. This concept has now been generalized. LUDI is now able to fit
`fragments onto the interaction sites and simultaneously link them to an already existing ligand or
`part of a ligand. For this purpose, 'link sites', which are X-H atom pairs suitable for appending
`a substituent to the ligand, can be specified by the user. Alternatively, the program assumes that
`all hydrogen atoms of the positioned ligand within a given cut-off radius, together with the heavy
`atoms they are bound to, are link sites.
`LUDI can perform a single link, generating a single bond between the newly fitted fragment
`and the already existing ligand. Additionally, it is also possible to do a multiple link. The double
`link will generate two bonds between the newly fitted fragment and the existing ligand. For exam-
`
`
`
`598
`
`ple, it is possible to fuse a second phenyl ring onto an existing one to form a naphthyl group. This
`double link also includes the 'bridge-mode' as described previously [9]. The options are shown in
`Fig. 2.
`In order to carry out the calculations in the link mode, a second library was built specifically for
`this purpose. The link sites (the atoms which form a bond with the already existing ligand) are ex(cid:173)
`plicitly defined for each entry in this library. Some examples are shown in Fig. 3. This library cur(cid:173)
`rently consists of 1100 entries. This number is larger than the number of entries in the standard
`library because, for many of the structures, there are several possible ways to form the link.
`The link mode of LUDI is similar to the approach implemented in the computer program
`GROW by Moon and Howe [7]. The purpose of GROW is to construct peptides by linking amino
`acids, whereas LUDI attempts to construct arbitrary organic molecules. GROW is based on
`force-field calculations and will therefore be considerably slower than LUDI, because LUDI is
`completely based on geometric operations.
`
`2.3. Prioritization of the fitted fragments
`An important problem of every method based on searching through large numbers of struc(cid:173)
`tures is the prioritization of the hits. This problem is approached as follows:
`
`link library
`
`single link 0 + )l~.....- ____.
`
`0
`
`double link
`
`..... ,
`
`0~, +
`
`triple link
`
`0
`
`Fig. 2. Examples for a single, double and triple link as performed by LUDI in the link-mode.
`
`
`
`Only those fragments with a root-mean-square (rms) deviation of the fit of the fragment onto
`the interaction sites below a certain threshold (typically 0.3-0.5 A) are accepted. A further requi(cid:173)
`rement for a successfully positioned fragment is that it does not overlap with the protein. LUDI
`also checks for electrostatic repulsion between protein and ligand: if a polar atom is closer to a
`protein atom of the same polarity than a threshold distance (typically 3.5 A for 0 .. 0 contacts),
`
`599
`
`Standard library
`
`0
`
`link library
`
`-{)
`rO
`
`Fig. 3. Examples from the link library of LUDI. Each possible link that will be considered by LUDI has to be specified
`explicitly.
`
`
`
`600
`
`then the fit of the fragment is rejected. In the electrostatic repulsion check, only those protein
`atoms are taken into account that do not hydrogen bond with the ligand.
`The number and quality of the hydrogen bonds between protein and ligand and the hy(cid:173)
`drophobic protein-ligand contact surface were then used to calculate a score. The relative weight
`of a hydrogen bond with respect to the hydrophobic interaction was derived from a value of 1.5
`kcaljmol for the contribution of a hydrogen bond to the binding energy [19] and 25 cal/(mol A2)
`for the hydrophobic interaction [20]. Therefore, in the scoring function it is assumed that an un(cid:173)
`perturbed hydrogen bond has the same contribution to ligand binding as 60 A2 of hydrophobic
`contact surface. The following preliminary scoring function was used:
`
`Score =l:hbonds 100 * f(AR) * f(Aa) + 5/3 * NCONTACT
`
`f(AR)=1,AR s; 0.2A
`f(AR) = 1- (AR- 0.2)/0.4, AR s; 0.6 A
`f(AR) = 0, AR > 0.6 A
`
`f(Aa)= 1, Aa s; 30°
`f(Aa)= 1-(Aa-30)/50, Aa s; 80°
`f(Aa)=O, Aa > 80°
`
`AR is the deviation of the H .. O/N hydrogen-bond length from the ideal value 1.9 A. Aa is the
`deviation of the hydrogen-bond angle <NJO·H .. OJN from its ideal value 180°. NCONTACT repre(cid:173)
`sents the lipophilic contact area between protein and ligand in A 2.
`The scoring function was tested on the fit of fragments into the specificity pocket of trypsin and
`into the pteridine-binding site of dihydrofolate reductase. The fragments were taken from the
`standard LUDI library consisting of currently 800 fragments. For trypsin, the fragment with the
`highest score was benzamidine. In the case of DHFR, the highest score was found for the frag(cid:173)
`ment 2,4-diamino-pteridine.
`
`3. APPLICATIONS
`
`3.1. Inhibitors of the HIV protease
`As a first example, I report the application ofLUDI to the design of inhibitors of the HIV-pro(cid:173)
`tease [21]. The 3D structure of the HIV-1 protease complexed with a peptidic inhibitor was recent(cid:173)
`ly solved by Wlodawer and coworkers [22] (entry 4HVP in the Brookhaven protein databank
`[23]). I used a recent publication by DeSolms et al. [24] on C-terminal variations of the HIV pro(cid:173)
`tease inhibitor L-682,679 (see 1 in Fig. 4) as a starting point for my calculations with LUDI. De(cid:173)
`Solms et al. report binding data for 12 substituents at the P2' position and for 18 substituents at
`the P3' position. The 3D structure of the L-682,679-HIV protease is not available. For the calcula(cid:173)
`tions, I assumed that the Merck compound L-682,679 [24] binds to the HIV protease in the same
`manner as the compound MVT-101 that was used in the X-ray diffraction experiment by Wlo(cid:173)
`dawer and coworkers. The validity of this assumption is supported by the further structural ana(cid:173)
`lysis of a HIV protease-inhibitor complex by Erickson et al [25], showing a binding mode very
`similar to that ofMVT-101 [22]. The geometry ofL-682,679 in the complex with the protease was
`generated as follows. First, the positions of the backbone atoms of the inhibitor were taken direct(cid:173)
`ly from the X-ray structure whenever possible and the side chains were added in a reasonable ge(cid:173)
`ometry. Hydrogen atoms were added using standard geometries with the molecular graphics pro(cid:173)
`gram INSIGHT [26]. This structure was then optimized, including a critical buried water
`
`
`
`601
`
`1
`
`Fig. 4. Chemical structure of the HIV -protease inhibitor L-682,679 [24]1 and the reference compound 2 used in the pres(cid:173)
`ent calculation. LUDI was used to search for suitable substituents R 1 and R 2•
`
`molecule in the active site of the HIV protease, using the force-field CVFF [27]. The protein was
`kept fixed during the energy minimization. The amino acids Asp, Glu, Lys and Arg of the protea(cid:173)
`se were assumed to be charged. A hundred steps of conjugate gradients energy minimization were
`carried out to remove unfavorable steric contacts between protein and ligand. The energy minimi(cid:173)
`zation caused a shift of the C-terminal nitrogen in compound 2 by 0.23 A. The corresponding
`movement of theCa atom at position P2' was 0.43 A. Therefore, with respect to the present calcu(cid:173)
`lation, the model structure of compound 2 with the protease is very close to the structure of the
`MVT-101 compound.
`The purpose of the present calculations was to assess the ability of LUDI to design automati(cid:173)
`cally analogs of L-682,679 with a modified C-terminus by comparing the results from LUDI with
`the data of DeSolms et al. [24]. Structure 2 (see Fig. 4) was used as a lead and calculations were
`performed with LUDI in the link mode. In this mode, LUDI attempts to append fragments to the
`already positioned inhibitor 2. The results are summarized and compared with the data of De(cid:173)
`Solms et al. [24] in Table 2.
`In a first calculation, LUDI was used to search for substituents R 1 at the P2' position. LUDI
`predicts two substituents: CH2CH(CH3h and CH(CH3h. Both were also synthesized by DeSolms
`and indeed show improved binding by factors of 55 and 500, respectively. However, LUDI failed
`to retrieve the phenyl group [which shows the best binding of the compounds described by De(cid:173)
`Solms (improved binding by a factor of 600)] as the substituent at R 1• The phenyl group was re(cid:173)
`jected by LUDI due to overlap with the protein structure. This calculation took only 45 son a Sili(cid:173)
`con Graphics 4D35 workstation.
`LUDI was then used to design ligands for R2. The calculation took 105 sand yielded 10 possi(cid:173)
`ble substituents. Binding data from the paper of DeSolms et al. [24] are available for 3 of them:
`CH2CH20H (Fig. 6), CHr2-pyridyl and CHr3-pyridyl (Fig. 5). In all cases, slightly improved
`binding, by factors of 5, 1.5 and 1.5, respectively, was observed experimentally. It is noteworthty
`that for several other suggestions of LUDI, experimental data of closely related compounds are
`given by DeSolms et a! [24]. LUDI predicts both CH2CH20H and CH2CH2CH20H as substi(cid:173)
`tuents. Experimentally, the dihydroxy substituent CH2CH(OH)CH20H improved binding by a
`factor of 18-22. LUDI predicted p-hydroxy-phenyl as a substituent. Experimental information is
`available for p-amino-phenyl with a binding improvement of I 1-17.
`LUDI did not, however, find the methyl-benzimidazole compound, which shows the strongest
`
`
`
`602
`
`TABLE2
`COMPARISON OF THE SUGGESTED SUBSTITUENTS R 1 AND R2 FOR THE HIV-PROTEASE INHIBITOR 2
`WITH THE DATA OF DESOLMS ET AL. [24]
`
`Rt
`
`CH(CH3) 2
`CH 2CH(CH3)z
`
`Rz
`
`H
`H
`
`CH(CH3) 2
`CH(CHJ)z
`CH(CH 3)z
`CH(CH3)z
`CH(CH3) 2
`CH(CH3) 2
`CH(CH3) 2
`CH(CH3) 2
`CH(CH3) 2
`CH(CH3) 2
`
`CHzCHzOH
`CHz-3-pyridyl
`CHz-2-pyridyl
`CH2CH 2CH20H
`CH2COOC6Hs
`CHzC6H4-4'0H
`CHrl-imidazolyl
`CH2-2-thiazolyl
`CHz-2-furanyl
`CHz-1-tetrahydroisochinolin
`
`•As compared to R 1 = H (IC50 = 500 nM).
`hAs compared to R2 = H (IC50 = I nM).
`na = not available.
`
`Experimentally observed binding improvement
`
`500•
`55•
`
`Sb
`J.Sb
`J.5b
`na
`na
`na
`na
`na
`na
`na
`
`binding in the study of DeSolms et al. [24] (IC50 = 0.06 nM), although the appropriate fragment
`is contained in the fragment library. It is tempting to speculate about the reason for this failure.
`The benzimidazole moiety can form two hydrogen bonds with the protein. The most likely
`partners in the protein to form these hydrogen bonds are the side-chain oxygen of Asp B29 and
`the backbone nitrogen of Gly B48. The distance between these atoms in the crystal structure
`4HVP is 9.22 A. The sum of two hydrogen-bond lengths (2*2.9 A) and the intramolecular N-N
`distance (2.4 A) in the benzimidazole moiety is, however, only 8.2 A. This is 1 A shorter than the
`distance in the X-ray structure. Therefore, it is likely that the conformation of the side chain of
`Asp B29 will change upon ligand binding to allow for two hydrogen bonds to be formed. In fact,
`when the side-chain conformation of Asp B29 was changed so that the distance Asp B29 OD-Gly
`B48 N was reduced to 8.0 A, methyl-benzimidazol was retrieved by LUDI as a possible substi(cid:173)
`tuent at R2.
`
`3.2. Inhibitors of DHFR
`The second example given is the design of new inhibitors of dihydrofolate reductase (DHFR).
`The 3D structure of DHFR complexed with the anticancer drug methotrexate 3 (MTX) was
`solved by Bolin et al. [28] (entry 4DFR in the Brookhaven protein databank [23]) (Fig. 7). This
`structure, without water molecules, was used in the present calculations. The purpose of the calcu(cid:173)
`lations was to use LUDI to design new substituents for the 2,4-diamino-pteridine moiety at posi(cid:173)
`tion 6 on the ring system. Therefore, only the pteridine portion 4 of MTX was used from the X(cid:173)
`ray structure and the substituent at position 6 was removed. Again, the hydrogen atoms were ad(cid:173)
`ded using the program INSIGHT [26].
`The results from LUDI on the design of substituents for 2,4-diamino-pteridine in position 6,
`
`
`
`603
`
`Asp B29
`
`Fig. 5. Conformation of the CHT3-pyridyl substituent as R2 of compound 2. The substituent is shown with shaded atoms.
`LUDI suggests that the pyridine nitrogen forms a hydrogen bond with the backbone nitrogen ofGly B48 from the protein.
`
`once again run in the link mode, are summarized in Table 3 and are compared with data from a
`compilation of experimental data prepared by Blaney et al. [29]. LUDI retrieved seven structures
`as possible substituents of R. Experimental data are available for two of them. The isobutyl sub(cid:173)
`stituent leads to a strong improvement in binding. The phenylethyl group leaves the binding un(cid:173)
`changed. LUDI does not retrieve the substituent CH2N(CH3)C6H4-4'-CO-Glu (yielding MTX)
`because the link library does not contain such complex moieties.
`
`4. DISCUSSION AND CONCLUSIONS
`
`This paper describes recent advances in a new approach to the de novo design of protein ligands
`as implemented in the computer program LUDI [9]. A new set of rules to generate the interaction
`
`Asp B29
`
`Fig. 6. Conformation of the CH2CH20H substituent at R 2 of compound 2. The substituent is shown with shaded atoms.
`LUDI suggests that the hydroxyl group forms a hydrogen bond with the side chain of Asp B29 from the protein.
`
`
`
`604
`
`4
`
`3
`
`Fig. 7. Chemical structure of methotrexate 3 and the reference compound 4 used in the present calculation. LUDI was
`used to search for suitable substituents Rat position 6 of the pteridine ring.
`
`sites is described. LUDI is now capable of designing new substituents for a given enzyme inhibitor
`lead. A scoring function for the fitted fragments was implemented that is based on the number
`and quality of the hydrogen bonds and the hydrophobic contact surface.
`LUDI was successfully applied to the design of inhibitors for the enzymes HIV protease and di(cid:173)
`hydrofolate reductase. The first application of LUDI given in the present paper is the design of a
`new C-terminal substituent for an inhibitor of HIV protease. In this case, LUDI predicted two
`fragments as substituents for the P2' site; both were found experimentally to yield substantially
`improved binding [24]. For the P3' site, LUDI retrieved ten candidate structures. The available ex(cid:173)
`perimental data show improved binding for three of them. For DHFR, LUDI predicted seven
`fragments as possible substituents for 2,4-diamino-pteridine moiety at position 6. For one of
`them, the available experimental data indeed showed improved binding as compared to the un(cid:173)
`substituted lead compound. These results demonstrate that LUDI is indeed able to suggest active
`compounds.
`The positioning of fragments by LUDI is done by a fit onto the interaction sites. This approach
`offers the advantage that only purely geometrical calculations are required, thereby avoiding the
`very CPU -intensive evaluation of energy functions and their derivatives. In comparing the present
`
`TABLE3
`COMPARISON OF SUGGESTED SUBSTITUENTS RAT POSITION 6 OF 2,4-DIAMINO-PTERIDINE WITH
`EXPERIMENTAL DATA FROM THE SURVEY OF BLANEY ET AL. [29]
`
`R
`
`CH2CH(CH3)2
`CHr !-naphthyl
`CHzCHzC6Hs
`CH(CH3)2
`CH2C6Hr3',5'(CH3)z
`CH2C6H4-4'CN
`CH2C6H4-4'0H
`
`na = not available.
`
`Experimentally observed binding improvement
`
`> 100
`na
`
`na
`na
`na
`na
`
`
`
`605
`
`approach with the well-established method of positioning a putative ligand by force-field calcula(cid:173)
`tions, one should bear in mind that the traditional force-field approach will also encounter the
`multiple minima problem. Therefore, a considerable number of force-field calculations are re(cid:173)
`quired before the optimal position of the ligand can be specified unambiguously. Methods based
`on force-field calculations will therefore be much slower than LUDI.
`When comparing the accuracy of LUDI to the traditional force-field calculations, one must
`consider that the error introduced by using discrete positions or vectors is roughly of the order of
`the distance between the interaction sites. In the examples described in the present paper, the
`point density corresponds to distances between neighboring interaction sites of about 0.3 A. This
`is roughly equal to the accuracy of the atomic positions in a high-resolution protein structure and
`well within the tolerance of most of the results of today's best force-field calculations. I therefore
`conclude that the use of discrete points does not introduce significant errors into the calculation.
`LUDI does not distinguish between interaction sites at the optimal positions, e.g., for carboxyl(cid:173)
`ic groups those with < o-C-O .. H = 0° and those slightly shifted off these positions. However, the po(cid:173)
`sition of a protein ligand is usually determined by several interactions that all occur simultaneous(cid:173)
`ly. This means that in most cases the hydrogen-bond geometries will not adopt their optimal
`values. The geometrical constraints involved in maximizing the number of hydrogen bonds will be
`more important than the electrostatic effects in determining the hydrogen-bond geometry [30].
`Therefore, the present approach appears to be justified, as only geometries are considered. A de(cid:173)
`tailed evaluation of a protein-ligand complex generated by LUDI can be made afterwards using
`a force-field calculation.
`A very important advantage of the geometry-based approach adopted by LUDI is the possibili(cid:173)
`ty to combine the search for favorable nonbonded interactions with the search for a suitable bond
`for the fragment with an already existing ligand. This offers the possibility to design new protein
`ligands in a stepwise manner.
`In conclusion, I have further developed a new algorithm for the de novo design of protein li(cid:173)
`gands. The method has been applied successfully to predict improved inhibitors for two enzymes
`(DHFR and HIV protease). The present results indicate that the approach may be useful for the
`rational design of drugs when the 3D structure of the target protein is known.
`
`ACKNOWLEDGEMENT
`
`I would like to thank my colleague Gerhard Klebe for many helpful discussions.
`
`REFERENCES
`
`I Goodford, P.J., J. Med. Chern., 28 (1985) 849.
`2 Boobbyer, D.N.A., Goodford, P.J., MCWhinnie, P.M. and Wade, R.C., J. Med. Chern., 32 (1989) 1083.
`3 Miranker, A. and Karplus, M., Proteins II (1991) 29.
`4 DesJarlais, R.L., Sheridan, R.P., Seibel, G.L., Dixon, J.S., Kuntz, I.D. and Venkataraghavan, R., J. Med. Chern. 31
`(1988) 722.
`5 Lawrence, M.C. and Davis, P.C., Proteins 12 (1992) 31.
`6 Bartlett, P.A., Shea, G.T., Telfer S.J. and Waterman, S. In: Roberts, S.M. (Ed.), Molecular Recognition: Chemical
`and Biological Problems, Royal