`ESCOt--1
`
`593
`
`J-CAMD !83
`
`LUDI: rule-based automatic design of new substituents for
`enzyme inhibitor leads
`
`Hans-Joachim Bohm
`BASF AG, Ce11/ral Research, 6700 Ludwigslwfen, Germany
`
`Received JO July 1992
`Acccptc<l 17 August 1992
`
`Keywords: Enzymes; Enzyme inhibitors; ~fo!ecular modeling; Drug design; De novo design
`
`SUMMARY
`
`Recent advances in a new method for the de novo design of cnzyn1e inhibitors arc reported. A new set
`of rules to define the possible nonbonded contacts between protein and ligand is presented. This 1uethod \Vas
`derived fro1n published statistical analyses of nonbondcd contacts in crystal packings of organic 1nolecules
`and has been i1nplemented in the recently described computer progran1 LUDI. 11oreovcr, LUDI can now
`append a new substitucnt onto an already existing ligand. Applications are reported for the design of inhibi(cid:173)
`tors of HIV protease and dihydrofolate rcductasc. The results de1nonstrate that LUDI is indeed capable of
`designing new ligands with i1nproved binding when coinpared to the reference eo1npound.
`
`I. INTRODUCTION
`
`The de novo design of protein ligands has recently gained increased attention (1-·9]. Most effort
`so far has focused on the calculation of favorable binding sites [I-3] and on the docking of given
`ligands into the binding pocket of a protein [4,5}. A fe\v groups have also reported on the auto(cid:173)
`matic design of novel ligands [6-9).
`Recently, I reported a new method for the de novo design of enzy1ne inhibitors, called LUDI
`(9]. This 1nethod is based on a statistical analysis of nonbonde<l contacts found in the Cambridge
`structural database (CSD) (lOJ. The first version of the progran1 1nade direct use of the contact
`patterns retrieved fron1 the CSD and utilized thetn to position small molecules or fragn1ents in a
`cleft in a protein structure (e.g. an active site) in such a \Vay that hydrogen bonds are fanned with
`the protein and hydrophobic pockets are filled with suitable side chains of the ligand. In the first
`paper on LUDI [9] I presented a very sitnple set of rules to generate the positions of aton1s on the
`basis of fragments found suitable to fonn favorable interactions \Vith the protein. However, this
`first set of rules turned out to be too si1nplistic because it took into account only the n1ost heavily
`populated hydrogen~bond geotnctries. The direct use of contact gco1nctries fron1 the CSD carries
`the danger that some potentially itnportant contact patterns arc not included because they have
`
`0920·6S4X/S S.00 © 1992 ESCOM Science Publishers B.V.
`
`Breckenridge Exhibit 1014
`Breckenridge v. Novartis AG
`
`
`
`594
`
`not yet shO\\'O up in the crystal structures of s1nall molecules. One should keep in 1nind that de(cid:173)
`spite the rather large number of structures (90 000) currently contained in the CSJ) ( 1991 version),
`the nurnber of certain nonbondcd contacts relevant for ligand protein interactions n1ay be very
`s1nal!.
`I therefore decided to develop a nc\\' set of rules for nonbondcd contacts on the basis of the cx(cid:173)
`perin1entally observed range of non bonded contact geo1netries revealed by statistical analysis of
`the CSD [11--18]. This ne\v set of rules is thought to have the advantage of covering the co1nplete
`space of energetically favorable arrangements for hydrogen bonds and hydrophobic contacts. The
`analysis of the CSD is used to define the range of allowed angles and dihedrals (see Fig. 1 for defi(cid:173)
`nition of the angles and dihcdrals) describing the nonbondcd contact geometry. This space is then
`populated by discrete points (or vectors) that are equally spaced. The point density can be con~
`trolled by the user. Note that the data fron1 the statistical analysis of the CSD are used n1erely to
`derive the allo\ved range of contact geometries. The rules derived fron1 the CSD do not take into
`account the experimentally observed difiCrcnt populations of different contact geon1etries.
`In addition, so1ne other in1prove1ncnts to LUDI are reported concerning the positioning of
`fragn1ents, the evaluation of positioned fragn1ents and the possible prioritization of the structures
`found to fit the binding site ofa protein. Another new functionality that has been added to LUDI
`is the ability to link a ne\v frag1ncnt to an already existing ligand \Vhile forming hydrogen bonds
`\Vi th the protein and filling a hydrophobic pocket. This feature offers the hnportant possibility to
`design new substitucnts for a given lead con1pound.
`
`N
`H~
`
`c
`
`A
`
`R
`
`N
`
`R
`
`B
`
`R
`
`~··
`\ H
`
`Fig. J. Definition of the geometric parameters R, a and ro used in the rules for the allowed non bonded contacts. A: defini·
`tion for tem1inal groups; B: definition for ·O·; C: definition for -N =. For -N = groups; a denotes the angle between the
`bisector of the (lnglc C = N-R and the vector N .. H.
`
`
`
`595
`
`Finally, LUDI was used to design new inhibitors of the aspartic protease of the human in11nu(cid:173)
`nodeficiency virus (HIV) and dihydrofolate reductase (DHFR).
`
`2. METHODOLOGY
`
`2.1. A new set of rules to generate the potential interaction sites
`Interactions bet\vccn a protein and its ligand are usua1ly forn1ed through favorable non bonded
`contacts such as hydrogen bonds or hydrophobic interactions. These contacts rnay be divided into
`individual interaclions between single ato1ns or functional groups of the protein and the ligand.
`1'hus, for every ato1n or functional group of the protein that is involved in binding \Vi th the ligand,
`there exists a counterpart on the ligand. This counterpart is again an ato111 or a functional group.
`For exa1nplc, the counterpart for a carbonyl group C =0 of the protein tnay be an atnino group
`N-H of the ligand. A suitable position for such a functional group or atom of the ligand is referred
`to as its 'interaction site'. A statistical analysis of hydrogen-bond gco1nctrics in crystal packings
`of small molecules [11-18] reveals that there is a rather broad distribution of hydrogen-bond pat(cid:173)
`terns. Therefore, for every functional group of the protein there exists not only a single position
`but also a region in space suitable for favorable interactions \vi th the protein. In LUDf, this distri(cid:173)
`bution of possible contact patterns is taken into account by using an ense1nble of interaction sites
`distributed over the whole region of possible contact patterns. This approach has the advantage
`that it is purely gco1uetrical and therefore avoids costly calculations of potential functions.
`The definition of an interaction site has been given previously [9]. LUDI distinguishes between
`four different types of interaction sites:
`I. hydrogen-donor,
`2. hydrogen-acceptor,
`3. lipophilic-aliphatic,
`4. lipophilic-aron1atic.
`Jn LUDI, the hydrogen-donor and hydrogen-acceptor interaction sites are described by vectors
`(ato1n pairs) to account for the strong dirc1,;tionality of hydrogen bonds. I-Iydrogen-donor sites are
`represented by D-X vectors (Ro.x = l A) and hydrogen-acceptor .sites are represented by A-Y vec(cid:173)
`tors (H·A·Y= l.23 A). The particular lengths for the vectors \Vere chosen to correspond roughly to
`the N-I·I/0-H and C=O bond lengths, respectively. A suitable type of interaction site is selected
`for each functional group or a torn of the enzy1ne. Then a user-defined nutnber of interaction sites
`is positioned. This positioning is guided by the rules.
`1'he rules used to generate the hydrogen-donor and hydrogen-acceptor interaction sites \vill
`no\v be described. For the hydrophobic contacts the sarne rules are used as given in n1y previous
`paper (9]. The position of an interaction site is described by the distance R, angle a and dihedral
`ro as defined in Fig. I. 'J'he available expcriinental data on nonbondcd contact geo1netries in crys(cid:173)
`tal packings of s1nall organic molecules are used to define the allo\ved values for R, a, and ro. The
`region in space defined by the values is then populated by discrete interaction sites. The distance
`between the interaction sites is typically 0.2-0.3 A. The rules arc sun11narized in Table 1.
`The hydrogen-bond geo1netry of carbonyl groups in the solid state has been investigated exten(cid:173)
`sively [11, 12, 15]. The available data show a distribution of a fro111 l l 0° to 180° \Vi th a preference
`for the lone-pair direction (a= 120°, ro=0°,l80°). Ho\vever, as this preference is not particularly
`pronounced and the other regions are also significantly populated, an even distribution of interac-
`
`
`
`596
`
`tion sites was used, with Ro .. n= 1.9 A, a= 1 ICH80' and ro=()-360'. The optimal O .. D-X hy(cid:173)
`drogen bond is assu1ncd to be linear ( <o .. o-x= 180°). 'l'his distribution is applied for the back(cid:173)
`bone carbonyl groups and those in the side chains of the amino acids Asn and Gln.
`The distribution of hydrogen-acceptor aton1s around a N-H group falls into a sn1aller region in
`space than that around a carbonyl group. The statistical analyses that have been published
`(12,14,15] all show a strong preference for a linear hydrogen bond with < N·H .. o/N= 15()-180'. A
`very similar distribution has also been found around the N-H group in aromatic rings [ 13, 15]. The
`available data indicate similar distributions for N-H and 0-H. Therefore, identical rules for both
`groups \Vere used to generate interaction sites with RH .. A= l.9 A, a= 150--180° and ro=0--360°.
`This distribution was used for the backbone N-H groups and for the hydrogen-donor groups in
`the side chains of the amino acids His, Gln, Asn, Ser, l'hr and Tyr. For charged amino groups, a
`slightly shorter hydrogen-bond length of Ru .. A = 1.8 A was used. This shorter hydrogen-bond
`length for charged groups has also been observed experimentally [14].
`A proble1n arises with the generation of the position of the second atom, Y, adjacent to the hy(cid:173)
`drogen-acceptor position A. The optimal position of this second atom is <lifl1cult to obtain fro1n
`available experituental data. The position of the site Y was thus generated assutning < N·H .. A-Y
`=0', < H .. A-Y = 11()-180' and RA.v= 1.23 A, although the particular choice of the dihedral is ad-
`1nittcdly son1c\vhat arbitrary.
`
`TABLE I
`GE011ETRIC PARAMETERS DESCRIBING THE ALLO\VED RANGE OF NONBONDED CONTACT GEO-
`METRIES USED IN LUDI
`
`Enzyme
`functional
`group
`
`C=O
`
`N-H,0-H
`
`N-H(charged)
`
`coo-
`
`=N-
`
`R-0-R (sp 1)
`
`R-0-R (sp 3)
`
`Interaction
`site
`
`Geometric
`parameters
`
`D-X
`
`A-Y
`
`A-Y
`
`D-X
`
`D-X
`
`D-X
`
`D-X
`
`Ro.n=L9A
`Cl= ll(}-180°
`01=0-360°
`R11.A=l.9A.
`a=l50-180°
`w=0--360°
`Rn.A=l.8A
`a= 150-180°
`ffi=0-36(} 0
`Ro.o= t.8 A
`a=I00-140°
`ro-50-50°, 130-230'
`Rtl..n=l.9A
`a=l50-180"
`(1)0':0-360°
`Ro.o= 1.9 A
`a=l00--140°
`(!)= -60-fJ0°
`Ro.ll=l.9A
`n"' 90--130°
`ffi= -70--70°
`
`Reference
`
`ll,l2,15
`
`12,14,15
`
`t2,t4,t5
`
`t6
`
`13,t5
`
`tl,15
`
`12,15,18
`
`
`
`597
`
`The hydrogen-bond contact patterns around carboxylic acids have been studied by GOr_bitzand
`Etter [16]. 'fhe data indicate a preference for <c=o .. H=120° and <o.c.o .. n=0,180°. These au(cid:173)
`thors found no indication that syn hydrogen bonds arc inherently more favorable than anti hy(cid:173)
`drogen bonds. Their data \Vere translated into the follo\ving rules to generate the interaction sites
`around a carboxylic acid: Ro.o= 1.8 A, n= 100-140', m = -50 .. 50', 130-230'.
`The distribution of hydrogen donors around an unprotonated nitrogen in aron1atic rings has
`been investigated by Vedani and Dunitz {13]. The distribution of hydrogen donors is na1To\ver
`than that around a carbonyl group. The follo\ving rule (\vhich applies to the unprotonated nitro(cid:173)
`gen in the side chain of His) is derived frorn the results of Vedani and Dunitz: RN .. n = 1.9 A,
`n = 150-180°, m= 0--360'.
`1-Iydroxyl groups can act both as hydrogen donors and as hydrogen acceptors. Although a de(cid:173)
`tailed analysis of high-resolution protein structures [I ?J shows that hydroxyl groups act more of(cid:173)
`ten as donors than as acceptors, the possibility that hydroxyl groups act as acceptors has to be
`taken into account. For sp3-oxygen, the data of Kroon et al. [18) indicate a preference for the do(cid:173)
`nor group to lie in the plane of the lone pairs ( <c.o .. tt= 109±20°). Ho\vever, no evidence has
`been obtained for any preference of the lone-pair direction within this plane. This contrasts \Vith
`data obtained by Vedani and Dunitz [13] and by Klebe [15], who report a preferred orientation of
`hydrogen-donor groups in the direction of the lone pairs. Sii1ce the experimental data are used
`n1erely to establish the allo\ved hydrogen-bond patterns, hydrogen bonds not pointing in the
`direction of the lone pair were also allowed for: Ro .. o = 1..9 A, a= 90-130°, m = - 70 .. 70'. For sp2-
`oxygen, as found in the side chain of Tyr, there is a clear preference for the hydrogen-donor
`groups to lie in the plane of the aromatic ring. The data of Vcdani and Dunitz [13], Klebe [15] and
`Baker and Hubbard [ 17] were used to derive the following rule: Ro .. o = 1.9 A, n = 100-140° and
`w= -50 .. 50°.
`As tnost publications on statistical analyses do not present a quantitative analysis of the data,
`there is a certain atnount of ambiguity involved in the choice of the rules given above. A very re(cid:173)
`stricted definition of the allowed hydrogen-bond geo111etries \Vould strongly reduce the nu1nber of
`hits obtained in the subsequent fragn1ent fitting, and carries the risk of eventually n1issing son1e of
`the pron1ising hits. On the other hand, a very broad definition \\'Ould result in a very large nun1ber
`of hits, \Vi th the difficulty of selecting the tnost interesting ones. Thus, the present choice of rules
`represents a co1npron1ise.
`'fhe generated interaction sites \Vere finally checked for van der \Vaals overlap \Vith the protein.
`
`2.2. Frag1nent linking
`In 1ny previous paper I described the 'bridge' tnode \vhich allows one to connect positioned
`fragments by suitable spacers. This concept has now been generalized. LUDI is now able to fit
`frag1ncnts onto the interaction sites and simultaneously link the111 to an already existing ligand or
`part of a ligand. For this purpose, 'link sites', \vhich are X-H aton1 pairs suitable for appending
`a substituent to the ligand, can be specified by the user. Alternatively, the progra1n assun1es that
`all hydrogen ato1ns of the positioned ligand within a given cut-off radius, together with the heavy
`atoms they are bound to, arc link sites.
`LUDI can perform a single link, generating a single bond bet\veen the nev.'ly fitted fragrncnt
`and the already existing ligand. Additionally, it is also possible to do a multiple link. The double
`link \Vill generate t\VO bonds between the ne\vly fitted fragment and the existing ligand. For exam-
`
`
`
`598
`
`pie, it is possible to fuse a second phenyl ring onto an existing one to fonn a naphthyl group. This
`double link also includes the 'bridge-1node' as described previously [9}. The options arc sho\Vll in
`Fig. 2.
`In order to carry out the calculations in the link 111ode, a second library \Vas built specifically for
`this purpose. The link sites (the atoms \Vhich fonn a bond \vith the already existing ligand) are ex(cid:173)
`plicitly defined for each entry in this library. So1nc exa1nples are sho\vn in Fig. 3. This library cur(cid:173)
`rently consists of 1100 entries. This nun1ber is larger than the nun1bcr of entries in the standard
`library because, for n1any of the structures, there are several possible \vays to fonn the link.
`'l'he link mode of LUDI is similar to the approach itnple1nented in the co1nputer program
`GROW by Moon and Howe (7]. The purpose of GROW is to construct peptides by linking amino
`acids, whereas LUDI attempts to construct arbitrary organic 1nolecules. GRO\V is based on
`force-field calculations and \vill therefore be considerably slo,ver than LUDI, because LUDI is
`con1pletely based on geo1netric operations.
`
`2.3. Prioritization ofthefittedfragn1ents
`An iinportant prob\c1n of every n1ethod based on searching through large nu1nbers of struc(cid:173)
`tures is the prioritization of the hits. This problem is approached as follo,vs:
`
`link library
`
`single link
`
`double link
`
`01
`triple link J
`0 +Y+"'
`"' I
`+
`"'
`
`f
`
`__...
`
`0
`
`Fig. 2. Examples for a single, double ai1d triple link as performed by LUDI in the link-1node.
`
`
`
`Only those frag1nents \Vith a root-1nean-square (nns) deviation of the fit of the frag1nent onto
`the interaction sites bclo\V a certain threshold (typically 0.3--0.5 A) are accepted. A further requi(cid:173)
`ren1ent for a successfully positioned fragn1ent is that it does not overlap \\1ith the protein. LUDI
`also checks for electrostatic repulsion bet\veen protein and ligand: if a polar aton1 is closer to a
`protein aton1 of the san1e polarity than a threshold distance (typically 3.5 1\ for 0 .. 0 contacts),
`
`599
`
`Standard library
`
`0
`
`link library
`
`-0
`rO
`
`fig. 3. Examples from the link library of LUDI. Each possible link that will be considered by LUDI has to be spcrified
`explicitly.
`
`
`
`600
`
`then the fit of the fragtnent is rejected. In the electrostatic repulsion check, only those protein
`aton1s arc taken into account that do not hydrogen bond with the ligand.
`The number and quality of the hydrogen bonds bct\veen protein and ligand and the hy(cid:173)
`drophobic protein-ligand contact surface were then used to calculate a score. The relative \Veight
`of a hydrogen bond with respect to the hydrophobic interaction was derived from a value of l .5
`kcal/mo! for the contribution ofa hydrogen bond to the binding energy [19] and 25 cal/(mol A 2)
`for the hydrophobic interaction [20]. Therefore, in the scoring function it is assumed that an un(cid:173)
`perturbed hydrogen bond has the same contribution to ligand binding as 60 A2 of hydrophobic
`contact surface. The follo\ving prclll11inary scoring function was used:
`
`Score = l:h00,,, 100 * f(6R) * f(Au) + 5/3 * NCONTACT
`
`f(6R)= 1, im s; 0.2 A
`f(6R)= l-(6R-0.2)/0.4, 6R s; 0.6A
`f(6R)=0,6R > 0.6A
`
`f(6a)= 1, 6a s; 30°
`f(6a) = l -(M-30)/50, 6a s; 80°
`f(6a)=0, 6a > 80°
`
`6R is the deviation of the H .. O/N hydrogen-bond length from the ideal value 1.9 A. 6" is the
`deviation of the hydrogen-bond angle <N/O·H .. O/N fro1n its ideal value 180°. NCONT1\CT repre(cid:173)
`sents the lipophilic contact area between protein and ligand in A 2.
`The scoring function \Vas tested on the fit offragtnents into the specificity pocket of trypsin and
`into the ptcridine-binding site of dihydrofolate reductasc. The fragments \Vere taken frotn the
`standard LUDT library consisting of currently 800 fragments. For trypsin, the fragn1ent \Vith the
`highest score \Vas benzamidine. In the case of DHFR, the highest score \Vas found for the frag(cid:173)
`n1ent 2,4-dian1ino-pteridine.
`
`3. APPLICATIONS
`
`3.1. Inhibitors of the HIV protease
`As a first example, T report the application of LUDI to the design of inhibitors of the HIV-pro(cid:173)
`tease [21 ]. The 3D structure of the HI\'-1 protease co1nplexcd \Vi th a pep ti die inhibitor was recent(cid:173)
`ly solved by Wlodawer and coworkers [22] (entry 4HVP in the Brookhaven protein databank
`[23]). I used a recent publication by DeSo!ms ct al. [24] on C-terminal variations of the HIV pro(cid:173)
`tease inhibitor L-682,679 (see 1 in Fig. 4) as a starting point for my calculations with LUDI. De(cid:173)
`Sohns ct al. report binding data for 12 substituents at the P2' position and for 18 substituents at
`the P3' position. The 3D structure of the L-682,679-HIV protease is not available. For the calcula(cid:173)
`tions, I assumed that the Merck compound L-682,679 [24] binds to the HIV protease in the same
`manner as the compound MVT-101 that \Vas used in the X-ray diffraction experitncnt by \Vlo(cid:173)
`da\ver and coworkers. l'he validity of this assumption is supported by the further structural ana(cid:173)
`lysis of a HIV protease-inhibitor co1nplex by Erickson el al [25], sho\ving a binding tnode very
`si1nilar to that of MVT-1O1 [22]. 'fhe geometry of L-682,679 in the con1plex \Vi th the protease \Vas
`generated as f ollo\VS. First, the positions of the backbone ato1ns of the inhibitor \\'ere taken direct(cid:173)
`ly fron1 the X-ray structure \Vhenever possible and the side chains \Vere added in a reasonable gc(cid:173)
`on1ctry. Hydrogen aton1s \Vere added using standard geon1ctries \Vi th the tnolecular graphics pro(cid:173)
`grarn INSIGHT [26]. 'fhis structure \\'as then optirnized, including a critical buried water
`
`
`
`601
`
`OH
`
`0
`II
`R
`NH
`~!\'fl 2
`R,
`
`2
`
`Fig. 4. Chemical structure of the HIV-protease inhib.itor L-682,679 (24] l and the reference compound 2 us..>d in the pres(cid:173)
`ent calculation. LlJDI was used lo search for suitable substitucnts R1 and R2•
`
`1nolecule in the active site of the HIV protease, using the force-field CVFF [27]. 'I'he protein was
`kept fixed during the energy n1ini1ni?.ation. The amino acids Asp, Glu, Lys and Arg of the protea(cid:173)
`se \Vere assu1ned to be charged. A hundred steps of conjugate gradients energy 1ninitnization were
`carried out to re1nove unfavorable steric contacts bet\veen protein and ligand. The energy n1ini1ni(cid:173)
`zation caused a shift of the C-tcnninal nitrogen in co1npoun<l 2 by 0.23 A. The corresponding
`1nove1nent of the C0 aton1 at position P2' \\'as 0.43 .A. Therefore, \vi th respect to the present calcu(cid:173)
`lation, the n1odel structure of con1pound 2 \vi th the protease is very close to the structure of the
`MVT-101 compound.
`The purpose of the present calculations \Vas to assess the ability of LUDI to design autoinati(cid:173)
`cally analogs ofL-682,679 with a modified C-terminus by comparing the results from LUDI with
`the data of DeSohns ct al. [24]. Structure 2 (sec Fig. 4) \Vas used as a lead and calculations were
`perfonncd \Vi th LlJDI in the link 1node. In this mode, LUDI attcn1pts to append fragn1ents to the
`already positioned inhibitor 2. The results are sununarized and co1npared \Vith the data of De(cid:173)
`Solms et al. [24] in Table 2.
`In a first calculation, LlJDI was used to search for substituents H .. 1 at the P2' position. LlJDI
`predicts two substituents: CH 2CH(CH 3), and CH(CH3),. Both were also synthesized by DcSolms
`and indeed sho'v itnproved binding by factors of 55 and 500, respectively. lfo\\'C\'er, LUDI failed
`to retrieve the phenyl group [which sho\\'S the best binding of the coinpounds described by Dc(cid:173)
`Soln1s (i1nproved binding by a factor of 600)] as the snbstituent at R 1• 'fhe phenyl group \vas re(cid:173)
`jected by LODI due to overlap \\'ith the protein structure. This calculation took only 45 son a Sili(cid:173)
`con Graphics 4D35 workstation.
`LUDT was then used to design ligands for R1. The calculation took 105 sand yielded 10 possi(cid:173)
`ble substituents. Binding data fro111 the paper of DcSohns et al. [24J are available for 3 of thezn:
`CH 2CH20H (Fig. 6), CH,-2-pyridyl and CH2-3-pyridyl (Fig. 5). In all cases, slightly improved
`binding, by factors of 5, 1.5 and 1.5, respectively, \Vas observed experimentally. It is note,vorthty
`that for several other suggestions of LUDI, experimental data of closely related con1ponnds are
`given by DcSolms ct al [24]. LUDI predicts both CH2CH 20H and CH 2CH 2CH20H as substi(cid:173)
`tucnts. Experi1ne11tally, the dihy<lroxy substitucnt Ctl 2CH(Ol-I)CH20f·[ in1provcd binding by a
`factor of 18-22. LUDI predicted p-hydroxy-phcnyl as a substitucnt. Experilncntal inforn1ation is
`available for p·a1nino-phenyl \Vi th a binding itnprove1nent of 11--17.
`LUDl did not, however, find the 1nethyl-henzimidazole con1pound, \Vhich sho\vs the strongest
`
`
`
`602
`
`TABLE2
`COMPARISON OF THE SUGGESTED SUBSTITUENTS R1 AND R 2 FOR THE HIV-PROTEASE INHIBITOR 2
`WITHTHEDATAOFDESOLMS ET AL.(24]
`
`R,
`R,
`Experimentally observed binding improvement
`-------------------~. · · - - - - - - - - - - - - - -
`H
`CH(CHi)2
`50-0'
`H
`55•
`CH 2CH(CH 3h
`
`CH(CH,J,
`CH(CH,J,
`CH(CH,J,
`CH(CH,J,
`CH(CH1h
`CH(CH1)2
`CH(CH3) 1
`CH(CHi)i
`CH(CH,),
`CH(CH1) 2
`
`CH1Cll20H
`CH2·3-pyridyl
`CHr2-pyridyt
`CH1CH 2CH10H
`CH1COOC6Hs
`CH2C6Hr4'0H
`CHr 1-imidazolyl
`CHr2-thiazolyl
`CHr2-furanyl
`CH r 1-tet rah yd roisochinolin
`
`na
`na
`'" na
`na
`na
`na
`
`•As compared to R 1 = H (IC 5.i = 500 n~f).
`bAs compared to R2 = H {IC5~ = I ni\1).
`na = not available.
`
`binding in the study of DeSolms et aL [24] (ICio ~0.06 nM), although the appropriate fragment
`is contained in the frag111ent library. It is ten1pting to speculate about the reason for this failure.
`'fhe benli1nidazolc n1oiety can form t\VO hydrogen bonds \Vith the protein. The n1ost likely
`partners in the protein to fonn these hydrogen bonds are the side-chain oxygen of Asp B29 and
`the backbone nitrogen of Gly B48. The distance bet\veen these atoms in the crystal structure
`4HVP is 9.22 A. The sum of two hydrogcn·bond lengths (2*2.9 A) and the intramolecular N-N
`distance (2.4 A) in the benzitnidazole n1oiety isi ho\vever, only 8.2 A. This is 1 A shorter than the
`distance in the X-ray structure. Therefore, it is likely that the confonnation of the side chain of
`Asp B29 will change upon ligand binding to allow for two hydrogen bonds to be formed. In fact,
`when the side-chain conformation of Asp B29 was changed so that the distance Asp B29 OD-Gly
`B48 N \Vas reduced to 8.0 A. inethyl-benziinidazol \Vas retrieved by LUDI as a possible substi(cid:173)
`tuent at R1.
`
`3.2. Inhibitors of DHFR
`The second exa1nple given is the design of new inhibitors of dihydrofolatc reductase (Dl-IFR).
`The 30 structure of DHFR co1nplexed \Vith the anticancer drug methotrexate 3 (MTX) \Vas
`solved by Bolin ct al. (28] (entry 4DFR in the Brookhaven protein databank [23]) (Fig. 7). This
`structure, \Vithout water 1nolecules, \Vas used in the present calculations. The purpose of the calcu·
`lations was to use LUDI to design new substituents for the 2,4·dian1ino-pteridine n1oiety at posi(cid:173)
`tion 6 on the ring syste111. Therefore, only the pteridine portion 4 of MTX \Vas used fro1n the X(cid:173)
`ray structure and the substitucnt at position 6 was rc1noved. Again, the hydrogen aton1s \Vere ad(cid:173)
`ded using the program INSIGHT (26].
`The results fro1n LlJDI on the design of substituents for 2,4·dia1nino-pteridine in position 6,
`
`
`
`603
`
`Fig. 5. Conformation of the CH2-3-pyridyl substituent as R 2 of compound 2. The substituent ls shown with shaded atoms.
`LUDI suggests that the pyridine nitrogen forms a hydrogen bond with the backbone nitrogen ofGly B48 from the protein.
`
`once again run in the link xnode, are stun1narized in Table 3 and arc cornparcd with data fron1 a
`con1pilation of experi1ncntal data prepared by Blaney et al. [29]. LlJDI retrieved seven structures
`as possible substituents of R. Expcrin1cntal data are available for two of them. l'hc isobutyl sub*
`stituent leads to a strong i1nprove1nent in binding. The phcnylethyl group leaves the binding un(cid:173)
`changed. LUDI does not retrieve the substituent CH2N(CH3)C6H4-4'-CO-Glu (yielding MTX)
`because the link library docs not contain such co1nplcx n1oieties.
`
`4. DISCUSSION AND CONCLUSIONS
`
`This paper describes recent advances in a new approach to the de novo design of protein ligands
`as i1nple1nented in the cornputer prognun LUDI [9]. A nc\v set of rules to generate the interaction
`
`Fig. 6. Conformation of the CH2CH20H substituent at R 2 of compound 2. The substituent is shown v.ith shaded atoms.
`LUDJ suggests that the hydroxyl group forms a hydrogen bond with the side chain of Asp B29 from the protein.
`
`
`
`604
`
`4
`
`3
`
`Fig. 7. Chcmic-<1! structure of methotrexate 3 and the reference C{Jmpound 4 used in the present calculation. LUDI was
`used to search for suitablesubstituents Rat position 6 of the ptcridine ring.
`
`sites is described. LUI)J is 110\V capable of designing ne\v substitucnts for a given enzyn1e inhibitor
`lead. A scoring function for the fitted frag1nents \Vas iinplen1ented that is based on the nun1bcr
`and quality of the hydrogen bonds and the hydrophobic contact surface.
`LUDI \Vas successfully applied to the design of inhibitors for the cnzyrnes l-IIV protease and di(cid:173)
`hydrofolate reductase. The first application of LlJDI given in the present paper is the design of a
`new C-tenninal substitucnt for an inhibitor of HIV protease. In this case, LUDI predicted two
`fragtncnts as substitucnts for the P2' site; both \Vere found experi1ncntally to yield substantially
`iinprovcd binding (24]. For the P3' site, LU ()J retrieved ten candidate structures. The a vailablc cx(cid:173)
`pcrin1ental data show in1proved binding for three of the1n. For DHFR, LUOI predicted seven
`frag1nents as possible substituents for 2,4-diatnino-pteridinc 1noiety at position 6. For one of
`thetn, the available experimental data indeed shov,red in1proved binding as co1npared to the un(cid:173)
`substituted lead compound. These results demonstrate that LUDI is indeed able to suggest active
`con1pounds.
`'l'hc positioning of fragments by LUDI is done by a fit onto the interaction sites. This approach
`ofl'ers the advantage that only purely gcon1etrical calculations are required, thereby avoiding the
`very CPU-intensive evaluation of energy functions and their derivatives. In con1paring the present
`
`TABLE3
`CO]l...IPARISON OF SUGGESTED SUBSTITUENTS RAT POSITION 6 OF 2,4·D1AMINO·PTERIDINE \VITH
`EXPERl~1ENT AL DATA FROM THE SURVEY OF BLANEY ET AL. [29J
`
`R
`
`CH2CH(CHih
`CH 2·l·naphthyl
`CH1CH1CiHi
`CH(CH1h
`CH2CiH1·3',5'(CJI3)i
`Cff 2C6H,·4'CN
`Cfl 2C6Hr4'0H
`
`na "° not available.
`
`Experimentally observed binding in1prove1nent
`
`> 100
`na
`
`na
`na
`
`"' na
`
`
`
`605
`
`approach with the \\'ell-established method of positioning a putative ligand by force-field calcula(cid:173)
`tions, one should bear in mind that the traditional force-field approach \vill also encounter the
`1nultiple n1iniina problem. Therefore, a considerable nu1nbcr of force-field calculations are re(cid:173)
`quired before the optin1al position of the ligand can be specified una111biguously. Methods based
`on force-field calculations \Vill therefore be inuch slo\vcr than LUDI.
`\.Vhen co1nparing the accuracy of LUDI to the traditional force-field calculations, one nntst
`consider that the error introduced by using discrete positions or vectors is roughly of the order of
`the distance between the interaction sites. In the exan1ples described in the present paper, the
`point density corresponds to distances bet\veen neighboring interaction sites of about 0.3 A. This
`is roughly equal to the accuracy of the aton1ic positions in a high-resolution protein structure and
`\Veil \Vi thin the tolerance of most of the results of today's best force-field calculations. I therefore
`conclude that the use of discrete points does not introduce significant errors into the calculation.
`LUDI does not distinguish bct\vcen interaction sites at the optin1al positions, e.g., for carboxyl(cid:173)
`ic groups those \vi th < o.c.o .. H = 0° and those slightly shifted ofr these positions. Ho\vever, the po(cid:173)
`sition of a protein ligand is usually detcrrnined by several interactions that all occur sitnultaneous(cid:173)
`ly. 'J'his n1eans that in 111ost cases the hydrogen-bond geo1nctrics \Vil! not adopt their opti1nal
`values. The geornctrical constraints involved in 1naxi1nizing the nurnber of hydrogen bonds \Viii be
`inore in1portant than the electrostatic effects in detennining the hydrogen-bond geo1netry [30].
`Therefore, the present approach appears to be justified, as only geornctries are considered. A de·
`tailed evaluation of a protein-ligand con1plex generated by LUDI can be n1ade after\Vards using
`a force-field calculation.
`A very irnportant advantage of the geo1netry.based approach adopted by LUDI is the possibili(cid:173)
`ty to cornbine the search for favorable non bonded interactions \\'ith the search for a suitable bond
`for the frag1nent \Vith an already existing ligand. This offers the possibility to design new protein
`ligands in a stepwise 1nanncr.
`In conclusion, I have further developed a nev.' algorith1n for the de nova design of protein Ii·
`gands. The 1nethod has been applied successfully to predict irnproved inhibitors for t\VO enzyn1es
`(DJ.IFR and l1:IV protease). The present results indicate that the approach n1ay be useful for the
`rational design of drugs \vhen the 30 structure of the target protein is kno\VIL
`
`ACKNOWLEDGEMENT
`
`I \vould like to thank rny colleague Gerhard Klebe for 111any helpful discussions.
`
`REFERENCES
`
`1 Goodford, P.J.,J. Nied. Chem.,