`Biological
`Crystallography
`
`ISSN 0907-4449
`
`Bulk-solvent correction in large macromolecular
`structures
`
`short communications
`
`Bernard Rees,* Lasse Jenner and
`Marat Yusupov
`
`Laboratoire de Biologie et Ge´nomique
`Structurales, IGBMC, 1 Rue Laurent Fries,
`BP 10142, 67400 Illkirch CEDEX, France
`
`The estimation of the bulk-solvent contribution to the diffraction of a macromolecular crystal
`makes use of a solvent mask which delimits the bulk-solvent regions in the crystal. It is shown that
`the way this mask is usually defined in CNS contains a bias which can lead to absurd results in the
`case of very large structures, where the calculations can only be made on relatively coarse grids. A
`modified procedure is described and applied to 70S ribosome data at 5.5 A˚ resolution. The B factor
`affecting the bulk solvent is also discussed. Even in this case of very high and widely variable atomic
`B factors, it seems sufficient to consider a constant and isotropic B factor for the bulk solvent. This
`is initially set to the average value of the atomic B factor, but can be refined.
`
`Correspondence e-mail: rees@igbmc.u-strasbg.fr
`
`1. Method
`
`Received 8 April 2005
`Accepted 21 June 2005
`
`# 2005 International Union of Crystallography
`Printed in Denmark – all rights reserved
`
`Acta Cryst. (2005). D61, 1299–1301
`
`It is well known that a sensible modelization of the bulk solvent is
`very important for the refinement of a macromolecular structure
`when low-resolution terms are included. The structure factor is
`usually written
`
`F ¼ kfFcalc exp½ Bðsin =Þ2 þ dsolvFsolv exp½ Bsolvðsin =Þ2g;
`ð1Þ
`
`where k is the scale factor, Fcalc the structure factor calculated from
`the current atomic model, dsolv the bulk-solvent electron density
`(0.34 e A˚ 3 for pure water) and Fsolv the scattering of the solvent
`mask. This mask is a step function with value 1 in the solvent regions
`of the unit cell and 0 in the molecular regions. It is sometimes more
`convenient to consider a molecular mask, which is the complement of
`the solvent mask (value 1 in the molecular regions, 0 outside). By
`Babinet’s theorem, except for H = 0, Fsolv is the opposite of the
`diffraction amplitude of the molecular mask.
`At very low resolution, the macromolecule diffracts essentially as
`its molecular mask affected by the average macromolecular density
`and since this is usually not very different from dsolv, dsolvFsolv and Fcalc
`are of the same order of magnitude, with a phase difference of . The
`solvent contribution decreases faster than Fcalc with increasing reso-
`lution, but cannot be neglected as long as the resolution is not at least
`4–5 A˚ . Clearly, the correction for bulk solvent becomes particularly
`important when the resolution limit of the experimental data is low,
`as is usually the case for very large macromolecules or macro-
`molecular complexes. However, when we tried to apply the standard
`CNS bulk-solvent correction (Bru¨ nger et al., 1998), with a reasonable
`dsolv, to the Thermus thermophilus 70S ribosome, with a resolution
`limit of 5.5 A˚ (Jenner et al., 2005), we noticed a large increase in the R
`factor, especially in the lowest resolution bins where it should have
`improved most. Attempts to optimize the parameters of (1) resulted
`in an unrealistically small value of dsolv (see below).
`An essential part of the solvent modelling is the definition of the
`solvent mask. In CNS, this depends on two parameters: rprobe and
`rshrink. The first step of the procedure consists in centring spheres of
`radius rvdW + rprobe on all atomic postions of the macromolecular
`model (rvdW is the van der Waals radius of the atom). This delimits the
`region that the centre of spherical solvent molecules of radius rprobe
`can occupy. All grid points within the spheres are labelled 0 and the
`grid points outside are labelled 1 (Fig. 1). The second step defines the
`regions effectively occupied by the solvent by removing a shell of
`thickness rshrink from the molecular surface defined in step (i). In the
`standard CNS procedure, this is performed by relabelling as 1 all the
`points with initial label 0 closer than rshrink from any point with initial
`
`doi:10.1107/S0907444905019591 1299
`
`Merck Exhibit 2243, Page 1
`Mylan v. Merck, IPR2020-00040
`
`
`
`Figure 1
`Definition of the solvent mask. The figure shows the result of the first step of the
`mask definition, where all grid points within the spheres of radius rvdW + rprobe
`centred on the atoms A, B, C, . . . of the current model are labelled 0, while the
`points outside are labelled 1. In the second step, in the standard CNS procedure,
`any point 0 closer than rshrink to a point 1 has its label changed to 1. In the procedure
`proposed here, all points P at the intersection of the surface defined in the first step
`with a grid line (in direction x, y or z) are considered. All grid points within a sphere
`of radius rshrink centred on a point P are relabelled 1. rvdW is the van der Waals
`radius of the atom type and rprobe and rshrink are related to the radius of the solvent
`molecules, but are considered as empirical parameters (rprobe = 1 A˚ , rshrink = 1.2 A˚
`in this work).
`
`Figure 2
`R factor at low resolution for the T. thermophilus 70S ribosome complexed with
`tRNAs and thrS mRNA (Jenner et al., 2005). All reflections with I > 2(I) in the
`300–5.5 A˚ resolution range were used in the structure-factor calculations. Except in
`case (c), the grid-step size was 1/5 of the high-resolution limit. (a) No solvent
`correction (open circles). (b) Standard CNS correction, with solvent density refined
`to 0.053 e A˚ 3 (black triangles). (c) Standard CNS correction, with grid step size
`1/10 of the resolution. The solvent density refined to 0.287 A˚ 3 (open triangles). (d)
`Modified procedure. The refined solvent density is 0.336 e A˚ 3 (black diamonds).
`
`Bav, j is defined for each grid point as the average of the B factors of
`the nearest atoms.
`
`short communications
`
`label 1 (Jiang & Bru¨ nger, 1994). Ideally, rprobe and rshrink should be the
`average radius of the solvent molecules, but in practice empirical
`values are used. Jiang & Bru¨ nger (1994) proposed rprobe = 1.0 A˚ and
`rshrink = 1.1 A˚ .
`While this procedure has the advantage of simplicity, it contains an
`inherent bias which becomes serious whenever the grid step size is
`not much smaller than rshrink. The reason of this is that the points
`labelled 1 considered in the second step, even those closest to the
`molecular surface, are not in general on the surface but already in the
`solvent region. Obviously, if the grid step is larger than rshrink, no
`shrinking at all is performed. For a resolution limit of 6 A˚ , this
`happens even when a grid step of 1/5 of the resolution limit is chosen,
`while the usual practice is to choose 1/3 or 1/4.
`We modified the second step of the procedure in the following way.
`Instead of the grid points, all intersection points of the sphere of
`radius rvdW, i + rprobe of every atom i of the model with the grid lines
`are considered (point P in Fig. 1). The condition for any such a point
`to be on the molecular surface and not inside the molecule is that its
`distance to the other atomic centres j is never less than rvdW, j + rprobe.
`All grid points inside the sphere of radius rshrink centred on such a
`point P are labelled 1.
`rprobe and rshrink were optimized in a way similar as in Jiang &
`Bru¨ nger (1994) and essentially the same values were obtained:
`rprobe = 1.0 A˚ and rshrink = 1.2 A˚
`The complete procedure is as follows.
`(i) Calculate a solvent mask with rshrink = 0. This can be performed
`using CNS.
`(ii) Apply the shrinking procedure described above (program
`SHRINKMASK).
`(iii) Calculate and extract Fcalc and Fsolv with the mask determined
`in step (ii) (CNS).
`(iv) Determine by least-squares the best values of k, B, dsolv and
`Bsolv of (1) (program SOLVPAR). If | B| is large, B is added
`algebraically to the B factors of all atoms of the model. dsolv and Bsolv
`will be used in conjunction with the mask in further CNS refine-
`ments.1
`
`2. Thermal motion
`
`As mentioned above, F in (1) can be considered as the diffraction of
`the macromolecules minus the diffraction of the molecular masks. (1)
`implicitly assumes a rigid-body and isotropic thermal motion. The
`diffraction amplitude at a given time t is the sum of the amplitudes
`from the macromolecules of all the unit cells of the crystal, each one
`in its instantaneous position at t, displaced from the equilibrium
`position owing to thermal motion or local disorder. The contribution
`of the molecular mask at the same position is subtracted from the
`amplitude scattered by each macromolecule. As a result, Bsolv in (1)
`should normally be equal to the B factor of the macromolecule. By
`the same argument, if different parts of the macromolecule have
`different B factors, the corresponding parts of the molecular mask
`should be affected by the same B factors. Fsolv in (1) should therefore
`be replaced by
`
`P j
`
`Fsolv;BðHÞ ¼
`
`exp½ Bav;jðsin =Þ2 expð2iH rj VÞ;
`
`ð2Þ
`
`where the sum is over all grid points j within the molecular mask. rj is
`the position of a grid point and V is the associated volume element.
`
`1 It is, in principle, possible to determine dsolv and Bsolv using CNS. However,
`we have been unable to use this option successfully with our ribosome data.
`
`3. Results
`
`The bulk-solvent contribution has been calculated for the 70S ribo-
`some from T. thermophilus complexed with tRNAs and the thrS
`mRNA operator. The experimental data were collected at the SLS
`synchrotron source in Villigen, Switzerland (Jenner et al., 2005). All
`
`1300 Rees et al.
`
` Bulk-solvent correction
`
`Acta Cryst. (2005). D61, 1299–1301
`
`Merck Exhibit 2243, Page 2
`Mylan v. Merck, IPR2020-00040
`
`
`
`165 895 independent reflections with I > 2(I) in the 300–5.5 A˚
`resolution range were used in the structure-factor calculations.
`The grid step size was set to 1/5 of the resolution. The R factors in
`the low-resolution range are shown in Fig. 2. With the standard CNS
`procedure, when dsolv is set to 0.34 e A˚ 3 and Bsolv to the average
`atomic B, the agreement between the calculated structure factors and
`the experimental value is much worse than without solvent correc-
`tion, with an R factor as high as 62% in the lowest-resolution shell
`(300–39 A˚ ) (not shown). When dsolv is refined as in step (iv) of the
`procedure above, R values similar to those without correction are
`calculated, but the solvent density is now 0.053 e A˚ 3, an unrealisti-
`cally low value, which shows that the calculation tends to eliminate
`the solvent contribution.
`With the modified procedure and the same grid step, the R values
`at low resolution are much smaller. Furthermore, dsolv refines to
`0.336 e A˚ 3, a value very close to the electron density of water.
`To obtain more reasonable results with the standard procedure of
`CNS, a very fine grid has to be used. Fig. 2 shows the low-resolution R
`factors for a grid step of 1/10 of the resolution. However, even with
`such a fine grid, the refined value of dsolv, 0.287 e A˚ 3, is still some-
`what too small and the R-factor values are consistently higher than
`those of the modified procedure for resolutions lower than 15 A˚ . The
`results with the modified procedure are much less sensitive to the grid
`size.
`We also calculated the solvent contribution to the diffraction
`assuming a variable solvent B factor, as in (2). After the calculation of
`Fsolv, B and to be consistent with the empirical approach followed
`above, the same four quantities of (1) were refined. Bsolv is now an
`additional positive or negative isotropic solvent B factor. Even
`
`short communications
`
`though the atomic B factors of the ribosome model are very high
`(average B = 280 A˚ 2) and show large variations (r.m.s. variation
`75 A˚ 2), the results are practically identical to those obtained with a
`constant isotropic B. The reason is that the difference between the
`solvent contribution in the two calculations becomes significant only
`at relatively high resolution, where the solvent contribution to the
`diffraction is smallest. The conclusion is that a solvent correction with
`a constant isotropic B is probably sufficient in most cases.
`
`4. Availability
`
`The Fortran programs SHRINKMASK and SOLVPAR and a UNIX
`shell script running the mask defining procedure are available from
`rees@igbmc.u-strasbg.fr.
`
`We are grateful to Clemens Schulze-Briese for his help in collecting
`the ribosome low-resolution diffraction data at the Swiss Light
`Source.
`
`References
`
`Bru¨ nger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-
`Kunstleve, R. W., Jiang, J.-S., Kuszewski, J., Nilges, M., Pannu, N. S., Read,
`R. J., Rice, L. M., Simonson, T. & Warren, G. L. (1998). Acta Cryst. D54,
`905–921.
`Jenner, L., Romby, P., Rees, B., Schulze-Briese, C., Springer, M., Ehresmann,
`C., Ehresmann, B., Moras, D., Yusupova, G. & Yusupov, M. (2005). Science,
`308, 120–123.
`Jiang, J. S. & Bru¨ nger, A. T. (1994). J. Mol. Biol. 243, 100–115.
`
`Acta Cryst. (2005). D61, 1299–1301
`
`Rees et al.
`
` Bulk-solvent correction 1301
`
`Merck Exhibit 2243, Page 3
`Mylan v. Merck, IPR2020-00040
`
`