`(b) Human carbonmonoxy haemoglobin data at 2-7 A resolution
`Crystals of human carbonmonoxy haemoglobin were grown by N. L. Anderson (un-
`published work), using a method based on that described by Perutz (1968). Each crystal-
`lisation tube was bubbled full of carbon monoxide before sealing. Glass tubes with greased,
`ground glass stoppers were used to prevent the escape of carbon monoxide or the intro-
`duction of air. Large crystals grew in 4 to 6 weeks. They were re-equilibrated with carbon
`monoxide in 3-5 M-phosphate buffer at pH 6-8 for l8 h before being mounted in quartz
`capillaries between plugs of cotton. These plugs were necessary to prevent crystal slippage,
`but as they only touched the sharp ends of the tetragonal-bipyramidal crystals they did
`not interfere with the diffracted radiation. Each capillary was flushed with carbon mon-
`oxide just before being sealed with Wax.
`The X-ray photographs were taken by Anderson using an Arndt-—Wonacott rotation
`camera. The use of such a camera and the methods of measuring and processing data
`recorded on rotation photographs are described fully by various authors in the book
`edited by Arndt & Wonacott (1977). A monochromated X-ray beam was used and because
`the separation of the diffraction spots along the c*—axis is small a collimator of 0-2 mm
`diam. was needed. A crystal-to-film distance of 90 mm allowed 2-7 A resolution data to be
`collected. The c-axis of the crystal was parallel to the rotation axis of the camera, and a
`total rotation of 45° allowed measurement of all the independent reflections, except for
`those in the cusp along the c-axis. Fifteen exposures, each for a 3-deg. rotation, were
`taken, with 3 films in each pack. The exposure time for each rotation step was 6 h and all
`the data finally used were collected from one crystal. The cusp data were not collected
`so about 2% of the total data to 2-7 .3. resolution (Arndt & Wonacott, 1977) were not in
`cluded in the data set.
`The intensities of the fully recorded and the partially recorded spots were measured
`on a flat-bed scanner (Mallett et al., 197 7). The intensities from each film were corrected
`for Lorentz, polarisation and absorption factors and then sealed together by means of
`symmetry-related reflections recorded on different films. The symmetry R-factor computed
`for the data set is given with other statistics in Table 1. Reflections that were split between
`2 contiguous rotation photographs were not used in the scaling but were included in the
`R-factor calculation. Data were obtained for 8020 reflections with spacings out to 2-7A.
`Statistics on data processing of 2-7 Ii data
`Total number of reflections measured
`Number of reflections fully recorded
`Number of reflections recorded in parts
`Number of independent reflections
`; IF — as
`R-factor =T
`2 R
`where F, = amplitude of ith reflection and 17' : average amplitude for all reflections symmetry-
`related to the ith reflection.
`3. Determination of the Structure
`This analysis used the fact that the shape of the haemoglobin molecule is expected
`to be as similar in horse and human liganded haemoglobins as it has been shown to be
`in the two deoxy forms (Bolton & Perutz, 1970; Fermi, 1975). This enables one to

`use the structure of horse haemoglobin, which is known (Cullis et al., 1961,1962;
`Ladner et al., 197 7), to find the location of the human haemoglobin molecule in its
`unit cell, and then to use it as an initial model for further refinement.
`(a) Determination of the space group and the location of the
`molecule in the unit cell
`(i) Analysis of 5-5 A human methaemoglobin data
`The space group of human liganded haemoglobin could be either P41212 or P43212.
`In either case the symmetry and the number of molecules per unit cell limit the
`number of parameters describing the position and orientation of the tetrame1'ic
`molecule to two. Figure 1(a) and (b) illustrates the two parameters (q and 0) that
`must be determined. The origin of the molecular co-ordinate system is taken at the
`centre of mass of the four iron atoms, and this lies on the molecular dyad axis (Y).
`FIG. I. (a) Schematic diagram of the haemoglobin tetramer showing the molecular axes. Y is
`the dyad axis relating the 1131 dimer to (2232. X (perpendicular to the paper) and Z are the pseudo-
`dyad axes relating oz and )3 subunits.
`(b) The unit cell of human liganded haemoglobin. The centres of the molecules lie on the diagonal
`dyad axes. There are 4 molecules per unit cell. This Figure shows the cell dimensions of human
`carbonmonoxy haemoglobin and shows space group P41212 in which the molecular centres lie
`at positions q,q,0; ~ q + 1/2, q + I/2,1/4; ~ q, — g, 1/2; q + 1/2, — q -9- 1/2, 3/4. (Space group
`P45212 would have molecular centres at q,q,0;q + 1/2, -— q + 1/2,1/4; — q, — q, 1/2; — q + 1/2,
`q + 1/2, 3/4.) 0 is the angle between the molecular Z-axis and the c-axis of the unit cell. The
`positive direction of 9 is chosen in the sense that rotates X into Z .


`—I— 1
`" T.
`u n 1:
`1: u no
`- Q)
`u :2 no
`9431741104 In
`“ 7‘
`0 6'
`A |.5-
`I 3-
`2.] L
`3.0 ..L_._
`V 20»
`ox" O
`q |0Or
`. .._4j .
`FIG. 2. Results of the low resolution search, using human methaemoglobin data, for the correct
`values of q and 0 i.n space group P43212 (a) and in P41212 (b). Shaded areas are the regions of
`parameter space where no significant overlap would occur between the 4 molecules in the cell.
`The numbers are the values of the agreement factor A expressed as a percentage of mean F0.
`There is a minimum in A, at q : 0‘528 (fractional co-ordinate) and 0 = 2-126 rad, for space
`group P41212 in (b) whereas there is no minimum in (a). The sharpness of the minimum, for 2
`ranges of reflection spacings (
`, 11 A to 7 A; -
`- -, 9 A to 5-5 A), is shown in (:3); top,
`the dependence of A on q for 9 = 2-126 rad; below, the dependence of A on 9 for q = 0-528.

`The agreement factor for these parameters is 37% of the mean F0 for the 400
`reflections with spacings between 11 A and 7 A, and 45% for the 600 refleetions with
`spacings between 9 A and 5-5 A. The weaker minimum in the upper region of Figure 2(b)
`corresponds to the case where the tetramer is rotated by 180° about a pseudo-dyad
`axis so that the an and /3 subunits are interchanged. The position and orientation
`specified by (q, 0) is related by the pseudo-dyad to that specified by (q, 71' —— 0) with
`the direction of Y reversed. The latter parameters are equivalent to (— q, 77 — 6)
`with the direction of Y restored.
`The region of parameter space determined from the low resolution data of human
`methaemoglobin was taken as the starting point for further searches using the high
`resolution data for carbonmonoxy haemoglobin.
`(ii) Analysis of 2-7 A human carbonmonoxy haemoglobin data
`The unit cell parameters of human carbonmonoxy haemoglobin are a = b = 53-7 A,
`c = 193-0 A (see Table 2). In the search calculations using high resolution data, the
`Unit cell dimensions of human liganded haemoglobin and parameters
`specifying location of molecules in the unit cell
`Human methaemoglobin
`Human carbonmonoxy haemoglobin
`Space group P41212; 4 molecules per unit cell.
`1' q and 0 are defined in Fig. l(b). Cell dimensions a,b and c are in A; q is expressed as fraction
`of cell edge; 0 is in radians. The cell dimensions for human methaemoglobin are those given by
`Perutz (1968).
`I In the search for q and 0 with the 5-5 A methaemoglobin data the unit cell dimensions used
`were those previously quoted by Muirhead (1963) (a = b = 54-3 A, c = 196-4 A) so exact agree-
`ment between the values of q and 0 obtained with the met- and carbonmonoxy haemoglobin data
`is not expected.
`density map used was one that was constructed from the atomic co-ordinates of the
`horse methaemoglobin structure (Ladner et al., 1977), using a computer program writ-
`ten by A. D. McLachlan. Atoms are replaced by Gaussian distributions of electron
`density, each type of atom being given a specific weight and radius. The structure
`amplitudes and phases were calculated from this map using a fast-Fourier-transform
`program written by L. F. TenEyck. In this case the agreement factor examined was
`the R-factor (RF),
`Figure 3 illustrates the variation of the R-factor, calculated on 8020 reflections
`with spacings between 20-0 A and 2-7 A, in the region of parameter space near the


`(i) Cyclic real space refinement
`The initial atomic positions were then refined using the cyclic procedure of con—
`strained crystallographic refinement that has now been used successfully for many
`structure determinations (see Steigemann et al., 1976). The method involves cyclic
`real-space refinement (Diamond, 1971,1974) into (2F0 ~ Fe) electron density maps
`computed using amplitudes and phases calculated from the preceding model together
`with the observed structure amplitudes. The standard procedure is shown in Figure 4,
`Current set of atomic co-ordinates <-::--—-*-:-
`Construction of map from co-ordinates ( I)
`(either with standard atomic weights and radii
`or with refined values)
`Calculation of I; and ac by Fourier transformation of map (2)
`Calculation of R; and Fourier coefficients (3)
`Calculation of “best" map (4)
`amplitude 2/-5-/-‘C
`phoses ac
`Calculation of "difference" map
`amplitude /-[3-Fc
`phases ac
`Real space refinement (5)
`Visual inspection of “ditference" map
`possibly more than I cycle
`into same map
`Corrections with model-building
`program (6)
`No further corrections indicated
`FIG. 4. Flow diagram of cyclic refinement procedure. (1) The “map” is constructed by replacing
`each atom by a Gaussian distribution of electron density using a program written by A. D.
`McLachlan. (2) Fourier transform program by L. F. TenEyck. (3) R-factor program by J. E. and
`R. C. Ladner. (4) The “best” map at any stage has amplitudes 2(F0 — Fe) and phases one. This
`corresponds to the model map + twice the difference map. (5) Real-space refinement program by
`Diamond ( 1971,1974); parameters as given in Table 3. (6) Model-building program by Diamond
`as an aid to following the discussion below. The left-hand and right-hand paths were
`followed at various stages during the refinement. The conditions of the real-space
`refinement are given in Table 3. The progress of the refinement is shown in Table 4 in
`terms of R-factor improvement, average shifts in atomic positions and average
`changes in phase angle.

`Conditions of real-space refinemenn‘
`Zone length, 5; Margin width, 6
`Radii of atoms (a); C,O,N : 1-7 A; s : 2-4 A, Fe : 1-9 A;
`Occupancies of atoms (Z); O : 6, O r: 7, N —> 8, S fi 16, Fe : 2611
`Relative softness of angular parameters:
`4:, q/;, X
`1- (N~Ca—C)
`X5 (arginine)
`Proline angle
`Filter levels:
`Scale factors, background
`Translational refinement
`Rotational refinement
`Electron density map on grid with dimensions 0-75 A X 075 A X 0-67 A
`The globin and the haem of each subunit were refined separately as described by Fermi (1975).
`(In the globin refinement the atoms Fe—C—O were present but their final positions were taken
`from the haem refinement; in the haem refinement, atoms of His(F'8) were present but their
`final positions were taken from the globin refinement. Fe was therefore not connected directly to
`N5(F8).) The haem group had a flat porphyrin ring with co-ordinates based on the structure of
`chlorohemin (Kocnig, 1965); only rotations about side chain bonds were allowed. Fe—C—0 was
`not connected to the pyrrole nitrogens of the haem; Fe was free to move normal to the haem
`plane and C-0 was free to move 01? the normal.
`1' See Diamond (1971,l974) for definitions.
`I Z and a were not refined. All cycles allowed atomic positions, background density and local
`scale factors for each residue (K) to refine.
`No further improvement in R-factor was obtained after it had decreased to 31%
`with standard atomic weights and radii or 29% With refined values for the scale
`factors K of each residue (see Diamond, 1971,1974). These scale factors account to
`some extent for the group temperature factors for each residue. By this stage no
`further features in the difference Fourier map could be interpreted.
`(ii) Energy refinement
`The limited resolution of the data does not enable the real-space refinement method
`to produce co-ordinates without some bad non-bonded interactions. Bad contacts
`can be relieved by a few cycles of energy refinement (Levitt, 1974). A total of 60
`cycles of energy refinement were carried out on the co-ordinates using the program
`of Levitt together with the latest values of the energy parameters recommended
`by D. Hall (personal communication). The parameters concerned With the structure of
`the haem groups had been chosen with reference to the accurate co-ordinates of
`synthetic models of haem proteins (Collman, 1977). The conditions used in the energy
`refinement are given in Table 5. The R-factor of the energy-refined co-ordinates is
`not significantly worse than that for the real-space-refined co-ordinates (see Table 4).
`The r.m.s.T shift in atomic co-ordinates between the real-space-refined and the energy-
`refined co-ordinates was 0-28 A and the average magnitude of the shift was 0-22 A.
`1' Abbreviation used: r.m.s., root-mean-squared.



`(iii) Accuracy of final co-ordinates
`An estimate of the accuracy of the co-ordinates can be obtained by a method
`originating with Luzzati (1952,l953) that was discussed and used by Fermi (1975).
`This method estimates the r.m.s. error in the co-ordinates from the dependence of
`the R-factor (RF) on the inverse resolution of the data. Figure 5 shows a plot of the
`0'5 '-
`O~4 -
`FIG. 5. The dependence of the R-factor (RF) on the i_nverse resolution of the data, d.* (O) R,-
`calculated from real—spa.ce-refined co-ordinates;
`( 0) RF calculated from energy-refined co-
`ordinates; continuous lines are theoretical curves for r.m.s. errors between 0-4 A and 0-6 A, derived
`as described by Fermi (1975).
`value of RF versus d* for both the real-space-refined and energy-refined co-ordinates,
`alongside theoretical curves for r.m.s. errors between 0-4 A and 0-6 A. The data indicate
`an r.m.s. error of about 0-55 A for both sets of co-ordinates. This r.m.s. value gives
`the average error throughout the co-ordinate set, and the atomic positions will be
`more accurate in the well-determined regions of the structure (e.g. in helical regions)
`and less accurate in poorly determined regions (e.g. in the corner regions between
`helices and at the chain termini). Fermi (1975) was able to estimate the relative errors
`in various regions of the structure of human deoxyhaemoglobin from a comparison of
`co-ordinates between the two halves of the molecule that are related by a molecular,
`but non-crystallographic, 2-fold axis. The relative r.1n.s. errors that he found are
`given in Table 6. Assuming that the distribution of errors is similar for the carbon-
`monoxy haemoglobin structure, the r.m.s. errors in various parts of this new structure
`are estimated as given in Table 6. The r.m.s. error in the co-ordinates of the corner
`regions between helices, and of the chain termini, is greater than 055 A.
`(iv) Changes from initial co-ordinates
`The accumulated r.m.s. shifts between corresponding atomic positions are given
`in Table 4 for each stage of the refinement. The r.m.s. shift between the initial and
`final co-ordinates of all atoms was 1-35 A. When main chain and ,3-carbon atoms only





`Distances and angles in the haems and their surroundings
`Energy- Real-space-
`Distance or Angle
`Fe to mean haem plane (A)
`Fe to plane of Ns (A)
`Angles between pyrrole planes 1-4 and mean
`haem plane
`Average distance of Fe from pyrrole Ns (A)
`N5 of His(F8) to mean haem plane
`N5 of His(F8) to plane of pyrrole Ns (A)
`N5-Fe bond length (A)
`Angle between N¢~—Fe bond and normal to
`haem plane (deg.)
`Angle between N5—~—Fe bond and plane of
`His(F8) (deg.)
`Angle between projection of N,—Fc bond on
`haem plane and line from centre of haem
`to N1 (+ toward N2) (deg.)
`Angles Fe—N5—Cg (deg.)
`Fe“N5“Cg (deg.)
`Angle between plane of His(F8) and mean
`haem plane (deg.)
`Angle between projection of His(F8) on haem
`plane and line from centre of haem to N1
`(+ toward N2) (deg.)
`Distance C5(F8) to N1 (A)
`Cg(F8) to N3 (A)
`Angle between Fe—C-O and haem normal (deg.)
`Angle between projection of Fe—CeO line on
`haem plane and line from centre of haem to
`N1 (+ towards N2) (deg.)
`Distance from O of CO to
`2-07 :0-26
`2-07 :0-26
`2-06;: 0-25
`2-2, 1-1
`3-2, 1-5
`0~22j;0-15 —0-02
`4-9, 3-1
`2-2, 3-3
`1-99 :0-26
`1-81 :0-25
`13: 10
`12 : 11
`142 :3 17
`106: 17
`13 ;; 11
`122 :17
`126;; 17
`14 3; 7
`1 1
`13 i 7
`N5 of HisE7 (A) )
`0,, of HisE’7 (A)
`0,2 of ValE11 (A)
`Distance from “Q" to §
`N5 of HisE7 (A)
`C,, of HisE7 (A)
`2-7 (3-1)) 2-9 3-3 (2-5))CW of ValE11 (A) 3-1
`T The estimates of these angles and distances made for horse carbonmonoxy haemoglobin by
`Heidner et al. (1976) are given in parentheses for comparison.
`I The errors in these distances are ;{;0-26 A.
`§ “Q” is where the O of CO would be if the CO did lie on the normal to the haem plane.
`basis of accurately known structures of haem complexes (see Table 5). In the energy-
`refined co-ordinate set, the iron atom in each subunit is essentially in the mean plane
`of the haem. As a. result of the energy refinement the iron atom moved by 0-04 A in
`the on haem and by 0-20 A in the B haem, whilst the pyrrole nitrogens (N1 to N4)


`FIG. 7. The haem group in carbonmonoxy haemoglobin, showing the proximal histidine (F8)
`bonded to the iron atom and some of the side chains that are in contact with the haem on the side
`where the ligand binds. The view has been chosen perpendicular to the line joining the N atoms of
`pyrrole rings 1 and 3 to emphasize the position of the CO molecule off the normal to the haem



`Cys}393(F9) differed in met- and carbonmonoxy haemoglobin. They found that in
`methaemoglobin the side chain of Cys/393 was in equilbrium between two con-
`formations, one with the side chain pointing into a pocket enclosed by parts of helices
`F, G and H, and the other with tl1e side chain on the outside of the subunit. Their
`difference Fourier map showed that in carbonmonoxy haemoglobin the side chain
`always has the first of these conformations, accounting for the decreased reactivity
`of this sulphydryl group in carbonmonoxy haemoglobin compared to that in met-
`haemoglobin. In deoxyhaemoglobin the side chain always has the second con-
`formation, whilst the pocket between helices F, G and H is occupied by the side chain
`of Tyr/3l45(HC2), whose OH group is hydrogen-bonded to the carbonyl group of
`Val/398(FG5). The reactivity of the sulphydryl group is low in deoxyhaemoglobin
`for a different reason, as access to it is hindered by the carboxy-terminal residue
`His,3l46(I-IC3) which is held firmly in position by salt bridges in deoxyhaernoglobin
`but is probably freely moving in carbonmonoxy haemoglobin. The result concerning
`the conformation of /393 in carbonmonoxy haemoglobin is confirmed by the present
`study of the human form. The initial co-ordinates compiled from those of horse
`methaemoglobin had the sulphydryl group in the surface position, and the first
`difference Fourier map calculated showed this to be wrong. The position was corrected
`by rotating the side chain about the Ca——Cg bond so that the side chain was brought
`into the pocket, where it remained during subsequent refinement. The initial and
`final conformation angles of Cys/393(F9) are given in Table 10, along with the shifts
`Change in conformation of Cg/sB93(F9)
`Conformation angle (deg.)

`Atomic shifts from initial co-ordinates (A)
`In the final energy-refined structure S7 is 3-2 A from the carbonyl 0 of Ser;389(F5) with which
`it probably forms a. hydrogen bond.
`that took place in the atomic positions by the end of the refinement. In carbon-
`monoxy haemoglobin the carboxy terminal residues Tyr/3145(HC2) and His;3l46(HC3)
`are only partially localised near the surface of the subunit. The Tyr side chain appears
`in the final map with low occupancy in a position in contact with Cys,393, hydrogen-
`bonded to the carbonyl group of Valfi98(FG5).

`5. Conclusions
`A structure has been solved using X-ray data for the native protein only, by
`refinement from a trial model based on a closely related, known structure. The
`parameters specifying the position and orientation of the molecules in the new unit
`cell were determined by searching for the best initial R-factor. The refined co-
`ordinates gave an R-factor to 2-7 A resolution as good as that obtained by standard
`The final co-ordinates of human carbonmonoxy haemoglobin have been deposited
`with the Protein Data Bank. Copies can be obtained from the Crystallographic Data
`Centre, University Chemical Laboratory, Cambridge, England or from the Brook-
`haven National Laboratory, Upton, Long Island, NY 11973, USA.
`I am grateful to Dr N. L. Anderson for providing the high-resolution X-ray data, and
`to Drs J. Champness, P. R. Evans, G. Fermi, D. Hall, R. C. Ladner, S. E. V. Phillips,
`T. Takano and A. Wonacott for helpful discussions and the use of their computing pro-
`cedures. I thank Drs R. Henderson and M. F. Perutz for continuing encouragement.
`(1977). Editors of The Rotation Method in Crystal-
`Arndt, U. W. & Wonacott, A. J.
`lography, North-Holland, Amsterdam.
`Baldwin, J. M. (1975). Progr. Biophys. Mol. Biol. 29, 225-320.
`Baldwin, J. & Chothia, C. (1979). J. Mol. Biol. 129, 175-220.
`Bode, w. & Schwager, P. (1975). J. Mol. Biol. 98, 693-717.
`Bolton, W. & Perutz, M. F. (1970). Nature (London), 228, 551-552.
`Collman, J. P. (1977). Acc. Chem. Res. 10, 265-272.
`Cullis, A. F., Muirhead, H., Perutz, M. F., Rossmann, M. G. & North, A. C. T. (1961).
`Proc. Roy. Soc. ser. A, 265, 15-38.
`Cullis, A. F., Muirhead, H., Perutz, M. F., Rossman, M. G. & North, A. C. T. (1962).
`Proc. Roy. Soc. ser. A, 265, 161-187.
`Dayhofi‘, M. O. (1972). Atlas of Protein Sequence and Structure, vol. 5, suppl. 3, National
`Biomedical Research Foundation, Washington DC.
`Diamond, R. (1966). Acta Crystallogr. 21, 253-266.
`Diamond, R. (1971). Acta Crystallogr. sect. A, 27, 436-452.
`Diamond, R. (1974). J. Mol. Biol. 82, 371-391.
`Fehlhammer, H. & Bode, W. (1975). J. Mol. Biol. 98, 683-692.
`Fermi, G. (1975). J. Mol. Biol. 97, 237-256.
`Greer, J. (1971). J. Mol. Biol. 62, 241-249.
`Heidner, E. J., Ladner, R. C. & Perutz, M. F. (1976). J. Mol. Biol. 104, 707-722.
`Jensen, L. H.
`(1976). In Crystallographic Computing Techniques (Ahmed, F. R., ed.),
`Munksgaard, Copenhagen.
`Koenig, D. F. (1965). Acta Crystallogr. 18, 663-673.
`Ladnor, R. C., Heidner, E. J. & Perutz, M. F. (1977). J. Mol. Biol. 114, 385-414.
`Levitt, M. (1974). J. Mol. Biol. 82, 393-420.
`Luzzati, V. (1952). Acta Crystallogr. 5, 802.
`Luzzati, V. (1953). Acta Crystallogr. 6, 142-152.
`Mallett, J. F. W., Champness, J. N., Faruqi, A. R. & Gossling, T. H. (1977). J. Phys. E.
`Sci. Instrum. 10, 351-358.
`Muirhead, H. (1963). Ph.D. thesis, University of Cambridge.
`Perutz, M. F. (1968). J. Cryst. Growth, 2, 54-56.
`Perutz, M. F. (1976). Brit. Med. Bull. 32, 195-208.
`Perutz, M. F. (1979). Armu. Rev. Biochem. 48, 327-386.

