`
`Three-Dimensional Coordinates from Stereodiagrams of Molecular Structures
`
`BY MICHAEL G. ROSSMANN AND PATRICK ARGOS
`Department of Biological Sciences, Purdue University, West Lafayette, Indiana 47907, USA
`
`(Received 5 July 1979; accepted 15 October 1979)
`
`819
`
`Abstract
`
`A method is described for deriving three-dimensional
`coordinates from stereodiagrams of molecular architec(cid:173)
`ture. The accuracy of the method is
`tested for
`cytochrome bs (86 ca atoms) and tomato bushy stunt
`virus (311 Ca atoms). The coordinates were recon(cid:173)
`structed to 1·9 A and 2· 6 A r.m.s. deviation of their
`original values, respectively. The ethics of the pro-
`cedure are discussed.
`
`Introduction
`
`The publication and illustration of molecular detail
`often takes the form of ball-and-stick representations in
`stereodiagrams. This practice has been greatly aided by
`Johnson's (1970) ORTEP program. These stereo(cid:173)
`diagrams are frequently published by protein and
`nucleic acid crystallographers long before the actual
`coordinates are publicly distributed. This practice has
`created an unusual situation where scientists publicly
`present their results, but do not attempt to provide
`sufficient information for others to use quantitatively
`their 'published data'.
`In this paper a technique is described with which the
`three-dimensional coordinates can be determined from
`stereodiagrams. This is followed by a discussion of the
`ethics of non-publication of relevant data and the use of
`the present
`technique
`to
`extract
`the missing
`information.
`
`The technique
`
`A stereodrawing consists of two projections of a
`three-dimensional object on to a plane. The object is
`viewed from a given distance but is rotated by a small
`angle + qJ and -qJ to create the
`left and right
`projections. The viewing distance, v, is normally about
`20 in ( -0·5 m) or at infinity. The total angular
`separation 2qJ is usually about 5°, which is the average
`angle subtended by the eyes at the normal focal plane.
`Let an atom of the object be at position x,y,z relative
`to a Cartesian axial frame.
`0567-7408/80/040819-05$01.00
`
`Let the object be rotated by ±({J about they axis and
`viewed along z (Fig. 1 ).
`Let the coordinates of the projected atom be at XvYL
`and xR,yR
`in
`the
`left and right stereodiagrams,
`respectively.
`If the viewing distance is at infinity, then
`XL =X COS (/) + Z sin qJ,
`
`and
`
`YL=y,
`
`XR = X COS (/)- Z sin (/J,
`
`It follows that
`
`YL + YR
`y= - - -
`2
`
`(I)
`
`2 sin lfJ
`If, however, the viewing distance is not at infinity, an
`object which has a length d and is in front of the
`projection plane will appear to have a larger length D
`(Fig. 2) within the plane, where
`
`d
`D=--.
`z
`1-(cid:173)
`v
`
`(2)
`
`Left
`
`Right
`
`Fig. I. Definition of coordinates in viewing a stereodiagram. The z
`axis is perpendicular to the page.
`© 1980 International Union of Crystallography
`
`
`
`820
`
`THREE-DIMENSIONAL COORDINATES FROM STEREODIAGRAMS
`
`Projection
`Plane
`Fig. 2. An object of length d at distance z in front of the projection
`plane when viewed at a distance v from this plane will appear to
`have a projected length D where D = d/! 1 -
`(z/v)).
`
`Since xL,YvXR and YR are measured projected lengths
`on the stereodiagram, their values must be corrected
`with the expression (2). Hence, combining (1) and (2),
`it is clear that
`
`/J-pleated sheet; for example, the distance between
`every fourth, eighth, etc. ca atom within an a-helix.
`The longer the depth measurements, the greater will be
`the accuracy of the determination of q> and v, although
`the C a -C a criterion has been found to be sufficient.
`The criterion can be stated as requiring the minimum
`value of
`
`N
`
`E = L (kR;- S;)\
`
`(4)
`
`i=l
`where S; is the anticipated distance between two atoms
`(say Ca-Ca = 3·84 A), R; is the distance derived from
`terms of the
`in
`(equations 3)
`the stereodiagram
`measurement units, and k is a scale factor which relates
`the units used in R to those used for S. The length of
`each R; will depend on the selected values of q> and v.
`Thus, a two-dimensional search must be made for the
`minimum of E in terms of these variables. Likely search
`:::;; 6 ° and 1 0" :::;; v :::;; 50", with
`limits are 1· 5 ° :::;; q>
`suitable steps of Aq> = 0·25 o and .Av = 5".
`The best value of k for a given q> and v is given when
`oE/ ok = 0, that is,
`
`'
`
`:LRS
`k=-N--
`:LR2
`N
`which can be used to evaluate E.
`q> and v have been found,
`the atomic
`Once
`coordinates can be calculated from expressions (3).
`However, the values of z will be subject to a good deal
`of error, as they depend on the differences (xL- xR). In
`contrast, x andy are essentially averages of the left and
`right measurements. The accuracy in z can be regained
`to some extent by invoking the same criteria as used in
`the determination of f/J and v. An iterative least-squares
`procedure can be set up to adjust the coordinates so as
`to equalize all C a -C a distances.
`There may be a residual systematic error if the stereo
`angle q> has been incorrectly estimated. If (/Jc and q>E
`are the correct and estimated angles, respectively, then
`it can be shown that the z parameters will be in
`systematic error according to the ratio of (sin (/Jc/sin
`qJE) or approximately (/Jcl qJE. For instance, if (/Jc =
`2 · 5 ° and
`(/JF. = 2 · 7 5 °, then the molecule will be
`compressed in the ratio of 0· 91 along z. Hence, even a
`small error in the determination of qJ will produce a
`significant systematic error in z. This is a consequence
`of the lack of physical information inherently residing
`in the small rotation angle between the left and right
`images. Other systematic errors might arise in the
`photographic reduction due to lens aberrations.
`The steps in the procedure can be summarized as
`follows.
`(i) Determine xL, YL and xR, YR from the stereo(cid:173)
`diagram. [One of these sets must now be rotated and
`
`X=
`
`y
`
`Z=
`
`2 cos (/J
`
`XL+ XR (~-;).
`YL + YR (I-;).
`XL -XR ('-;).
`
`2
`
`2 sin q>
`
`By solving these equations it will be found that
`
`X=
`
`XL +XR
`2 cos (/J
`
`X q,
`
`YL + YR
`y= - - - X q,
`2
`
`Z =
`
`XL -XR
`2 sin q>
`
`X q,
`
`q= - - - - - -
`
`1 + G~ ::;)
`
`(3)
`
`where
`
`As v tends to infinity, q will approach unity and
`equations (3) and (1) become equivalent. Furthermore,
`as v approaches -[(xL - xR)/2 sin q>] an instability is
`reached since the atom will then project at infinity on
`the viewing plane.
`The expressions (3) can, however, be used only with
`a knowledge of q> and v, parameters rarely supplied
`with stereodiagrams. Accordingly some criterion is
`necessary to determine these parameters; that is, the
`depth measurements (along z) must be correlated to
`their horizontal (along x and y) counterparts. A
`reasonable criterion used for analyzing stereodiagrams
`of polypeptide backbones is that all C a -C a distances
`are of equal length. Additional criteria could use well
`known constants of such objects as the a-helix or
`
`
`
`MICHAEL G. ROSSMANN AND PATRICK ARGOS
`
`821
`
`to
`in order to minimize
`translated
`yR)
`:L(yL -
`compensate for any fortuitous rotation and translation
`of the diagrams with respect to each other.]
`(ii) Find the center of gravity of the left and the right
`coordinate sets and refer xL, YL and xR, YR to these
`origins. (One of the referees points out that some
`improvement might be obtained by referring both
`diagrams to the same origin and by considering the
`translation factor as a further variable along with v and
`qJ.)
`(iii) Search for the minimum in E (expression 4) to
`obtain the angular separation qJ and viewing distance v.
`(iv} Compute x,y,z from expressions (3).
`(v) Refine x, y and z given qJ and v from (4).
`In practice it has been found that some atomic
`coordinates are particularly poor due to overlapping in
`one or other of the projections. Such inaccurate co(cid:173)
`ordinates can interfere in the search procedure for ffJ
`and v. However, these can readily be eliminated from
`the search procedure by using the C a -C a criterion in
`conjunction with reasonable test values of qJ and v (e.g.
`ffJ= 3°, v = oo). The test must be applied immediately
`preceding step (iii).
`
`Results
`
`The procedure was tested by comparing 'calculated'
`coordinates determined from published stereodiagrams
`against the corresponding 'observed' sets obtained from
`the original investigators. Two examples were chosen:
`one easy, where each half-diagram could be readily
`followed, and one difficult, where each monoprojection
`contained many overlapping atoms and bonds. A
`stereodiagram of cytochrome b5 (Mathews, Levine &
`Argos, 1972) represented the easy example while
`tomato bushy stunt virus (TBSV) protein subunit
`(Harrison, Olson, Schutt, Winkler & Bricogne, 1978)
`provided the difficult case. Original coordinates were
`kindly supplied by Drs Scott Mathews and Steve
`Harrison, respectively.
`The stereodiagrams were photographed without
`change of size. The resultant transparencies were
`digitized on an Optronics film scanner with a 100 Jlm
`raster. The optical densities were then listed on a line
`printer where each density was represented by a single
`character, but those below a given threshold were
`shown simply as an asterisk. Thus, the output was
`essentially binary where the bond lines of the original
`stereodiagrams were easily recognizable on a much
`enlarged scale. The molecular line drawing could then
`be followed in most places. Consultation of the original
`stereopair was able to resolve the remaining ambiguities.
`The x andy coordinates of all atoms could then be read
`in terms of raster steps.
`In Table 1 are listed the results for the analysis of the
`cytochrome b5 and TBSV stereodiagrams. Shown are
`
`I
`
`~
`-<'-<
`V'-"
`00
`.;.,
`I
`Q::;
`:::::1
`
`I
`
`~
`-<"-<
`~'-"
`...,
`I
`~
`~
`
`"d"V"l
`ooo
`6..:.
`
`~r;::
`66
`
`~::!
`o..:.
`
`MM _...,
`66
`
`
`
`THREE-DIMENSIONAL COORDINATES FROM STEREODIAGRAMS
`
`822
`Table 2. Two-dimensional exploration of angular separation (qJ) and viewing distance (v) for coordinates taken
`from a cytochrome b5 stereo pair
`
`(/) (0)
`
`v
`(in)*
`10
`20
`30
`40
`00
`
`3-00
`0·766
`0· 751
`0·750
`0·751
`0·755
`
`2·75
`2·50
`2·25
`2·00
`0·774
`0-798
`0·842
`0·914
`0·754
`0·770
`0·804
`0·862
`0·753
`0·768
`0·800
`0·856
`0·753
`0·768
`0·800
`0·856
`0-758
`0·774
`0·806
`0·862
`Note: Numbers represent r.m.s. deviations in A for calculated C<>-Cn distances from 3·84 A.
`
`3·25
`0·767
`0·756
`0·756
`0·756
`0·760
`
`3·50
`0·775
`0·766
`0·766
`0·766
`0·770
`
`3·75
`0·786
`0-779
`0·779
`0·780
`0·782
`
`4-00
`0·799
`0·794
`0·794
`0·794
`0·797
`
`4-25
`0-813
`0·809
`0-809
`0·810
`0-811
`
`*Iin=25·4mm.
`
`the r.m.s. deviation between the measured xL and xR
`and the measured YL and YR coordinates. The latter pair
`should be the same and thus give an estimate of the
`error in the experimental determination of coordinates.
`The former must be significantly different since
`variation in the x coordinates gives the determination of
`the unknown z parameters. Thus p = L2: (xL - xR)2/
`yR)2P12 is a measure of the power of the
`2:(yL -
`technique when applied to a given diagram. It will be
`observed that the determinations of qJ and v (Table 2)
`give reasonable results when based only on the C a -C a
`Inclusion of a-helical parameters gave
`distances.
`essentially the same results. The r.m.s. deviation of all
`Ca-Ca distances from 3·84 A was improved from
`0·78 to 0·29 A for cytochrome b 5 and from 1·16 to
`0· 53 A for TBSV with the refinement of the x, y and z
`parameters.
`Comparison of the observed and calculated coordi(cid:173)
`nates was performed by a least-squares procedure (Rao
`& Rossmann, 1973; Rossmann & Argos, 1975) which
`obtains the best fit of two molecules in space. In this
`case, the two molecules were the 'observed' (coordi(cid:173)
`nates from original investigator) and 'calculated' (co(cid:173)
`ordinates from stereodiagram) structures. While the z
`parameters (depth) do have a systematic error due to a
`small inaccuracy of estimating qJ, no substantial error
`was found (Table 1). The larger error in z for TBSV
`reflects the larger molecular thickness along z so that
`the systematic error will be greater at the extremities of
`the molecule. A comparison of the original diagram of
`cytochrome b5 (Mathews et al., 1972) and one drawn
`from the calculated coordinates is shown in Fig. 3.
`The TBSV calculated coordinates were tested for
`in showing structural equivalence
`their usefulness
`which involves the topological superposition of two or
`more protein domains. Argos, Tsukihara & Rossmann
`( 1980) have recently suggested structural analogy
`two TBSV
`the
`/3-barrels comprising
`the
`among
`domains and concanavalin A. The necessary metho(cid:173)
`dology has been reviewed by Rossmann & Argos
`( 19 78). The intent here is to demonstrate the utility of
`stereodiagram coordinates, even in a case as complex
`
`Fig. 3. Comparison of the original stereodiagram (top) as pub(cid:173)
`lished by Mathews et a/. ( 1972) with a stereodiagram drawn
`from the calculated coordinates.
`
`as TBSV, but not to discuss the functional and
`evolutionary implications of these comparisons. Table
`3 shows the number of topologically equivalenced ca
`atoms determined by using the observed and calcu(cid:173)
`lated coordinates. While the calculated coordinates
`gave somewhat fewer equivalences with a slightly higher
`the same homologous secondary
`r.m.s. deviation,
`structural spans were easily recognized.
`
`
`
`MICHAEL G. ROSSMANN AND PATRICK ARGOS
`
`823
`
`Table 3. Analysis of the utility of the coordinates
`derivedfrom a stereodiagram of a TBSV subunit
`
`'Observed' coordinates
`
`'Calculated' coordinates
`
`Number of
`equivalences
`
`r.m.s.
`deviation
`(A)
`
`Number of
`equivalences
`
`r.m.s.
`deviation
`(A)
`
`TBSV(S}-TBSV(P)
`TBSV(P}-concanavalin A
`TBSV(S}-concanavalin A
`
`69
`68
`82
`
`3·8
`3-4
`3·2
`
`63
`58
`77
`
`3·6
`3·6
`3·6
`
`Notes:
`(I) 'Observed' refers to coordinates from original investigator. 'Calculated' refers to
`coordinates derived from the published stereodiagrams.
`(2) Coordinates of concanavalin A were obtained from
`the GAPSOM Atlas
`(Feldmann, 1975) and refer to the work of Hardman & Ainsworth ( 1972, 1973).
`(3) (S) and (P) refer to the surface and protruding domains of the TBSV subunit
`(Harrison eta/., 1978).
`
`The ethics question
`
`The tradition of science is to gather and publish facts.
`Others may wish either to verify the facts by repeating
`the observations or to use these results to obtain a
`fundamental understanding of Nature in terms of a
`unifying concept or correlation. The accepted practice
`is to extract information from the literature, acknow(cid:173)
`ledge its source, and to build upon it. The trend to
`withhold coordinates appears to be at odds with this
`long-standing
`tradition of scientific endeavor and
`exchange. Furthermore, coordinates are sometimes
`given only to close associates thus stifling a healthy
`public debate. Nevertheless, the present authors foresee
`that the technique published here may be considered a
`'sharp' practice by some, although it is only extracting
`information from publications. This is evidenced by
`resistance to suggestions that coordinates be deposited
`with the Brookhaven Data Bank upon publication of
`high-resolution structures (cf. Instructions to Authors
`of the Journal of Biological Chemistry, 1979; Crystal(cid:173)
`lography of Molecular Biology, 1976).
`This situation appears to have arisen as many years
`of intensive effort are required by groups of scientists to
`determine tertiary structures. The researchers wish to
`announce their basic findings and yet withhold their
`quantitative results for some time in order to either (i)
`digest and utilize the coordinates for further scientific
`interpretations and thus reap the benefits of many years
`of effort or (ii) perhaps refine their atomic positions
`before release for the public domain to avoid erroneous
`deductions. Whichever is the case, it is clear that early
`publications announcing crystallographically deter(cid:173)
`mined structures often omit the detailed results. Hence,
`if the deduction of coordinates from published stereo(cid:173)
`diagrams represents a feat not intended by authors,
`controversy will obviously result.
`The question of accuracy may not generally be
`sufficient justification for withholding the quantitative
`results. Even quite inaccurate coordinates can give
`information on such topics as polypeptide topology and
`possible gene duplication. Authors need simply state
`
`what they have done and how they have arrived at their
`results. As far as possible, they should give limits of
`error. It is those who use the coordinates outside the
`limits of accuracy who are to be held culpable.
`However, this does not mitigate the first author's
`responsibility in only publishing stereodiagrams whose
`features can be regarded with reasonable confidence. It
`can hardly be Tycho Brahe's fault if others arrive at
`unacceptable concepts of the solar system.
`A fair and just solution to the problems raised here is
`imperative. Clearly the original author should always
`be approached before resorting
`to the extraction
`technique given here. Furthermore, the source of the
`coordinates, whether obtained directly or from a
`stereodiagram, must always be stated as recognition of
`the degree of error. In any event the continued absence
`of coordinates and now perhaps even stereodiagrams
`can only result in retardation of scientific advancement.
`
`We would like to thank Sharon Wilder for help in the
`preparation of the manuscript. The work was sup(cid:173)
`ported by grants from the National Science Foun(cid:173)
`dation (No. PCM78-16584) and the National Institutes
`of Health (Nos. AI 11219 and GM 10704) to MGR
`and by a grant from the National Science Foundation
`(No. PCM77-20287) and a Faculty Research Award
`from the American Cancer Society (No. FRA 173) to
`PA.
`
`(1972).
`
`References
`ARGOS, P., TSUKIHARA, T. & ROSSMANN, M. G. (1980).
`Submitted for publication.
`Crystallography of Molecular Biology (1976). 'Ettore Ma(cid:173)
`jorana' Centre for Scientific·Culture, International School
`of Crystallography, Course III, Erice, Trapani, p. 9.
`FELDMANN, R. J. (1975). GAPSOM - Global Atlas of
`Protein Structure on Microfiche. Division of Computer
`Research and Technology, National Institutes of Health,
`Bethesda, Maryland 20014.
`HARDMAN, K. D. & AINSWORTH, C. F.
`Biochemistry, 11,4910-4919.
`HARDMAN, K. D. & AINSWORTH, C. F.
`Biochemistry, 12, 4442-4448.
`HARRISON, S.C., OLSON, A. J., SCHUTT, C. E., WINKLER, F.
`K. & BrucoGNE, G. (1978). Nature (London), 276,
`368-373.
`JOHNSON, C. K. (1970). ORTEP. Report ORNL-3794,
`second revision. Oak Ridge National Laboratory, Ten(cid:173)
`nessee 3 7830.
`Journal of Biological Chemistry ( 1979). 254, 1-11 (Instruc(cid:173)
`tions to Authors).
`MATHEWS, F. S., LEVINE, M. & ARGOS, P. (1972). J. Mol.
`Bioi. 64, 449-464.
`RAO, S. T. & ROSSMANN, M. G. (1973). J. Mol. Bioi. 76,
`241-256.
`ROSSMANN, M.G. & ARGOS, P. (1975). J. Bio[. Chern. 250,
`7525-7532.
`RosSMANN, M.G. & ARGOS, P. (1978). Mol. Cell Biochem.
`21, 161-182.
`
`(1973).