`
`Three-Dimensional Coordinates from Stereodiagrams of Molecular Structures
`
`DY MICHAEL G. ROSSMANN AND PATRICK ARGOS
`
`Departn1ent of Biological Sciences, Purdue University, iYest Lafayette, Indiana 47907, USA
`
`(Received 5 July 1919; accepted 15 October 1919)
`
`819
`
`Abstract
`
`A nlethod is described for deriving three-di1nensional
`coordinates from stcreodiagrams of molecular architec(cid:173)
`ture. The accuracy of the method
`is
`tested for
`cytochro1ne b5 (86 ca atoms) and tomato bushy stunt
`virus (311 C 0 atoms). l'hc coordinates were recon(cid:173)
`structed to l ·9 A and 2·6 A r.m.s. deviation of their
`original values, respectively. The ethics of the pro(cid:173)
`cedure arc discussed.
`
`Introduction
`
`The publication and illustration of molecular detail
`often takes the form of ball-and-stick representations in
`stereodiagrams. This practice has been greatly aided by
`Johnson's (1970) ORTEP program. These stcrco(cid:173)
`diagrams arc frequently published by protein and
`nucleic acid crystallographers long before the actual
`coordinates are publicly distributed. This practice has
`created an unusual situation \\'here scientists publicly
`present their results, but do not attc1npt to provide
`sufficient infonnation for others to use quantitatively
`their 'published data'.
`In this paper a technique is described with \vhich the
`three-dimensional coordinates can be detennined fro1n
`stcreodiagrams. This is follo\ved by a discussion of the
`ethics of non-publication of relevant data and the use of
`the present
`technique
`to
`extract
`the n1issing
`inforn1ation.
`
`The technique
`
`A stereodra\ving consists of t\VO projections of a
`three-dimensional object on to a plane. The object is
`vie\ved from a given distance but is rotated by a small
`the
`left and right
`angle +ip and -ip to create
`projections. The viewing distance, v, is normally about
`20 in (-0·5 m) or at infinity. The total angular
`separation 2QJ is usually about 5 °, which is the average
`angle subtended by the eyes at the normal focal plane.
`Let an aton1 of the object be at position x,y,z relative
`to a Cartesian axial fran1e.
`0567-7408/80/040819-05$01.00
`
`Let the object be rotated by ±(0 about they axis and
`vic\ved along z (Fig. I).
`Let the coordinates of the projected ato1n be atxL,yL
`and xR,yR
`in
`the
`left and
`right stereodiagra1ns,
`respectively.
`If the viewing distance is at infinity, then
`xL = x cos QJ + z sin tp,
`YL=y,
`
`and
`
`It follows that
`
`xR = xcos cp- z sin (jJ,
`y, =Y·
`
`X1, + XR
`X= - - - ,
`2 cos (jJ
`
`YL +YR
`y=-2-·
`
`XL-;- XR
`Z=---.
`2 sin <p
`
`(1)
`
`If, ho\vever, the viewing distance is not at infinity, an
`object which has a length d and is in front of the
`projection plane will appear to have a larger length D
`(Fig. 2) within the plane, \vhere
`
`d
`D=--.
`z
`1-(cid:173)
`v
`
`(2)
`
`Lefl
`
`Righi
`
`Fig, I. Definition of coordinates in viewing a stereodiagrarn. The z
`axis is perpendicular to the page.
`© 1980 International Union of Crystallography
`
`Breckenridge Exhibit 1024
`Breckenridge v. Novartis AG
`
`
`
`820
`
`THREE-DIMENSIONAL COORDINATES FROM STEREODIAGRAMS
`
`I
`D
`z-...;l' ~-___J-"---c--l I
`
`Fig. 2, An obje<:t of length d at distance z in front of the projection
`plane when viewed at a distance v from this plane will appear to
`have a projected length D where D = d/l 1 -
`(z/v)],
`
`Since x1 .• yi,XR and YR are measured projected lengths
`on the stcreodiagra1n, their values must be corrected
`with the expression (2). Hence, con1bining (1) and (2),
`it is clear that
`
`X ~ ·~L~;~ (1 -;),
`y=y,;y, (1-~)·
`
`z~-~s~x; (1-~}
`
`By solving these equations it \Vill be found that
`X 1• + XR
`X= - - -X q,
`2 cos q>
`
`\vhere
`
`YL +yR
`Y=---xq,
`2
`
`X1. -XR
`z=---xq,
`2 sin <p
`
`q=------
`l + (-~v-'· -_si-:-·;)
`
`(3)
`
`As v tends to infinity, q \Viii approach unity and
`equations (3) and (I) beco1ne equivalent. Furthennore,
`as v approaches -[(xL - xR)/2 sin <p] an instability is
`reached since the atom \vi\l then project at infinity on
`the vie\11ing plane.
`The expressions (3) can, ho\vever, be used only with
`a knowledge of rp and v, paran1eters rarely supplied
`\Vith stereodiagra1ns. Accordingly son1e criterion is
`necessary to deterinine these paran1etersi that is, the
`depth mcasure1nents (along z) n1ust be correlated to
`their horizontal (along x and y) counterparts. A
`reasonable criterion used for analyzing stereodiagran1s
`of polypeptide backbones is that all C 0 -Ca distances
`are of equal length. Additional criteria could use well
`known constants of such objects as the rt-helix or
`
`fi-pleated sheet; for example, the distance bet\vecn
`every fourth, eighth, etc. Ca atom \Vithin an a-helix.
`The longer the depth 1neasurements, the greater will be
`the accuracy of the determination of rp and v, although
`the c .. -cn criterion has been found to be sufficient.
`The criterion can be stated as requiring the minimum
`value of
`
`N
`
`(4)
`
`E ~ I (kR,- s,J'.
`l=l
`where S 1 is the anticipated distance bct\veen t\VO atoms
`(say C0 -Ca = 3·84 A), R1 is the distance derived from
`in
`(equations 3)
`the stereodiagra1n
`terms of the
`1neasuren1ent units, and k is a scale factor \Vhich relates
`the units used in R to those used for S. The length of
`each R1 will depend on the selected values of rp and v.
`Thus, a two-di1nensional search must be 1nadc for the
`n1inin1um of E in tenns of these variables. Likely search
`limits arc 1-5° :s: cp
`:s: 6° and 10" ::; v:::;: 50", 'vith
`suitable steps of .dcp = O· 25 ° and Av = 5".
`The best value of k for a given rp and v is given \vhen
`BE/ Bk = 0, that is,
`
`IRS
`N k=--,
`IR'
`
`N
`\Vhich can be used to evaluate E.
`<P and v have been
`the ato1nic
`found,
`Once
`coordinates can be calculated from expressions (3).
`flowever, the values of z will be subject to a good deal
`of error, as they depend on the differences (XL - sR). In
`contrast, .Y and y are essentially averages of the left and
`right 1neasure111ents. The accuracy in z can be regained
`to so1ne extent by invoking the sa1ne criteria as used in
`the detennination of tp and v. An iterative least~squares
`procedure can be set up to adjust the coordinates so as
`to equalize all ca~ca distances.
`There may be a residual systematic error if the stereo
`angle <P has been incorrectly estin1ated. If <Pc and rpE
`are the correct and estirnated angles, respectively, then
`it can be sho\\'11 that the z parameters \Viii be in
`syste1natic error according to the ratio of (sin 'Pc/sin
`rpf:) or approxin1ately <PcllPE· For instance, if <Pc =
`<P1-: = 2·75°, then the molecule will be
`2·5° and
`compressed in the ratio of 0·91 along z. Hence, even a
`small error in the deterrnination of <? will produce a
`significant syste1natic error in z. This is a consequence
`of the lack of physical infonnation inherently residing
`in the s1nall rotation angle between the left and right
`in1ages. Other systematic errors might arise in the
`photographic reduction due to lens aberrations.
`The steps in the procedure can be sum111arized as
`follows.
`(i) Dctenninc xL, Yi. and xR, YR from the stereo(cid:173)
`diagran1. !One of these sets n1ust no\v be rotated and
`
`
`
`821
`
`g l g
`
`MICHAEL G. ROSSMANN AND PATRICK ARGOS
`' ~
`)-< ,_
`d
`':!
`' ~
`~-3
`' ,;
`~
`s
`~
`~~
`
`to minin1ize L(yL -
`to
`in order
`translated
`yR)
`co1npcnsate for any fortuitous rotation and translation
`of the diagrarns with respect to each other.]
`(ii) Find the center of gravity of the left and the right
`coordinate sets and refer Xv Yi. and xR, YR to these
`origins. (One of the referees points out that so1ne
`improvement 1night be obtained by referring both
`diagrams to the san1e origin and by considering the
`translation factor as a further variable along \Vith v and
`qi.)
`(iii) Search for the minimum in E (expression 4) to
`obtain the angular separation rp and vie\ving distance v.
`(iv) Compute x,y1z from expressions (3).
`(v) Refine x,y and z given q> and v from (4).
`In practice it has been found that some atomic
`coordinates are particularly poor due to overlapping in
`one or other of the projections. Such inaccurate co(cid:173)
`ordinates can interfere in the search procedure for <P
`and v. Ho\vever, these can readily be eliminated from
`the search procedure by using the Ca:-CQ criterion in
`conjunction with reasonable test values of <p and v (e.g.
`~= 3°, v = oo). The test must be applied immediately
`preceding step (iii).
`
`Results
`
`The procedure was tested by comparing 'calculated'
`coordinates determined front published stercodiagran1s
`against the corresponding 'observed' sets obtained from
`the original investigators. T\vo cxa1nples \Vere chosen:
`one easy, \Vhere each half-diagram could be readily
`follo,vcd, and one difficult, \vhere each 1nonoprojection
`contained n1any overlapping atoms and bonds. A
`stercodiagram of cytochrome b5 (Mathe,vs, Levine &
`Argos, 1972) represented the easy cxantplc while
`to1nato bushy stunt virus (TBSV) protein subunit
`(Harrison, Olson, Schutt, \Vinkler & Bricogne, 1978)
`provided the difficult case. Original coordinates \Vere
`kindly supplied by Drs Scott Mathews and Steve
`Harrison, respectively.
`The stereodiagrarns were photographed without
`change of size. The resultant transparencies were
`digitized on an Optronics fihn scanner \vith a 100 µm
`raster. The optical densities \Vere then listed on a line
`printer where each density \Vas represented by a single
`character, but those below a given threshold were
`shown shnply as an asterisk. Thus, the output was
`essentially binary where the bond lines of the original
`stcreodiagrams \vere easily recognizable on a much
`enlarged scale. The molecular line dra,ving could then
`be followed in n1ost places. Consultation of the original
`stereopair \Vas able to resolve the remaining antbiguitics.
`The x and y coordinates of all atoms could then be read
`in tern1s of raster steps.
`In Table 1 are listed the results for the analysis of the
`cytochrome b5 and TBSV stercodiagran1s. Sho\vn arc
`
`I 'f
`' " !;!
`
`b
`~ <
`
`~
`' :5:
`
`.'?(~
`~~
`
`" fJ
`
`~-
`~
`
`~
`'
`"' ~-<
`' "' ~
`"-
`"" ~~ ,_
`.;:
`~
`
`i
`~
`~ ~
`
`"" ~
`
`$
`<:!
`
`
`
`822
`
`THREE-DIMENSIONAL COORDINATES FROM STEREODIAGRAMS
`
`Table 2. Two-dimensional exploration of angular separation (q>) and viewing distance (v) for coordinates taken
`fronl a cytochro111e b5 stereo pair
`
`v
`(in)•
`
`10
`20
`30
`40
`ro
`
`2.00
`0·914
`0·862
`0·856
`0·856
`0·862
`
`2-25
`0·842
`0·804
`0·800
`0·800
`0·806
`
`2·50
`0·798
`0·770
`0·768
`0·768
`0·774
`
`2-75
`0.774
`0·154
`0·153
`0·153
`0·758
`
`~ (')
`
`3·00
`0·766
`0·751
`0·150
`0·751
`0·155
`
`3·25
`0·767
`0·756
`0·156
`0·156
`0·760
`
`3.50
`0·775
`0·766
`0·766
`0·766
`0·770
`
`3. 75
`0·786
`0·779
`0·779
`0·780
`0·782
`
`4.00
`
`0·799
`0·794
`0·794
`0.794
`0·797
`
`4.25
`0·813
`0·809
`0·809
`0·810
`0·811
`
`Note: Nu1nbers represent r.m.s. deviations in A for calculated C,. -C .. distances from 3·84 A.
`
`• I in= 25·4 mm.
`
`the r.m.s. deviation between the measured Xi and x,
`and the measured Yi and y, coordinates. The latter pair
`should be the same and thus give an estimate of the
`error in the experimental determination of coordinates.
`forn1cr 1nust be significantly different since
`l'he
`variation in the x coordinates gives the determination of
`the unknown z parameters. Thus p = l2:(x1. - x,)'/
`L(Y1•
`- YR)']'" is a measure of the power of the
`technique when applied to a given diagram. It will be
`observed that the determinations of 1iJ and v (Table 2)
`give reasonable results when based only on the C 0 -C 0
`distances.
`Inclusion of «·helical parameters gave
`essentially the same results. The r.m.s. deviation of all
`C 0 -C 0 distances from 3·84 A was improved from
`O· 78 to O· 29 A for cytochrome b, and from l · 16 to
`O· 53 A for TBSV with the refinement of the x, y and z
`parameters.
`Comparison of the observed and calculated coordi(cid:173)
`nates was performed by a least-squares procedure (Rao
`& Rossmann, 1973; Rossmann & Argos, 1975) which
`obtains the best fit of two molecules in space. In this
`case, the two molecules were the 'observed' (coordi(cid:173)
`nates from original investigator) and 'calculated' (co(cid:173)
`ordinates from stereodiagram) structures. While the z
`parameters (depth) do have a systematic error due to a
`small inaccuracy of estimating rp, no substantial error
`was found (Table !). The larger error in z for TBSV
`reflects the larger molecular thickness along z so that
`the systematic error will be greater at the extremities of
`the molecule. A comparison of the original diagram of
`cytochrome b, (Mathews et al., 1972) and one drawn
`from the calculated coordinates is shown in Fig. 3.
`The TBSV calculated coordinates were tested for
`their usefulness
`in showing structural equivalence
`which involves the topological superposition of two or
`1nore protein domains. Argos, Tsukihara & Rossn1ann
`(1980) have recently suggested structural analogy
`two TBSV
`among
`the P-barrels comprising
`the
`domains and concanavalin A. The necessary metho(cid:173)
`dology has been reviewed by Rossmann & Argos
`(1978). The intent here is to demonstrate the utility of
`stereodiagram coordinates, even in a case as complex
`
`Fig. 3. Co1nparison of the original stcreodiagram (top) as pub(cid:173)
`lished by Mathews et al. (1972) with a stereodiagram drawn
`fro1n the calculated coordinates.
`
`the functional and
`lo discuss
`as TBSV, but not
`evolutionary in1plications of these con1parisons. 1'able
`3 shows the number of topologically equivalenced ca
`atoms determined by using the observed and calcu(cid:173)
`lated coordinates. While the calculated coordinates
`gave somewhat fewer equivalences with a slightly higher
`r.m.s. deviation, the same ho1nologous s_ccondary
`structural spans were easily recognized.
`
`
`
`MICHAEL G. ROSSMANN AND PATRICK ARGOS
`
`823
`
`Table 3. A 11alysis of the utility of the coordinates
`derlvedfrom a stereodiagram ofa TBSVsubunit
`
`what they have done and how they have arrived at their
`results. As far as possible, they should give limits of
`error. It is those who use the coordinates outside the
`limits of accuracy who are to be held culpable.
`Ho\vcvcr, this docs not 1nitigatc the first author's
`responsibility in only publishing stereodiagrams whose
`features can be regarded with reasonable confidence. It
`can hardly be Tycho Brahe's fault if others arrive at
`unacceptable concepts of the solar system.
`A fair and just solution to the problems raised here is
`imperative. Clearly the original author should always
`be approached before resorting
`to
`the extraction
`technique given here. Furtherrnore, the source of the
`coordinates, whether obtained directly or from a
`stereodiagram, must always be stated as recognition or"
`the degree of error. In any event the continued absence
`of coordinates and now perhaps even stereodiagrams
`can only result in retardation of scientific advancement.
`
`We would like to thank Sharon Wilder for help in the
`preparation of the manuscript. The work was sup(cid:173)
`ported by grants from the National Science Foun(cid:173)
`dation (No. PCM78-16584) and the National Institutes
`of Health (Nos. AI 11219 and GM 10704) to MGR
`and by a grant from the National Science Foundation
`(No. PCM77-20287) and a Faculty Research Award
`from the American Cancer Society (No. FRA 173) to
`PA.
`
`References
`
`(1972).
`
`ARGOS, P., TSUKIHARA, T. & ROSSMANN, M. 0. (1980).
`Submitted for publication.
`Crystallography of Molecular Biology ( 1976). 'Ettore Ma·
`jorana' Centre for Scientific·Culture, International School
`of Crystallography, Course III, Erice, Trapani, p. 9.
`FELDMANN, R. J. ( 1975). GAP SOM - Global Atlas of
`Protein Structure on Microfiche. Division of Computer
`Research and Technology, National Institutes of J-lealth,
`Bethesda, Maryland 20014.
`HARDMAN, K. D. & AINSWORTH, C. F.
`Biochemistry, 11, 4910-4919.
`HARDMAN, K. D. & AINSWORTH, C. F.
`Biochemistry, 12, 4442-4448.
`1-IARRISON, s. c., OLSON, A. J., SCHUIT, c. E., WINKLER, F.
`K. & BRICOGNE, G. (1978). Nature (Lo11do11), 276,
`368-373.
`JOllNSON, C. K. (1970). ORTEP. Report ORNL-3794,
`second revision. Oak Ridge National Laboratory, Ten(cid:173)
`nessee 37830.
`Journal of Biological Chemistry ( 1979). 254, 1-1 t (Instmc(cid:173)
`lions to Authors).
`MATHEWS, F. S., LEVINE, M. & ARGOS, P. (1972). J. Mo/.
`Biol. 64, 449-464.
`RAo, S. T. & RossMANN, M. G. ( 1973). J. Mo/. Biol. 76,
`241-256.
`ROSShfANN, M. G. & ARGOS, P. (1975). J. Biol. Che111. 250,
`7525-7532.
`ROSSMANN, M. G. & ARGOS, P. (1978). Mo/. Cell Biochem.
`21, 161-182.
`
`(1973).
`
`TBSV(S}-TBSV(P)
`T!lSV(P)-concanavalin A
`TBSV(S)---concanavalln A
`
`68
`
`"
`"
`
`Number of
`equivalences
`
`'Observed' coordinates
`r.m.s.
`de~·iaiion
`CAJ
`J.8
`H
`J.2
`
`'Calculattd' coordinates
`r.m.s.
`Number o( deviation
`cAJ
`equivalences
`63
`J.6
`J.6
`SS
`17
`J.6
`
`Notes:
`(I) 'Observed' refers to coordinates from original investigator. 'Calculated' refers to
`coordinates derived from the published stereodiagrams.
`the GAPSO,\I Atla1
`(2) Coordinates of conc:mavalin A were obtained from
`(Feldmann, 1975) and refer to the voork of Hardman & Ainsworth ( 1972, 1973).
`(J) (SJ and (P) refer to the surface and protruding d-Omains of the TBSV subunit
`{Harrison et al., 1978).
`
`The ethics question
`
`The tradition of science is to gather and publish facts.
`Others may wish either to verify the facts by repeating
`the observations or to use these results to obtain a
`fundamental understanding of Nature in terms of a
`unifying concept or correlation. The accepted practice
`is to extract information from the literature, acknow(cid:173)
`ledge its source, and to build upon it. The trend to
`withhold coordinates appears to be at odds with this
`tradition of scientific endeavor and
`long-standing
`exchange. Furtherrnore, coordinates are sometimes
`given only to close associates thus stifling a healthy
`public debate. Nevertheless, the present authors foresee
`that the technique published here may be considered a
`'sharp' practice by some, although it is only extracting
`inforrnation from publications. This is evidenced by
`resistance to suggestions that coordinates be deposited
`with the Brookhaven Data Bank upon publication of
`high-resolution structures (cf. !11str11clio11s to Authors
`of the Journal of Biological Chemistry, 1979; Crystal(cid:173)
`lography of Molecular Biology, 1976).
`This situation appears to have arisen as many years
`of intensive effort are required by groups of scientists to
`detern1inc tertiary structures. The researchers \Vish to
`announce their basic findings and yet withhold their
`quantitative results for some time in order to either (i)
`digest and utilize the coordinates for further scientific
`interpretations and thus reap the benefits of many years
`of effort or (ii) perhaps refine their atomic positions
`before release for the public domain to avoid erroneous
`deductions. Whichever is the case, it is clear that early
`publications announcing crystallographically deter(cid:173)
`mined structures oflen omit the detailed results. Hence,
`if the deduction of coordinates from published stereo(cid:173)
`diagrams represents a feat not intended by authors,
`controversy will obviously result.
`The question of accuracy may not generally be
`sufficient justification for withholding the quantitative
`results. Even quite inaccurate coordinates can give
`information on such topics as polypeptide topology and
`possible gene duplication. Authors need simply state