`
`The Protein Data Bank: A Computer·based Archival File for
`Macromolecular Structures
`
`The Protein Data Bank is a. computer-based archival file for macromolecular
`
`The Bank stores in a uniform format atomic co-ordinates and partial
`structures.
`
`
`bond connectivities, as derived from crystallographic studies. Text included in
`
`ea.ch de.ta entry gives pertinent information for the structure at hand (e.g.
`species from which the molecule has been obtained, resolution
`of diffr action
`
`
`de.ta, literature citations and specifications of secondary structure}. In addition
`
`
`to atomic co-ordinates and connectivities, the Protein Data Bank stores structure
`factors and phases, although these latter date. a.re not placed in any uniform for
`mat. Input of de.ta to the Bank and genera.I maintenance functions are ca.rr ied
`out at Brookhaven National Laboratory. AU date. stored in the Bank a.re available
`
`on magnetic tape for public distribution, from Brookhaven (to laboratories
`in
`and worldwide}. A
`the Americas}, Tokyo (.Japan}, and Cambridge (Europe
`master file is maintained at Brookhaven and duplicate copies are stored in
`Cambridge and Tokyo. In the future, it is hoped to expand the scope of the
`
`Protein De.ta Bank to make available co-ordinates for standard structural
`
`
`types (e.g. o:-helix, RNA double-stranded helix} and representative computer
`
`
`programs of utility in the study and interpretation of macromolecular structures.
`
`The Protein Data Bankt (1971,1973) was established in 1971 as a computer-based
`archival file for macromolecular structures. The purpose of the Bank is to collect,
`standardize, and distribute atomic co-ordinates and other data from crystallographic
`studies. As the number of solved protein and nucleic acid structures has grown to
`the point where some 107 characters are necessary to represent the co-ordinate
`information currently held, the need for such a computer-readable file has become
`very clear, and demands for the Bank's services have increased accordingly. The
`Protein Data. Bank is one of several data base activities in the field of crystallography,
`e.g. the Bibliographic (Kennard et al., 1972) and Structural (Allen et al., 1973)
`Data. Files for organic and organometallic compounds, the Atlas of Macromolecular
`Structure on Microfiche (AMSOM) (Feldmann, 1977), the Bond Index to the Deter
`mination of Inorganic Crystal Structures (BIDICS)t and the Powder Diffraction File.§
`
`(a) Scope
`The Protein Data Bank covers atomic co-ordinates, structure factors and phases
`from diffraction studies of macromolecules. Since most of this information is not
`generally published in the primary literature, the Bank depends for comprehensive
`ness on data. supplied directly by the investigators. It is essentially a depository of
`data., held in computer-readable form, in contrast to other data banks that are based
`
`t Protein Data Ba.nk is a. misnomer of historical
`origin, since the file now contains entries for a
`nucleic acid.
`t I. D. Brown, Bond Index to the Determination
`of Inorganic Crysta.I Structures,
`McMaster
`Ontario, Canada, LSS 4Ml.
`University, Hamilton,
`
`
`§American Society for Testing Materia.ls, 1916 Race St., Philadelphia, PA. 19103, U.S A.
`
`535
`
`1 of 8
`
`BI Exhibit 1080
`
`
`
`636
`
`F. C. BERNSTEIN ET AL.
`TABLE l
`Protein data bank hol,dings
`
`MOLECULE
`
`DEPOSITOR
`
`STATUS
`CODE
`
`A
`
`R
`
`H
`R
`
`B
`
`PD
`PD
`P
`P.
`P
`P
`P
`P
`A
`
`R
`R
`
`G. SCHULZ
`ADEHYLATE K !NASE
`C.-1. BRAHDEH
`ALCOHOL DEHYDROGEHASE <ADP-RIB>
`C.-1. BRAHDEH
`ALCOHOL DEHYDROGEHASE CORTHOPHEH>
`D. BLOU
`ALPHA-CHYl1'.JTRYPSIH <TOSYLl
`A. TULIHSKY
`ALPHA-CHYl'llTRYPS IH
`R. POLJAK
`ANTIGEN BINDING FRAGtEHT CHEU>
`BENCE-JONES 11"11.JHOGLOBULIH REI
`O. EPP, R. HUBER
`CALCIUM-BIHDIHG PARVALBUMlN SET 6A
`R. KRETSINGER
`
`CALCIUM-BIHDIHG PARVALBUMlN SET 6H
`R. KRETSINGER
`CALCIUM-BIHDING PARVALBUMIH SET 61
`R. KRETSlHGER
`K. KANNAN
`CARBONIC ANHYDRASE B
`CARBONIC AHHYDRASE C
`K. KANNAN
`CARBOXYPEPTIDASE A
`U. LIPSCOl"B
`CHYl'llTRYPS IHOGEH
`J. KRAUT
`G. REEKE. G. EDEU·�H
`COHCAHAVALTH A
`K. HARDMAH
`COHCAHAVALIH A
`F. S. MATHEUS
`CYTOCHROtE BS
`CYTOCHROtE C <ALBACORE, OXIDIZED> R. DICKERSOH
`CYTOCHROtE C <ALBACORE. REDUCED> R. DICKERSON
`CYTOCHROf'E C <BONITO. HEART>
`M. KAKUDO
`J. KRAUT
`CYTOCHROtE C2
`R. Tlt1<0VICH
`CYTOCHROtE CS50
`H. ISITSON
`ELASTASE
`L. JEHSEH
`FERREDOXIN
`FLAVODOXIH CCLOSTRIDIUM tf')
`M. LUDWIG
`COHEH. DAV I ES, SIL VERTOH
`P
`
`GAt1�-CHYtDTRYPS I H
`
`GLYCERALDEHYDE-3-P-DEHYDROGEHASECLOBSTR>M. ROSSMANN
`H
`HEl"K>GLOBIH <HORSE, AQUO tET>
`LADNER, HEJDHER, PERUTZ RP
`HEl'DGLOBIH CHORSE. DEOXY>
`M. PERUTZ. G. FERMI
`M. PERUTZ. G. FERMI
`HEl1:lGLOBlli CHUMAll. DEOXY>
`J. FRIER
`HEMOGLOBIN CHUMAH. FETAL. DEOXY>
`IJ. HEHDR ICKSOH
`HEtUGLOB IH CLAtf'REY>
`HEXOK !HASE <YEAST> Bl II
`T. STEITZ
`HIGH POTEHTIAL IRON PROTEIN
`J. KRAUT
`M. ROSSMANN
`LACTATE DEHYDROGEHASE
`LACTATE DEHYDROGENASE/J'IAD/PYRUVATE M. ROSSMAHH
`L YSOZYtE CHEN EGG-IJH l TE. SET IJ2)
`R. D IAMOHD
`LYSOZYl"E CHEii EGG-WHITE. SET RSSD>
`R. DIAMOND
`LYSOZYl1E <HEH EGG-UHITE. SET RS6Al
`R. DIAMOND
`L YSOZYtlE <HEH EGG-Ul ITE. SET RS9A>
`R. D IA110ND
`L YSOZYtE CHEH EGG-1.H lTE. SET RS 12Al R. DIAMOND
`R. DIAtllHD
`LYSOZYl'E CHEH EGG-I.HITE. SET RSl6)
`tv:ILATE DEHYDROGENASE
`L. BANASZAK
`MYOGLOB 11� <SPERM l.liALEl
`H. �TSOH
`tl'>'OGLOB HI (SPERM I.HALE. t£Tl
`T. TAKAHO
`T. TAKAHO
`MYOGLOBIH CSPERM I.HALE. DEOXY>
`PAtlCREATIC TRYPSIN IHHIBITOR
`R. HUBER
`PAPAIH, NATIVE
`J. DRENTH
`
`PAPAHI <ACE->1LA-ALA-PHE-ALA. CYS-25) J. DREHTH
`PAPfHH CCYS DERIV OF CYS-25)
`J. DRENTH
`J. DREHTH
`PAPAHI <OXIDIZED CYS-25)
`J. DRENTH
`PAPA IH CTOS-L YS, CYS-25>
`PAPA ltl <BZOXY-GL Y-PHE-GL
`Y. CYS-25>
`J. DREliTH
`PAPAIH CBZOXY-PHE-ALA.CYS-25) J. DRENTH
`H. UATSON
`PHOSPHOGL YCERATE KIHASE CYEAST>
`
`PHOSPllOGI. YCERATE K !HASE (HORSE>
`P. EVANS. D. PH ILL !PS
`S. OATI.EY. D. PHILLIPS
`PREALBUMIH CHUNAH, PLASMA>
`RIBOHUCLEASE S
`H. tJYtKOFF
`RUBREDOXIH
`L. JEHSEH
`F. A. COTIOH. E. HA ZEH
`STAPHYLOCOCCAL HUCLEASE
`STREPTOMYCES GRISEUS PROTEIHASE B
`M. JAtES
`J, KRAUT
`SUBTILISIH BPH'
`J. DREHnt
`SUBTILISIH HOVO
`J. AHD D. RICHARDSOH
`SUPEROXIDE DIStuTASE
`B. tflnHEWS
`ntERl1JLYSIH CUHREFtHED>
`ntERl1JLYSIH <REFIHED)
`B. NATTHEWS
`e.-o. SODERBERG
`ntlOREDOXIH
`TRAHSFER RHA CYEAST. PHE>
`J. SUSSNAH, S.-H. KIM
`TRAHSFER RHA C'l'EAST. PHE)
`M. SUNDARALIHGAM
`JACK, LADHER. KLUG
`TRANSFER RHA <YEAST, PHEl
`TRIOSE PHOSPHATE ISOtERASE
`I. WILSOH. D. PHILLIPS
`TRYPSIH <HATIVE. PHBl
`FEHLHAtl'ER.BODE.SCH�GER H
`TRYPSJHCBEHZAMIDIHE INHIBITED. PH7>
`FEHLHA111:R.80DE.SCHISIGER RH
`TRYPSIH/'TRYPSIH IHHIBITOR COtt>LEX BODE ET AL.
`H
`
`A
`B
`
`HD
`
`A
`
`A
`
`A
`H
`p
`p
`
`lDEHT
`CODE
`
`IADK
`IADH
`2ADH
`2CHA
`3CHA
`IFAB
`IRE!
`ICPV
`2CPV
`3CPV
`JCAB
`ICAC
`lCPA
`!CHG
`2CHA
`3CHA
`IBSC
`ICYT
`2CYT
`ICYC
`IC2C
`ISSC
`IEST
`IFDX
`IFXH
`I GCH
`IGPD
`2tHB
`IDHB
`IHHB
`IFDH
`ILHB
`IYHX
`IHIP
`2LDH
`3l.DH
`IL yz
`2LYZ
`3LYZ
`4L YZ
`SL YZ
`6LY2
`IMDH
`ll"BH
`21"BN
`3t'BH
`3PTI
`SPAP
`2PAP
`3PAP
`4PAP
`SPAP
`6PAP
`i'PAP
`IPGK
`2PGK
`IPAB
`lRHS
`2Rxtt
`1 SHS
`ISGB
`ISBT
`2SBT
`!SOD
`ITLH
`2TI.H
`ISRX
`1THA
`2THA
`3THA
`ITIM
`IPTH
`2PTB
`IPTC
`
`STATUS CODES
`
`BLAHK
`A
`B
`D
`H
`P
`R
`
`STANDARD ENTRY AVAILABLE FOR DISTRIBUTION
`ALPHA CARBOH ATOMS OHLY
`BACKBONE ONLY
`HEW DATA HAS BEEH PROMISED
`HEW EHTRY WITH DEPOSITOR FOR APPROVAL
`IH PREPARATIOH
`REPLACES AH OUT OF DATE PARAtETER SET
`
`2 of 8
`
`BI Exhibit 1080
`
`
`
`LETTERS TO THE EDITOR
`
`537
`
`on data abstracted from scientific publications. The Bank contains 77 atomic co
`ordinate entries for 4 7 ma.-cromolecules (Table 1 }, t and 13 sets of structure factors and
`phases. The atomic co-ordinate entries, which include descriptive text and partial
`bond connectivities, conform to a uniform format (see below}, but the structure
`factors and phases are stored in the format received from depositors. All co-ordinate
`entries are referred to depositors for verification, before being made available publicly
`through the Bank.
`
`(b) Record .structure of atomic co-ordinate entries
`Atomic co-ordinate entries consist of records each of 80 characters.+ Using the
`punched card analogy, columns 1 to 6 contain a record type identifier, and columns
`7 to 70 contain data.§ Columns 71 to 80 are normally blank, but may contain sequence
`information which is added by the library-file management program UPDATE1f
`used to maintain the file on the Brookhaven CDC CYBER 70/76 computing system.
`In order to facilitate retrieval of data from the file, the first four characters of each
`record define the unique record type, and the syntax of each record is independent of
`the order of records within any entry for a particular macromolecule. (In the master
`file, this order is always fixed.) Atomic co-ordinate data contributed by depositors
`are processed into the standard format with program MACMOL, II which also subjects
`the data to certain nomenclature and connectivity checking procedures.
`A sample partial entry for the protein ribonuclease S is shown in Table 2. tt The
`unique code lRNS identifying this entry is given in the HEADER record, along with
`the date these data were entered into the Bank, and a provisional classification based
`on function, intended for future use in indexing and subdividing the file. Text giving
`the name of molecule, species from which it ha-s been obtained, authors, literature
`citations, and other general description are presented in records COMPND through
`REMARK. SEQRES gives the amino acid sequence, and FTNOTE records are
`footnotes keyed to particular residues or atoms. Records HELIX through TURN
`describe the secondary structure as stated or approved by the depositor. Record
`CRYSTI defines the unit cell, while ORIGX and SCALE respectively give trans
`formations relating the orthogonal Ang strom co-ordinates stored in the file to those
`originally supplied by the depositor (these frequently are referred to an oblique or
`non-isometric system) and to standard crystallographic fractional co-ordinates.
`ATOM records give the IUPAC-IUB (1969) standard a.tom names (IUP AC-IUB,
`1970), and residue abbreviations (IUPAC-IUB, 1971), along with sequence identifiers
`(cf. SEQRES, above), co-ordinates in Angstrom units, and occupancies and thermal
`
`t In addition to current co-ordinate entries shown in Table 1, the Bank contains obsolete
`entries (for adenyla.te kinase tosyl, cx·chymotrypsin, concana.valin A, lactate dehydrogenase, horse
`methemoglobin, papain, rubredoxin, benzamidine-inhibited trypsin and pancreatic trypsin
`inhibitor), which have been superseded by later, more accurate data. These obsolete data are
`available on special request.
`t Originally, the Bank used a 140-character format, similar to that employed in the protein
`refinement programs of Diamond (1966,1971). The 140.character format has been superseded by
`the SO-character format.
`§ A detailed description of the file formats is available from Brookhaven on request.
`1 Control Data Corporation, UPDATE Reference Manual, Publication No. 60342500, Control
`Data Corporation, Arden Hills, Minnesota., 1974.
`� G. J.B. Williams, unpublished. For the 140.character data, prog1·am PROIN by E. F. Meyer
`wsa utilized.
`tt The file is organized in a similar way for proteins and nucleicacids,althoughcertaindifferenoes
`exist, e.g. with regard to details of atom and residue names.
`
`3 of 8
`
`BI Exhibit 1080
`
`
`
`F. C. BERNSTEIN ET AL.
`
`TABLE 2
`Abbreviated sample atomic co-ordinate entry ( riboniwlease S)
`
`lRNS
`
`REFERENr.F lo Fo Mo RICHARDS ANO H. w. WYCl<OFFt ATLAS OF
`STRUCT111iES FDA MOLECULAR 8IOLOGYt VOL. 1. IHBONUCL£ASE-St
`CLARENnnN PRESS (19731.
`REFERENr.F 2o F. Mo RICHARDS AND H. w. WYCKOFFt BOVINE
`PANCREATTC RIBONUCLEASEt THE ENZYMES, EOITEU BY Po Oo
`ROYEA• vOLo IV• THIRD EDITIDNt P647t ACADEMIC PRESS 11971>
`REFERENr.F 3. Fo "'• RICHARDSt Ho w. WYCKOFFo w. D. CARLSON•
`No Mo ALLEWELLt Bo LEE ANO Y• �ITSUlt PROTEIN STAUCTUAEt
`RIBONUr.LEASE-S AND NUCLEOTIDE INTERACTIONSt COLO SPRING
`HARBOR �Yl4POSIA ON QUANTITATIVE BIOLOGY• VOL. XXXVlt P35
`11971).
`REFEAENrF 4o No M. ALLEWELL ANO Ho w. WYCKOFFt
`CRYSTALI OGRAPHIC ANALYSIS OF THE INTERACTION OF CUPRIC
`ION WIT� RIBONUCLEASE St J. BIOL. CHE"·· vnL. 246t P4657
`11971>.
`REFERENr.F So H. w. WYCKOFFt D· TSEHNOGLOUt A. Wo HANSON,
`Jo R. kl\lnXt 80 LEE ANO F. Mo RICHARDS• THE THREE
`DIMENSTnNAL STRUCTURE OF RIBDNUCL�ASE-s. I�TERPAETATION
`OF AN �LECTRON OENSITY HAP AT A NOMINAL RESOLUTION OF 2
`ANGSTRnMS• Jo BIOL. CHEM., VDLo 245• P305 119701.
`REFERENr.t 60 Ho w. WYCKOFFt K• Do HAROMANt No Mo ALLEWELL•
`To INARA�It o. TSERNOGLOUt L• No JOHNSON ANO F. M.
`RlCHAAn�· THE STRUCTURE OF RIBONUCLEASE-s AT 6 ANGSTROM
`RESOLUTJONt Jo BIOL. CHEM., VOLo 242, P3749 119671.
`
`WHICH FORMS A SEPARATE
`MOLECULE IS GIVEN THE
`
`LYS
`ALA
`ASN
`CVS
`ASP
`LYS
`MET
`TYH
`HIS
`VAL
`
`PHE GLU ARG GLN HIS MET
`
`GLN MET
`LYS PRO
`VAL GL.N
`ASN GLY
`SEA ILE
`PRO ASN
`ILE ILE
`HIS PHE
`
`MET
`VAL
`ALA
`GLN
`THP
`CYS
`VAL
`ASP
`
`LYS
`ASN
`VAL
`THA
`ASP
`ALA
`ALA
`ALA
`
`SER
`THq
`CYS
`ASN
`CYS
`TYR
`CYS
`SEii
`
`AAG
`PHE
`SEq
`CYS
`ARG
`LYS
`GLU
`VAL
`
`ALA
`ALA
`CYS
`ARG
`ALA
`CYS
`THA
`LYS
`LYS
`PRO
`
`HYOROLASF CPHOSPHORIC DIESTERt RNA)
`RI80NUCLFASE•S <E.c. 3.1.4.221
`BOVINE IRn� TAURUS> PANCREAS
`Fo Mo Rir.MAROS ANO Ho w. WYCKOFF
`R.J. FLETTERICK ANO Ho w. WYCKOFFt PRELIMINARY REFINEMENT
`OF PROTEtN COORDINATES IN RFAL SPACE• ACTA CRYSTot VOL. A31t
`P698 Cl 97"11 •
`1
`1
`1
`1
`1
`l
`1
`1
`l
`1
`1
`1
`1
`1
`1
`1
`l
`1
`l
`1
`l
`1
`1
`l
`1
`2
`2 RESOLUTtnNo 2o0 ANGSTROMS.
`3
`3 REF'INEMFNT. BY A STEEPEST-DESCENTS PROCEOUREo REFER TO THE
`3 JRNL CtTATION ABOVE.
`4
`4 THIS COn�DINATE SET IS DESIGNATED &0 8Y THE OEPOSITORo
`5
`5
`5
`5
`1
`2
`1
`2
`3
`4
`5
`6
`7
`8
`1
`1
`1
`1
`2
`2
`2
`2
`1
`2
`
`THE *S-P£PTIDE* IRESIOUES 1•20l
`CHAIN FP�� THE REMAINDER OF THE
`CHAIN InF�TIFIER s.
`S
`20 LYS GLU THR
`S
`20 -�P SER SER
`104 �EQ SER SER
`104 ASN LEU THR
`104 VAL HIS GLU
`104 �LN LYS ASN
`104 TYA GLN SEA
`104 ALU THR GLY
`104 THA THA GLN
`104 RLY ASN PRO
`
`ALA
`THR
`ASN
`LYS
`SER
`VAL
`TYR
`SER
`ALA
`TYR
`
`ALA
`SER
`TYR
`ASP
`LEU
`ALA
`SER
`SER
`ASN
`VAL
`
`THE MAIN CHAIN ANO HOST OF THE ASSOCIATED SIDE CHAINS ARE
`NOT WELL-nEFtNEO IN THE REGIONS OF RESIDUES 2t 65-72 AND
`119-123.
`
`THE MAIN CHAIN IS VERY POORLY OEFINEO OR NOT VISIBLE AT ALL
`IN THE FLECTRON DENSITY MAP IN THE REGIONS OF PEstnUES 1.
`18-20•21-23 AND 124.
`3 MET S
`Hl THR �
`H2 ASN
`24 ASN
`
`1
`1
`
`HEADER
`COMP"IO
`SOURCE
`AIJTHOR
`JRNL
`JANL
`JANL
`REMARK
`REMARK
`REMA RI(
`REMARK
`REMARK
`REMARK
`REMARK
`REMARK
`REMARK
`REMARK
`R£'4ARK
`REMARK
`RE NARK
`REMARK
`RE"IARK
`REMARK
`REMARK
`REMARK
`REMARK
`REMARK
`REMARK
`RE14ARK
`REMARK
`REMARK
`REl4ARK
`REMARK
`REMARK
`REMARI(
`RE14ARK
`REMARK
`REMARK
`REMARK
`REMARK
`REMARK
`REMARK
`RE.MARK
`SEQRES
`SE ORES
`SEORES
`SE ORES
`SE ORES
`SE ORES
`SEORES
`SE ORES
`SE�RES
`SEORES
`FT NOTE
`FT NOTE
`FT NOTE
`FTNOTE
`FTNOTE
`FTNOTE
`FT NOTE
`FT NOTE
`HELIX
`HEL Ill
`
`4 of 8
`
`BI Exhibit 1080
`
`
`
`LETTERS TO THE EDU'OR
`
`539
`
`TABLE 2-continud
`
`44
`'SJ
`
`0 CYS
`0 THA
`
`0
`
`J
`
`7
`
`4
`
`0
`
`6 952
`
`2
`
`8
`
`10
`
`84
`100
`
`72
`108
`118
`
`6
`
`56 l
`50 ALA
`J HJ SER
`HELIX
`48 0
`41 HIS
`l Sl J Lv!!:
`SHEET
`87 -1 N ASN
`79 THR
`2 S l J MFT
`SliEET
`10. -1 H ASP
`96 LYS
`J Sl J ALA
`SHEET
`64 0
`61 ALA
`l S2 4 LY!'
`SHEET
`0 CYS
`75 -1 N VAL
`6J
`Tl SER
`2 S2 4 Ac;,,.
`SHE[T
`111 -1 N TYM
`0 VAL
`13
`J 52 4 lit!'
`l 05 GLIJ
`SHEET
`124 -1 0 ALA
`116 VAL
`N VAL
`109
`4 S2 4 VAL
`SHEET
`PSEUDO 3/10 HELIX
`57
`54 VAL
`l Tl VAL
`TUAN
`PSEUDO J/10 HELIX
`59
`56 SER
`2 T2 ALA
`TUAN
`bd
`BETW Sl�NDS lt2 OF SliEET 52
`65 GLY
`3 Tl CYS
`TUAN
`ENO OF STRANO 2 OF SHEET Sl
`90
`87 SER
`4 T4 THA
`TUAN
`97.150 90.0C/ 90.00 120.00 p Jl 2 l
`44.4'50
`44.650
`CRYSTl
`0.000000
`0.000000
`1.oooonn 0.000000
`ORIGXl
`11.000000
`0.000000
`0.00001111 1.000000
`ORIGX2
`u.000000
`1.000000
`0.0000011 0.000000
`ORIGX3
`u.000000
`0.000000
`.Ol29Jl
`.022306
`SCALEl
`0.000000
`0.000000
`.025861
`o.oooono
`SCALE2
`0.000000
`.Ol029J
`o.ooooon 0.000000
`SCALEJ
`o.oo
`7.914 20.202 1.00
`2
`-15.J94
`l
`N
`Lye; S
`l
`ATOM
`o.oo
`2
`7.6J6 18.7JO 1.00
`-15.145
`1
`2 CA Lye: S
`ATOM
`o.oo
`2
`6.107 18.763 l.oo
`J c
`Lye; 5
`-14.982
`l
`ATOM
`o.oo
`2
`5.351 19.7J2 1.00
`4 0
`-15.145
`l
`Lye; 5
`ATOM
`5 CB Lvc; !';
`o.oo
`9.24-. 18.185 l.oo
`2
`-13.872
`1
`ATOM
`o.oo
`7.65 .. 18. 794 1.00
`2
`-12.693
`6 CG Lye; s
`l
`ATOM
`.
`.
`.
`........ �"'*"'*"'*"'*"'*"'*"'..,.*"'�"'*"'*"'*"'*"'*"'*"'*"'*"'*"'*"'*�*"'*"'*"'*"'*"'*"'*"'.._. ..................... "'*"'
`121
`-6.795 -9.247
`7.034 l.OO o.oo
`ATOM
`927 N
`Ac;p
`l
`-5.81J -9.425
`5.935 1.00 o.oo
`1
`ATOM
`928 CA Ac;p
`121
`-6.217 -l0.156
`4.789 1.00 O.OO
`ATOM
`929 C
`Ac;p
`121
`l
`ATOM
`930 0
`Ac;p
`121
`-5.828 -9.850
`3.652 1.00 o.oo
`1
`--..529 -10.01s
`6.6�8 l.OO o.oo
`ATOM
`9Jl C8 Ac;p
`121
`l
`ATOM
`932 CG At;p
`121
`-J.471 -�.50J
`5.687 1.00 O.OO
`1
`ATOM
`933 001 Ac;P
`121
`-J.J20 -8.082
`5.636 1.00 0.00
`1
`-2.718 -lO.J3J
`4.799 1.00 O.OO
`ATOM
`934 002 Ac;p
`121
`l
`5.013 l.Oo o.oo
`l
`ATOM
`9J5 N
`A1 A
`122
`-7.049 -11.201
`°
`a
`122
`-7.865 -12.086
`4.084 1.00 O.OO
`ATOM
`936 CA A1
`l
`ATOM
`937 C
`ALA
`122
`-8.554 -13.331
`4.724 1.00 O.OO
`1
`122
`-8.495 -13.636
`5.925 l.OO o.oo
`ATOM
`938 0
`At.A
`l
`939 CB ALA
`122
`-6.991 -12.510
`2.881 l.OO O.OO
`ATOM
`1
`-8.885 -13.915
`3.717 l.OO O.OO
`1
`940 N
`5F.R
`123
`ATOM
`ATOM
`941 CA SF.P
`123
`-9.758 -15.155
`3.627 1.00 o.oo
`l
`l
`ATOM
`942 C
`SFP
`123
`-8.915 -16.127
`2.880 1.00 O.OO
`943 0
`ATOM
`SF.P
`123
`-8.372 -1s.a12
`1.a10 l.OO o.oo
`l
`l
`ATOM
`944 CB SFP
`123
`-10.877 -14.659
`2.597 1.00 O.OO
`ATOM
`945 OG S�o
`123
`-10.157 -14.035
`1.530
`t.OO o.oo
`l
`ATOM
`946 N
`VAL
`124
`-8.845 -11.415
`3.439 l.Oo o.oo
`2
`ATOM
`947 CA V&l
`124
`-8.591 -18.490
`2.596 1.00 O.OO
`2
`948 c
`-9.235 -18.381
`1.209 l.Oo o.oo
`2
`ATOM
`V&L
`124
`949 o
`ATOM
`va1.
`124
`-a.580 -11.135
`.377 1.00 o.oo
`2
`-8.937 -19.929
`ATO�
`950 CB VAL
`124
`3.162 l.Oo o.oo
`2
`ATOM
`951 CGl Val
`124
`-9.135 -20.905
`2.012 l.OO o.oo
`2
`ATOM
`952 CG2 VAL
`124
`-7.784 -20.573
`4.226 l.OO o.oo
`2
`ATOM
`953 OXT VAL
`124
`-10.419 -19.165
`1.046 l.OO o.oo
`2
`954
`TEA
`Vat
`124
`CONECT 196 195 ,.44
`CONECT 312 311 7�9
`CONECT 448 447 A44
`CONECT 498 497 c;49
`CONECT 549 498 c;4A
`CONECT 644 196 643
`CONECT 729 312 7�8
`CONECT 844 448 A43
`J6
`1n
`MASTER
`FlllO
`
`5 of 8
`
`BI Exhibit 1080
`
`
`
`F. C. BERNSTEIN ET AL.
`
`540
`motion factors, if these latter data are provided. Within each residue, atoms are
`ordered in a standard manner, starting with the backbone (N-Oa-0-0) and
`proceeding in increasing remoteness from the alpha carbon atom along the side-chain.
`Ca has been encoded CA, Oil as CB, etc. Where the sequence is known, but atoms have
`not been located in the structure analysis, gaps have been left in the atom serial
`numbers, to allow for future insertion. A TER record denotes an explicit chain
`terminating residue. CONECT records give bond connectivity, for all atoms where
`the covalent connectivity is not specified completely by the atom name and order of
`serial numbers within the entry (i.e. the primary structure of standard residues).
`CONECT records may also be used to denote hydrogen bond and salt bridge inter
`actions. Each entry is terminated with a MASTER record, which gives checksums of
`the number of records, broken down by record type, and an END of data record.
`
`(c) Services
`The activities of the Protein Data Bank are to collect and standardize data from
`laboratories engaged in the analysis of macromolecular structures, and to distribute
`these data within the scientific community. As a service to depositors, data are
`checked for errors of a clerical nature but no exhaustive verification is attempted.
`The Data Bank is located at Brookhaven National Laboratory and input of data to
`the Bank and general maintenance functions are carried out at Brookhavent.
`
`TABLE 3
`Protein Data Bank activities
`
`Co-ordinate
`entrie.s
`held at
`end of year
`
`Co-ordinate entries distributed
`Brook- Cam-
`haven
`bridge
`
`Tokyo Total
`
`Laboro.tol'ies receiving do.ta
`Brook- Cam-
`ha.ven bridge
`
`Tokyo Tota.I
`
`16
`17
`40
`69
`
`106
`99
`613
`1920
`
`30
`102
`241
`600
`
`136
`201
`754
`2766
`
`14
`14
`31
`47
`
`2
`6
`6
`14
`
`246
`
`16
`20
`37
`70
`
`9
`
`Year
`
`1973
`1974
`1976
`1976
`
`Duplicate copies of the Brookhaven master file are maintained at Cambridge and
`Tokyo. Data are available on magnetic tape for public distribution, from Brookhavent
`(to laboratories in the Americas), Tokyo (Japan), and Cambridge (Europe and world
`wide). The data are also available in a limited way within the United States over the
`Crystallographic Computing Network (Koetzle et al., 1975). Retrieval programs to
`access data in the flle have been described (Meyer, 1974; M. Tasumi, personal com
`munication). These efforts represent modest initial attempts at interactive access:
`up to the present the Protein Data Bank has served almost exclusively the demands of
`off-line users. A statistical summary of the Bank's activities is given in Table 3.
`These statistics show rapid growth, both in number of holdings and in requests filled.
`
`t Deta.ils of operations a.re announced periodically in a NewslettCl" (cul'rent issue is number 3,
`November 1976). Copie3 ma.y be obta.ined from B1·ookha.ven.
`t Requests should be accompanied with a new 2400 ft reel of magnetic ta.pe, a.nd a check or
`purchase order for U.S. $34.30 made to the order of Brookhaven National La.bora.tory, to cover
`postage and handling. This charge is subject to change in the future.
`
`6 of 8
`
`BI Exhibit 1080
`
`
`
`LETTERS TO THE EDITOR
`
`541
`
`( d) Future developments
`In the future, it is hoped to expand the scope of the Protein Data Bank to make
`types (e.g. «-helix, RNA double.
`available co-ordinates for standard structural
`
`
`stranded heJix) and representative computer programs of utility in the study of
`
`
`structures. A small number of programs will be written to calculate
`macromolecular
`
`
`useful quantities derived from the co-ordinates (namely torsion angles, wire-model
`
`bender angles, full covu.lent connectivities, etc.). In addition to this software, the
`
`
`
`Bank will undertake to distribute contributed programs, provided that documentation
`
`
`is deposited in machine-readable form with source code. t In order to facilitate use of
`
`
`parts, it is intended atomic co-ordinate data to assemble models from standardized
`
`
`a compact list of to offer a. "model-builder's kit", to consist of header information,
`
`to O· l A precision, and torsion angles. This kit would be distributed
`co-ordinates
`
`
`in the form of printed listings, but also would be available on magnetic tape.
`principally
`
`
`As the size of the Bank grow!>, possibilities for wide interactive access to the atomic
`
`
`co-ordinate data via a network will become more attractive. This type of access would
`
`increase the file's utility, particularly to those outside the fields of crystallography
`
`and analysis of protein and nucleic acid conformations, by removing the necessity
`
`for repeated development of retrieval programs. For example, graphics terminals
`
`
`could be used to draw pictures of macromolecules for research or instructional purposes.
`
`Network access will be a topic of continued exploration in the next few years.
`
`
`Suggestions regarding possible future improvements in Protein Data Bank services
`
`will be appreciated by the authors.
`
`The late Walter C. Hamilton was one of the founders of this project. Helen M. Berman
`participated in the initial organization of the Data Bank and also was active in preparation
`of some of the first data entries as were Betty R. Davis and Daniel D. Jones. Numerous
`individuals have offered helpful suggestions and criticisms: Herbert J. Bernstein, David
`M. Blow, John S. Coggins, Robert Diamond, Anthony C. T. North, Michael G. Rossmann,
`and David G. Watson. R-ichard J. Feldmann generously has transferred numerous atomic
`co-ordinate entries originally collected for inclusion in AMSOM. The members of the
`Protein Data Bank Advisory Group, David R. Davies, Kenneth Neet and Frederic M.
`Richards have overseen our operations and engaged in many useful discussions.
`This work was performed under the auspices of the U.S. Energy Research and Develop
`ment Administration and supported by the U.S. National Science Foundation under grants
`AG-370, GJ 33248X, DCR-75-07702, and PCM75-18956.
`
`Chemistry Department
`Brookhaven Natione.l Laboratory
`Upton, N.Y. 11973, U.S.A.
`
`University Chemical Laboratory
`Lenstield Road, Cambridge CB2 JEW, England
`
`University of Tokyo
`Hongo, Tokyo, Japan
`
`Received 21 February 1977
`
`FRANCES C. BERNSTEIN
`THOMA.$ F. KOETZLEt
`GRA.BEME J. B. WILLIAMS
`EDGAR F. MEYER, JR§
`MICHAEL D. BRICE
`JOHN R. RODGERS
`OLGA KENNARnn
`T.AXEHIKO SHIMANOUCHI
`MlTSUO TASUMlt
`
`t The Bank will assume no re.>ponsibility for checkout 01· correction of errors in such deposited
`programs.
`t To whom oorrospondence should be addressed.
`§Permanent address: Department of Biochemistry and Biophysics, Texas A & M University,
`College Station, Texas 77843. Research collaborator at Brookhaven National Labora.tory.
`, External staff, Medical Research Council.
`
`7 of 8
`
`BI Exhibit 1080
`
`
`
`542
`
`F. C. BERNSTEIN ET AL,
`
`REFERENCES
`Allen, F. H., Kenna.rd, 0., Motherwell, W. D. S., Town, W. G. & Watson, D. G. (1973).
`J. Ohem. Doc. 13, 119-123.
`Diamond, R. (1966). Acta Orystallogr.
`21, 253-266.
`Diamond, R. (1971). Acta Orystallogr.
`sect. A, 27, 436-452.
`Feldma.nn, R. J. ( 1977 J. Atlas of Macromolecular Strueture on M icroftche, Tracor Jitco
`Inc., Rockville.
`IUPAC-IUB Commission on Biochemical Nomenclature (1970). J. Biol. Ohem. 245, 6489-
`6497.
`IUPAC-IUB Commission on Biochemical Nomenclature (1971). J. Biol. Chem.
`247,
`977-983.
`Kenna.rd, 0., Watson, D. G. & Town, W. G. (1972). J. Chem. Doc. 12, 14-19.
`Koetzle, T. F., Andrews, L. C., Bernstein, F. C. & Bernstein, H. J. (1975). In Computer,
`N etworking and
`Chemistry (Lykos, P., ed.), ACS Symposium Series, vol. 19, p. 1,
`American Chemical Society, Washington.
`Meyer, E. F. (1974). Biopolymers,
`13, 419-422.
`Protein Data Bank (1971). Nature New Biol. 233, 223.
`Protein Data. Bank (1973). Acta Orystallogr. sect. B, 29, 1746.
`
`
`8 of 8
`
`BI Exhibit 1080
`
`