`
`/
`
`Journal of
`OLECULAR
`BIOLOGY
`
`Editors in Chief:
`J. C. KENDREW S. BRENNER
`
`olume 196
`
`N um ber 4
`
`20 August 1987
`
`UNIVERSITY Of WASIIINGTON
`
`SEP 22 '87
`
`LIBRARIES
`
`London
`Austin
`
`Orlando
`Boston
`
`New York
`San Diego
`Sydney
`Tokyo
`Toronto
`
`JMOBAK 196 (4) 743- 966
`ISSN 0022- 2836
`
`BIOEPIS EX. 1062
`Page 1
`
`
`
`Journal of Molecular Biology
`
`Editors-in-C hief
`,J. ( '. hPndre \\'. 7 All ~aint s Pa.-;sage. Cambridge C B:2 3 L~ . Eng la nd
`:-: . Bre nne r. .\I.H .C'. L a borato ry of :'ll o leeula r Bi o logy. Llnin•rsi t y Po~tg ra du ate .\l edi ('a l Sl·hool
`Hills Roa d . Cam b ridge C' B:2 :2QH . Eng la nd
`
`Editors
`
`Ue1H' stru c·t urp
`}
`( ~ t· n c· modi fi eati on
`C:Pne t•x press io n
`C: e ne rPgul at io n
`
`C'e ll d t• \·elo pme nt }
`C'e ll function
`
`Orga nelle stru c·tures }
`:\I <H'I'Onl o lel' lll a r
`asse mblies
`i\) <l('l'OnlOIP<· ular
`stnl('turP
`
`.1/olet n!t's:
`
`Lr' //1' 1'-' to th r'
`h'ditor:
`
`( :t\nt:.ra l
`Pre lin1in a r.v X -ray d a ta
`
`/J .
`
`/{
`
`{
`
`S. /J renner (addre,;s a bo ve).
`P. ( 'Jwmbon . La borato ire d e UC:·nt'tiquP :\lol!>e ulain• d es Eu l'aryotes du
`(':\RS. Tnst itut d e ('himie B io log ique. F ae ultP d e :\ledi c·ine, I I Rue
`Humann . 6708:) Stra,; ho urg C'ed ex. F rance .
`J/. Uottesnwn. In stitute of Ca nc:e r R esearc·h . College of
`l'h.vsicians &:
`Surgeons o f Co lumbi a L' ni ve rs ity . 70 1 \\' . J(i8th Street. Xew York
`:\Y IOO:n L'XA.
`1'. con Hi ppel , I m;t itute of Mo lee ul a r 13iology. L' ni\·e r, ity of Oregon
`Eugene. OH !)7+03- 1:2:29. LTX A.
`Jiloth . l>t•parte me nt d e :\li c·robio logie. C'. :'II.L". . 9 a \-. d e C'ha mpel.l'H-
`1 :2 11 C: e nt- \·e +. ~\\'i tze rl a nd .
`J /atsulillra. In stitute fo r
`.\lo iPe ul a r a nd Cellula r Biology. Osaka
`l ' ni n •rsity. Ya ma d a -o ka. ~ ui ta. Osa ka :)65 . • Japan .
`.-l . 1\ lny . M. R .C.
`Laborato ry
`of
`i\l o lendar
`13io log,\·. Universitv
`Po,;tgra du a te :\I Pdi ral Schoo l. Hill s Hoad. Cambridge CB2 2QH
`Eng land .
`R. Hulm· .
`:\l a x- PI<uw k-Tnsti t ut
`:\1i.in t'l1f'n. Ge rmany .
`J . ( '. A'endrell' (add n•ss abon•).
`0. A. Oil/Jeri. De pa rtm e nt o f Bi oc·he mi st ry. l'n ivers ity o f Birmingham
`P.O. Box :~6:l. Birming ha m B 1.5 :2TT. l•:ng la nd .
`8. Nrenner (address <tbO\·e).
`R. !Iuber (addn•ss abo n ').
`{
`./. (' . li.endre11• (address a bov!') .
`
`{
`
`flir Bi oclw mie , 80:t { Mart insried bei
`
`Associate Editors
`
`( '. H. <'an/or. De pa rtnw nt o f Hum a n (: pnetit"s and DP\' elo pm t> n t. ('oiiPgt> o f Physicia ns f-iurgeons of Columbia
`l'niv e rs it\'. 70 1 \\'pst 168 Street. Roo m 160:2 . J.'\e w Yo rk . NY 100:1:2. ll X A .
`. De pa rtnwnt of ( 'lwmist ry . I m pt> ri a l Col lege of 1·kie nce & T echn o logy . ~nuth Ke ns ington, Londoa
`.-l . H. Fersht .
`S\\'7 2 A \' .
`Jr .. -l . He;ulrirkson. Depa rtme nt o f Bi ne he mi stry & l\lo lec· ul a r Bio phys irH. College of Ph ysicians & Surgeons of Columbia
`l ' nive rs ity. lj:10 \Yt' st 168th f.itrePt. J.'\e \\' Yo rk . NY 10032 . l ' X A.
`I. 1/erskou•it:. 6 e pa rtm Pnt o f Bi od w mistr.v and Bi ophys ies , t\ r hool o f i\ledi cinP. L' ni\·e r:sit .v o f Ca lifo rni a. Sa n Franc·isco
`C'A !H I-1-3, l ' XA.
`R. /Ja8key . C. R .(' . i\l o le cular Embr.vo log,v G roup . Departme nt o f Zoo logy . Do \\'nin g Street . Ca m bridge CB:I3EJ
`Eng land .
`1'. Luz;;ati. ( 't-'ntre d e (: {> n{>tiqu t> i\lo iPe ula ir<'. ('pntre J.'\a ti o nal d e Ia Rec herr lw ~ r i P ntifiqu e. 9 1 G if-sur-Yvet te, France
`13 . Matth ews. In stitu te of i\lo lec ular Bio logy . Unive rs ity of Oregon . Eugene. OR 97+03 1229 , UXA.
`J. II. Jl iller. De pa rtm Pn t of Bio logy . Un ivNs ity of California, +05 Hilgard Av e nue . L os Angeles. CA 9002+. U. '.A.
`M. F. Moody . Schoo l o f Ph a rm acy . L' ni\'Prsity o f L ondo n . :29/:19 Bruns wi c·k Squa rf'. L o ndo n WC' IK l AX , England.
`
`Pu hlis lwd I \\'i t'e a mo nth o n th f' :3th a nd 20th ttt :2+- 28 (ha l Roa d . L ondo n KW I 7 DX , Eng la nd by Academ ic ]>
`Limitt>d.
`
`19H7 . \'olumes 1!);1 198 . 2+ Tss ut>s. Tnl ancl , £660.00 includ ing postage and pa cking ; a broad . S l 250.00 in cluding pos
`a nd pa<"king . Inde x a nd C'umulati\' P Co nte nts of \ 'o lunws I to 20 . 2 1 t o +0 . 4 1 t o 60: prices o n a.ppli cation.
`
`f-iuhsc: ript io n o rdl'rS s hould be sl' nt to Aeade mi c P1'PSS Limi t ed , :24.- 28 Ova l R oad. Londo n NWI 7DX . Second
`p ol<tage paid a t .J a ma ica . :\ .Y .. UXA . Air fre ig ht and mai ling in the U.R.A . by Publieation s Expediting l nC' ..
`:\h·ac- ha m .-h e nue. Elm ont. !\. Y . II 003 . LT.R.A . He nd noti ces of cha nge o f a ddress t o the o ffi ce of t he Publishers at
`(j 8 Wf't'ks in advan cP . PIPase inc·lude both o ld and ne w addresses.
`© IH87 At·a de mi e l'rPss Ltd . Th E' a ppea ra nee o f th e c·ode a t tlw botto m of the first page of a pa per in this
`indi cat es t lw copyrig ht ow1w r 's c·onst" nt t hat copiP:s of the pa pe r ma y bE' ma d (' for pNsona l or intpm a l use. or
`pt'rson a l o r in t e rnal use of s peeifie c·lien t:; in the U. f-i .A. This conse nt ma y be g iven on the condition , with in the l
`th a t t lw <"opie r pay the stated pe r-l'opy fee t hro ug h the Co pyrig ht Clearance Ce nte r. Ine .. :2 1 Cong ress Street.
`:\l A 0 1970. L' XA . fo r co pying bey o nd th a t pe rmittPd by 1-leetio ns 107 or 108 o f the U.R. Copyrig ht L a w. Th is
`d oe~-; not e x te nd t o othN kind s of eop,vin g, w r h as c-opying for general distribution . fo r adv e rt ising 01' pro
`purposes . fo r <·reating nf' \\' collecti\·p wo rks. fo r re::mle or fo r the c·opy ing o r distribu t ing eo pies o utside the U.S. A.
`
`BIOEPIS EX. 1062
`Page 2
`
`
`
`J. Mol. Biol. (1987) 196, 901 - 917
`
`Canonical Structures for the Hypervariable Regions
`of Immunoglobulins
`
`Cyrus Chothia 1
`
`2 and Arthur M. Lesk 1
`
`•
`
`3t
`
`•
`
`1M RC Laboratory of Molecular Biology
`Hills Road, Cambridge CB2 2QH
`England
`
`2Christopher Ingold Laboratory
`University College London
`20 Gordon Street
`London W C 1 H 0 AJ, England
`
`3EMBL B iocomputing Programme
`Meyerhofstr. 1, Postfach 1022.09
`D-6900 Heidelberg
`Federal R epublic of Germany
`
`(Received 13 November 1986, and in revised form 23 April1987)
`
`We have a nalysed the atomic structures of Fab a nd VL fragments of immunoglobulins to
`determine the relationship between their a mino acid sequen ces a nd the three-dimensional
`stru ctures of their antigen binding sites. We identify the relatively few residues that,
`through their packing, hydrogen bonding or the ability t o assume unusual ¢ , 1/J or w
`conformations, are primarily responsible for
`the main -cha in co nformations of the
`hypervariable regions. These residues a re found to occur at sites within the hypervariable
`regions and in the conserved {J-sheet framework.
`Examination of the sequences of immunoglobulins of unknown stru cture shows that
`many have hypervariable regions that are simi la r in size to one of the known stru ctures and
`contain identical residues at the sites responsible for the observed conformation . T his
`implies that these hypervariable regions have conformations close to those in the known
`structures. For five of t he hypervariab le regions, the repertoire of conform ations appea rs to
`be limited to a relatively small number of discrete structural classes. We call the commonly
`occurring main-chain conformations of the hypervaria ble regions "canoni cal stru ctures".
`The accuracy of the a nalysis is being tested and refin ed by the prediction of
`immunoglobulin structures prior to their ex perimental det ermin ation.
`
`1. Introduction
`The specificity of immunoglobulins is determined
`by t he sequence a nd size of t he hypervariable
`regions in
`t he variable dom ains. These regions
`produce a surface complementa ry to that of the
`antigen. The subj ect of this paper is the relation
`between the a min o acid sequences of antibodies a nd
`the stru cture of their binding sites. The results we
`repo rt
`are
`related
`to
`two previous sets of
`observations.
`
`t Also associated with Fairleigh Di ckinson Un iversity ,
`Teaneck-H ackensack Campus, Teaneck , NJ 07666,
`U.S.A.
`
`901
`
`0022- 2836/87/ 16090 1- 17 $03.00/0
`
`the
`the sequences of
`The first set concerns
`hypervariable regions. Kabat a nd his colleagues
`(Kabat et al. , 1977 ; Kabat, 1978) compared the
`sequences of the hy pervari ab le regions t hen kn ow n
`and found that, at 13 sites in
`t he light chains
`a nd at seven positions in the heavy chains, the
`residues are conserved . They a rgued
`that the
`residues at these sites are involved in the stru cture,
`rather than the specificity, of the hyperva ria ble
`regions. They suggested that t hese residues have a
`fix ed position in antibodies a nd that this could be
`used in t he model building of combining sites to
`limit the conformations and positions of the sites
`wh ose residues varied. Padla n (1979) also examined
`t he sequences of the hypervariable region of light
`
`© 1987 Acade mi c Press Limi ted
`
`BIOEPIS EX. 1062
`Page 3
`
`
`
`902
`
`C. Chothia and A.M. Lesk
`
`cha ins. H e found that residues that are part of the
`hy perv ari a ble regions, and that are buried within
`the dom ains in the known stru ctures, are conserved.
`The residues he found conserved in V;. sequences
`were different to those conserved in VK sequences.
`The second set of observations concerns the
`conformation of the hypervariable regions. The
`results of the structure analysis of Fab and Bence(cid:173)
`Jones proteins (Sau l et al. , 1978; Segal et al., 1974;
`Marquart et al., 1980; Suh et al. , 1986; Schiffer et al.,
`1973; Epp et al., 1975; F ehlha mmer et al. , 1975;
`Colman et al. , I977 ; Furey et al., I983) show that in
`severa l cases hypervariable regions of the same size,
`but with different sequences, have the same main (cid:173)
`(Padlan & Davies, 1975;
`chain conformation
`F ehlh a mmer et al. , I975; Padlan et al., 1977;
`Padlan , 1977b; Colman et al. , 1977; de Ia Paz et al. ,
`1986). Details of these observations are given
`below.
`In this paper, from an analysis of the immuno(cid:173)
`globu lins of known atomic stru cture we determine
`the limits of the {1-sheet fram ework common to the
`kn ow n structures (see section 3 below) . We then
`identify the relatively few residues that, through
`packing, hydrogen bonding or the ability to assume
`unusual ¢ , t/1 or w conformations, are primarily
`responsible
`for
`the main -chain conformations
`observed in the hypervariable regions (see sections 4
`to 9, below). These residues are found to occur at
`sites within the hypervariable regions and in the
`conserved {1-sheet framework. Some correspond to
`residues identified by K abat et al. (1977) and by
`P adla n (Padlan et al. , 1977; Padlan , 1979) as being
`important for determining the conformation of
`hy pervariable regions.
`immuno(cid:173)
`Examination of
`the sequences of
`globulins of unknown stru cture shows that in many
`cases the set of residues responsible for one of the
`observed hypervariable conformations is present.
`This suggests
`that most of the hypervariable
`r-egions in immunoglobulins have one of a small
`discrete set of main-chain confmmations that we
`call "canonical structures" . Sequence variations at
`t he sites not responsib le for the conformation of a
`particular canonical stru cture will modulate the
`surface that it presents to a n antigen.
`Prior to t his analysis, attempts to model the
`co mbining sites of antibodies of unkn own structure
`have been based on the assumption that hyper(cid:173)
`variable regions of the same size have similar
`backbone structures (see section 12, below). As we
`show below , and as has been realized in part before,
`t his is t ru e only in certain instances. Modelling
`based on t he sets of residues identified here as
`responsible
`for
`the observed conformations of
`hypervariab le regions would be expected to give
`more accurate results.
`
`2. Immunoglobulin Sequences and Structures
`Kabat et al. (I983) have published a collection o(
`t he kn ow n immunoglobulin seq uences. For t he
`
`variable domain of the light chain (Vdt they list
`some 200 complete and 400 partial sequences; for
`the variable domain of the heavy chain (VH) they
`list about 130 complete and 200 partial sequences.
`In th is paper we use the residue numbering of
`Kabat et al. (1983) , except in t he few instances
`where
`the structural: superposition of certain
`hyperva riable regions gives an alignment different
`from that suggested by the sequence comparisons.
`In Table I we list the immunoglobulins of known
`stru cture
`for which atomic co-ordin ates are
`available from the Protein Data Bank (Bernstein et
`al., 1977) , and give the references to the crystallo(cid:173)
`graphic analyses. Amzel & Poljak (1979) , Marquart
`& Deisenhofer (1982) and Davies & Metzger (1983)
`have written reviews of t he molecular structure of
`immunoglobulins.
`The VL and VH domains have homologous
`structures
`(for
`references,
`see Table 1). Each
`contains two large {1-pleated sheets that pack face
`to face with their main chains a bout 10 A apart
`( 1 A = O·I nm) and inclined at an angle of -30°
`(Fig. I). The {1-sheets of each domain are linked by
`a conserved disulphide bridge. The antibody
`binding site is form ed by t he six hypervariable
`regions; three in VL and three in VH. These regions
`link strands of the {1-sheets . Two link strands that
`are in different {1-sheets. The other four are hair-pin
`turns: peptides that link two adjacent strands in
`the same {1-sheet (Fig. 2) . Sibanda & Thornton
`(1985) and Efimov (1986) have described how t he
`conformations of small and medium -sized hair-pin
`turns depend primarily on the length and sequence
`of the turn. Thornton et al. (I985) pointed out t hat
`the sequence-conformation rules for hair-pin turns
`can be used for modelling antibody combining sites.
`The
`results of
`these authors and our own
`unpublished work on the conformations of hair-pin
`turns, are summarized in Table 2.
`
`3. The Conserved ~-Sheet Framework
`
`immu noglobulin
`first
`the
`of
`Comparisons
`structures determined showed t hat t he framework
`regions of different molecules are very similar
`
`t Abbreviations used: VL and VH , variab le regions of
`t he immun oglob ulin light and heavy chains,
`respectively ; r.m.s. , root-mean-square; CDR,
`co mplementarity-determining region.
`
`Table 1
`i mmunoglobulin variable domains of known
`atomic structure
`
`Pr·otein
`
`Fab':NEWM
`Fab MCP C603
`Fab KOL
`Fab J539
`VL REl
`VLRH E
`
`Chain
`L
`
`Type
`'}{
`
`Reference
`
`).J"
`K
`).I
`K
`K
`AI
`
`Saul et al. (1978)
`li
`Segal et al. (1974)
`I
`III Marquart et al. ( 1980)
`Suh et al. (1986)
`III
`Epp et al. (1975)
`Furey et al. (1983)
`
`BIOEPIS EX. 1062
`Page 4
`
`
`
`The S tructure of Hypervariable Regions
`
`L2
`t
`
`903
`
`N
`
`c
`
`N
`
`t he
`Figure 2. A draw ing of t he a rrangement of
`hypervaria ble regions in immunoglobulin binding sites .
`The squares indi cate the position of residues at the ends
`of t he P-sheet strands in t he fra mework regions.
`
`residues
`of 79
`framework
`common P-sheet
`(Fig. 3(b) ). For different pairs of VH domains t he
`r .m.s. difference in t he position of t he ma in -chain
`ato ms is between 0·64 and 1·42 A.
`The combined P-sheet framework consists of VL
`residue 4 to 6, 9 to 13, 19 to 25, 33 to 49, 53 to 55,
`61 to 76, 84 to 90 , 97 to 107 and V" residues 3 to 12,
`17 to 25, 33 to 52, 56 to 60, 68 to 82, 88 to 95 and
`102 to 11 2. A fi t of t he main -chain ato ms of t hese
`156 residues in t he four kn own F a b stru ctures gives
`r.m.s. differences in atomic posit ions of main-chain
`ato ms of:
`
`KOL
`NEWM
`McPC603
`
`NEWM
`
`McPC603
`
`1·39 A
`
`1·15A
`1·47 A
`
`J 539
`
`1·14A
`1·37 A
`1·03A
`
`The major determinants of t he tertia ry structure
`of t he framework are the residues buried within a nd
`between t he domains. We calculated t he accessible
`surface area (Lee & R ichards, 1971 ) of each residue
`in t he F ab and VL structures. In T ab le 4 we list t he
`residues comm only buried wit hin t he VL and VH
`domains and in t he interface between t hem. These
`t he same as t hose identified by
`a re essentia lly
`P adlan (1977a ) as buried wit hin t he t hen known
`structures and conserved
`in
`t he
`t hen known
`sequences. Exa mination of
`t he 200
`to 700 VL
`sequences and 130 to 300 VH sequences in
`t he
`Tables of K abat et al. (1983) shows t hat in nearly
`all the sequences listed t here t he residues at t hese
`posit ions are ident ical wit h , or very simila r to, t hose
`in t he known stru ctures.
`There are two positions in t he VL sequences at
`which t he nature of t he conserved residues depends
`on t he chain class. J n Vl sequences, t he residues at
`positions 71 and 90 are usua lly Ala and SerJAla,
`in VK sequences
`t he corres ponding
`respectively;
`residues a re usually Tyr/Phe and GIn/ Asn. These
`residues make contact wi t h t he hypervari ab le loo ps
`and play a role in determin ing t he conformation of
`
`F igure 1. The structure of an immunoglobulin V
`· The drawing is of KOL VL. Strands of P-sheet are
`Jef)rel;ented by ri bbons. The t hree hypervariable regions
`la belled Ll , L2 and L3. L2 and L3 a re ha irpin loops
`link adj acent P-sheet strands. Ll links t wo strands
`a re part of different P-sheets. The VH domains and
`hypervaria ble regions, Hl , H 2 and H3 , have
`structures. The domain is viewed from t he P-
`that forms t he VL - VH interface. The arrangement of
`6 hypervaria ble regions t hat form the anti body
`site is shown in Figure 2.
`
`& Da vies, 1975) . The structura l similarities
`frameworks of the variable domains were
`as arising from the tendency of residues t hat
`the interiors of the domains to be conserved ,
`from t he conservation of the total volume of
`interior residues (Padlan , 1977a, 1979). In
`'tion, t he residues t hat form t he central region
`the in terface between VL and VH doma ins were
`to be strongly conserved (Polj ak et al.,
`975; P adlan , 1977b) and to pack wit h very similar
`(Chothia et al. , 1985).
`·
`In t his section we defin e and describe t he exact
`ten t of the structurally similar framework regions
`the known Fab and VL structures. This was
`ed by optimally superposing the main-
`ato ms of t he known structures (Ta ble 1) and
`t he differences in position of atoms in
`residuest .
`3(a ) we give a plan of t he P-sheet
`k t hat, on t he basis of t he superpositions,
`common to all six VL stru ctures. It contains 69
`·-~" ... , ... .,,. The r.m.s. difference in t he position of t he
`r ·"•uJ··<.;[J ain atoms of these residues is small for all
`rs of VL domains; t he values vary between 0·50
`1·61 A (Table 3A) . The four VH domains share a
`
`t F or t hese and other calculations we used a progra m
`written by one of us (see Lesk, 1986).
`
`BIOEPIS EX. 1062
`Page 5
`
`
`
`904
`
`C. Chothia and A . M. Lesk
`
`Structure
`
`Sequence'
`
`I 2 3 4
`X- G- G- X
`
`Table 2
`Conformation of hair-pin turns
`
`Conformationb
`n
`r/>2 ,
`r/>2 ,
`o/12
`+55 +35 +85
`or
`+65 -125 -105 +10'
`
`o/13
`-5d
`
`+70 - 115
`
`-90
`
`0'
`
`+50 +45 +85 -20°
`
`2- -3
`
`I
`
`I X-G- X - X
`
`1= == 4 X-X-G-X
`
`Frequency'
`
`6/6
`
`6/7
`
`7/8
`
`4/4
`
`4/4
`
`3f3
`
`X- X-X - X
`
`X-X-X-G
`
`+60 +20 +85 +25r
`r/>2
`o/12
`o/11
`r/>1
`r/>3
`o/13
`r/>4
`o/14
`- 95 - 10 + 145 +155
`- 135 + 175 -50
`-35
`
`I 2 3 4 5g
`3
`2/ "'-.4 X X X X G
`I
`I
`1===5 X X X X X
`
`/3 "'
`2
`4
`
`I 2 3 4 5h
`G
`X - X -X-N-X
`D
`
`I // I
`~ ~-- 5
`I 2 3 4 5 6;
`3- -4
`I
`I
`G
`5 X - X -X-X- N- X
`2
`I
`I
`X
`1===6
`
`r/>2
`-75
`
`o/12
`- 10
`
`r/>4
`o/13
`r/>3
`-95 -50 - 105
`
`o/14
`0
`
`r/>5
`o/15
`+85 - 160
`
`+50 +55 +65 -50 -130 -5
`
`-90 + 130
`
`l fl (3/3)
`
`r/>2
`
`o/12
`
`r/>3
`
`o/13
`
`r/>4
`
`o/14
`
`-60 -25 -90
`
`0 +85 +10
`
`13/ 15
`
`r/>2
`
`o/12
`
`r/>3
`
`o/13
`
`r/>4
`
`o/14
`
`r/>5
`
`o/15
`
`-65 -30 -65
`
`- 45
`
`- 95 -5 +70 +35
`
`3/3
`2/2
`l / l
`
`The data in this Table a re from an unpublished analysis of proteins whose ato mic str-u cture has been
`determined at a resolution of 2 A or higher. The conformations described here for the 2-residue X-X(cid:173)
`X-0 t urn and the 3-residue turns are new. The other conformations have been described by Sibanda &
`Thornton ( 1985) a nd by Efimov ( 1986) . We list on ly con formations found more than once.
`' X indicates no residue restriction except t hat certain sites cannot have Pro , as this residue requires
`a rf> value of ~ -60° and cannot form a hydrogen bond to its main-chain nitrogen.
`b Residues whose rf> ,o/1 values a re not given have a f3 conformation .
`' Frequencies are given as ntfn2 , where n 2 is the number of cases where we found the structure in
`column l with the sequence in column 2 and n 1 the number of these cases that have the conformation
`in column 3. Except for the frequencies in brackets, data is given only for non -homologous proteins.
`d.<.f These are type 1', II' and I II' turns.
`• Different conformations are found for the single cases of X-D-G-X-X and X-G-X -G- X .
`h Different conformations are found for the single cases of X -N-N -X-X , X-G-G-X-X a nd X-G-X-X-
`G. The 2 cases of X -X-X- X -X- have different conformations.
`; Different conformations are found for the 2 cases of X -G-X -X-X -X.
`
`these loops. This is discussed in sections 5 and 7,
`below.
`The conservation of the fram ework structure
`extends to t he residues immediately adjacent to the
`hypervariab le regions. If the conserved fram eworks
`of a pair of molecules are superposed ,
`t he
`differences in the positions of th ese residues is in
`most cases less than 1 A and in all but one case less
`tha n 1·8 A (Table 5). In contrast , residues in the
`hy pervariab le region adjacent to the conserved
`fram ework can differ in position by 3 A or mor~ .
`The six loops, whose main -chain conformations
`vary and which a re part of th e antibody combining
`site, are form ed by residues 26 to 32, 50 to 52 and
`91 to 96 in VL domains, and 26 to 32, 53 to 55 and
`96 to 101 in the VH domains L l , L2, L3, HI , H2 and
`H3 ,
`respectively. T heir
`limits are
`somewhat
`different
`from
`those of
`the comp lementarity (cid:173)
`determining regions defin ed by Kabat et al. (1983)
`on the basis of sequence variability: residues 24 to
`
`34, 50 to 56 and 89 to 97 in VL and 31 to 35, 50 to
`65 and 95 to 102 in VH. T his point is discussed in
`section 11 , below.
`
`4. Conformation of the Ll
`Hypervariable Regions
`
`In the known VL stru ctures, the conformations of
`th e L1 regions, residues 26 to 32, are ch aracterist~c
`of the class of the light chain. In V;. dom ains the~r
`conformation is helical and in the VK dom ains it 15
`extended (Padlan et al. , 1977; Padlan , 1977b; de Ia
`Paz et al., 1986) . These conformational differences
`are the result of sequence differences in both the Ll
`region and the framework (Lesk & Chothia , 1982).
`
`(a) V;. domains
`Figure 4 shows
`the conformation of t he Ll
`regions of the V;. dom ains. The L1 regions in RliE
`
`BIOEPIS EX. 1062
`Page 6
`
`
`
`The Structure of Hypervariable R egions
`
`905
`
`9
`
`13
`
`53
`
`55
`
`8
`
`12
`
`Figure 3. Pla ne of the ,8-sheet fra mework that is conserved in the VL and VH domains of the immunoglobulins of
`own atomic structure.
`
`nd KOL contain nine residues designated 26 to 30,
`a , 30b, 31
`to 32; NEWM has one additional
`sidue. The L1 regions in RHE and KOL have the
`me conformation : their main-chain atoms have a
`.m.s. difference in position of 0·28 A. Superposition
`f the L1 region of NEWM with those of KOL and
`HE shows that the additional residue is inserted
`tween residues 30b and 31 and has li ttle effect on
`he conformation of
`the
`rest of
`the
`region:
`Uperpositions of the main-chain atoms of 26 to 30b
`nd 31 to 32 in NEWM to 26 to 32 in KOL and
`HE give r.m.s. differences in position of 0·96 A
`nd 1·25 A. Thus, the sequence alignment for the V;.
`1 regions of KOL , RHE and NEWM implied by
`he stru ctural superposition is:
`
`Position
`RHE
`KOL
`NEWM
`
`29
`28
`27
`26
`Ser Ala Thr Asp
`Thr Ser Ser Asn
`Ser Ser Ser Asn
`
`32
`30 30a 30b 30c 31
`lie Gly Ser
`Asn Ser
`lie Gly Ser
`lie Thr
`Ile Gly Ala Gly Asn His
`
`In all three structures, residues 26 to 29 form a
`type I turn with a hydrogen bond between the
`carbonyl of 26 and the amide of 29. Residues 27 to
`30b form an irregular helix (Fig. 4) . This helix sits
`across the top of the {J-sheet core. The side-chain of
`residue 30 penetrates deep into the core occupying a
`cavity between residues 25, 33 and 71. The major
`determinant of the conformation of L1
`in
`the
`observed structures is the packing of residues 25,
`30, 33 and 71. V;. RHE , KOL and NEWM have the
`
`BIOEPIS EX. 1062
`Page 7
`
`
`
`906
`
`C. Chothia and A.M. Lesk
`
`Table 3
`Differences in immunoglobulin framework
`structures (A)
`
`For pairs of V domains we give the t·.m.s. diffe rence in the
`atomi c positions of fra mework main chain atoms after optimal
`superposit ion.
`
`A . V'- domains
`Framework residues are 4 to 6, 9 to 13, 19 to 25, 33 to 49 , 53 to
`55, 6 1 to 76, 84 to 90 and 97 to 107 .
`
`KOL NEWM REI MCPC603
`
`J539
`
`RHE
`KOL
`NEWM
`REI
`MCPC603
`
`0·74
`
`1·47
`1·13
`
`1·46
`1· 23
`1·24
`
`1·6 1
`1·36
`1·28
`0·50
`
`1·4 1
`1·15
`1·53
`0·77
`0·76
`
`il. VH domains
`Fra mework residues a re 3 to 12, 17 to 25, 33 to 52 , 56 to 60, 68
`to 82 , 88 to 95 and I 02 to 11 2.
`
`NEWM MCP C603
`
`J539
`
`KOL
`NEWM
`MCPC603
`
`1·42
`
`0·64
`1·27
`
`0·89
`1·29
`0·89
`
`same residues at these sites: Gly25 , Ile30, Val33 and
`Ala7 1. (Another L1 residue, Asp29 or Asn29, is
`buried by the contacts it makes with L3.)
`Kabat et al. (1983) listed 33 hum an V;. domains
`
`for which the sequences of t he L1 regions are
`known. The 21 sequences in subgroups I , II, V and
`VI have Ll regions t hat are the same length as
`those found in RHE , KOL or NEWM. Of these, 18
`conserve the residues responsible for the observed
`conformations:
`
`Residue
`position
`
`I{,esidue in
`KOL(RHE/NEWJ\1
`
`Residues in
`18 V, sequences
`
`25
`30
`33
`71
`29
`
`Gly
`lie
`Val
`Ala
`Asp(Asn
`
`18 Giy
`17 Val, I lie
`17 Val, I lie
`18 Ala
`II Asp, 6 Asn, I Ser
`
`The conservation of these residue's implies that
`these 18 Ll regions have a conformation that is t he
`same as that in RHE , KOL or NEWM.
`Subgroups III and IV have 13 sequences for
`which t he Ll regions are kn own (Kabat et al. ,
`1983). These regions are shorter t han those in RHE
`and KOL and in t he other V;. subgroups. They also
`have a quite different pattern of conserved residues.
`Kabat et al. (1983) listed 29 mouse V;. domains for
`which the sequence of the L1 region is known .
`These Ll regions are t he same size as t hat in
`NEWM. They also have a pattern of residue
`conservation sim ilar to , but not identical with, that
`in KOLfNEWM: Ser at position 25, Val at 30, Ala
`at 33 and Ala at 71. This suggests that the fold of
`
`Table 4
`Residues commonly buried within V L and V H domains
`
`VL domains
`
`Residues in
`known
`structures
`
`Position
`
`A.S.A a
`(A2)
`
`Position
`
`Vu domains
`
`Residues in
`known
`structures
`
`A.S.A.a
`(A2)
`
`4
`6
`19
`21
`23
`25
`33
`35
`37
`47
`48
`62
`64
`7 1
`73
`75
`82
`84
`86
`88
`90
`97
`99
`101
`102
`104
`
`L,l\1
`Q
`v
`I ,l\1
`c
`G,A,S
`V,L
`w
`Q
`L,I ,W
`I
`F
`G,A
`A,F ,Y
`L ,F
`I ,V
`D
`A,S
`y
`c
`A,S,Q,N
`V,T ,G
`G
`G
`T
`L,V
`
`6
`12
`ll
`I
`0
`13
`3
`0
`30
`8
`24
`II
`13
`2
`0
`0
`4
`ll
`0
`0
`7
`18
`3
`ll
`I
`2
`
`4
`6
`18
`20
`22
`24
`34
`36
`38
`48
`49
`5 1
`69
`78
`80
`82
`86
`88
`90
`92
`104
`106
`107
`109
`
`L
`Q,E
`L
`L
`c
`S,V ,T ,A
`M,Y
`w
`R
`I ,V
`A,G
`J,V,S
`I ,V,l\1
`L,F
`L
`M,L
`D
`A,G
`y
`c
`G
`G
`T ,S
`v
`
`14
`16
`21
`0
`0
`8
`4
`0
`13
`I
`0
`4
`13
`0
`0
`0
`2
`3
`0
`0
`II
`19
`17
`2
`
`a Mean accessible surface area (A .S.A.) of the residues in the Fab structures NEWM, MCPC603 ,
`KOL and J539 and in the VL structures REI and RHE.
`
`BIOEPIS EX. 1062
`Page 8
`
`
`
`The Stntcture of Hypervariable R egions
`
`907
`
`the mouse V;. L1 regions is a distorted version of
`that found in the known human structures.
`
`The number of residues in the Ll region 111 th ese
`seq uences varies:
`
`(b) VK domains
`
`In Figure 5 we illustrate the conformation of the
`11'1 regions in the three known VK structures: J539,
`!ftEI and MCPC603. In J539 Ll has six residues, in
`lftEI it has seven ·and in MCPC603 13. The L 1
`~gion of J539 has an extended conformation . In
`lftEI,
`residues 26
`to 28 have a n extended
`~onformati o n and 29 to 32 form a distorted type II
`~urn. The six additional residues in MCPC603 all
`~ccur in the region of this tum (Fig. 5) . In the three
`~tructures the main chain of residues 26 to 29 and
`~2 have the same conformation. A fit of the main (cid:173)
`~h ain atoms of these residues in J539, REI and
`~CPC603 gives r.m .s. differences in position of 0·47
`J.o 1·03 A. The sequence alignment implied by the
`tru ctura l superposition is:
`
`Residue size of LJ
`Number of huma n V,
`Number of mouse V,
`
`6
`
`17
`
`7
`38
`40
`
`8
`14
`
`9
`
`10
`
`II
`I
`32
`
`12
`4
`35
`
`13
`2
`30
`
`The conservation of residues at the positions buried
`between Ll and the fra mework implies that in th e
`la rge majority of VK domains residues 26 to 29 have
`a conformation close to that found in the known
`stru ctures and that the remaining residues, if small
`in number, form a turn or, if la rge, a hair-pin loop.
`
`5. Conformation of the L2
`Hypervariable Regions
`
`The L2 regions have the same conformation in
`the known structures (Padlan et al. , 1977 ; Padlan,
`
`ftesidue
`539
`~EI
`KCPC603
`
`30
`29
`28
`27
`26
`Ser
`Ser Va l
`Ser
`er
`Lys
`lie
`Ser Glu Asp
`lie
`Ser Glu
`Ser Leu Leu Asn
`
`31
`
`3la
`
`31 b
`
`3l c
`
`3ld
`
`31 e
`
`31f
`
`Ser Gly
`
`Asn Glu
`
`Lys Asn
`
`32
`Ser
`Tyr
`Phe
`
`In J539, REI and MCPC603 , residues 26 to 29
`~xtend across the top of P-sheet fram ework with
`bne, 29, buried within it. The main contacts of 29
`~re with residues 2, 25, 33 and 71. The penetration
`bf residue 29 into th e interior of the fram ework is
`!'lot as great as that of residue 30 in the V;. dom.ains,
`"nd the deep cavity that exists in V;. domains is
`~lied in VK domains by the large side-chain of the
`~sidue at position 71. In J539 , REI and MCPC603,
`"he residues invol ved in the packing of Ll (2, 25 ,
`29, 33 and 71) are very similar : lie, Ala/Ser, Val/
`le/Leu , Leu and Tyr/Phe, respectively .
`The six residues 30 to 30f in MCPC603 form a
`~air-pin loop that extends away from the domain
`Fig. 5) and does not have a well -ordered conforma(cid:173)
`~on (Segal et al. , 1974) .
`Kabat et al. (1977) noted that residues at certain
`llositions in the L1 regions of the VK sequences then
`~own were conserved , and suggested that they
`ave a structural role. The structural role of
`!tsidues at positions 25, 29 and 33 is confirmed by
`f,he above analysis of the VK stru ctures and the
`pattern of residue conservation in the much larger
`pumber of sequences known now . K a bat et al.
`1983) listed 65 human and 164 mouse VK sequences
`for which the residues between positions 2 and 33
`~e known . For about half of these, the residue at
`!'osition 71 is also known . These data show that
`Ptere are 59 human and 148 mouse sequences that
`~ve residues very similar to those in the known
`~ructures at the sites invol ved in the packing of
`f-l:
`
`1977b; de Ia P az et al. , 1986) expect for NEWM,
`where it is deleted. We find that the similarities in
`the L2 structures arise from the conformational
`requirements of a three-r~sidue turn and
`the
`conservation of the fra mework residues against
`which L2 packs.
`The know structures L2 consists of three residues,
`50 to 52:
`
`Residue
`
`RHE
`
`KOL
`
`REI
`
`MCPC603
`
`J539
`
`50
`5 1
`52
`
`Ty r
`Asn
`Asp
`
`Arg
`Asp
`Ala
`
`Glu
`Ala
`Ser
`
`Gly
`Ala
`Ser
`
`Glu
`lie
`Ser
`
`These three residues link two adjacent strands in
`the fram ework P-sheet. Residues 49 and 53 are
`hydrogen bonded to each other so that the L2
`region is a three-residue hair-pin turn (Fig. 6) .
`
`51
`/ ~
`50
`52
`I
`I
`49= ==53
`
`The conform ations of L2 in the five stru ctures are
`very similar: r.m .s. differences in position of their
`main -chain ato ms are between 0·1 and 0·97 A. The
`only difference among the conformations is in th e
`orientation of the peptide between residues 50 and
`51. In MCPC603 this difference is associated with
`the Gly residue at position 50. The side-chains of L2
`all point towards the surface. The main -chain packs
`
`Position
`
`J539/REI/MCPC603
`
`Human V,
`
`Mouse V,
`
`2
`25
`29
`33
`71
`
`"'"
`
`Ile
`Ala Ser
`Va l Ile Leu
`Le u
`Ty r Phe
`
`57 lie, I Met, I Val
`52 Ala , 7 Ser
`30 lie, 21 Val , 8 Leu
`57 Le u, 2 Val
`28 Phe, I Tyr
`
`134 Ile, 14 Va l
`104 Ala, 4 Ser
`59 Leu, 5 1 Val , 38lle
`94 Leu, 44 Met, 7 Val, 3 lie
`54 Phe, 26 Ty r
`
`BIOEPIS EX. 1062
`Page 9
`
`
`
`908
`
`C. Chothia and A. M. Lesk