Rrprint1•d from J . .l/ol. Biol. (1987) 196. !lOl-!)17
`Canonical Structures for the Hypervariable Regions
`of Immunoglobulins
`Cyrus Chothia and Arthur M. Lesk
`J. Jin/ Biol. (1987) 196. 90I- 91i
`Canonical Structures for the Hypervariable Regions
`of Immunoglobulins
`2 and Arthur M. Lesk 1.3t
`Cyrus Chothia 1
`1 11/ RC La-Ooratory of Molecular Biology
`Hills Road. Cambridge CB2 2QH
`2Ghristopher h1gold Laboratory
`l "niversity College Lo1ulon
`20 Gordon St-reet
`London WCJl-f OAJ, England
`3 EM BL Biocom pa ting Programme
`llfeyerhofstr. 1, Postfach 1022.09
`D-6.900 Heidelberg
`Federal Republic of Germany
`(Received 13 November 1986, and in rn•ised form ~.1 April 1987)
`\\'e have analysed t he atomic stru ctures of Fab and \ "L fragmrnt-s of immunoglobulins to
`dete rmine the relationship between th~i1· amino aC'id sequences and t.h<' three·dimeni;ional
`structures of their antige n binding .:it-t-s. \\'e identif~· the relatively few residues that,
`if! o r w
`through their packing, hydrogen bondi ng or the abi lit~· to assu me unusual </J.
`the main-chain confo rmations of th e
`conformations, arc primarily
`hype rvariable regions. These residues a re found to oc·<'nr at sit t'>1 within the hypervariable
`regi ons and in the conserved /J-sheet. framework.
`Exa mination of the sequen ces of immunoglobuli ns of unknown structur<' shows that
`many have hypervariable regions I hat are similar in siZ(' to one of t he known ;;t ru c·l un·:< and
`contain identical residues at the sitrs responsible for the observed c·onformation. 'f'his
`implies that these hyperva.riable regions «onformations close t,o th ose in th e known
`structures. For five of the hypervariable regions, the l'f'J>erloire of conform a ti ons appears to
`be limited to a relat h·e]y :-:mall number of discn·t.e st rurtural da;:;;es. \\'c· call t he c·o mmonly
`ON·nrring n co nformat ions of the hy pervariab)e regions "1•a no nical s t.ructurc•s".
`the predid i1111 of
`test eel and refi ned
`th e analysis is being
`The accuracy of
`immunoglobulin str uc·tu res prior to their expe rimental dete rmination.
`1. Introductio n
`The specificity of immunoglobulins is detnrnined
`by th" sequenl't' and size of the hypt-n·ar1able
`th e variable domai11s. These regions
`produce a surface complementary to that of the
`antigen. The s ubject of this paper is the relation
`lit>lwt·1·n the amino acid seque rwrs of antibodies and
`the strul'ture of their binding ;;itt>>1. The result s we
`:wts of
`two prf'vious
`report are
`the sequenct>s of the
`fir:-1 set com·ernl<
`hypervariable regions . l\aliat and his colleagul's
`(Kahat t i nl .. 1977: Kabat, 1978) rompared lh1·
`s('qnrn1:es oft lw h.vpernviable re~ions t hen known
`and found t.hat. at 13 s itex in th e light. thain:-.
`and at >lf' \ 'E'll positions in t ht> ht'a".'' l"11ain,.;. the
`rt·sidue!-l are consen-rrl . 1'h1·y a rgnecl
`resid 11C's at Hwst' i;it 1·s art' invoked in I he st.ructure.
`rat hl'r than t hl' s pl.'1:ificity . of tht> hy p1·rrnrialtlt·
`regions. 'I'lw.v suggested that lht'.'!' n >., id111·s ban· a
`fo.:Pcl position in antibndi c•:-: and that thi:-: 1·01dd hr
`used in tht· modrl building of co mbining sites to
`limit I he co nform al ions and posi tions of 1 h1· sil t':-.
`whose residut>' va r·it·d . P ad la.n (1979) a lso 1·xamitwd
`the sl'querwes of th e hypervariable n ·gion of light
`t Also associated with Fairleigh Dickinson l'nivf'rxit.v.
`~ean\'tk · Ha<·kensack Campu3, T eanf'1·k. :\.J 07666.
`l X A.
`C. Chothia nm/ A. M. Lesk
`chains. He found t hat residues that are part of the
`h.YJ>l"'rvariable regions. and that are buried within
`the domains in t he known structures. are conserved.
`The residues he found conserved in V,i sequenl"e~
`\\"ere different to t hose conserved in VK ~t>quen <"t'S.
`The i<t>1·ond ,:;t:>t· of observations concerns
`conformation of the hypervariable regions. The
`results of tht· structure analysis of F ab and Bence·
`Jones prot\>ins (Saul l't al .. 1978; Segal ''/ al., 197+:
`~larquart et al .. 1980; Suh et n.l., 1986; !-ichiffer et al ..
`1973: Epp et al., 1975; Fehlham mer et al .. 1975:
`Colman el rd .. 1977; Furey el al., 1983) show that in
`several cases hypervariable regions of the same size,
`but with d ifferent sequences, have the same main(cid:173)
`(Pa<llan & Davies, 1975;
`chain conformation
`Fehlhamnwr et al .• 1975; Padlan et al., 1977;
`Padlan , 1977b; Colman et al .. 1977; de la Paz et a.I ..
`1986). Details of these observations are given
`below .
`In this paper. from an analysis of t he immuno(cid:173)
`globulins of known atomic stru<:ture we determine
`the limits of t he P-sheet framework common t.o t he
`known st.ructures (see section 3 below). We then
`identify t he relative!.'· few residues that , t hrough
`packing, hydrogen bonding or the ability to assume
`I/! or w conformations, are primarily
`unusual </>.
`t he main-chain conformations
`observed in the hypervariable regions (see sections 4
`to 9, below). These residues a re found to occur at
`sites within the hypervariable regions and in the
`conserved 8-sheet framework. Some correspond to
`residues identified by Kabat r:t al. (1977) and by
`Pad lan (Padlan et al .. 1977; Padlan. 1979) as being
`important for determining l,he conformation of
`hypervariable regions.
`Examination of
`t.he sequences of
`glohulins of unkno\\"n structure shows that in manv
`('asE's the ~t>I of residues responsible for one of t h.e
`ohserved hypt>rni.riablf' conformations is present.
`This suggests that most of the hypen·ariable
`regions in immunoglobulins han? one of a small
`d isl"rete set of main-chain conformations that wf'
`<'all ··canoniC'al st rudme>s ''. Sequence variations at
`the sil t's not responsible for t he ('()nformation of a
`partic·ular «anonical stru<'lure will modulate the
`su rfa( ·t· 1hat it. prest>nl s to an antigen.
`Pri0>· tu thi~ anal)·sis, alt.empts to model t hf'
`<"Ombining sites of antihodit>>' of unknown st rm·ture
`ha\'t> bt·t>n ha>w<I on tlw assumption t hat hyper(cid:173)
`,·ariable n·gions of t he same si:1a· have simila r
`ba!·kbone ~I rutturt>s (sf."e st>(·tion I:.!. ht'ln\\"). :\,; \\"€'
`show bf:'low . and a~ has bt>t·n n·;ilir,ed in part be.fore.
`this is t.r11t> only in (·t>rtain inst·es. :\lodelling
`based on the ,.;<·ts of rl'~idtws idf'ntified here a:<
`th\· olisl'rn·d
`('Onformations of
`hyp1·/'\'a rfab lc regions ll'ould be 1·xµel·kd t o giYl'
`mort• acn1rat<· rPsult s.
`2. Immunoglobulin Sequences and Structures
`Kahal et 11/. (1983) ha,·p publish!:'<! a mllrr·t ion of
`the known immunoglobulin seque>H't>.~. F or the
`variable domain of tl1e light chain (Vdt they list
`some 200 complete and 400 partial sequences; for
`the variable domain of the heavy chain (VH) they
`list about 130 complete and 200 partial sequences
`In this paper we use the residue numbering of
`Kabat et al. ( 1983), except in the few instance~
`the structural superposition of certain
`hypervariable regions gives an alignment different.
`from that suggested by the sequence comparisons.
`In T able l we list the immunoglobulins of known
`for which atomic co-ordinates are
`available from t,he Protein Data Bank (Bernstein et
`al., 1977), and give t he references to the crystallo(cid:173)
`graphic analyses. Amzel & P oljak (1979), Marquart
`& Deisenhofer ( 1982) and Davies & Metzger (1983)
`have written reviews of the molecular structure of
`\'L and VH domains have homologous
`references, Ree Table l ). Each
`contains two large fl-pleated sheets that pack face
`to face with their main chains about 10 A apart
`(l A = O· I nm) and inclined at an angle of - 30°
`(Fig. I ). The P-sheets of each domain are linked by
`a conserved disulphide bridge. The antibody
`binding site is formed by the six hyperl'ariable
`regions; three in VL and three in \'H. These regions
`link strands of the P-sheets. Two link strands that
`are in different {J-sheets. The other four are hair-pin
`t urns: peptides that link two adjacent strands in
`the same P-sheet (Fig. 2). Sibanda & Thornton
`(1985) and Efimov (1986) have described how the
`conformations of small and medium-sized hair-pin
`turns depend primarily on t he length and sequence
`of the turn . Thornton et al. (1985) pointed out that
`t he sequence-conformation rules for hair-pin turns
`can be used for modelling antibody combining sites.
`results of
`these authors and our own
`unpublished work on t he conformations of hair-pin
`turns, are summarized in Table 2.
`3. The Conserved ~Sheet Framework
`Comparisons of
`structures determined showed that. t.he fram1>work
`regions of different moleC'ules are very similar
`t A bbrf'viatiom:: used: \'1. and \ 'H, variahle regions of
`thE' immunoglobulin light and hea''.'' c'hains.
`re~pl"'l'!in•l.Y : r.m.:< .. root · mean-:>quare; COR,
`1·omplemen tarity·df.'tf.'rmining region.
`Table 1
`I 1111111111oglob11/i11 mriable domain,s of k1101c11
`a.tomic structure
`l'alo' XE \nl
`~'ah )J('fl( 'fiO:i
`~·11h Kill.
`~'ab .Jr;:l!I
`\ 'L }{)<;}
`'"- RHJ<:
`Saol el 11/. (1978)
`~t'j!lll p/ rt/. (1974)
`Marquart el al. (198\ll
`Soh el al. (J!lSli)
`Epp el,,/, (1975)
`Fort')' el al. ( Hl83)
`The Structure of Hypervariable Regions
`Figure 2. A drawing of t he arrangement. of the
`hypervariable regions in immunoglobulin binding site:;.
`The squares indicate t he position of residues at t he ends
`of t he /J-sheet strands in the framework regions.
`framework of 79
`(Fig. 3(b)). For different pairs of VH domains t he
`r.m.s. difference in the position of t he main-chain
`atoms is between 0·64 and l ·-1.:l A.
`The combined P-sheet framework consists of \"L
`residues-!. to 6, 9 to 13, 19 to 25. 33 to -1-9, 53 to 55,
`61 to 76, 84 to 90, 97 to 107 and VH residues 3 to I:?,
`17 to 25, 33 to 52, 56 to 60, 68 to S:!. 88 to 95 and
`102 to 112. A fit of t he main-chain atoms of these
`156 residues in the four known Fab structures gives
`r.m .s. differences in atomic positions of main-chain
`atoms of:
`~ F,\\"i\I
`1·15 A
`1·47 A
`1· 14 A
`1·37 A
`t·03 A
`The major determinants of the tertia,..r stn1ttun·
`of t he framework are the residues buried within and
`between the domains. We calculated the ac:«essible
`surface area (Lee & Richards, 1971 ) of each residue
`jn the Fab and VL structures. In T able .i we list the
`residues commonly buried within the \'L and \'H
`domains and in the interface between them. T hese
`are essentially t he same as those identified by
`Padlan (1977a) as buried within the t hen known
`structures and conserved
`then known
`sequences. Examination of
`to 700 \'1..
`t he
`sequences and 130 to 300 VH sequences in the
`Tables of Kabat et al. (1983) shows that in nearly
`a.II t he sequences listed there t he residues at t hese
`positions are identical with , or very similar to. \host'
`in the known structures.
`There are two positions in the \'._ :·wqut-nces at
`which the nature of the conserved residues depends
`on the chain class. Tn VA sequences, t he residues at
`positions 7 l and 90 are usua lly Ala and l:ier/ Ala,
`respectively; in V~ sequences the correllponding
`residues are usually Tyr/Phe and Gln/ Asn. These
`residues make conta<'I wit h t lw hypervariable loops
`and play a role in determining the conformation of
`Figure 1. The structure of an immunoglobulin V
`domain. The drawing is of KOL Vv Strands of .8-sheet are
`represented by ribbons. T he three hypervariable regions
`are labelled LI, L2 and L3. L2 and L3 are hairpin loops
`that link adjacent P-sheet strands. Ll links two strands
`that are part of different .8-sheets. The VH domains and
`their hypervariable regions. H l , H2 and H3, have
`homologous structures. The doma in is viewed from t he P(cid:173)
`sheet that forms the Vt. - VH interface. The arrangement of
`the 6 hypervariable regions that form
`the antibody
`binding site is shown in Figure 2.
`(Padlan & Davies, 1975). The structural similarities
`of the frameworks of the variable domains were
`seen as arising from the tendency of residues that
`form the interiors of the domains to be conserved,
`and from the conservation of the total volume of
`the interior residues (Padlan, l977a, 1979). In
`atldition, the residues that form t he central region
`of the interface between Vi.. and VH domains were
`observed to be strongly conserved (Poljak et al. ,
`1975; Padlan, l977b) and to pack with very similar
`geometries (Chothia et al., 1985).
`In this section we define and describe t he exact
`~xt.ent of the structurally similar framework regions
`in the known Fab and VL structures. This was
`de~rmined by optimally superposing the main (cid:173)
`chain atoms of the known structures (Table 1) and
`calculating the differences in position of at-Oms in
`homologous residuest.
`In Figure 3(a) we give a plan of t he /J-sheet
`~ramework that, on the basis of t he superposit ions,
`is ~ommon to all six VL structures. It contains 69
`res1.dues. The r.m.s. difference in the position of the
`main-chain atoms of these residues is small for all
`pairs of VL domains; the values vary between 0·50
`and 1·61A(Table3A). The four VH domains share a
`t For these and other calculations we used a program
`system written by one of us (see Lesk, 1986).
`f '. Chothia and A . M. Lesk
`Table 2
`('1111f11r111ritio11 of hair-pin 11lrnS
`+55 +35 +85 _5d
`+65 -12.'i -105 +10'
`+711 -1 15 -90
`+50 + 45 +85 -20•
`-1 35 + 175
`+60 +20 +85 +:?;;'
`- 50 -35 -95 -1 0 +145 + l51i
`- 75
`- 10
`- 95 -50 -105
`0 +85 - 160
`+;;o +55 +65 -50 -130 -~
`-90 + 130
`-60 -:!.';
`- 90
`0 +85 +JO
`- 65
`- 30
`- 65
`- 45 -95 -5 +70 +35
`:! 3
`:-. c:. <: - x
`I x. (:. x. x
`1: : :-1 X- X- (:. X
`x. x. x. x
`X- X·X · G
`.j 5•
`I 2 3
`:?/ "-.i x x x x (;
`__ _ .,
`1-- - - x x x x x
`2 3 4 5•
`x X· X X· X
`l /
`·~ --5
`:! 3
`:! -
`.j 5 6'
`5 x x x. x x. x
`The data in thi• TahlP arP. from Rn nnpuhlish~d anll lyois of proteins whose atomic atrut•t ur .. h as been
`determined at a resolution of 2 A or higher. The conformations dl"S<·ribed here for the 2-residue X-X·
`X -1: turn and the 3-residue turns are new. The other conformations have ~n described by Sibanda &.
`Thornton (1!111.;1 and by Efimo,· (1986). \\"e list only conformations found more than once.
`• X indic·,11 ~~ no residu.- ..-~tri .. tiuu exct'pt that certain sites cannot ha,·p Pro. M this ~idue requires
`a 4> v11l ue of - - w and cannot form a hydrogen bond to its main-chain nitrogen.
`• Residu~s whose </I.I/I valuPS are not given havt' a P conformation.
`' ~·requencies are given as 11 1/n,. wlwrt- 11 2 is the number of cases where we found the structure in
`column I with the ..equence in column 2 and n 1 the number of these cases that have the conformation
`in t'Ol umn 3. r.., .... pl for the frequenc·iE>• in bra<'ket.'!. data is given only for non-homologous proteins.
`d.•.r Tht·>i· nrc• typP I'. 11' and Ill' turn~.
`1 Different t'Onformations are found for the single ~ases of X-D-G-X-X and X-C·X·G-X.
`h Differl!nt conformations are found for the single cases of X-X-X-X-X. X .(;.(:. );. X and X-G·X·X·
`(;. The:!\·"""" of X-X- X-\.\. have different <·onformations.
`' Different l'onformations are found for the:! cases of X·G·X-X-X-X.
`these loops. Thi:< is discussed in sec-t ion:< 5 and 7,
`lit' low.
`The conservation of lhf> framework :<I ru<'l ure
`t-xtends In t he residues immed iately adjan·nt lo the
`hypervariable region:<. lf thr l'"""l'n·t'cl franH'works
`of a pair of molecules are superposed ,
`differences in t h1· fJt>:<it ion:< of 1 lws<' t«·:<idm's is in
`mn:<t c·1\s1·:< le:<s than 1 .l. and in all but one case l!'ss
`than 1·8 . .\ (T ablt>5). Jn <·ontrnst. residut's in tht•
`hyperva.riablc region adjaecnt
`to the eonsern~d
`fram (·\\·ork <·an differ in position !.>· 3 .. \ or more.
`The six loops. whose main-c·hain conformation~
`,·ary and whi<·h an· part of the antil11ul>· c·ombinin).(
`:<it ... are formed l1y r1·.;i1hu·-. :!Ii to :l:! . . )0 to .i:! and
`!ll to !Iii in \ "L <lomainll, and 26 to:{:! . 53 to 1)5 and
`96 to 101 in lhf' \'H domains LI , L:!. L3. HI . H:? and
`respeC'tin·l,L Tht>i l'
`limit :<
`11f I he <'omplt>mt-nt arit ,. _
`ciE>terminin~ reJZions defitwd li.y Kil hat ,,, rd. ( J !}8°3)
`011 the hasts of sequc•nc·<- \';Hinbility : resid1tt's :!..t to
`34, 50 to 56 and 89 to 97 in \"Land 31 to 35. 50 ~o
`65 and 95 to 102 in \ "H· T his point is discussed m
`sec-t ion 11 , below.
`4. Conformation of the L1
`Hypervariable Regions
`In t.he known \ · structures thC' conformations of
`I he LI regions, re~idues 26 t o' 32, are char~cterist~c
`of t he <-lass of tlw light. chain. Jn V, domat~s t~e~r
`conform a I ion is helical and in the \". domains it IS
`extended (Padlan et al .. 1977; Padlan. 1977b; de la
`Paz et rtl .. 1986). These conformational differences
`are the result of sequence differences in b~t h the LI
`region and t he framework (Lesk & Choth1a, 1982).
`(a) VA do111ai11.<
`Figure + shows the c-onformation of the LI
`rrgions of the \"., domains. The LI regions in RHE
`The Str11c/11re of II Y/ll'l'l"flfiah/1' Regions
`' ' -, ......
`C -----
`, ,
`, .. '"'
`Figure 3. Plane of the tl·sheet framework that is conserved in the \"L and \ 'H domains of the immunoglobulins of
`mown atomic structure.
`and KOL contain nine residues designated 26 to 30.
`to 32; ~E\\'~1 has one additional
`30a, 30b, 31
`residue. The LI regions in RHE and KOL have the
`3ame conformation: their main-chain atoms have a
`r.m.s. difference in position of 0·28 A. Superposition
`of the LI region of ~EWM with t hose of KOL and
`RHE shows that the additional residue is inserted
`between residues 30b and 31 and has little effect on
`t he
`rest of
`the conformation of
`superpositions of the main-chain a.toms of 26 to 30b
`and 31 to 32 in NEWM to 26 to 32 in KO L and
`RHE give r.m.s. differences in position of 0·96 A
`and 1 ·~5 A. Thus, the sequence alignment for the Vl
`Ll regions of KOL, R H E and NEWM implied by
`the structural superposition is:
`Sn Ala Thr Asp
`Thr Ser &>r Asn
`~r S.-r Ser A•n
`:Joa 30b
`Ile <:iv ~r
`lie c(,· S.-r
`(;I~· Ala
`:111,. 31
`A~n :-;.,r
`lie Thr
`l:ly Asn His
`Tn all three structures, residues :?Ii to 29 form a
`t~· pe J turn with a hydrogen bond between the
`carbonyl of 26 a nd the amide of :?!1. Residues 27 to
`30b form a n irregula r helix (Fig . 4). This helix<
`ac ross t he top of the /J-sheei core. The of
`residue 30 penetrates deep into the core occupying a
`cavity between res idues 25. 33 a nd 71. The major
`determinant of the conformation of L I
`t he
`observed structures is the packing of residues :!5.
`30, 33 and 71. V;. RHE, KOL and NEWM. have the
`6 of 18
`Celltrion, Inc., Exhibit 1062


`( '. Chothia and A. M. Lesk
`Table 3
`/JijJ,.tt·11r•'·' in i1111111111oylril111li11 frflllll'trork
`.. 11rnrl11 rP.~ ( . .t)
`For l'•""' of \" domains '"~ give. the r:m.s. difference in. thl'
`atomic ('1Qsitio11' of frame\\·ork mam chain atoms aftpr optimal
`l't. tfomn;uli
`Framework residues al'I' 4 tu 6, !I tu 13, 19 t o:?:.. 33 lo 4!l. 53 to
`:;:;. 61 tu 7ti. 114 to !10 and 97 to I07.
`;l;E\01 REI
`l ·13
`l ·:.!3
`1 :!-1
`)(('('( 'Oll:l
`B. J 'u d1m111hl"'
`Framework residues art> 3 10 l:!. 17 to :!5. 33 to 52, 56 to 60. 68
`t o X:!. i;s to 95 and Ill:! lo I I:!
`:'\F.\DI M('PC'603
`)lt 'l't ·1~1:1
`,..amt' residues at these sitE>s: C:ld5. Ile30, Val33 and
`Ala7 1. (Another LI residue,· Asp29 or Asn29. is
`buried hy the conliwts it makes with L3.)
`Kabat el al. (1983) listed 33 human \'1 domains
`for which the sequences of the Ll regions are
`known. The 21 st-quences in subgroups I, II , V and
`\. f ha ,.e L I regions that are the same length as
`those found in RHE, KOL or NEWl\1. Of these, 18
`conserve the residues responsible for the observed
`Residue in
`18 \', sequences
`7 1
`l it
`I>!( ;r,.
`17 V~l. l llt
`17 \"al. I lie
`11 Asp. 6 Asn, I Ser
`The ronservation of these residues implies that
`t hese 18 LI regions have a conformation that is the
`same as that in RH E , KOL or NEWM.
`Subgroups TII and I\' have 13 sequences for
`which the L I regions are known (Kabat et al.,
`1983). These regions are shorter than those in RHE
`and KOL and in the other V, subgroups. They also
`have a quite different pattern of conserved residues.
`Kabat f:I al. ( 1983) listed 29 mouse \·.domains for
`which the sequence of the Ll region is known.
`These Ll regions are the same size as that in
`X F:\ DI. They also have a pattern of residue
`<:on:sen·alion similar to, but not identical with, that
`in KOl./ XEWM: Ser at position 25, Val at 30, Ala
`at 33 and Ala at 71. This suggests that the fold of
`Table 4
`Rl'.sirlue.< ro11111uml,11 b11rin/ 11•itlti11 l'L and l'n domains
`r., domains
`\"L domains
`Residues in
`A.k . ..\.0
`Posi tion
`Residue~ in
`1:. A.X
`L.l. \\"
`t; A
`.\ .F.Y
`( '
`A S .(J.:'\
`\ ".T .1:
`L.\ ·
`5 1
`I ,\ ·
`T .S
`I I
`. • )li·an ac~~~s;;ible .~urfaC'e. area (A.X .• \ .) of lht> ro«idu .. s i 11 th~ 1''1ib ~truC'tures ~EWM, MCPC603.
`1, 0 1, and .J.1.U and 111 the\ L 'tru<·tures REI and HH E.
`The 8tr11rf11re of Hypen•oriable Heyions
`the mouse \'~ Ll regions is a distorted version of
`that found in tht> known human structures.
`The numher of residue,:; in the LI region m these
`:<(·quen1T1' ,-aries:
`(b) 1·. domains
`In Figure ,-, we illust rat-e the 1·onformation of t he
`LI regions in the three known \.K structures: J539.
`REI and \l('PC603. In J539 L1 has six res iduE>s. in
`RET it has seven and in MCPC603 13. The LI
`region of J539 has an t>xtended l'Onformation. In
`residues 26
`to 28 have an extended
`conformation and 29 to 3~ form a distorted type IT
`turn. The six additional residues in MCPC'l>ll:3 all
`occur in the region of this turn (Fig. 5). Tn the t.hree
`;;tnu·turt':> the main chain of residues :W to 29 and
`:3:! haw the same conformation. A fit of the main·
`('hain atoms of these residues in J539. REI and
`~ICPC603 give:; r.m.s. differences in position of 0-4 7
`to I ·03 A. The sequenc-e alignment implied b~· the
`~t rul"lural superposition is:
`Hesi<lut- size of LI
`~umber of human r.
`:-;umber of mouse \"•
`I I
`The conservation of r<'sidut>>< at the positions buried
`between LI and tlw framework implies thal in the
`large majority of\"" domains rt>sidur:-: :W lo 29 han·
`a conformation clos€"
`to that found in the known
`struct.ures and tha.t the remaining residues. if small
`in number, form a turn or, if large. <.I hair-pin loop.
`5. Conformation of the L2
`Hypervariable Regions
`The L2 regions have the same conformation in
`the known strndurt>s (Padlan Pl al., 1977; Pad la.n.
`Set Olu Asp
`Ser Ulu
`Leu Leu Asn
`In J539, REI and MCP(.'603. residues 26 to :rn
`extend across the top of /J-sheet framework with
`one, 29. buried within it. The main contacts of 29
`are with residues 2. 25. 33 and 71. The penetration
`of residue 29 into the interiOI' of the framework is
`not as great as that of residue 30 in the Y 1 domains,
`and the deep c-av ity that exists in \"1 domains is
`filled in V" domains hy tht> large s ide-chain of the
`i·esirlue at position 71. In J539. REI and :\lCPC603,
`the residues involved in the packing of L l (2. ::!fl.
`:!\!. 33 a.nd 71) are very similar: Ile. Ala/Ser. \·al/
`Ile/teu, Leu and Tyr/ Phe, r·espeet.i\·t>ly .
`The six residues 30 to 30f in :'ltCPC603 form a
`hair-pin loop that extends a wa>· from the domain
`(Fig. 5) and does not have a well-ordered conforma·
`tion (fiegal "'al., 197~)-
`Kabat et al. (1977) noted that residues at c-ertain
`position!' in the Ll regions of the V" ,.;equt>rH·f'>< then
`known were conserved, and suggested that they
`havt' a structural role. The structural ro le of
`residues at positions 25. 29 and 33 is confirmed by
`the above analysis of the \"" structures and t,he
`pattern of residue conservation in t.he muc:h larger
`number of sequences known now. Kabat ,,, al.
`(1983) listed 65 human and llH- mou~e \'" sequenc·e><
`for which the residues between posit.ions 2 and 33
`are known. For about half of these. the residue at
`position 71
`is also known. Thes\-' data ><how that
`t here are 59 human and l-l-8 mouse i<equt>nt·e:< that
`ha,·t.' residues ,-en· similar to thosl' in the known
`stnwtures at the 'site:-; involved in the packing of
`.. ~\:-.:11 Glu Lys A sn Phe
`1977/J: de la Paz ,,, rd .• 1986) expect for .'.\E\nl.
`\1·here it i:< deleted. v\'p find that the similaritil'" in
`the L2 struct ures arise from the conformat.ional
`requiremE-nts of 1:1
`turn and
`c-onservation of the framework residues <lgainst
`which L2 pack:<.
`The know structures L2 consists of three n·,.;irlue,.;.
`50 to n2:
`~H 'PC603
`.\ :<II
`~ ... r
`These three rt>sidue·s link two adjal"ent ,.;!.rands in
`the framework P·><lwl'I. Residues 49 and f>:J an ..
`h~·drogen bonded to each othel' ,.;o that the L2
`region is a three-residue hair-pin turn (Fig. 6).
`/ ""'
`49= = =53
`The c'onformations of L2 in tht- fi,-f' stru<'turt-:-< art'
`ver,v s imilar: r.m . ~. differenc:t.'i< in position of their
`main·thain atom!> are lwt·wt>t>n 0· 1 and 0-97 A. 1'ht>
`only difference among the conformations i" in the
`orientaticm of the peptide bt-twet'n res idues i'ill and
`51. In W'PC603 this difference is associated with
`t he Gly re:-;ichw at po,.;ition 50. The >1ide-<:hain>< of L:2
`all point towards the surfac·e. The main-chain pac·k,.;
`. J.~39/REl/~lf 'l'f ·1;u:i
`Human\" •
`Ala Ser
`Val Ile Leu
`Tyr Phe
`;,7 lie. I Met. 1 \'al
`!'d Ala, 7 Ser
`ao Ile. ti \'al. x Leu
`;;; Leu. t \"al
`:.'X Phe. I Tyr
`1:34 lie. 14 \"al
`1(14 .\la. 4 ~r
`Ml Leu . . ii \"al. :38 llt-
`$14 Lo·u. 44 ~It-I , i \"al. 3 llt(cid:173)
`!H Plw. ~Ii 1\r
`I '. ('/lfJlhin nnd A . • 1/. /,Psk
`Table 5
`DifferenrP.~ in the positions of the framework rP.,idiua
`'"~jfm• 11 t to the hypervariable regions i 11
`i 111 m·u.noglobul in structures
`A<lja <·1>11 t framework
`Differences in
`position (.\}
`fl·a 1·2
`I~~ J.4
`0.3-1 ·~
`II·~ J.i
`Figure 4. The C'onformation of the ~I regi.on. of \ 'A
`l\:OL. The side-chain of lle30 is buried w1thm the
`framt-work struC'ture: st'\' seC'tion ~ .
`against the consPrved framework residues llf'4 i and
`Gly6+/Ala6.t (Fig. 6).
`K abat "' al. (I 983) gi,·<· t ht· se4uenn•)< of the L:!
`regions of I 74 \ 'L domains. l n all cases t lw.\' are
`three residues in length. Of the l 7-i. 122 do not
`contain (:ly and 49 have. like :\ICPC603, a Cly
`residue at position :)I). The residllPl< at position 48
`and 64 are almost ah:<oh1tf>ly conserved as I le an<l
`Gh-. These size and sequenN' identities imply that
`:tl.:Oost all L:? regions ha,•e a conformation close to
`that found in the known st ructures.
`6. Conformation of the L3
`Hypervariable Regions
`The L3 region, residues 91 to 96. forms the link
`two adjac:ent :<trands of P-sheet. Our
`anah ·;;i,. of the strurt ure), and sequen<'e" known for
`this· region suggests that the large majority of•
`chains have a common conformation that is quite
`different from the conformations found in J. chains.
`(a) l'A domains
`The L3 region of\'; ~ E\DI has six rt'><idues and
`t hose of KOL and R.11 E ha.1·c· t'ight. :-i111K·rim~iLion
`of the three regions g il't's th<• following alignment:
`Tvr Asp Arg
`T;.p Asn
`Trp Asn
`Ser Asp
`Ser Leu
`Ser Leu A'l?
`. .\~n $er Trr
`..\>p C:lu Pio
`Figure S. The C'Onforma111111 of the LI regions of\', ;\l('P<'603. \'• REI and v. J539. Residues 26 to 29 and 32 hav~:
`:<ame c·onformation in the 3 strudurt>s. Tht' sidf'-C'hain of residue :!() is buried within the framework structure,
`,,. .. 1·tion -L
`9 of 18
`Celltrion, Inc., Exhibit 1062


`The SfrucfurP of llyprn·ari((b/e Region.,
`Figure 6. The C'Onformation of thP L2 region of \'~
`KOL. This region packs against framework residues Ile47
`and Glyti4.
`Figure 7. Tht: conformation of t.he L3 region of \'.
`REI. The conformation is stabili2ed by the hyclrogPn
`bonds made ll\' the framework residuP Gln90 and b\' the
`ri.~ conformati~n of the peptide of Pro95.

`In a.II three \'i structures. residues 91 to !12 and 95
`to 96 form an extension of the P-sheet framework
`with main-chain hydrogen bonds between residues
`92 and 95:
`_ 94
`93 _
`90= = =97
`93A.- -93h
`92= = =95
`90= = =97
`Residues 93 and 94 in NR\\':11 form a t11·0-n:.,<idut"
`type II' turn (see Table:?). Residues 93. 93a, 93b
`and 94 in RHE and KOL form a four-residue tu rn
`with the same conformation: the r.m.s. difference in
`the position of their main-chain atoms i:-; 0·19A .
`in almost all four
`This conformation
`is found
`residue turns that. like KOL and RHE, have(;)\· o r
`Asn in tht- fourth position of t he turn, positio~ 94
`here (Sibanda & Thornton, 1985: Efimov. 1986; and
`see Table 2).
`Ka.bat el al. (1983) listed :?7 human and:?.'; mouse
`r, domains for which t he seq11enc·(' of the whole of
`· t~e .third hypervariable region
`is known. The
`' distribution of s izes of th1t L3 region
`in U1ese
`sequences is:
`Residue • iu·
`:\umber of human \',
`Number of mouse ,.,
`In the L3 regions with s ix residues \l'l' would
`expP<'I . as in :\E\\'}l, 91
`to !I:! and !I:)
`to 96 to
`continue the P·sheet oft lw framework and 9;3 (.1i 9.J.
`t<i form a two-residue hair-pin t.urn. Ruff's r1'1111in~
`the sequence and conformation of two-residue t nrn,.;

