`
`39
`
`Canonical Structures for the Hypervariable Regions
`of Immunoglobulins
`
`Cyrus Chothia and Arthur M. Lesk
`
`PETITIONER'S EXHIBITS
`
`Exhibit 1062 Page 1 of 18
`
`
`
`y.jW.fiwU1987) 196. 901-917
`
`Canonical Structures for the Hypervariable Regions
`of Immunoglobulins
`
`Cyrus Chothia'^ and Arthur M. Lesk'^f
`
`^MRC Laboratory of Molecular Biology
`Hills Road, Cambridge CB2 2QH
`England
`
`Laboratory
`Ingold
`^Christopher
`University College London
`20 Gordon Street
`London WCIH OAJ, England
`
`Programme
`^EMBL Biocompniing
`1022.09
`Meyerhofstr.
`f Postfach
`D-6900 Heidelberg
`Federal Republic of Germany
`
`(Received 13 November 1986, and in revised form 23 April
`
`1987)
`
`We have analysed the atomic structures of Pab and \Y fragments of immunoglobulins to
`determine the relationship between their amino acid sequences and the three-dimensional
`structures of their antigen binding sites. We identify the relatively few residues that,
`through their packing, hydrogen bonding or the ability to assume unusual cj), i^ or cu
`conformations, are primarily
`responsible
`for
`the main-chain conformations of
`ihe
`hypervariable regions. These residues are found to oc.ur at sites within the hypervariable
`regions and in the conserved ^-sheet framework.
`Examination of the sequences of immunoglobulins of unknown .structure shows that
`many have hypervariable regions that are similar in size to one of the known structures and
`contain identical residues at the sites responsible for the observed conformation This
`imphes that these hypervariable regions have conformations close to those in the known
`structures. For five of the hypervariable regions, the repertoire of conformations appears to
`be hmited to a relatively small number of discrete structural cla.sses. W e cal the commonly
`occurring main-chain conformations of the hypervariable regions "canonical structures .
`The accuracy of
`the analysis
`is being
`tested and refined by
`the predi.t.on of
`immunoglobulin structures prior to their experimental determination.
`
`.
`1. mtroauction
`The specificity of immunoglobulins is determined
`by the sequence and size of
`the hypervariable
`regions in the variable domains. These regions
`produce a surface complementary to that of the
`antigen. The subject of this paper is the relation
`between the amino acid sequences of antibodies and
`the structure of their binding sites. The results we
`report are
`related
`to
`two previous
`sets of
`observations.
`fixed
`
`'
`
`'
`t Also associated with Fairleigh Dickinson University.
`Teaneck-Hackensack Campus, Teaneik. X.I 07666.
`
`the
`the sequences of
`set concerns
`The first
`liypervariable regions. Rabat and his colleagues
`(j<abat et al.. 1977; Rabat, 1978) compared
`the
`^p^mgnces of the hypervariable regions then known
`^^^| f^^^.^^^ tiiat. at 13 sites in the light chains
`^^^ ^^ seven positions in the heavy chains, the
`..esidues are conserved. They argued
`that
`the
`residues at these sites are involved in the structure,
`j.^ther than
`the si)eciticity, of the hypervariable
`regions. They suggested that these residues have a
`position in antibodies and that this could be
`used in the model building of combining sites to
`limit the conformations and positions of the .sites
`.^j^^gg residues vailed. Padlan (1979) also examined
`^^^ se(|uences of the hypervariable region of light
`
`(KI2i~;;H;W,H-/mil90N17 S03.00/1)
`
`""
`
`(e, HIST Ac;i<lemi,- Pivss Limite,!
`
`PETITIONER'S EXHIBITS
`
`Exhibit 1062 Page 2 of 18
`
`
`
`902
`
`C. Chothia and A. M. Lesk
`
`chains. He found that residues that are part of the
`hyjiervariable regions, and that are buried within
`the domains in the known structures, are conserved.
`The residues he found conserved in V^ sequences
`were different to those conserved in V,, sequences.
`The
`.second set of observations concerns
`the
`conformation of
`the hypervariable
`regions. The
`results of the structure analysis of Fab and Bence-
`Jones proteins {Saul et al, 1978; Segal et al, 1974;
`Marquart et al, 1980; Suh H al, 1986; Schiffer et al,
`1973: Epp el al, 1975; Fehlhammer et al, 1975;
`Colman et rd.. 1977; Furey et a/., 1983) show that in
`several cases hypervariable regions of the same size,
`but with different sequences, have the same main-
`chain
`conformation
`(Padlan & Davies, 1975;
`Fehlhammer et al,
`1975; Padlan et al,
`1977;
`Padlan, 19776; Colman et al., 1977; de la Paz el al.
`1986). Details of
`these observations are given
`below.
`In this paper, from an analysis of the immuno(cid:173)
`globulins of known atomic structure we determine
`the limits of the /S-sheet framework common to the
`known structures (see section 3 below). We then
`identify the relatively few residues that, through
`packing, hydrogen bonding or the ability to assume
`unusual (f>, tp or CO conformations, are primarily
`responsible
`for
`the main-chain
`conformations
`observed in the hypervariable regions (see sections 4
`to 9, below). These residues are found to occur at
`sites within the hypervariable regions and in the
`conserved /S-sheet framework. Some correspond to
`residues identified by Rabat et al. (1977) and by
`Padlan (Padlan et al.. 1977; Padlan, 1979) as being
`important
`for determining
`the conformation of
`hypervariable regions.
`immuno(cid:173)
`sequences of
`Examination of
`the
`globulins of unknown structure shows that in many
`cases the set of residues responsible for one of the
`observed hy|)eivariable conformations
`is present.
`This suggests
`that most of
`the hypervariable
`regions in immunoglobulins have one of a small
`discrete set of main-chain conformations that we
`call
`'canonical structures". Sequence variations at
`the sites not responsible for the conformation of a
`particular canonical structure will modulate
`the
`surface that it presents to an antigen.
`the
`Prior t,, this analysis, attempts
`to model
`combining sites of antibodies of unknown structure
`have been based on the assumption that hyper(cid:173)
`variable regions of
`the same size have similar
`backbone structures (see section 12. below). As we
`sliow below, and as has been realized in part before,
`this is true only
`in certain
`instances. Modelling
`based on the sets of residues identified here as
`responsible
`for
`the
`,)liseive,l conformations of
`hypeivariable regions would be expected to give
`more accurate results.
`
`2. Immunoglobulin Sequences and Structures
`
`Rabat et al. (1983) have published a collection of
`the known
`imnuinoglobuliii
`.sequences. For
`the
`
`variable domain of the light chain (V^_)•f they list
`some 200 complete and 400 partial sequences- for
`the variable domain of the heavy chain (V^) they
`list about 130 complete and 200 partial sequences.
`In
`this paper we use the residue numbering of
`Rabat et al. (1983), except in the few instances
`where
`the
`structural
`superposition of certain
`hypervariable regions gives an alignment different
`from that suggested by the sequence comparisons.
`In Table 1 we list the immunoglobulins of known
`structure
`for which
`atomic
`co-ordinates are
`available from the Protein Data Bank (Bernstein et
`al., 1977), and give the references to the crystallo-
`graphic analyses. Amzel & Poljak (1979), Marquart
`& Deisenhofer (1982) and Davies & Metzger (1983)
`have written reviews of the molecular structure of
`immunoglobulins.
`The VL and V^, domains have homologous
`structures
`(for
`references,
`see Table 1). Each
`contains two large )S-pleated sheets that pack face
`to face with their main chains about 10 A apart
`(1 A = 01 nm) and inclined at an angle of -30°
`(Fig. 1). The ^-sheets of each domain are linked by
`a
`conserved disulphide bridge. The antibody
`binding site is formed by the six hypervariable
`regions; three in VL and three in V'^. These regions
`link strands of the yS-sheets. Two link strands that
`are in different )?-sheets. The other four are hair-pin
`turns: peptides that link two adjacent strands in
`the same
`/?-sheet (Fig. 2). Sibanda & Thornton
`(1985) and Efimov (1986) have described how the
`conformations of small and medium-sized hair-pin
`turns depend primarily on the length and sequence
`of the turn. Thornton et al. (1985) pointed out that
`the sequence-conformation rules for hair-pin turns
`can be used for modelling antibody combining sites.
`The
`results of
`these
`authors and our own
`unpublished work on the conformations of hair-pin
`turns, are summarized in Table 2.
`
`3. The Conserved f-Sheet Framework
`
`immunoglobulin
`first
`the
`of
`Comparisons
`structures determined showed that the framework
`regions of different molecules are very similar
`
`t Abbreviations used: VL and \'„, variable regions of
`the immunoglobulin light and heavy chains,
`respectively; r.m.s.. root-mean-square; CDR,
`loniplementarity-determining region.
`
`Table I
`lititiiitnoglobulin variable domains of knoitm
`atomic structure
`
`Priitein
`
`iMliXKW.M
`Fall .MCt>('H0;j
`Fah KDL
`Fab ,J,''>:5!I
`VL KKl
`VL R HE
`
`Chain
`L
`
`Type
`H
`
`Reference
`
`XI
`K
`XI
`K
`K
`X\
`
`II
`I
`III
`III
`
`(1978)
`»2M\elal
`Segal rf,r/. (1974)
`Marquart f( al. (19*')
`Suhc(,r/. (1986)
`Eppe(«/. (197,'))
`Furey e( a/. (1983)
`
`PETITIONER'S EXHIBITS
`
`Exhibit 1062 Page 3 of 18
`
`
`
`The Structure of Hypervariable Regions
`
`903
`
`V. H
`V.
`Figure 2. A drawing of the arrangement of the
`hypervariable regions in immunoglobulin binding sites.
`The squares indicate the position of residues at the ends
`of the /?-sheet strands in the framework regions.
`
`residues
`79
`of
`framework
`^-sheet
`common
`(Fig. 3(b)). For different pairs of VH domains the
`r.m.s. difference in the position of the main-chain
`atoms is between 0-64 and 1-42 A.
`The combined ^-sheet framework consists of \'L
`residues 4 to 6, 9 to 13, 19 to 25. 33 to 49, 53 to 55,
`61 to 76, 84 to 90, 97 to 107 and ¥„ residues 3 to 12,
`17 to 25, 33 to 52, 56 to 60, 68 to 82. 88 to 95 and
`102 to 112. A fit of the main-chain atoms of these
`156 residues in the four known Fab structures gives
`r.m.s. differences in atomic positions of main-chain
`atoms of:
`
`KOL
`NEWM
`M(P('60:j
`
`MEWM
`
`1-39 A
`
`McPC603
`
`115A
`1-47 A
`
`J539
`
`1-UA
`1-37 A
`1-03 A
`
`The major determinants of the tertiary structure
`of the framework are the residues buried within and
`between the domains. We calculated the accessible
`surface area (Lee & Richards, 1971) of each residue
`in the Fab and VL structures. In Table 4 we list the
`residues commonly buried within the VL and V^
`domains and in the interface between them. These
`are essentially
`the same as those
`identified by
`Padlan (1977a) as buried within the then known
`structures and conserved
`in
`the
`then
`known
`sequences. Examination of
`the 200
`to
`700 \'L
`sequences and 130 to 300 VH sequences
`in
`the
`Tables of Rabat et al. (1983) shows that in nearly
`all the sequences listed there the residues at these
`positions are identical with, or very similar to, those
`in the known structures.
`There are two positions in the VL sequences at
`which the nature of the conserved residues depends
`on the chain class. In V;^ sequences, the residues at
`positions 71 and 90 are usually Ala and >Ser/Ala,
`respectively;
`in V,, sequences
`the corresponding
`residues are usually Tyr/Phe and Gln/Asn. These
`residues make contact with the hypervariable loops
`and play a role in determining the conformation of
`
`Figure 1. The structure of an immunoglobulin V
`domain. The drawing is of KOL VL. Strands of j3-sheet are
`represented by ribbons. The three hypervariable regions
`are labelled LI, L2 and L3. L2 and L3 are hairpin loops
`that link adjacent ^-sheet strands. LI links two strands
`that are part of different )S-sheets. The V^ domains and
`their hypervariable regions, HI, H2 and H3, have
`homologous structures. The domain is viewed from the /?-
`sheet that forms the VL-VH interface. The arrangement of
`the 6 hypervariable regions that form
`the antibody
`binding site is shown in Figure 2.
`
`(Padlan & Davies, 1975). T he s t r u c t u r al similarities
`of the frameworks of
`t he v a r i a b le d o m a i ns were
`seen as arising from t he t e n d e n cy of residues t h at
`form the interiors of t he d o m a i ns to be conserved,
`and from the conservation of t he t o t al v o l u me of
`the
`interior
`residues
`( P a d l a n, 1977a, 1979).
`In
`addition, the residues that form the central region
`of the interface between VL and V^, domains were
`observed to be strongly conserved (Poljak et al.,
`1975; Padlan, 19776) and to pack with very similar
`geometries (Chothia et al, 1985).
`In this section we define and describe the exact
`extent of the structurally similar framework regions
`in the known Fab and VL structures. This was
`determined by optimally superposing
`the main-
`chain atoms of the known structures (Table 1) and
`calculating the differences in position of atoms in
`homologous residues!.
`In Figure 3(a) we give a plan of the ^-sheet
`framework that, on the basis of the superpositions,
`is common to all six VL structures. It contains 69
`residues. The r.m.s. difference in the position of the
`main-ehain atoms of these residues is small for all
`pairs of VL domains; the values vary between 0-50
`and 1-61 A (Table 3A). The four VH domains share a
`
`+ For these and other calculations we used a program
`system written by one of us (see Lesk, 1986).
`
`PETITIONER'S EXHIBITS
`
`Exhibit 1062 Page 4 of 18
`
`
`
`904
`
`C Chothia and A. M. Lesk
`
`Table 2
`('Dnformnt'ton of hair-pin
`
`turns
`
`Structure
`
`Sequence'
`
`Conformation''
`(°)
`
`Frequency*^
`
`2
`
`I
`
`3
`
`4
`
`12
`.\
`
`3 4
`(1- (J- X
`
`\(.',- X- X
`
`X - X- C- X
`
`X X- X X
`
`X- X X G
`
`(j>-2.
`-1-55
`
`02.
`i/(2
`-l-,35 + 85
`or
`-1-65 - 1 25 - 1 05 + 1 0'
`
`i/(3
`-S"*
`
`+ 70 - 1 15
`
`- 90
`
`0'
`
`+ 50 + 45
`
`+ 85
`
`- 2 0"
`
`M
`^l
`135 + 1 75
`
`+ 60 + 20
`ip2
`4>'2
`-50
`- 35
`
`+ 85 +2.5'
`03
`i/<3
`- 95
`- 10
`
`i/<4
`04
`+ 1 45 +1.55
`
`./K
`2 -^
`4
`
`1
`X
`
`2
`X
`
`3
`X
`
`4
`X
`
`5'
`a
`
`02
`- 75
`
`02
`- 10
`
`03
`- 95
`
`03
`- 50
`
`04
`- 1 05
`
`04
`0
`
`05
`+ 85
`
`05
`- 1 60
`
`X
`
`X
`
`X
`
`X
`
`+ 50
`
`+ 55
`
`+ 65
`
`-.50
`
`- 1 30
`
`— .5
`
`- 90
`
`6/6
`
`6/7
`
`7/8
`
`4/4
`
`3/3
`
`1
`1
`1 1 1 15
`/K
`4
`2
`1 ^ 1
`1 / 1
`r
`5
`3—
`4
`1
`1
`2
`5
`1
`1
`1
`6
`
`X
`
`1
`
`2
`
`X
`
`X
`
`3
`
`\
`
`5"
`
`X
`
`4
`{;
`X
`I)
`
`+ 130
`
`1/1(3/3)
`
`02
`
`02
`
`03
`
`03
`
`04
`
`04
`
`- 60
`
`- 25
`
`- 90
`
`0
`
`+ 85
`
`+ 10
`
`13/15
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6'
`
`02
`
`02
`
`03
`
`03
`
`04
`
`04
`
`05
`
`05
`
`\
`
`X
`
`X
`
`\
`
`X-
`X
`
`X
`
`- 65
`
`- 30
`
`- 65
`
`- 45
`
`- 95
`
`-5
`
`+ 70
`
`+ 35
`
`3/3
`2/2
`1/1
`
`The data in this Table are from an unpublished analysis of proteins whose atomic structure has been
`determined at a resolution of 2 A or higher. The conformations described here for the 2-residue X-X-
`X-(i turn and the 3-residue turns are new. The other conformations have been described by Sibanda &
`Thornton (lil85) and V)y Efimov (1986). We list only conformations found more than once.
`' .X indicatc.-i no residue re.strictidn except that certain sites cannot have Pro, AS this residue requires
`a 0 value of
`bO and cannot form a hydrogen bond to its main-chain nitrogen.
`Residues whose 0,0 values are not given have a fl conformation.
`' Frequencies are given as nju^. where (i^ is the number of cases where we found the structure in
`column 1 with the sequence in column 2 and n^ the number of these cases t h at have the conformation
`in column 3. E.\,ept for the frequencies in brackets, data is given only for non-homologous proteins.
`'•'•' These are type I'. IF and III' turns.
`' Different conformations are found for the single cases of X-D-G-X-X and X-G-X-G-X.
`'' Different conformations are found for the single cases of X-X-N-X-X. X-G-G-X-X and X-G-X-X-
`G. The 2 cases of ,X-X-.\-.\-.X- have ditferent conformations.
`' Different conformations are found for the 2 cases of X-G-X-X-X-X.
`
`these loops. This is discussed in sections 5 and 7,
`below.
`structure
`the framework
`The conservation of
`extends to the residues immediately adjacent to the
`hypervariable regions. If the ciinserveti frameworks
`of a pair of molecules are
`superposed,
`the
`diiferences in the positions of these residues is in
`most cases less than 1 A and in all but one case less
`than 1-8 A (Table 5). In contrast, residues in the
`hypervariable
`region adjacent
`to
`the conserved
`framework can differ in position by 3 A or more.
`The six loops, whose main-chain conformations
`vary and which arc part of the antibody innibining
`site, are formed by residues 26 to 32. ."io to .'r2 and
`91 to 9() in VL domains, and 26 to 32. 53 to ,55 and
`96 to 101 in the VH domains LI, L2. L3, HI, H2 and
`H3,
`respectively. Their
`limits
`are
`s,',mewhat
`different
`from
`those of
`the
`complementarity-
`determining regions defined l,y Rabat cf al. (1983)
`on the basis of se(|uen,'e variability: residues 24 to
`
`34, .50 to 56 and 89 to 97 in VL and 31 to 35, 50 to
`65 and 95 to 102 in VH. This point is discussed in
`section 11, below.
`
`4. Conformation of the LI
`Hypervariable Regions
`In the known VL structures, the conformations of
`the LI regions, residues 26 to 32, are characteristic
`of the class of the light chain. In V^ domains their
`conformation is helical and in the V„ domains it is
`extended (Padlan et al., 1977; Padlan, 1977fc; de la
`Paz et «/., 1986). These conformational differences
`are the result of sequence differences in both the LI
`region and the framework (Lesk & Chothia, 1982).
`
`(a) F;i domain.s
`
`the conformation of the LI
`Figure 4 shows
`regions of the \'^ domains. The LI regions in RHE
`
`PETITIONER'S EXHIBITS
`
`Exhibit 1062 Page 5 of 18
`
`
`
`The StntcUire of llyperrariable
`
`Regions
`
`905
`
`V, H
`
`18
`
`82
`
`68
`
`3f
`
`25
`
`H
`
`1020.—
`
`r\ H2
`
`56
`
`3352
`
`9(.
`
`13 <(
`
`0 60
`
`<; .J
`
`88
`
`112
`ed in the \'L and VH domains of the immunoglobulins of
`Figure 3. Plane of the ^-sheet framework that is conserve
`mown atomic structure.
`
`12
`
`and R OL contain nine residues designated 26 to 30,
`30a, 30b, 31 to 32; X E WM has one additional
`residue. The LI regions in R HE and R OL have the
`same conformation: their main-chain atoms have a
`r.m.s. difference in position of 0-28 A. Superposition
`of the LI region of X E WM with those of R OL and
`RHE shows t h at t he additional residue is inserted
`between residues 30b and 31 and has little effect on
`the conformation of
`t he
`rest of
`t he
`region:
`superpositions of the main-chain atoms of 26 to 30b
`and 31 to 32 in N E WM to 26 to 32 in R OL a nd
`RHE give r.m.s. differences in position of 0-96 A
`and 1-25 A. Thus, the sequence alignment for the V^
`LI regions of R O L, R HE a nd N E WM implied by
`the structural superposition is:
`
`Position
`RHE
`KOL
`NP^WM
`
`.30b 30,- 31 32
`liOa
`26 27 28 29 30
`Scr Ala Thr Asp He Gly Ser
`.Asn Ser
`Thr Ser Ser Asn He Gly Ser
`He Thr
`.Ser Ser Ser ,Asn He (Jly Ala <;iy Asn His
`
`In all three structures, residues 26 to 29 form a
`type I turn with a hydrogen bond between
`the
`carbonyl of 26 and the amide of 29. Residues 27 to
`30b form an irregular helix (Fig. 4). This hebx sits
`across the top of the J?-sheet core. The side-chain of
`residue 30 penetrates deep into the core occupying a
`cavity between residues 25, 33 and 71. The major
`determinant of the conformation of LI
`in
`the
`observed structures is the packing of residues 25.
`30 33 and 71. V^ RHE, ROL and NEWM have the
`
`PETITIONER'S EXHIBITS
`
`Exhibit 1062 Page 6 of 18
`
`
`
`906
`
`('. Chothia and A. M. Lesk
`
`Table 3
`Differences in immunoglobtdin
`.stnirtiires
`(.A)
`
`framework
`
`the
`in
`For pairs of V domains we give the r.m.s. difference
`atomic positions of framework main chain atoms after optimal
`s,iperi)„siti,,n.
`
`.A. I'L (loriinin^
`Framework residues are 4 t„ 6, 9 to 13, 19 to 25. 33 t„ 49. .53 to
`.55. 61 t,i 76. 84 to 90 and 97 to 107.
`
`KOL X'EW'.M
`
`R EI
`
`.M('P('B03
`
`J.539
`
`RHE
`KllL
`XEWM
`REI
`M('P('603
`
`0-74
`
`1-47
`113
`
`1-46
`1 -23
`I 24
`
`1-61
`1-36
`1-28
`0-50
`
`1-41
`1 15
`1-.53
`0-77
`0-76
`
`13. I'H doff/iiin.^
`Framework residues are 3 to 12. 17 to 25. 33 to 52, 56 to 60. 68
`to 82. HH tu 95 and 102 to 112.
`
`XEWM MC'PC603
`
`.1539
`
`KOL
`XEWM
`MCPr(i03
`
`1 42
`
`(11)4
`1-27
`
`0-89
`1-29
`0-89
`
`same residues at these sites: Gly25, Ile30, Val33 and
`Ala71. (Another LI residue, Asp29 or Asn29, is
`buried by the contacts it makes with L3.)
`Rabat et al. (1983) fisted 33 human V^ domains
`
`the sequences of the LI regions are
`for which
`known. The 21 sequences in subgroups I, II, V and
`VI have LI regions that are the same length as
`those found in R H E, ROL or NEWM. Of these, 18
`conserve the residues responsible for the observed
`conformations:
`
`Residue
`position
`
`Residue in
`K O L / R H E / N E WM
`
`Residues in
`18 Vj sequences
`
`25
`30
`33
`71
`29
`
`GIv
`He
`Val
`Ala
`Asp/Asn
`
`18(;iv
`17 Val. 1 He
`17 Val, llle
`18 Ala
`11 Asp, 6 Asn, I Ser
`
`The conservation of these residues implies that
`these 18 LI regions have a conformation that is the
`same as that in RHE, ROL or NEWM.
`Subgroups III and IV have 13 sequences for
`which
`the LI regions are known (Rabat et al,
`1983). These regions are shorter than those in RHE
`and ROL and in the other V^ subgroups. They also
`have a quite different pattern of conserved residues.
`Rabat et al. (1983) listed 29 mouse V^ domains for
`which the sequence of the LI region is known.
`These LI regions are the same size as that in
`NEV^'M. They also have a pattern of residue
`conservation similar to, but not identical with, that
`in ROL/XEWM: Ser at position 25, Val at 30, Ala
`at 33 and Ala at 71. This suggests that the fold of
`
`Re,sidue.s
`
`commonly
`
`T a b le 4
`buried
`tiHlhln Vi and V^
`
`domains
`
`VL domains
`
`Residues in
`known
`structures
`
`Position
`
`A.S.A."
`(A^)
`
`Position
`
`Vj, domains
`
`Residues in
`known
`structures
`
`A.S.A."
`(A^)
`
`4
`li
`19
`21
`23
`25
`33
`35
`37
`47
`48
`02
`64
`71
`73
`, .1
`S2
`84
`86
`ss
`90
`97
`99
`101
`102
`104
`
`L.M
`Q
`\'
`I,M
`r G,A,S
`
`V.L
`W
`Q
`L.I.W
`I
`F
`G,A
`A.F.V
`L.F
`I.V
`D
`A,S
`\
`('
`A.s.g.x
`V.T.G
`(i
`G
`T
`L V
`
`6
`12
`11
`1
`0
`13
`3
`0
`30
`S
`24
`11
`13
`2
`0
`0
`4
`11
`0
`0
`7
`18
`3
`11
`1
`2
`
`4
`6
`18
`20
`-)•)
`24
`34
`36
`38
`48
`49
`51
`69
`78
`80
`82
`86
`88
`90
`92
`104
`106
`107
`109
`
`L
`Q.E
`L
`L
`('
`S.V.T.A
`M.Y
`W
`R
`I,V
`A,G
`I.V.S
`I.V.M
`L.F
`L
`M,L
`D
`A.G
`Y
`V
`G
`G
`T.S
`V
`
`14
`16
`21
`0
`0
`8
`4
`0
`13
`1
`0
`4
`13
`0
`0
`0
`2
`3
`0
`0
`11
`19
`17
`•7
`
`" .Mean accessible .surface area (A.S.A.) of the residues
`KOL and ,]53!l and in the \\ structures REI and R H E.
`
`the Fab structures NEWM, MCPC603.
`
`PETITIONER'S EXHIBITS
`
`Exhibit 1062 Page 7 of 18
`
`
`
`The Structure of Hypervariable R eg I OILS
`
`907
`
`the mouse V^ LI regions is a distorted version of
`that found in the known human structures.
`
`The number of residues in the LI region in these
`secjuences varies:
`
`(b) r^ domains
`
`In Figure 5 we illustrate the conformation of the
`LI regions in the three known V^ structures: J539,
`REI and MCPC603. In J539 LI has six residues, in
`REI it has seven and in MCPC603 13. The LI
`region of J539 has an extended conformation. In
`REI,
`residues 26
`to 28 have
`an
`extended
`conformation and 29 to 32 form a distorted type II
`turn. The six additional residues in MCPC603 all
`occur in the region of this turn (Fig. 5). In the three
`structures the main chain of residues 26 to 29 and
`32 have the same conformation. A fit of the main-
`chain atoms of these residues in J539, REI and
`Mt!PC603 gives r.m.s. differences in position of 0-47
`to 103 A. The sequence alignment implied by the
`structural superposition is:
`
`Residue size of LI
`Number of human V,
`Number of mouse V,
`
`0
`
`17
`
`7
`38
`40
`
`8
`14
`
`9
`
`10
`
`11
`1
`32
`
`12
`4
`35
`
`13
`
`30
`
`The conservation of residues at the positions buried
`between LI and the framework implies that in the
`large majority of V„ domains residues 2(i to 29 have
`a conformation close to that found in the known
`structures and that the remaining residues, if small
`in number, form a turn or, if large, a hair-pin loop.
`
`5. Conformation of the L2
`Hypervariable Regions
`
`The L2 regions have the same conformation in
`the known structures (Padlan et al, 1977; Padlan,
`
`Residue
`J539
`REI
`MCPC(i03
`
`2(j
`Ser
`Ser
`Ser
`
`27
`Ser
`Glu
`(.;iu
`
`28
`Ser
`Asp
`Ser
`
`29
`Val
`He
`Leu
`
`.30
`Ser
`He
`Leu
`
`31
`
`Lvs
`Asn
`
`31a 31b 31c 31d 31e 31f
`
`Ser
`
`Gly Asn Glu
`
`Lys Asn
`
`32
`Ser
`Tyr
`Phe
`
`In J539, REI and Mf'Pt'603. residues 26 to 29
`extend across the top of /S-sheet framework with
`one, 29. buried within it. The main contacts of 29
`are with residues 2, 25. 33 and 71. The penetration
`of residue 29 into the interior of the framework is
`not as great as that of residue 30 in the V^ domains,
`and the deep cavity that exists in V^ domains is
`filled in V,, domains by the large side-chain of the
`residue at position 71. In J539, REI and MCJPCeOS,
`the residues involved in the packing of LI (2. 25,
`29. 33 and 71) are very similar: He, Ala/Ser, Val/
`Ile/Tjcu, Leu and Tyr/Phe, respectively.
`The six residues 30 to 30f in MCPC603 form a
`hair-pin loop that extends away from the domain
`(Fig. 5) and does not have a well-ordered conforma(cid:173)
`tion (,Segal et al, 1974).
`Rabat et al. (1977) noted that residues at certain
`positions in the LI regions of the V,, secjuences then
`known were conserved, and suggested
`that
`they
`have a structural
`role. The structural
`role of
`residues at positions 25. 29 and 33 is confirmed by
`the above analysis of the Y,^ structures and
`the
`pattern of residue conservation in the much larger
`number of sequences known now. Rabat et al.
`(1983) listed 65 human and 164 mou.se X„ sequences
`for which the residues between positions 2 and 33
`are known. For about half of these, the residue at
`position 71 is also known. The.se data show that
`there are 59 human and 148 mouse sequences that
`have residues very similar to those in the known
`structures at the sites involved in the packing of
`LI:
`
`1977^;: de la Paz et al, 1986) expect for XEWM.
`where it is deleted. We find that the similarities in
`the L2 structures arise from
`the conformational
`requirements of a
`three-residue
`turn and
`the
`conservation of the framework
`residues against
`which L2 packs.
`The know structures L2 consists of three residues,
`50 to 52:
`
`Residue
`
`R HE
`
`KOL
`
`REI
`
`.M( 'Pf'603
`
`.1539
`
`51
`52
`
`T \r
`Asn
`Asp
`
`Arg
`Asp
`Ala
`
`Glu
`Ala
`Ser
`
`GIv
`Ala
`.Ser
`
`Glu
`He
`,Ser
`
`These three residues link two adjacent strands in
`the framework
`;8-sheet. Residues 49 and 53 are
`hydrogen bonded
`to each other so that the L2
`region is a three-residue hair-pin turn (Fig. 6).
`
`.51
`
`.50
`
`52
`
`4 9 Z Z Z 53
`
`The conformations of L2 in the five structures are
`very similar: r.m.s. differences in position of their
`main-chain atoms are between 0-1 and 0-97 A. The
`only difference among the conformations is in the
`orientation of the peptide between residues 50 and
`51. In M('PC603 this difference is associated with
`the Gly residue at position 50. The side-chains of L2
`all point towards the surface. The main-chain jiacks
`
`Position
`
`.I.539/REI/.\I('P('(i03
`
`Human V,
`
`Mouse \;
`
`2
`25
`29
`33
`71
`
`He
`Ala Ser
`Val He Leu
`Leu
`T\T Phe
`
`.57 He. 1 Met. 1 Val
`.52 Ala, 7 Ser
`;W He. 21 \'al. 8 Leu
`57 Leu. 2 \'al
`28 Phe. 1 Tyr
`
`1:34 He. 14 Val
`104 .Via. 4 Ser
`ft Leu. 51 \'al. 38 He
`51
`,,-u, 44 Met, 7 Val. 3 He
`94 1.
`,54 Phe, 2ti Tvr
`
`PETITIONER'S EXHIBITS
`
`Exhibit 1062 Page 8 of 18
`
`
`
`908
`
`C. Chothia and A. M. Lesk
`
`KOL LI
`Figure 4. The conformation of the LI region of \\
`KOL. The side-chain of IleSO is buried within
`the
`framework structure; see section 4.
`
`against the conserved framework residues Ile47 and
`Gly64/Ala64 (Fig. 6).
`Rabat et al. (1983) give the sequences of the L2
`regions of 174 \Y domains. In all cases they are
`three residues in length. Of the 174. 122 do not
`contain tlly and 49 have, like MCPC603, a Gly
`residue at position 50. The residues at position 48
`and 64 are almost absolutely conserved as He and
`Gly. These size and sequence identities imply that
`almost all L2 regions have a conformation close to
`that found in the known structures.
`
`Table 5
`Differences in the positions of the framework re-nii
`adjacent to the hypervariable regions in
`immunoglobulin structures
`
`Hypervariable
`region
`
`Adjacent framework
`residues
`
`LI
`L2
`L3
`HI
`H2
`H3
`
`25
`49
`90
`25
`.52
`95
`
`33
`53
`97
`33
`,56
`102
`
`Differences in
`position (A)
`
`0-2-I.1
`0-3-0-5
`0-8-1-0
`0-.5-1-2
`0-8-21
`0-5-1-2
`
`Oo-0-8
`U-.i-1.4
`0-8-I.2
`0-3-12
`1-2-1-7
`(1-4-17
`
`6. Conformation of the L3
`Hypervariable Regions
`
`The L3 region, residues 91 to 96, forms the link
`between
`two adjacent
`strands of /S-sheet. Our
`analysis of the structures and sequences known for
`this region suggests that the large majority of K
`chains have a common conformation that is quite
`different from the conformations found in I chains.
`
`(a) l\ domains
`
`The L3 region of \'^ XEWM has six residues and
`those of ROL and R HE have eight. Superposition
`of the three regions gives the following alignment:
`
`91
`
`93
`
`93a 93b 94 95
`
`NEWM
`KOL
`RHE
`
`Ser Leu Arg
`-
`Tyr Asp Arg —
`.Asn
`.Ser Tyr
`Trp Asn Ser Ser Asp
`Trp Asn Asp Ser Leu Asp Clu Pro
`
`J 539
`
`MCPC603
`
`Figure 5. The conformation of the LI regions of V, MCPC603, Y„ REI and V, J539 Residues 26 to 29 and 32 have tlie
`same conformation in the 3 stro.tures. The side-chain of residue 29 is buried within the framework structure; sef
`seition 4.
`
`PETITIONER'S EXHIBITS
`
`Exhibit 1062 Page 9 of 18
`
`
`
`The Structure of Hypervariable Regioii.s
`
`909
`
`KOL L2
`Figure 6. The conformation of the L2 region of \'^
`KOL. This region packs against framework residues Ile47
`and (!lv64.
`
`REI L3
`Figure 7. The conformation of the L3 region of \\
`REI. The conformation is stabilized by the hydrogen
`bonds made by the framework residue Gln90 and by the
`ci.s conformation of the peptide of Pro95.
`
`In all three V^ structures, residues 91 to 92 and 95
`to 96 form an extension of the /S-sheet framework
`with main-chain hydrogen bonds between residues
`92 and 95:
`
`93_
`
`^94
`
`9 2 Z Z Z 95
`I
`I
`91
`96
`I
`I
`9 0 Z Z Z 97
`
`93a
`1
`93
`
`92-
`I
`91
`I
`90-
`
`93b
`I
`94
`
`-95
`I
`96
`I
`-97
`
`Residues 93 and 94 in XEW.M form a twi,-residue
`type IT turn (see Table 2). Residues 93, 9,3a, 93b
`and 94 in RHE and ROL form a four-residue turn
`with the same conformation: the r.m.s. difference in
`the position of their main-chain atoms is 0-19 A.
`This conformation
`is found
`in almost all
`four
`residue turns that, like ROL and RHE, have (Jly or
`•Asn in the fourth position of the turn, position 94
`here (Sibanda & Thornton, 1985: Efimov, 1986; and
`see Table 2).
`Rabat et al. (1983) listed 27 human and 25 mouse
`^x domains for which the sequence of the whole of
`the
`third hypervariable
`region
`is known. The
`distribution of sizes of the L3 region
`in
`these
`sequences is:
`
`Residue size
`Number of human V;
`Number of mouse \\
`
`6
`7
`25
`
`In the L3 regions with six residues we would
`expect, as in XEW.M, 91 to 92 and 95 to 96 to
`continue the ;8-sheet of the framework and 93 to 94
`to form a two-residue hair-pin turn. Rules relating
`
`the sequence and conformation of two-residue turns
`(Sibanda & Thornton, 1985; Efimov, 1986) are
`given in Table 2. iSimilarly, in L3 regions with eight
`residues we would expect 91 to 92 and 95 to 96 to
`continue the ^-sheet framework and 93, 93a. 93b
`and 94 to form a four-residue turn.
`
`(b) r,j domains
`
`The L3 regions in REI. M('I'('603 and ,1539 are
`the same size:
`
`REI
`MCPCtiOM
`.1539
`
`91
`
`Tvr
`Asp
`Trp
`
`92
`
`Gin
`His
`Thr
`
`93
`
`,Ser
`Ser
`Tvr
`
`(14
`
`Leu
`Tvr
`Pro
`
`95
`
`Pro
`Pro
`Leu
`
`96
`
`Tvr
`Leu
`He
`
`.M(T'('603. the L3 regions have the
`In REI and
`same conformation:
`the r.m.s. difference
`in
`the
`positions of the main-chain atoms of residues 91 to
`96
`is 0-43 A. L3
`in J539 has a conformation
`different from that in REI and M("P('()03.
`Normallj', for six-residue loojjs, we might cxyject
`the main-chain atoms of lesidui^s 92 and 95 to form
`hydrogen bonds, and residues 93 and 94 to form a
`turn (.see the discussion of L3 in the V^ chains,
`section 6(a), above). This conformation is prevented
`in the two V„ .structures REI and .M('P("(i03 by a
`Pro
`residue at position 95. In
`these
`two V,^
`structures, residue 92 has an UL conformation and
`Pro95 has a cis peptide. This puts residues 93 to 96
`in an extended conformation (Fig. 7). Important
`determinants of this particular L3 conformation are
`the hydrogen bonds formed to its main-chain atoms
`by the side-chain of framework residue 90. Though
`the side-chains at position 90 are not identical (REI
`
`PETITIONER'S EXHIBITS
`
`Exhibit 1062 Page 10 of 18
`
`
`
`•910
`
`C.