`Vol. 71, No. 3, pp. 845-848, March 1974
`
`Variable Region Sequences of Five Human Immunoglobulin Heavy Chains
`
`
`
`of the V Hiil Subgroup� Definitive Identification of Four Heavy
`
`Chain Hypervariable Regions
`
`
`
`(myeloma proteins/amino acid sequences/antibody combining site)
`
`J. DONALD CAPRA AND J. MICHAEL KEHOE
`
`Department of Microbiology, Mount Sinai School of Medicine of the City University of New York, 10 East 102 Street, New York, N.Y. 10029
`
`Communicated by Henry G. Kunkel, N()llem.ber 5, 1973
`
`at residue 426 (12) so proteins Zap and Tur both
`
`rL�J�-!
`-WL�
`
`10 H
`
`10 eo
`
`ABSTRACT The variable regions of five human immu
`
`kappa), and Tur (IgAl kappa) were isolated from serum or
`
`
`noglobulin heavy chains of the VBIII subgroup have been
`
`plasma by zone electrophoresis on polyvinyl copolymer
`totally sequenced. Three of the heavy chains belonged to
`
`("Pevicon") (8). After further purification by gel filtration
`the lgG class and two to the lgA cla88. Examination of
`
`
`chromatography, they were reduced with 0.1 M P-mercapto
`
`"these sequences, and comparison with additional pub
`lished heavy chain sequences, showed that a total of four
`
`ethanol and alkylated with iodoacetamide. The heavy
`
`of human heavy hypervariable regions is characteristic
`
`
`
`and light chains were separated by gel filtration in propionic
`
`chain variable regions.
`acid (9, 10).
`The relatively con�rved character of large segments of
`
`
`The conserved segments, which are those sections
`
`the heavy chain variable region was very evident in these
`Fragment Preparation: Heavy chains were treaW with
`studies.
`
`
`
`cyanogen bromide (11) and the resujting individual frag
`located outside tlie hypervariable regions, comprise ap
`
`
`
`ments purified by gel filtration chromatography in 5 M guani
`
`
`proximately 65% of the total heavy chain variable region.
`dine · HCI. Three proteins (Tei, Zap, and Tur) yielded a
`
`The following general struciural pattern for antibody
`
`molecules emerges from this and related stud.ies: an over
`
`large N-terminal fragment comprising residues 1-85. Pro
`
`all combining region superstructure is provided by the
`teins Was and Jon, which contain a methionine residue at
`more conserved segments while the refinements of the
`position 34, gave fragments comprising residues 1-34 and
`
`active site specificity are a function of hypervariable re
`
`3&-85. Since alf human IgG myelomas have a methionine �t
`gions.
`
`position 252, Tei and Was yielded a large fragment comprising
`The antibody combining site is now believed to reside ex
`·residues 86-252. In protein Jon; however, an additional
`
`clusively in the variable regions of the heavy and light poly
`
`methionine was present at position 111. Consequently two
`
`peptide chains of the immunoglobulin molecule. Evidence
`distinct
`
`fragments comprising residues 86-111 and 112-252
`
`
`is accumulating from several laboratories which indicates
`were obtained from this protein. IgAl proteins contain a
`
`that hypervariable regions within the variable region are
`methionine
`
`directly involved in the antibody combining site as well as
`yielded a very large fragment composed of residues 86-426.
`
`
`
`
`being responsible, at least in part, for the idiotypic determi
`Sequencirl{J Procedure. Positions 1-85: On both the intact
`
`nants of myeloma proteins and specific antibodies (1-5).
`heavy chain as well as on the 1-85 fragment, proteins Tei,
`
`The existence of three hypervariable regions in the vari
`Zap, and Tur were sequenced 60 steps on the automated se
`
`able region of human immunoglobulin heavy chains has been
`quencer (13, 14). Tryptic peptides were prepared and sepa
`
`
`
`established by previous studies from this laboratory. Resi
`
`rated on Dowex 50 X 4 with a pyridine-formate buffer system.
`dues 31-37 were described as the first hypervariable region
`
`In proteins Tei and Zap two invariant peptides were aligned
`
`of the heavy chain (6), and, after fragmentation of lgG
`by homology alone (70-74 and 7&-78) while in protein Tur,
`
`heavy chains with cyanogen bromide, two additional hyper
`
`variable regions were localized between residues 86-91 and
`1 8
`12,• 56
`HI HH ti H .H
`101-110 (7).
`We have now completed the amino acid sequence from
`
`residues 41 to 84 of the three Valli proteins originally re
`. I
`ported (6, 7) as well as the complete V region sequence of
`two JgA proteins with VaIII variable regions. The data
`make apparent an additional
`ability between residues 51 and 68, thus supporting the ob
`
`
`servations of Cebra et al. made on pooled guinea-pig immuno
`
`globulins (5). When these data on VaIII proteins are in
`10
`o
`30 40 so 60
`20
`
`
`cluded with that available for proteins of the Val and Vall
`TUBE NUMBER
`subgroups and analyzed by the method of Wu and Kabat (3),
`
`
`four distinct areas of sequence hypervariability are observed.
`Fro. I. Representative ion exchange chromatogram of tryptic
`
`hydrolysate of the amino terminal (1-S5) cyanogen bromide
`fragment of.a chain from protein Tur. Peptides were isolated
`
`area of sequence hypervari
`
`6.0
`w
`0
`� 5.
`
`.
`
`MATERIALS AND MEmODS
`Mye/,oma Proteins. Tei (IgGl kappa, Gm az), Was (IgGl
`kappa, Gm az), Jon (IgG3 lambda, Gm g), Zap (IgAl
`
`845
`
`from a Dowex 50X4 column and characterized and analyzed as
`
`described in the te:.ct.
`
`1 of 4
`
`BI Exhibit 1092
`
`
`
`846
`
`Immiinology: Capra and Kehoe
`
`Proc. Nat. Acad. Sci. USA 71 (1974)
`
`10
`
`20
`
`Zap
`
`Tur -------�LEO- -------------------------------------
`
`30
`Tei GLY PRB TBR PRB UR TBR SER ALA VAL TYR (
`ASP __ MB T _ (
`TRP MBT LYS (
`
`... --- SER -----
`Jon -----------'ALA
`
`so
`40
`J TRP VAL ARG GUI ALA PRO GLY LYS GLY LEO GLU TRP VAL
`J ---------------------
`] --------------------
`
`Zap
`
`TUr
`
`'1'BR SD MG Piii!: (
`
`AitG VAL LEU SER SER (
`
`70
`60
`Tei GLY TRP ARG TYR GLU GLY SER SER LEO TRR BIS TYR ALA VAL SD VAL GlJI GLY ARG PRE TBR DZ SZR AR(; ASlf
`GUf GLl1 ALA _. A.Sir SER
`waa ALA
`LYS
`Jon VAL ___ VAL GlJI VAL VAL GLO LY8 ALA PRB
`ALA ILE SD ----- ASP ----- ALA -------------
`VAL GUI
`Zap GLU PBB
`· Au. -----------------
`PHE -----
`
`ASP TBR
`
`ASlf ---------------
`
`ASK --- ASlf ---------------
`
`TUr SER GLY
`
`LEU ASN Au. ---
`
`ASN LEU
`
`so
`Tei �������LEtJ��LEU������TBRALAVAL���Au.�
`
`90
`
`l.oo
`
`Waa ------------------ AS• ARG
`
`Jon ------------------- ILE
`
`ALA _____________ .;_ ___ __
`VAL TBR --------------------
`
`
`Zap -------------------�N TBR GLY
`ALA ------------------
`
`TUr ------------------------ GIB ALA ------- LEO ---------
`
`110
`120
`Tei VAL TBR PRO ALA ALA ALA SER i.Eu MR PRE SER Au. VAL TRP GLY GIB GLY 'ftlR LEU VAL TBR
`) - PHE ASP - PHB -----------
`] SER MET ASP ----------- PRO ----
`
`---
`
`ASP ---------------
`
`SER
`
`Waa PHE ARG GIB PRO PRE VAL GIB
`Jon _ VAL VAL SER TBR [
`Zap THR ARG
`Gr.Y GLY � (
`
`TUr LEU SER VAL TBR
`
`VAL
`
`ALA PHE ASP������-------- LYS
`
`SER
`
`Fm. 2. The amincracid sequence of the variable regions of five human immunoglobulin heavy chains.
`
`the isolation of chymotryptic peptides established the se
`quence unambiguously. In all cases, tryptic peptides were
`sequenced in the automated sequencer, often using 4-sulfo
`phenylisothiocyanate (Pierce Chemical) on the lysine pep
`tides (15). In proteins Was and Jon, which contained cyano
`gen bromide fragments 1-34 and 35-85, the first 60 residues
`were established by automated sequencing of the intact
`heavy chain. Thus,
`in these two proteins, sequencing
`cyanogen bromide fragment 1-34 was superfluous since its
`composition agreed with the previously determined sequence.
`
`Fragment 35-85 of proteins Was and Jon was sequenced 35
`and 40 residues respectively; this, together with the C
`terminal tryptic peptides mentioned above gave the com
`plete sequence for this section. Ruidues 86-JSt: In proteins
`Tei, Was, Zap, and Tur the sequence was established by a
`continuous automated run of 45 steps from residue· 86 into
`the Cal domain. In both Zap and Tur, tryptic digestion and
`isolation of the resulting peptides confirmed a few question
`able positions. In protein Jon, residues 86-111 were obtained
`disulfide linked to residues 1-34 after cyanogen bromide
`
`2 of 4
`
`BI Exhibit 1092
`
`
`
`Proc. Nat. Acad. Sci.. U8A 71 (1974)
`
`digestion. This sequence was obtained by difference since
`residues 1-34 were known from the initial study of the
`intact heavy chain. Jon fragment 112-253 was subjected to a
`long sequencer run which definitely established the sequence
`of residues 112-121 as well as providing sequence data into
`the Cal domain.
`
`Ion Exchange Chromatography. An example of a Dowex
`50X4 chromatogram is shown in Fig. 1 for a tryptic digest
`of the Tur 1-85 fragment; 6.5-ml fractions were collected and
`0.1 ml of each fraction analyzed by the fluorescamine pro
`cedure initially described by Udenfriend et al. (16). Ninhydrin
`analysis was also performed after alkaline digestion of 0.5-ml
`aliquots. In most analyses, only the fluorescamine procedure
`was employed since it was much more sensitive. As shown in
`Fig. 1, 10 fractions were pooled. Each was subjected to amino
`acid analysis and several useful peptides were isolated and
`sequenced. T-1 (Asn Thr Leu Tyr Leu Gin Hsr) (79-85), T-3
`(Asn Asp Ser Lys) (75-78), T-7 (Gly Leu Gly Trp Val Ser
`Gly Arg) (46-53), and T-10 (Phe Thr Ile Ser Arg) (7(}-74).
`
`RESULTS AND DISCUSSION
`The amino-acid sequences of the variable regions of the
`five human myeloma proteins is displayed in Fig. 2. The
`variability-factor values determined by the method of Wu
`and Kabat (3) for these as well as all the other human V
`region sequences available is shown in Fig. 3. These calcula
`tions were based on 25 sequences from residues 1 to 34, 11
`sequences from residues 35 to 85, and 14 sequences from
`residues 86 to 122. Previous to this study there were only six
`published complete V region sequences, all but one (Nie) of
`the Val (Eu) or Vall (Daw, Cor, He, Ou) subgroup (for
`references see legend to Fig. 3). With five additional VBIII
`sequences the variability within and between subgroups can
`now be compared more meaningfully. In addition, with the
`availability of 11 complete sequences and several fragments,
`the Wu-Kabat plot becomes more statistically significant.
`A discussion of the sequences can be conveniently divided
`into those sections of the V region which are relatively con
`stant (1-30, 38-50, 69-83, 92-100, and 111-121), and the
`hypervariable regions (31-37, 51-68, 84-91, and 101-110).
`About 653 of the variable region of the heavy chain shows
`limited variation. In fact, there are 17 positions (143) which
`have been absolutely invariant in all human heavy chains
`regardless of their V region subgroup assignment. Certain
`positions are subgroup specific since at these positions all,
`or nearly all, of the mem hers of one subgroup have a particular
`amino acid, while members of the other subgroup contain a
`different amino acid. Utilizing the four available Vall
`proteins, positions 3, 9,16, 17, 19, 21, 23, 28, 29, 39, 42, 46, 50,
`80, 81 and 82 appear to be subgroup specific. As noted
`previously, no subgroup specific residues are identifiable in
`the C terminal portion of the V region (7). There are thus
`33 positions (273) in the V region which are either invariant
`or subgroup specific. A comparison with the published se
`quences of myeloma proteins (17, 18), pooled immuno
`globulins (5, 19, 20), and specifically purified antibodies (5,
`21-23) from lower species, indicates that the particular
`amino acids found at these positions are characteristic of a
`wide variety of mammals and have been faithfully conserved
`during evolution. Such residues may have eid.remely im
`portant attributes for variable region function such as, for
`
`Human Immunoglobulin Heavy Chains
`GAP
`GAP
`GAP
`
`A
`
`W
`
`AW
`
`847
`
`70
`60
`50
`40
`30
`20
`10
`10 20 30 40 50 60 70 80 90 IOO 110 120
`o..11111•·-�
`POSITION
`Fig. 1 as well as several other published sequences (36) deter
`
`FIG. 3. Variability-factor values for the sequences shown in
`
`mined according to the method of Wu and Kabat (3).
`
`example, the provision of a distinct backbone structure which
`is crucial to antibody function.
`As can be seen on inspection of Figs. 2 and 3, about a third
`of the heavy chain variable region can be considered "hyper
`variable." These regions deserve special consideration be
`cause of their specific implications for the formation of the
`antibody combining site, the nature of idiotypic determinants,
`and various theoretical conceptions of the origin of antibody
`diversity.
`In light chains, affinity labels have been localized near or
`within hypervariable regions (23-25), thus providing direct
`support for the general concept that hypervariable regions
`participate directly in the antibody-combining site. For the
`heavy chain, recent work has also been consistent with this
`idea. For example, Ray and Cebra localized affinity labels to
`the first (31-37) and the fourth (101-110) heavy chain
`hypervariable regions (26), Haimovich et al. (27) localized
`an affinity label to residue 54 of the mouse myeloma protein
`315 (which has anti-dinitrophenol activity), and Press and
`coworkers have localized affinity labels at or near the fourth
`hypervariable region in rabbit antibodies (28). Therefore,
`although the primary structure and affinity labeling studies
`of these proteins was being carried out independently, and
`even in different laboratories in many instances, there is a gen
`eral implication from the experimental observations that the
`same regions of the molecule which show the highest degree
`of sequence variation are near or part of those particular re
`gions of the heavy chain where affinity labels have been local
`ized.
`A second piece of evidence linking the antibody combining
`site to the hypervariable regions has come fr.om comparisons
`of sequences obtained from pooled immunoglobulin heavy
`chains with those of specifically purified antibody heavy
`chains. Sequence analyses of rabbit (29), guinea pig (5), and
`other mammalian heavy chain pools (19), indicate that a
`definitive sequence cannot be obtained within those regions
`which have been identified as hypervariable on the basis
`of studies with myeloma proteins. However, when specifically
`purified antibodies are studied, a single major sequence can be
`determined, as has been shown most definitively by Cebra and
`his coworkers (5).
`Additional support for the functional significance of hyper
`variable regions has been provided by current notions con
`cerning the tertiary structure of the immunoglobulin mole
`cule. Crystallographic analysis of human immunoglobulins
`has now advanced to the point where it has been possible
`to assign the residues which may line a "pocket" within the
`
`3 of 4
`
`BI Exhibit 1092
`
`
`
`848
`
`Immunology: Capra. and Kehoe
`
`Proc. Nat. Acad. Sci. USA 71 (1974)
`
`are involved in the idiotypic
`
`play a crucial role in the antigen binding function of immuno
`
`8. Kunkel, H. G. (1954) Method.8 Bi-Ochem. Anal. 1, 141-155.
`immunoglobulin molecule which presumably represents the
`
`
`9. Fleischman, J. G., Porter, R. R. & Press, E. M. (1963)
`
`combining site itself (30, 31). In each instance, the major
`Biochem. J. 88, 220-228.
`
`residues which line the pocket are associable with hyper
`10. Capra, J. D. & Kunkel, H. G. (1970) J. Clin. Invest. 49,
`
`
`variable regions. In addition, the conformational models
`610-621.
`11. Gross, E. & Witkop, B. (1962) J. Biel. Chem. 237, 1856-
`
`generated by the nearest neighbor calculations of Kabat and
`1863.
`
`W11 (32) place hypervariable regions in close as.sociation
`12. Chuang, C. Y., Capra, J. D. & Kehoe, J.M. (1973) Nature
`with the putative combining site.
`244, 15&-160.
`There is also growing evidence that at least some of the
`13. Edman, P. & Begg, F. (1967)Eur. J. Bi-Ochem. l, 80-91.
`14. Capra, J. D. & Kunkel, H. G. (1970) Proc. Nat. Acad. Sci.
`hypervariable regions
`determi
`USA67,87-92.
`
`nants of myeloma proteins and antibodies. Cross idiotypic
`15. Inman, J. K., Hannon, J.E. & Appella., E. (1972) Bi-Ochem.
`
`
`specificity among the cold agglutinins (33) and the anti
`Biophys. Res. Commun. 46, 2075-2081.
`
`gamma globulins (34) is believed to be related to the com
`S., Stein, S., Bohlen, P., Dairrnan, W., Leim
`16. Udenfriend,
`
`bining site. In at least two distinct anti-gamma globulin
`gruber, W. & Wiegele, M. (1972) Science, 178, 881-882.
`17. Kehoe, J. M. & Capra, J. D. (1972) Proc. Nat. Acad. Sci.
`molecules,
`
`
`the hypervariable regions show striking sequence
`USA 69, 2052-2055.
`
`similarities (7, 35).
`M. & de Preval, C. (1972) Eur. J.
`
`18. Bourgois, A., Fougerea.u,
`The genetic origin of hypervariable regions remains un
`
`Bicchem. 24, 446-455.
`
`clear. The variability within heavy chain hypervariable
`19. Capra, J. D., Wasserman, R. W. & Kehoe, J. M. (1973) J.
`regions seems more marked than that of light chain hyper
`Exp. Med. 138, 41Q-427.
`Of the 11 proteins which have now had their
`20. Mole, L. E., Jackson, S. A., Porter, R.R. & Wilkinson,
`J. M.
`
`variable regions.
`(1971) Bi-Ochem. J. 124, 301-318.
`
`V regions completely sequenced, if one considers
`the 43
`
`21. Fleischman, J.B. (1973) Immunochemistry 10, 401-407.
`
`
`hypervariable positions of the heavy chain, there are no two
`22. Strosberg, A. D., Jaton, J. C., Capra, J. D. & Haber, E.
`proteins which have more than 10 residues in common. It
`(1972) Fed. Proc. 31, 771.
`seems likely that hundreds, or even thousands, of proteins
`23. Goetz!, E. J. & Metzger, H. (1970) Bi-Ochemistry 9, 1267-
`1278.
`would have to be sequenced in order to find two which are
`24. Franek, F. (1971) Eur. J. Bicchem. 19, 176-183.
`
`
`
`identical if no preselection bias (such as selection by idiotypic
`25. Chesebro, B. & Metzger, H. (1972) Bi-Ochemistry 11, 766-
`
`
`antisera or for combining specificity) is involved. This im
`771.
`plies either that there are a very large number of germ
`26. Ray, A. & Cebra, J. J. (1972) Bicchemistry 11, 3647-3657.
`
`27. Haimovich, J., Eisen, H. N., Hurwitz, E. & Givol, D. (1972)
`
`line genes or that somatic processes are necessary to explain
`Bicchemistry 11, 2389-2397.
`
`
`the diversity in the heavy chain hypervariable regions.
`28. Press, E. M., Fleet, G. W. J. & Fisher, C. E. (1971) in
`
`
`Regardless of their origin, the hypervariable regions clearly
`Progress in Immunology, ed. Amos, B. (Academy Press,
`New York), p. 233.
`globulin molecules.
`
`29. Cebra., J. J., Givol, D. & Porter, R.R. (1968) Bi-Ochem. J.
`107,69-70.
`We thank D r. Henry Kunkel for the subclass and genetic
`tw
`30. Schiffer, M., Girling, R. L., Ely, K. R. & Edmundson, A. B.,
`ing of the myeloma. proteins.
`Bonnie Gerber, Ellen Bogner and
`personal communication.
`technical assistance. This
`Donna Atherton rendered invaluable
`31. Poljak, R. J. (1973) Abstracts of Ninth Internaticnal Con
`work was aided by grants from the National Science Foundation
`(GB 17046) and the U.S. Public Health Service (Al 09810) and a
`gress of Bicchemistry-Stockholm, p. 31.
`32. Ka.bat, E. A. & Wu, T. T. (1972) Proc. Nat. Acad. Sci. USA
`
`Grant-in-Aid from the New York Heart Association. J.D.C. is the
`69, 96Q-964.
`of National Institutes of Health Career Development
`recipient
`33. Williams, R. C., Kunkel, H. G. & Capra, J. D. (1968)
`
`Award 6-K4-GM-35, and J.M.K. is an Established Investigator
`of the American Heart Association.
`Science 161, 379-381.
`34. Kunkel, H. G., Agnello, V., Joslin, F. G., Winchester, R. J.
`& Capra, J. D. (1973)J. Exp. Med. 137, 331-342.
`35. Capra, J. D. & Kehoe, J.M., unpublished observations.
`(a) Edelman, G. M., Cunningham, B. A., Gall, W. E., Gott
`36.
`U., & Waxdal, M. J. (1969) Proc.
`lieb, P. D., Rutishauser,
`Nat. Acad. Sci. USA 63, 78-85; (b) Press, E. M. & Hogg, N.
`M. (1970) Bicchem. J. 117, 641--660; {c) Cunningham, B. A.,
`U. & Edelman, G. M. (1969)
`Pflumro, M. N., Rutisha.user,
`Proc. Nat. Acad. Sci. USA 64, 997-1003; (d) Wikler, M.,
`
`Kohler, H., Shinoda, T. & Putnam, F. W. (1969) Science 163,
`
`
`75-78; (e) Ponstingl, H., Schwarz, J., Reichel, W. & Hilsch
`mann, N.(1970) Hoppe-Seyler's Z. Physic!. Chem. 351, 1591.
`
`C. (1967) Nature 216, 330-332.
`1. Milstein,
`
`2. Franek, F. (1969) Symposium on Developmental Aspects of
`
`Antibody Format:iun and Structure, Prague.
`3. Wu, T. T. & Kabat, E. A. (1970) J. Exp. Med. 132, 221-250.
`4. Capra, J. D., Kehoe, J. M., Winchester, R. & Kunkel, H. G.
`(1971) Ann. N.Y. Acad. Sci. 190, 371-381.
`5. Cebra, J. J., Ray, A., Benjamin, D. & Birshtein, B. (1971)
`Progr. Immunol. (First International Congress of Immunol
`ogy), 269-284.
`D. (1971) Nature New Biel. 230, 61-63.
`6. Capra, J.
`7. Kehoe, J. M. & Capra, J. D. (1971) Proc. Nat. Acad. Sci.
`USA 68, 2019-2021.
`
`4 of 4
`
`BI Exhibit 1092
`
`