throbber
Volume 6 Number 101979
`
`Nucleic Acids Research
`
`Sequence analysis of cloned cDNA encoding part of an immunoglobulin heavy chain
`
`John Rogers, Patrick Clarke and Winston Salser
`
`Molecular Biology Institute and Department of Biology, University of California, Los Angeles,
`CA 90024, USA
`
`Received 8 March 1979
`
`ABSTRACT
`
`plasmid pH21-1 consists of mouse-derived complementary
`recombinant
`The
`DNA (cDNA) in the
`pMB9.
`plasmid
`insertion
`has
`E. coli
`The mouse
`been
`encodes the CH3 domain and half the CH2 domain of
`completely sequenced,
`and
`The predicted amino acid sequence differs
`the immunoglobulin yl heavy chain.
`at several positions from that previously published for
`this
`protein.
`The
`that in some other eukaryotic messenger
`pattern
`resembles
`of
`codon
`usage
`RNAs.
`secondary
`A computer program has been used
`to
`predict
`the
`optimum
`structure for the mRNA encoding the CH3 domain and the inter-domain junction.
`
`INTRODUCTION
`
`of
`
`(1-4)
`cloning
`elegant techniques
`development
`for
`With
`the
`DNA
`(cDNA),
`it has become possible to
`complementary
`eukaryotic
`to
`messengers
`both
`prepare large quantities
`in
`for
`such
`of
`many
`cDNAs
`sequence
`use
`analysis,
`locating the same sequences in cellular DNA and RNA.
`and
`in
`The
`immunoglobulin heavy chain genes are of particular interest, since
`only
`not
`do
`undergo
`they
`diversity and joining of variable and
`of
`generation
`the
`immunoglobulins,
`constant regions which so far appear to be peculiar to
`but
`also
`members of a developmentally regulated multigene family, and
`they
`are
`the ancestral heavy chain gene apparently arose by tandem
`duplication
`of
`a
`still smaller genetic unit, the immunoglobulin domain.
`pH21-1,
`A recombinant DNA
`plasmid,
`constructed
`has
`been
`containing
`from the heavy chain messenger of the IgGl-producing mouse myeloma
`sequences
`(R. Wall,
`MOPC21
`K. Toth,
`G. Paddock,
`Higuchi,
`R.
`and
`W. Salser,
`unpublished).
`Here
`we report the complete restriction map and DNA sequence
`of the
`mouse-derived
`insert
`cloned
`in
`this
`plasmid.
`contains
`It
`459
`nucleotides
`encoding
`of the yl constant region.
`C-terminal 1 domains
`the
`Some characteristics of the coding sequence are discussed.
`
`C) Information Retrieval Limited 1 Falconberg Court London Wl V 5FG England
`
`3305
`
`Genzyme Ex. 1033, pg 871
`
`

`
`Nucleic Acids Research
`
`MATERIALS AND METHODS
`
`Construction of pH21-1
`
`Construction
`series of cDNA clones containing K light chain mRNA
`of
`a
`sequences has been reported (5).
`The mRNA was
`from
`solid
`tumors
`of
`the
`IgGl-producing
`myeloma
`(6).
`In these same experiments, cDNA
`MOPC21
`mouse
`clones containing heavy chain mRNA sequences were
`constructed
`by
`the
`same
`methods using the 16-17S fraction of the mRNA, in which the principal species
`the heavy chain messenger (7).
`is
`The cDNA was inserted into the EcoRI site
`of plasmid pMB9, by means of poly-dA and poly-dT "tails" on the respective 3'
`ends of insert and plasmid, and the recombinant plasmids were
`cloned
`in
`E.
`(R.
`coli
`Wall,
`Paddock,
`and
`Toth,
`K.
`Higuchi,
`G.
`R.
`W. Salser,
`unpublished).
`One clone gave a distinct peak of hybridization
`16-17 S
`with
`MOPC21
`communication)
`(R.
`DeBorde,
`Wall
`mRNA
`and
`personal
`D.
`and was
`designated pH21-1.
`
`Restriction Analysis
`
`(8)
`Plasmid DNA was prepared as
`(9).
`in
`refs.
`and
`EndoR.TaqI
`was
`prepared
`S. Hendrich
`by
`using an unpublished technique of M. Komaromy.
`It
`was used at 650C in 10 mM HEPES pH 8.4,
`6 mM MgC12, 6 mM a-mercaptoethanol ,
`ml-1
`(NH4)2s04,
`25
`mM
`100 pg
`gelatin.
`Enzymes
`HaeIII and HincII
`were
`purchased from New England Biolabs, and AluI, HhaI and HinfI
`from
`Bethesda
`Research
`Laboratories; they were used as suggested by the suppliers.
`Polyacrylamide
`electrophoresis
`gel
`of
`restriction
`fragments
`for
`preparative
`or
`analytical
`purposes was carried out in 20 x 40 cm gels made
`with 6% acrylamide, 0.2% methylene bisacrylamide,
`12% glycerol,
`in
`running
`buffer.
`Running
`buffer was 50 mM Tris borate, 1 mM EDTA (TBE).
`For strand
`4%
`separation
`the
`gel
`consisted
`of
`acrylamide
`plus
`0.14%
`methylene
`bisacrylamide
`in
`the
`running
`buffer,
`which
`36
`was
`mM Tris base, 30 mM
`NaH2PO4, 1 mM EDTA.
`Samples for 6%
`gels
`were
`loaded
`in
`restriction
`the
`¼4 volume
`buffer,
`diluted
`necessary,
`if
`plus
`(0.03%
`of
`dye
`solution
`bromphenol blue plus xylene cyanol in
`20%
`glycerol).
`Samples
`for
`strand
`separation were prepared in 90 pl of the same dye solution made up to 300 ll
`0.3 M NaOH, and heated at 370C for 3 minutes
`immediately
`before
`loading.
`After
`electrophoresis,
`DNA
`fragments
`were
`visualized by ethidium bromide
`UV
`staining and
`fluorescence,
`or,
`end-labelled,
`if
`autoradiography.
`by
`Elution of
`DNA
`fragments
`was
`carried
`out
`as
`in ref. (10),
`except that
`
`3306
`
`Genzyme Ex. 1033, pg 872
`
`

`
`Nucleic Acids Research
`
`incubation of the crushed gel in elution solution was for 2 days at 420C.
`
`DNA Sequence Analysis
`
`Restriction fragments purified from digests of 150-300 pg
`the
`of
`6-kb
`pH21-1 were treated with bacterial alkaline phosphatase (Worthington
`plasmid
`370C.
`Biochemical, grade f) in 10 mM Tris-HCl
`pH 8.0,
`for
`30
`minutes
`at
`phenol-extracted
`thrice,
`ether-extracted
`twice,
`This
`mixture
`was
`ethanol-precipitated, redissolved in 5 mM Tris pH 9.5, 0.1 mM spermidine, and
`0.01 mM EDTA.Na3, and then denatured by boiling.
`32P
`5' end labelling with
`as in ref. (10), using Tris-HCl rather than sodium glycine
`carried
`was
`out
`T4 polynucleotide kinase was purchased from PL Biochemicals,
`as buffer.
`and
`32P
`(ICN Pharmaceuticals)
`(10).
`ethanol
`After
`made
`from
`y P-ATP
`was
`i
`strand
`by
`separated
`labelled
`ends
`of
`the
`precipitation, the
`DNA were
`separation, or by another restriction cleavage and electrophoretic separation
`of the fragments.
`we used four of the base-specific cleavage reactions of
`sequencing
`For
`Maxam and Gilbert (10), entitled G>A, A>C, T+C,
`and
`fifth
`C.
`A
`reaction
`(A. Maxam, personal communication).
`cleaving at A+G was performed as follows
`End-labelled
`and 1 pg carrier DNA were made up to 30 p1 in 17 mM sodium
`DNA
`900C.
`citrate pH 4.0 and heated for 10 minutes
`2 p1
`1 M NaOH was
`of
`at
`added
`and
`a capillary and heated for a further
`sealed
`30
`the mixture
`in
`minutes at 900C. 20 p1 of urea-dye mixture (10)
`added
`was
`then
`and
`the
`sample was ready for loading on a ladder gel.
`Ladder gels (20% acrylamide, 0.7% methylene bisacrylamide, 7 M urea
`in
`TBE)
`loaded and run as in ref. (10).
`In our later runs we used
`were
`made,
`thin gels (11), of thickness 0.32 mm instead of the regular 1.6 mm, and found
`considerably improved resolution of bands.
`Ladder gels were autoradiographed
`at -70 C on Cronex 4 X-ray film with Dupont Hi-plus intensification screens.
`
`Secondary Structure Prediction
`
`The most stable secondary structure
`represented
`RNA
`the
`for
`the
`(12).
`was predicted using the computer program of Studnicka et al.
`sequence
`This program will examine a large number of
`possible regions of base pairing
`find
`combination of regions which forms the most stable structure.
`that
`to
`The program
`begins
`by
`cataloguing
`all
`possible
`regions
`of
`2
`or
`more
`5231 regions in this "primary region
`consecutive
`base
`pairs.
`There
`were
`prohibitively
`catalogue" for the sequence
`It would
`be
`considered
`here.
`of these regions in a single computation cycle.
`expensive
`consider
`all
`to
`
`by
`
`3307
`
`Genzyme Ex. 1033, pg 873
`
`

`
`Nucleic Acids Research
`
`Therefore we rank the regions
`the computation
`and
`carry
`out
`in
`several
`The ranking uses a weighting function which is the sum of the energy
`cycles.
`divided
`by
`the square root of its length, and the
`of
`itself,
`region
`the
`energy of the best "local structure" which can be obtained by
`combining
`the
`region with all neighboring regions which are separated from it by less than
`10 nucleotides on either strand (W. Salser and L. Nagy,
`preparation ).
`in
`"branch
`migration" procedure is
`Where two primary regions would overlap,
`a
`used
`to
`determine
`most stable non-overlapping combination of parts of
`the
`the two primary regions.
`150
`The
`regions
`with the most favorable weighting factors were chosen
`for the first cycle and the 100 most stable
`regions
`combinations
`these
`of
`computed,
`were
`the
`energies being calculated using the rules given in ref.
`13.
`All these structures shared certain features which permitted us to break
`up the computation into three smaller jobs for the second
`cycle.
`In
`this
`second cycle
`all regions
`down
`to a weighting factor of 39 were
`considered
`(the
`regions).
`equivalent
`900
`of
`the
`top
`5231
`of
`original
`the
`The
`alternative
`structures for the second cycle were in turn examined for common
`features;
`these allowed us to subdivide the sequence into eight jobs for the
`final
`cycle, in which all regions of two or more base pairs were considered.
`In theory
`it
`would
`be
`possible
`to
`improve
`the
`structure
`slightly
`by
`considering
`single
`G-C
`pairs.
`For example, according to our base-pairing
`rules (13), the structure
`5' CUUC-GU
`is more
`stable
`than
`the
`computed
`3' GA-GUCA
`by 2 kcal.
`This additional refinement of the structure
`
`structure
`
`5' CUUCGU
`3' GAGUCA
`was not performed.
`
`Biosafety Precautions
`
`P3 physical containment was used throughout for
`growth
`of
`transformed
`bacteria.
`The
`initial isolation of pH21-1 had been carried out in E.
`coli
`X1849, an EK1 host, in compliance with the Asilomar Guidelines in
`effect
`at
`When the NIH Guidelines (14) were issued, pH21-1 was transferred
`that
`time.
`X1776,
`to
`E. coli
`an
`EK2
`host,
`and
`all
`subsequent
`experiments
`were
`conducted in accordance with those Guidelines.
`
`3308
`
`Genzyme Ex. 1033, pg 874
`
`

`
`Nucleic Acids Research
`
`RESULTS
`
`Restriction Analysis
`
`were tested in parallel digests of pH21-1
`restriction
`Various
`enzymes
`and pMB9 DNA.
`Since the mouse sequence was inserted at the single EcoRI site
`in pMB9, it was expected that comparison of the digests would show
`band
`one
`replaced by one or more bands unique to pH21-1.
`pMB9
`unique
`which
`to
`was
`This was found to be the case,
`tested
`and
`all
`the
`enzymes
`indicated
`an
`inserted segment of about 560 bp in length.
`Mapping was helped by the observation that each pH21-1 digest
`exhibited
`Consistent values for the size
`one pair of submolar bands not shown by pMB9.
`doublet was considered to
`of
`could
`deduced
`only
`be
`if
`the
`insert
`the
`probably
`fragment
`for each
`represent a
`single
`enzyme.
`restriction
`It
`44-bp
`resulted from
`single
`approx.
`deletion
`an
`at
`a
`in
`site
`a minor
`population of the plasmid DNA, since on preparative gels the larger band
`was
`homogeneous,
`smaller band was seen to be only one of up to seven
`while
`the
`(Figure
`1).
`equally spaced bands, the others being
`faint
`such
`Since
`no
`seen in pMB9, deletions of various sizes probably occurred
`heterogeneity
`is
`in the mouse insert or the A.T joints. There is
`repetitive
`no
`sequence
`in
`below),
`(see
`the
`which
`mouse
`insert
`and
`could
`account
`for
`this
`the
`restriction fragment affected always contained the left-hand A.T joint, which
`therefore could be the site of the deletions.
`possible
`to locate some of the restriction sites in the insert
`was
`It
`
`__g5F,+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~.;:.. ..'........9t
`
`_
`
`625
`520
`
`434
`398
`
`260
`
`pg),
`1.
`(300
`Preparative
`Figure
`pH21-1
`TaqI-digested
`of
`gel
`showing
`heterogeneity in one band.
`The uppermost and brightest member of the set
`is
`372
`approximately
`bp in length.
`The prominent doublets above and below the
`set are fragments from the pMB9 parts of the plasmid.
`The side lane contains
`HaeIII-digested pMB9, with fragment sizes marked in basepairs.
`
`3309
`
`Genzyme Ex. 1033, pg 875
`
`

`
`Nucleic Acids Research
`
`without further digests using the positions given by Maniatis et al. (2)
`and
`I. Cummings (unpublished) for the
`restriction
`pMB9
`sites
`in
`the
`nearest
`EcoRI
`pH21-1-specific
`seen in MboII
`site.
`Thus
`of
`the
`four
`fragments
`digests, the 1100-bp fragment must cover the right-hand insert-pMB9 junction,
`the 460-bp fragment with its 420-bp minor band
`must
`left-hand
`cover
`the
`junction,
`and
`the 116-bp and 75-bp fragments must be internal.
`Knowing the
`positions of some sites
`and
`the
`general
`location
`deletion-prone
`of
`the
`region,
`it
`then
`to locate most of the TaqI, HaeIII and AluI
`possible
`was
`sites from the single-enzyme digests.
`The map was refined by
`isolating
`the
`pH21-1-specific TaqI fragments and digesting them with HaeIII, AluI, and
`two
`HaeIII plus MboII.
`The resulting map was confirmed by subsequent
`sequencing
`discovery of an extra MboII site near the middle.
`except for
`the
`The final
`map is given in Figure 2.
`
`W
`
`585-nucleotide insertion
`
`al
`~E" 0 P4
`
`- vF21
`
`a
`
`0)
`
`~4
`0
`
`164
`
`"
`
`co
`
`i(
`)-' -"
`93
`79
`32 128
`1U 71
`AT
`joint \0 o o
`4-- ,.---,4-
`
`CJC
`
`92
`
`1241
`
`41
`
`34
`
`n
`
`q
`
`- AT'
`joint
`
`4-r-
`
`I vE
`4-- -m
`co ~B -
`0
`51~ ~~~0 ~0 ~z
`cliWnosoo to
`)sts
`
`2.
`Figure
`Restriction
`map of the insert and surrounding region in pH21-1.
`Top line: Restriction map of the pMB9 sequences around
`the
`EcoRI
`site,
`as
`deduced
`from analysis of the insert-containing HhaI fragment of pH21-1.
`All
`sites for the enzymes indicated are shown. Distances between the sites are in
`nucleotides.
`Middle line: Restriction map of the sequences inserted into the
`EcoRI
`site
`pMB9.
`of
`All
`sites
`for
`the
`enzymes
`indicated
`are shown.
`Distances between sites are in
`nucleotides.
`The
`numbers
`below
`the
`line
`denote the codons in or after which the cuts are made.
`Arrows indicate where
`sequences
`were
`obtained.
`We
`did
`not
`sequence
`continuously
`across all
`restriction sites, but the continuity of the coding sequence implies that
`no
`nucleotides
`were
`omitted.
`Bottom line: Additional sites inferred from the
`nucleotide sequence, with the numbers of the codons
`at
`which
`the
`cuts are
`MnlI and
`expected to be made.
`HphI,
`like
`MboII, make cuts offset from the
`recognition sequences by 5-10 nucleotides.
`There are
`EcoRI,
`no
`sites
`for
`BamHI, PstI, HindIII, HhaI, HinfI, HpaII, or HaeII in the insert.
`
`3310
`
`Genzyme Ex. 1033, pg 876
`
`

`
`':)t./*
`t'*.:.
`r..,.
`*i.wi..
`..F.-Ajiik
`upikA.i.§a...:
`*w....SF......w.,,_,
`.:..=,,,.4
`**.*...:F
`,,-..
`_..
`s.sw
`s_
`._._ '|-w :ww^w.' X * w
`
`Nucleic Acids Research
`
`preparing the HhaI fragment containing the insert for sequencing,
`While
`we did further restriction analysis in its outer parts, so as to
`extend
`the
`Since parallel digests of whole
`(Figure 2).
`restriction map for pMB9
`known
`pH21-1 and pMB9 never showed any differences in bands not covering the
`EcoRI
`site, this map is believed to represent the "wild-type" pMB9 structure.
`
`DNA Sequence Analysis
`
`restriction sites used for Maxam-Gilbert sequence analysis (10) are
`The
`Representative 'ladders' are shown in Figure
`3,
`and
`indicated in Figure 2.
`It covers 459 nucleotides between
`complete DNA sequence is in Figure 4.
`the
`amino acids
`439
`of
`287
`the A.T joints, and encodes
`the
`in
`sequence
`to
`This includes half of the CH2 domain (amino acids 228-334)
`Adetugbo (15,16).
`.~~~~~~~~~
`
`b) G GA A TC (
`
`._
`
`G GA A TC
`
`C
`
`. _ _ _ _
`
`_ : ..
`
`*s wF vi*Si &
`
`& ^ ...
`
`iX * -¢.$..3:;
`
`G GA A TC
`
`C
`
`_
`
`...
`
`::...:
`
`.:
`
`*::
`..
`
`A
`
`*-
`\
`
`,.
`
`X
`
`,
`
`4
`
`a:..
`
`A :.i
`
`*:
`
`-
`
`*
`
`.:gR.
`
`...
`
`.t
`
`- - ; :
`
`_ *_
`_ _
`
`** *
`
`w-
`
`_
`
`..
`
`...
`
`..
`
`...t *
`
`*
`
`...
`
`....~~~~~~~~~~~~~~~~~~~~~~~~~~~~
`S
`I
`
`.._.
`
`..
`
`.sc
`
`.*
`
`.... :.
`*::-.-::
`
`*-
`
`,az
`
`re
`
`t
`
`_ .:
`-
`
`"Ladders" showing sequences from pH21-1.
`Sequences are read
`from
`Figure 3.
`Asterisks indicate codons differing
`are numbered.
`Codons
`bottom
`top.
`to
`(a) Complementary strand with 5'
`from the reported amino acid sequence (16).
`(b)
`Coding
`3' cut with TaqI; thin gel.
`label at HaeIII site at codon 353,
`strand covering the same region, with 5' label at TaqI site at codon 326,
`3'
`(c)
`Coding strand with 5' label at MboII
`cut with MboII; regular gel.
`site
`3' cut with AluI; thin gel.
`at codon 373,
`
`3311
`
`Genzyme Ex. 1033, pg 877
`
`

`
`Nucleic Acids Research
`
`287
`Glu Glu Gln Phe
`5'-(T) -GAG GAG CAG TTC
`n
`
`291
`Asn Ser Thr Phe
`AAC AGC ACT TTC
`
`Arg Ser Val Ser
`CGC TCA GTC AGT
`
`*
`Glu Leu
`GAA CTT
`40
`
`301
`Pro Ile Met His Gln Asp Trp Leu Asn Gly
`CCC ATC ATG CAC CAA GAC TGG CTC AAT GGC
`60
`
`311
`Lys Glu Phe Lys Cys
`AAG GAG TTC AAA TGC
`80
`
`Arg Val Asn Ser Ala
`AGG GTC AAC AGT GCA
`100
`HincII
`
`321
`Ala Phe Pro
`GCT TTC CCT
`AluI
`
`331
`Ala Pro Ile Glu Lys Thr Ile Ser Lys Thr Lys Gly
`GCC CCC ATC GAG AAA ACC ATC TCC AAA ACC AAA GGC
`140
`TaqI
`
`*
`*
`Arg Pro Lys Ala Pro
`AGA CCG AAG GCT CCA
`160
`
`341
`Gln
`CAG
`
`Val Tyr Thr Ile Pro Pro Pro Lys Glu
`GTG TAC ACC ATT CCA CCT CCC AAG GAG
`180
`
`Asp Phe Phe Pro Glu Asp
`GAC TTC TTC CCT GAA GAC
`MboII
`MboII
`
`351
`Gln Met Ala Lys
`CAG ATG GCC AAG
`HaeIII
`
`371
`Ile Thr Val Glu
`ATT ACT GTG GAG
`260
`
`Asp Lys Val Ser Leu Thr
`GAT AAA GTC AGT CTG ACC
`220
`
`*
`*
`Trp Gln Trp Asn
`TGG CAG TGG AAT
`
`Gly Gln
`GGG CAG
`280
`
`361
`Cys Met Ile Thr
`TGC ATG ATA ACA
`230
`*
`383
`*
`Pro Ala Glu Asn
`CCA GCG GAG AAC
`
`391
`Tyr Lys Asn Thr Gln Pro Ile Met Asp Thr Asp Gly Ser Tyr Phe Val
`TAC AAG AAC ACT CAG CGC ATC ATG GAC ACA GAT GGC TCT TAC TTC GTC
`300
`320
`340
`
`401
`Tyr
`TAC
`
`421
`Val
`GTG
`
`Ser Lys Leu Asn
`AGC AAG CTC AAT
`AluI
`
`411
`Val Gln Lys Ser Asn Trp Glu Ala Gly
`GTG CAG AAG AGC AAC TGG GAG GCA GGA
`360
`380
`MboII
`
`Asn Thr Phe Thr Cys Ser
`AAT ACT TTC ACC TGC TCT
`400
`
`431
`Leu His Glu Gly Leu His Asn His His Thr Glu Lys Ser
`TTA CAT GAG GGC CTG CAC AAC CAC CAT ACT GAG AAG AGC
`430
`HaeIII
`MboII
`
`Leu Ser His Ser
`CTC TCC CAC TCT
`
`439
`Pro
`CCT-(A)-35'
`459n
`
`Nucleotide sequence of the cDNA insert in pH21-1.
`Only the coding
`Figure 4.
`strand is shown, with the encoded amino acid sequence above.
`Positions where
`(16)
`by
`marked
`this differs from the
`reported
`amino
`acid
`are
`sequence
`Additional
`underlining
`asterisks.
`Restriction
`sites
`underlined.
`are
`indicates less certain parts of the sequence.
`
`and
`
`all
`
`of
`
`the
`
`CH3
`
`domain
`
`(amino acids
`
`335-440)
`
`except the C-terminal
`
`glycine.
`30
`first
`of the sequence, and particularly the first
`nucleotides
`The
`ten, are not entirely certain; they were
`far
`from
`useful
`sites and could only be read knowing the amino acid sequence.
`The individual nucleotides in the sequences of the A.T joints were only
`partially resolved, but measurements of the complete poly-A blocks on ladders
`indicated lengths of
`76 +15
`bp on the left and 50 +15 bp on the right.
`
`any
`
`restriction
`
`The
`
`left-hand poly-A tail contains at least two
`transversions in A.T joints have been observed previously (2).
`
`T
`
`residues.
`
`Similar
`
`A -> T
`
`3312
`
`Genzyme Ex. 1033, pg 878
`
`

`
`Nucleic Acids Research
`
`DISCUSSION
`
`Amino Acid Substitutions
`
`published amino
`the
`The DNA sequence implies several substitutions in
`(15,16), as listed in Table I.
`In all but the first, the DNA
`acid
`sequence
`possible
`for
`these
`certain.
`There
`three
`sequence
`appears
`sources
`are
`discrepancies.
`(1) In vivo variation.
`of
`It is conceivable that since the separation
`the
`line used to make the cDNA from the MOPC21 (P3K) line used for
`MOPC21
`tumor
`protein sequencing in Cambridge, one or both
`accumulated
`lines
`had
`either
`switched to expression of a different y chain gene.
`'somatic'
`mutations
`or
`Somatic mutations have been documented in subclones
`of
`the
`Cambridge
`line
`(17-20),
`gene expression have been induced in another
`and
`switches
`in
`y2
`myeloma line in vitro (21); serological evidence has
`presented
`been
`for
`a
`second yl gene in
`(22).
`events are detected only
`these
`mice
`However,
`in
`single cloned cells, not in the cell population as a whole.
`Moreover,
`the
`pH21-1 with other y chains (see below) makes such an
`decreased
`homology
`of
`explanation unlikely.
`(2)
`8
`suggested
`DNA cloning errors.
`the fact that 5 of the
`by
`is
`This
`substitutions would abolish amino acid identities with other y chains
`(Table
`I).
`In particular, valine-296 is conserved in every y, a, and c chain known,
`
`TABLE I.
`
`Substitutions in yl amino acid sequence
`
`Position
`
`DNA sequence:
`
`AA sequence:
`
`Other AA sequences:
`
`Codon
`
`AA
`
`Codon
`
`AA
`
`Mouse Human Rabbit G-pig
`y2a
`yl
`y2
`y
`
`296
`299
`336
`338
`377
`378
`381
`382
`
`*(TCA)
`GAA
`AGA
`AAG
`TGG
`AAT
`CCA
`GCG
`
`(S)
`E
`R
`K
`W
`N
`P
`A
`
`V
`GTN
`A
`GCN
`K
`AAR
`R
`AGR
`TCN/AGY S
`GAY
`D
`GCN
`A
`P
`CCN
`
`V
`A
`S
`R
`N
`N
`T
`E
`
`V
`V
`Q
`R
`S
`N
`E
`P
`
`V
`T
`E
`L
`K
`D
`A
`E
`
`V
`V
`A
`R
`S
`N
`P
`V+-S+-E
`
`Numbering and mouse yl amino acid sequence are from ref. (16). Other amino
`AA = amino acid; + = insertion;
`acid sequences are from refs. (15, 25).
`* = DNA sequence uncertain.
`
`3313
`
`Genzyme Ex. 1033, pg 879
`
`

`
`Nucleic Acids Research
`
`Our DNA sequence is uncertain at that
`
`although not in a p chain (15, 23-28).
`point but appears not to encode valine.
`(3)
`substitutions
`of
`4
`the 8
`involve
`Protein sequencing errors.
`the
`involves
`interchange of pairs of nearby amino acids;
`acid-to-amide
`one
`an
`one (codon 377) would replace a serine which was not definitely
`and
`change;
`present in the peptide sequenced (15) with a tryptophan which could have been
`present in greater molarity than estimated (15).
`the pH21-1 sequence was completed, the corresponding chromosomal
`Since
`yl gene has been cloned and partially sequenced (29).
`exchange between
`The
`positions 336 and 338 is confirmed, and additional exchanges are found in the
`CH1
`domain
`(29).
`Both the nature of the "substitutions" in pH21-1, and the
`presence of the same and similar substitutions in the chromosomal gene, imply
`analyses
`that most of them represent errors in protein sequencing.
`the
`In
`that follow, the DNA sequence is taken to be correct unless otherwise stated.
`
`Comparison with Previous Sequence Data
`
`Some sequence data on the MOPC21 yl mRNA has already been published
`in
`form of a ribonuclease TI oligonucleotide catalogue (30).
`the
`In that work,
`the Ti oligonucleotides were not sequenced,
`but secondary digestion products
`were aligned so that where possible they matched the amino acid sequence.
`Of
`the
`ten
`listed
`oligonucleotides
`the region we have sequenced, all but
`in
`Two of the three (h6 and h29) were located in the right
`three are confirmed.
`places but the true nucleotide sequence is a
`permutation
`of
`the
`suggested
`one; our sequence is equally consistent with the secondary digestion products
`(30).
`tabulated
`by
`Cowan
`et
`al.
`show
`Our
`data
`that
`the
`third
`oligonucleotide, h18,
`does not derive from the place that was
`suggested
`by
`Cowan et al.
`Adetugbo and Milstein (20) have also inferred the
`mRNA
`from
`sequence
`341
`codon
`through 355 from the amino acid sequence of the MOPC21 frameshift
`analysis.
`by
`Their predicted sequence is confirmed
`mutant IF3.
`They
`our
`suggested (17-19) that the premature termination mutant IF1 contained a
`also
`nonsense mutation at serine-358.
`find
`codon
`be
`AGU,
`this
`Since
`we
`to
`however, the mutation cannot be a single base substitution.
`It could perhaps
`U before codon 358, or a 4-base deletion including it,
`be
`insertion
`an
`of
`either of which would create a termination codon in the appropriate position.
`
`Codon Usage
`
`As
`
`in
`
`other
`
`eukaryotic
`
`messengers,
`
`the
`
`pattern
`
`of
`
`codon usage is
`
`3314
`
`Genzyme Ex. 1033, pg 880
`
`

`
`Nucleic Acids Research
`
`clearly
`preferred
`Tables
`and
`nonrandom, as shown in
`is
`C
`III.
`II
`in
`and G is the next most abundant.
`redundant
`Codons for glutamine
`positions,
`and glutamic acid show particular preferences for G over A.
`distribution can be compared with that in the mouse immunoglobulin
`This
`CK domain (31) (Table III), which also shows a preference for C although
`the
`other bases are used equally. The coding sequences for hemoglobin a and B
`in
`the rabbit, which like immunoglobulin Cy and CK are believed to have diverged
`vertebrates (41,42), also share some but
`near to the time of origin
`of
`the
`not all anomalies in individual codon usage (Table III and (13)),
`suggesting
`long
`generally
`conserved
`anomalies
`that
`such
`be
`may
`of
`spans
`over
`evolutionary time.
`the most general features of codon usage,
`III
`Table
`shows
`also
`that
`codons,
`preferences for or against given bases in the third position of
`are
`shared
`by
`The genes for immunoglobulin C
`large
`animal
`of
`groups
`genes.
`regions, hemoglobins, peptide hormones, and histones all fall into
`I,
`Group
`which has high frequency of C, moderate to high G, moderate to low U, and low
`immunoglobulin V-region and for ovalbumin fall into Group
`for
`A.
`Genes
`an
`SV40
`II, with uniform usage except for a mild deficiency of G.
`The genes of
`are the only known representatives of Group III; they all have high U and low
`The significance of these
`C.
`unknown,
`patterns
`is
`although
`of
`analysis
`hemoglobin a usage (43) suggested a correlation
`with the relative abundances
`of tRNAs.
`
`TABLE II.
`
`Codon usage.
`
`Phe
`
`Leu
`
`Leu
`
`Ile
`
`Met
`Val
`
`UUU
`UUC
`UUA
`UUG
`CUU
`CUC
`CUA
`CUG
`AUU
`AUC
`AUA
`AUG
`GUU
`GUC
`GUA
`GUG
`
`0
`8
`1
`0
`1
`3
`0
`2
`2
`4
`1
`4
`0
`4
`0
`4
`
`Ser
`
`Pro
`
`Thr
`
`Ala
`
`UCU
`UCC
`UCA
`UCG
`CCU
`CCC
`CCA
`CCG
`ACU
`ACC
`ACA
`ACG
`GCU
`GCC
`GCA
`GCG
`
`3
`2
`1
`0
`4
`4
`3
`1
`5
`5
`2
`0
`2
`2
`2
`1
`
`Tyr
`
`0
`UAU
`4
`UAC
`0
`STOP UAA
`0
`UAG
`2
`CAU
`4
`CAC
`1
`CAA
`7
`CAG
`4
`AAU
`6
`AAC
`5
`AAA
`8
`AAG
`2
`GAU
`GAC
`4
`2
`GAA
`GAG 10
`
`His
`
`Gln
`
`Asn
`
`Lys
`
`Asp
`
`Glu
`
`Cys
`
`UGU
`UGC
`STOP UGA
`UGG
`Trp
`CGU
`Arg
`CGC
`CGA
`CGG
`AGU
`AGC
`AGA
`AGG
`GGU
`GGC
`GGA
`GGG
`
`Ser
`
`Arg
`
`Gly
`
`0
`3
`0
`4
`0
`1
`0
`0
`3
`4
`1
`1
`0
`4
`1
`1
`
`3315
`
`Genzyme Ex. 1033, pg 881
`
`

`
`Nucleic Acids Research
`
`TABLE III.
`
`Frequencies of bases at third-base positions.
`
`(a) Mouse immunoglobulin yl (pH21-1)
`
`Observed
`Expected for
`uniform usage
`Codon usage index
`
`(b)
`
`Other animal genes
`
`U
`
`C
`
`A
`
`G
`
`28
`38.09
`
`62
`38.09
`
`20
`35.59
`
`43
`41.26
`
`total
`
`153
`153.03
`
`0.74
`
`1.63
`
`0.56
`
`1.04
`
`U
`
`C
`
`A
`
`G
`
`total
`
`ref.
`
`0.74
`0.81
`0.37
`1.12
`0.91
`0.77
`0.39
`0.84
`
`1.63
`1.44
`1.99
`1.12
`1.43
`1.68
`1.70
`1.67
`
`0.56
`0.82
`0.16
`0.25
`0.39
`0.34
`0.45
`0.65
`
`1.04
`0.86
`1.41
`1.46
`1.30
`1.21
`1.41
`0.96
`
`1.23
`1.09
`
`1.11
`1.06
`
`1.10
`1.10
`
`0.58
`0.78
`
`153
`107
`141
`148
`98
`216
`168
`534
`
`128
`386
`
`1.52
`
`0.54
`
`1.18
`
`0.78
`
`1514
`
`-
`(31)
`(32)
`(33)
`(34)
`(35)
`(36)
`(37)
`
`(38)
`(39)
`
`(40)
`
`Group I
`Mouse Ig Cyl (P)
`Mouse Ig CK
`Rabbit Hb a
`Rabbit Hb $
`Rat insulin
`Rat GH
`Human CS
`Sea urchin
`histones
`Group II
`Mouse Ig VX
`Chicken Ov
`Group III
`SV40 all genes
`
`(P)
`(S)
`(P)
`
`(P)
`
`(S)
`
`The codon usage index is the frequency at which a base appears at the
`third position in codons, divided by the frequency at which it would
`appear if all the possible codons for each aminoacid were used uni-
`formly. Stop codons are not included. (P) = partial sequence, (S) =
`signal ("pre") sequence included. Ig = immunoglobulin, Hb = hemoglo-
`bin, GH = growth hormone, CS = chorionic somatomammotropin,
`Ov =
`ovalbumin.
`
`Base Composition and Dinucleotide Frequency
`
`base
`The
`composition and dinucleotide frequencies in the coding strand
`are shown in Table IV.
`There is a severe underabundance of the
`dinucleotide
`CG,
`comparable
`to
`in
`that
`total eukaryotic DNAs (44,45) and hemoglobin a
`genes (33,43,46).
`Demonstration that CG
`is
`not
`deficient
`in
`other
`some
`coding
`sequences,
`such as hemoglobin a genes (32,45), has ruled out earlier
`models in which it was proposed that
`ribosomes
`are
`unable to translate CG-
`rich
`sequences
`effectively.
`Subsequently, it has been suggested that CG in
`eukaryotes is a hotspot for mutation (13).
`The mechanism may be
`methylation
`
`3316
`
`Genzyme Ex. 1033, pg 882
`
`

`
`Nucleic Acids Research
`
`TABLE IV.
`
`Frequencies of nucleotides and dinucleotides.
`
`134 (29%)
`A
`88 (19%)
`T
`105 (23%)
`G
`132 (29%)
`C
`total 459 (100%)
`
`AA
`TA
`GA
`CA
`
`35
`10
`32
`57
`
`AT
`TT
`GT
`CT
`
`20
`16
`17
`35
`
`AC
`44
`35
`AG
`32
`TC
`29
`TG
`30
`GC
`26
`GG
`35
`CC
`5
`CG
`average 28.625
`
`(47)
`Coulondre et al.
`mutation,
`hotspots
`for
`
`the C followed by deamination to yield T (32,45).
`of
`have shown that methylated Cs in E. coli are indeed
`and suggested the same deamination mechanism.
`codon,
`Therefore, and since no amino acid is required to have CG in its
`asked whether the remaining CGs might be maintained by selective pressure
`we
`Cs
`encoded
`by
`these
`five
`amino
`acids
`on the nucleotide sequence.
`The
`(arg-295,
`phe-399) are conserved slightly more
`ile-326,
`pro-337,
`ala-382,
`than average in other y chains (15,25).
`One possible pressure for conserving
`five
`however,
`these
`CGs might be selection for an RNA secondary structure;
`preferentially into regions of strong base pairing in our
`do
`fall
`CGs
`not
`secondary structure prediction (below).
`
`Homologies between C 2 and C 3 domains
`H
`H
`-
`-
`-
`
`domains,
`The DNA sequence covers homologous parts of two immunoglobulin
`CH2 and CH3.
`first nucleotide sequence to cover regions which
`This
`is
`the
`gene,
`so
`are believed to have evolved by tandem duplication of an ancestral
`the sequence for possible nucleotide homology between the
`examined
`have
`we
`Comparison of all known heavy chain sequences (15, 23-28) indicates
`domains.
`that the most probable alignment
`between
`codons
`287-290
`392-395,
`and
`is
`396-428,
`codons 325-334 and 430-439, although the
`292-324
`and
`and
`codons
`positions of the two deletions cannot be defined unequivocally.
`Within these
`regions, excluding codons opposite deletions and codons which differ from the
`sequence of Adetugbo (16), there are 45
`codons
`of
`which
`9
`of
`pairs
`are
`With such low amino acid homology, reflecting
`identical
`in the two domains.
`the very ancient divergence of the CH2 and
`nucleotide
`little
`domains,
`CH3
`would be expected, particularly since the conserved amino acids are
`homology
`(This is implied by the fact that 15 of
`the
`probably selected for function.
`18 amino acids in conserved positions are also conserved in at least 3 of the
`4 heavy chain classes now sequenced (y, ai, c, p), compared to only 24% of the
`addition, most of those
`compared.
`acids
`In
`regions
`the
`in
`other
`amino
`conserved in both CH2 and CH3 are also conserved in the CH1 domain.)
`
`3317
`
`Genzyme Ex. 1033, pg 883
`
`

`
`Nucleic Acids Research
`
`sequence homology can be found between
`nucleotide
`little
`Indeed,
`CH2
`and
`In the 36
`CH3.
`codons
`of
`pairs
`residues,
`for
`nonidentical
`36/108
`nucleotides
`are identical; the value expected by chance is 27/108.
`In the 9
`codons
`residues,
`omitting
`of
`for
`identical
`nucleotides
`pairs
`uniquely
`specified
`by
`requirement,
`coding
`the
`6/11 nucleotides are identical; the
`5/11.
`slight
`value expected
`The
`by
`is
`chance
`chance
`is
`excess
`over
`for
`selection,
`conservation of chemical (and thus coding)
`attributable
`to
`similarities in amino acid replacements, and for C in third-base positions.
`
`Secondary structure
`
`predict
`There is usually little point in attempting
`secondary
`the
`to
`a fragment of an RNA, because long-range interactions may make
`of
`structure
`the most stable folding of the complete RNA very different from that
`of
`the
`(13).
`interest to predict the secondary
`seemed
`fragment
`of
`However,
`it
`structure of the pH21-1 sequence, in order to find
`domain
`whether
`the
`out
`structure of the protein might be reflected in the structure of the RNA.
`One
`that the CH3 domain would fold up into a structure
`anticipate
`might
`either
`independent of CH2, or that there might be prominent
`structure around
`local
`the junction between them.
`The computed secondary structure is presented in Figure 5.
`overall
`The
`kcal/nucleotide.
`stability
`0.345
`0.331
`is
`the
`than
`This
`is
`more
`kcal/nucleotide computed for the complete rabbit 6 globin mRNA using
`less
`a
`computer program (13) and less than the 0.407 kcal/nucleotide found
`powerful
`for the rabbit a globin mRNA using some of the same
`computer
`programs
`used
`here (32).
`Examination of each part of the structure separately did not turn
`up
`any
`regions of local low stability of the sort which might indicate that
`the local sequence paired with other portions of this mRNA.
`The most striking
`feature of the structure is the prominent stem formed
`from nucleotides 46-65 and 210-230.
`The CH2-CH3 junction is approximately in
`the center of the loop formed by this stem.
`The junction itself has now been
`defined by an RNA splice point in codon 335 (29) as shown in
`If
`Figure
`5.
`portion
`this
`of the structure is correct, the fact that the splice point is
`in a region of little secondary structure bounded by
`very
`the
`strong
`stem
`would
`limit the possible roles of secondary structure in directing splicing.
`That secondary structure does have an important role is suggested by the fact
`that the loosely conserved primary sequence common to all splice points
`(48,
`49)
`is
`small
`too
`give
`to
`the
`specificity needed.
`One can imagine that
`secondary structure sequesters some potential splicing sequences so that they
`
`3318
`
`Genzyme Ex. 1033, pg 884
`
`

`
`Nucleic Acids Research
`
`structure of the yl messenger RNA fragment,
`secondary
`Computed
`Figure 5.
`codons 287-439.
`for
`equally
`stable
`structure
`the
`shows
`an
`inset
`The
`The arrow marks the position of the RNA
`beginning
`and
`end of the sequence.
`domains (29).
`splice between the
`CH2
`energy
`of
`the
`and
`CH3
`The
`total
`structure (13) is -158.5 kcal.
`
`are unavailable for
`and
`splicing,
`reasonably
`close
`into
`others
`brings
`in a way that facilitates the correct splicing.
`Since we have not
`proximity
`examined the intervening sequence itself, it also remains possible
`that
`the
`actual precursor molecule contributes in a more
`secondary structure
`of
`the
`Clearly more work will be required
`definite way to the splicing specificity.
`to determine the exact role of secondary structure in RNA splicing.
`
`ACKNOWLEDGEMENTS
`
`R. Wall for communicating data on the identification
`We are grateful to
`kindly gave us pH21-1 DNA,
`and R.
`of pH21-1 by hybridization.
`D.B. DeBorde
`L. Nagy for
`particularly
`thank
`prepared
`additional
`supplies.
`We
`Hammen
`J.R. acknowledges
`running the secondary structure cycles on the computer.
`a
`Regents' Fellowship of the University of California and an
`assistantship
`at
`The project was supported by the American
`Biology Institute.
`Molecular
`t

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket