`CONTROL NOS. 90/007,542 AND 90/007,859
`
`DOCKET NOS. 22338-10230 AND -10231 .
`
`Expression of eukatyotic genes in
`E. coli
`
`T. J. R. HARRIS
`
`Celltech Ltd, 250 Bath Road; Slough SLl 4DY, Berks, UK
`
`I Introduction ........... ; . . . . . . . . . . . . . . . . . . . . . . . . 128
`II Gene expression in E. coli. . . . . . . . . . . . . . . . . . . . . . . . . . • 129
`A Transcription . . . . . . ...... ; . . . . . . . . . . . . . .. ·. . . . 12g
`·B Translation. . . . . . . . . . . . . . . . . . . . . • . . . . . . . . . . . . 130
`C Post-translational modification ............• ; . . . . . . . 130
`III Problems encountered in the· expression of eukaryotic DNA in
`E .. coli. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
`IV Expression of DNA from lower eukaryotes. . . . . . . . . . . . . . . . 134
`V The lac promoter ....... : .... ·. . . . . . . . . . . . . . . . . . . . . 134
`A The somatostatin experiment. . . . . . . . . • . . . . . . . . . . . • 135
`B · Expression of insulin in E. coli. . . . . . . . . . . . . . . . . . . . . 137
`C Synthesis of other honnones as {3-galactosidase fusions. . . . . 138
`D Expressi~n of ovalbumin. . . • . • • . • • . • • . • • • • • • • . • . . 140
`E Expression of native proteins. . . . . . . . . . . . . . . . . . . . . . 142
`F Expression of human growth hormone .••.•• ~ • . • . • . . • 144
`VI The phage X PL promoter .......... · . . . . . . . . . • . . . . . . . 147
`A Expression of eukaryotic genes from PL plasmids • • . . • • • • 150
`VII The trp promoter • . . . . . . . . . . . . . . .. . . • . . . . . . . . . . . . . 151
`A Construction of vectoiS . . . . . . . . . . . . . . . . . . . • . . . . . 151
`B Expression offusion proteins. . . . . . . . . . . . . . . . . . . . . . 151
`C Expression of interferon ....•.......• ~ . . . . . . . . . . . 155
`VIII The (j-lactamase promoter .........•...•.•••••.••.... 159
`A Synthesis of fusion proteins. . . . . . . . . . . . . . . . . . . . . • . 159
`B Sec~tion of native proteins uSing (j-lactamase fusions ..... 160
`C Synthesis of other native proteins ..........•........ 161
`IX Conclusions and future prospects . . . . . . . . . . . . . . . . . . . . . . 163
`A Alternative promoters and constructions •........•.... 170
`B mRNA structure and stability . . . . . . . . . . . . . . . . . . . . . 172
`C Nature of the proteins produced. . . . . . . . . . . . . . . . . . . • 172
`D Other host-vector systems. . . . . . . . . . . . . . . • . . . . . . . . 173
`
`GENI:."''IC ENUtN•:~IUNll 4
`IS liN 0·12·27U:IU4 ·0
`
`C:npyrlllllr (0 19M:r by ilnrdernir Jna t.Ondun.
`411 n'•ltl• uf rPnrrwlur. hi1n in anY /un11 ne.rrt~ed.
`
`.. · ..
`
`:.·:
`
`EVIDENCE APPENDIX
`
`PAGE 8467
`
`Sanofi/Regeneron Ex. 1 027, pg 835
`
`Merck Ex. 1027, pg 861
`
`
`
`/
`CONTROL NOS. 90/007,542 AND 90/007,859
`
`/
`
`DOCKET NOS. 22338-10230 AND -10231
`
`.
`
`' ••
`
`128
`
`T. J. R. Harris
`
`X Acknowledgments ................................ 174
`XI References •.•••...•.. ; ........................• 175
`
`I
`
`Introduction
`
`In recent years the techniques of in vitro DNA recombination
`followed by transfection of suitable host cells with recombinant
`vectors (gene cloning) has led to a great increase in our understanding
`of the structure and function of the genomes of many organisms. In
`the early stages of this work it became clear that genes which were
`cloned in this way could be expressed in the new host if the genetic
`elements controlling expression were suitably arranged. The results
`of these efforts will find application in two spheres. In the first, new
`·approaches to fundamental studies on the relationship of protein
`structure to function will be possible. Already, molecules have been ,
`produced which are hybrids of the appropriate regions of different
`interferon molecules and their functions are being examined. This is
`possible not only because the genes for the proteins can be recom(cid:173)
`bined but because they can then be expressed in E. coli in quantities
`sufficient for purification and biological study (Streuli et al., 1981;
`Week et al., 1981). Further extensions of this kind of work can be
`foreseen w~ere one or a few selected amino acids (e.g. near the active
`site of an enzyme) are altered by in vitro mutagenesis (Shortie et al.,
`1981; Lathe et al., 1983) and the effect on enzymatic function
`assayed. Secondly, such is the power of these gene cloning and
`expression techniques that they are already having a profound
`impact on the practice of biotechnology and it seems that few areas
`of this technology will remain unaffected by them. Indeed, the first
`proteins made by recombinant DNA techniques are now being pro(cid:173)
`duced in sufficient quantity for extensive safety and efficacy testing.
`Insulin and growth hormone, both conventionally isolated from
`human endocrine tissue have now been made in E. coli and the
`proteins purified (Goeddel et al., 1979a, 1979b). Considerable effort
`has been expended on the isolation and expression of both leukocyte
`(Le or Q(_) and fibroblast (F or (3) interferon genes so that the potential
`of these antiviral compounds can be evaluated properly ·(see Scott
`and Tyrrell, 1980). There is also the possibility of producing proteins
`for use as vaccines against a variety of infectious agents by cloning
`and expressing the genes coding for the relevant surface immunogens.
`Notable progress has been made towards a vaccine for foot and
`mouth disease virus (FMDV) using this approach, where one of the
`capsid proteins (VPI) produced in E. ·coli has been shown to elicit
`
`EVIDENCE APPENDIX
`
`.
`
`PAGE 8468
`
`Sanofi/Regeneron Ex. 1 027, pg 836
`
`Merck Ex. 1027, pg 862
`
`
`
`/'
`CONTROL NOS. 90/007,542 AND 90/007,859
`
`DOCKET NOS. 22338-10230 AND -10231
`
`' ••
`
`Expression of eukaryotic genes
`
`129
`
`neutralizing antibody (Kleid et al., 1981). Genetically engineered
`vaccines for other viruses such as hepatitis B and rabies virus are also
`being considered.
`Although none of these initial examples. of the expression of
`proteins from recombi,nant organisms is as yet established as a bio(cid:173)
`technological process, the way in· which the expression of the recom(cid:173)
`binant DNA was achieved forms a general paradigm for all future
`studies. However, at the same time, it is clear that not all the rules
`governing the expression of cloned genes have been elaborated and
`those rules that do exist are still largely empirical. In this article
`the ways in which expression has been achieved are reviewed, some
`of the problems discussed and some of the probable future systems
`considered.
`
`II Gene expression in_ E. coli
`
`,
`
`E. coli has been used as the host· cell for expression of foreign genes
`mainly because more is known about the control of gene expression
`in this organism than in any other. It is well established, for example,
`that the genes involved in a particular metabolic activity tend to be
`clustered in transcriptional units (operons) with the major control
`regions (the operator and promoter) located at the beginning of the .
`cluster (for" a detailed description of bacterial gene expression, see
`Miller and Reznikoff, 1980). The operon is transcribed into a poly(cid:173)
`cistronic mRNA from which the polypeptides are then translated.
`Transcriptional control is exerted over the expression of an operon
`and varies depending on the function of the genes in the operon
`(see ·Miller and Reznikoff, "1980). Since relatively few promoter
`systems are currently being utilized to express cloned genes, the
`essential elemen"b; of their control mechanisms will be dealt with
`when considering each system. Expression of a cloned gene requires
`efficient and specific transcription of the DNA, translation of the
`mRNA and in some cases post-translational modification of the
`resulting protein.
`
`A Transcription
`
`The first step in the initiation of transcription in E. coli is the binding .
`of RNA polymerase to a promoter sequence in the DNA. Analysis of
`the DNA sequence of many promoters in E. coli has revealed two
`regions of homology located ·about 35 base pairs (bp) upstream
`from the transcription initiation site (the - 35 region) and about
`10 bp · upstream (the -10 region or Pribnow-Schaller box). The
`
`:3
`
`·· ..
`
`EVIDENCE APPENDIX
`
`PAGE 8469
`
`Sanofi/Regeneron Ex. 1 027, pg 837
`
`Merck Ex. 1027, pg 863
`
`
`
`/
`
`/
`CONTROL NOS. 90/007;542 AND-90/007,859 ·
`130
`T. J. R. Harris
`
`· DOCKET NOS. 22338-10230 AND -10231
`
`I
`••
`
`conserved sequences in the -35 and -10 regions (TTGACA and
`TATAAT respectively, Rosenberg and Court, 1979; Siebenlist et al.,
`1980) probably represent those bases most intimately involved in
`polymerase binding and orientation via sigma factor, so that RNA
`chain initiation can take place just downstream.
`Transcription termination is also controlled by signals in the DNA
`sequence, characteristically a GC rich region having a two-fold
`symmetry before the termination site, followed by an AT rich
`sequence at the site of termination (Rosenberg and Court, 1979).
`Several protein factors are also involved in· the control of term(cid:173)
`ination, most notably the rho factor. Anti-termination proteins such
`as theN gene product of phage A can also be involved in specialised
`systems (Greenblatt"et al., 1981).
`
`B Translation
`Efficient. translation of mRNA in prokaryotic cells requires the
`presen~e of a ribosome binding site (rbs). For most E. coli mRNAs•
`the rbs consists of two components, the initiation· codon AUG and,
`lying 3--12 bases upstream, a sequence of 3-9 bases called the
`Shine-Dalgarno (SD) sequence complementary to the 3' end of the
`16S rRNA (Shine and Dalgarno, 1975). It is believed that hybrid(cid:173)
`ization to this region is involved in the attachment of the ribosomal
`30S subunit to the mRNA (Steitz, 1979). The SD sequence is not
`identical in all mRNAs but a semi-conserved consensus sequence has
`been identified just as for promoter sequences. It is possible that
`differences in SD sequences form part of a translational control
`system. In addition, ribosome binding is probably modulated by the
`secondary structure at the 5' end of the RNA since more efficient
`translation occurs if the AUG and SD sequence are freely accessible
`to 30S ribosomal subunits (Iserentant and Fiers, 1980). Termination
`of translation usually occurs whenever one of the three stop codons
`is encountered in the mRNA by a ribosome complex, provided that
`an.aminoacylated suppressor tRNA is not present.
`
`C Post-translational modification
`There are a variety of modifications that bacterial proteins can
`undergo following translation. The formyl group on the NH 2-terminal
`methionine is hydrolysed and one or more NH 2 -terminal residues
`may be removed. Many secreted proteins are synthesized as large
`precursors with additional hydrophobic NH 2-terminal signal
`sequences .that are cleaved off by a membrane bound enzyme (for
`review, see Davis and Tai, 1980). However, glycosylation and
`
`EVIDENCE APPENDIX
`
`PAGE 8470
`
`Sanofi/Regeneron Ex. 1 027, pg 838
`
`Merck Ex. 1027, pg 864
`
`
`
`/
`CONTROL NOS. 90/007,542 AND .90/007,859
`
`· DOCKETN.OS. 22338-10230 AND -10231-
`
`ExpreS8ion of eukaryotic genes
`
`131
`
`.·
`
`:.·
`
`.··
`
`,
`
`I
`•
`
`--
`
`phosphorylation, which are common modifications of proteins in
`eukaryotic cells do not occur to any great extent in E. coli.
`
`ill · Problems encountered in the expression of
`eukaryotic DNA in E. coli
`
`Successful expression of a eukaryotic gene in E. coli requires that the.
`cellular machinery is organised so that the level of expression of
`the cloned gene is as good or better than the resident genes. Probably
`the most important difference between eukaryotic genes (at least
`from higher organisms) and prokaryotic genes is the presence of
`intervening sequences (introns) which interrupt the coding sequences.
`Normally these sequences are spliced out of the initial RNA
`transcript, producing cytoplasmic mRNA suitable for translation.
`There are no introns in prokaryotic genes and consequently no
`splicing enzymes present, so in general genomic DNA cannot be used
`·as a source of genes for expression in bacterial cells. A second
`problem is that transcriptional signals in eukaryotes are different
`from those in prokaryotes (Carden et al., 1980; Breathnach and
`Chambon, 1981) and are not usually recogirised by bacterial RNA
`polymerase. This difference again emphasizes the fact that eukaryotic
`genomic DNA is not a suitable gene source for construction of
`expression vectors. Thirdly, the structure of eukaryotic mRN A is
`different to bacterial mRNA. Eukaryotic mRNA is polyadenylated · ·
`at the 3' end and normally capped at the 5' end, features which
`may affect mRN A stability and ribosome binding (Breathnach and
`Chambon, 1981). Furthermore eukaryotic mRNA does not seem to
`have an equivalent of the SD sequence present in prokaryotic mRN A
`(Kozak, 1981).
`An additional problem is that of codon usage. The codons used
`in mRN A coding for highly expressed prokaryotic genes are not
`random; there is a marked preference for particular codons for
`some amino acids (Grantham et al., · 1981; see Grosjean and Fiers,
`1982). This preference appears to correlate with the abundance of
`different tRNA species (lkemura, 1981). As codon selection pref(cid:173)
`erences are different for eukaryotic genes it is possible that the
`levels of certain tRNAs will affect .translational efficiency of these
`genes in a prokaryotic system. Finally, it is known that many
`eukaryotic proteins are subject to ·a number of post-translational
`modifications which may affect either activity or stability. Most of
`these modifications do not occur in prokaryotes.
`A number of strategies have been developed_ to try to overcome
`these difficulties (Table 1). Once· the amino acid sequence of a
`
`EVIDENCE APPENDIX
`
`PAGE 8471
`
`Sanofi/Regeneron Ex. 1 027, pg 839
`
`Merck Ex. 1027, pg 865
`
`
`
`/
`CONTROL NOS. 90/007,542 AND-90/007,859.
`132
`T. J. R. Harris
`
`DOCKETNOS. 22338-10230 AND -10231
`
`I
`••
`
`Gene
`
`Table 1 General strategies for the expression of cloned genes in E. coli.
`Controllevel
`Strategy
`Synthesise DNA in vitro by chemical methods,
`with optimised codon assignments or obtain
`eDNA clone to specific mRNA. Chemical DNA
`synthesis probably required for tailoring genes
`into expression vector.
`Clone gene adjacent to strong E. coli promoter
`which is controllable so that transcription can
`be induced (derepressed) when required. Use a
`multicopy plasmid to increase gene dosage.
`Include termination signal after gene to prevent
`transcriptional read-through.
`Fuse gene in correct translational reading frame
`to an E. coli gene already in the vector, so that
`normal rbs is maintained. Possible to use both
`long and short NH 2-terminal fusions.
`Alternatively, place new gene with its own
`AUG adjacent to an rbs. The sequence of the
`SD sequence and distance from the initiating
`AUG may modulate translation. Accessibility
`(secondary structure) around SD-AUG may be
`important. Codon usage can be overcome by
`using chemically synthesized genes. Not clear
`if codon bias actually affects the translation
`·of cloned genes. Include stop codon (s) in
`chemically synthesised genes.
`Use signal sequences to control secretion?
`Synthesis of precursor proteins followed
`by their processing ensures removal of
`NHrtenninal initiating methioiline. Factors
`affecting folding of foreign proteins and their
`degradation in E. coli are not well defined.
`Synthesis of long fusion proteins may result in
`increased stability.
`
`Transcription
`"(Initiation and termination)
`
`Translation
`(Initiation)
`
`Protein
`(Secretion and stability)
`
`'
`
`.••.
`~·.
`·.~
`
`· ... ·
`
`.-:·
`
`protein is known it is now a relatively straightforward task to design
`and synthesize chemically, a DNA sequence that will code for the
`protein without the problem of interveniTig sequences and with
`optimized codon assignments. A gene of 514 bp coding for leukocyte
`Le(a) interferon, a protein of 166 ammo acids is the longest DNA
`sequence that has been synthesized so far (Edge et al., 1981).
`· Although there is no theoretical limit to the size of gene that can be
`synthesized, practical problems arise for much larger proteins. If the
`gene is too big for a chemical synthesis, then double stranded DNA
`copies of mRNA populations can be generated, cloned into a plasmid
`vector and the clone co~taining the sequence coding for the protein
`
`EVIDENCE APPENDIX
`
`PAGE 8472
`
`Sanofi/Regeneron Ex. 1 027, pg 840
`
`Merck Ex. 1027, pg 866
`
`
`
`/
`DOCKET N·os. 22338~10230 AND -10231 ·
`CONTROL NOS. 90/007,542-AND 90/007,859 .
`133
`Expression of eukaryotic genes
`
`I,
`••
`
`--
`
`of interest selected from the clone bank by hybridization tech-
`niques .
`· Transcription of these genes is controlled by inserting the DNA
`adjacent to a strong prokaryotic promoter in a cloning vector. four
`promoters have been used most widely for this purpose, the lac
`promoter from the E. coli _lac operon; the trp promoter from the
`E. coli trp operon; the strong leftward promoter of phage A (P L ) and
`the constitutive and weaker ~ -lactamase promoter present in the
`plasmid vector pBR322. The expression vectors themselves are
`usually derived from high copy number plasmids so that there is
`increased expression owing to gene dosage (Gelfand et al., 1978;
`O'Farrell et al., 1978). Termination of transcription can be ensured
`by placing a termination site after the cloned gene (e.g. Nakamura
`and Inouye, 1982) although whether this is necessary for the main(cid:173)
`tenance of high levels of transcription is. not yet clear. The conse(cid:173)
`quences of uninterrupted transcription around a small circular
`plasmid DNA molecule are unlmown. It is presumably detrimental
`since most expression vectors have other genes present (e.g. an
`antibiotic resistance gene) which are transcribed in the opposite
`direction from a different promoter and it is. known that the trans(cid:173)
`-cription of genes in A. phage carrying the trp promoter is adversely
`affected if the trp promoter is in an orientation where transcription
`occurs towards transcripts arising from the PL promoter (Hopkins
`et al., 1976).
`Translational barriers have been overcome to some extent by two
`procedures. The foreign gene is either fused (in the correct trans(cid:173)
`lational reading frame) to a prokaryotic gene so that the existing rbs
`is· used to initiate translation, or the new gene, with its own initiation
`codon, is placed adjacent to a naturally occurring E. coli rbs (Backman
`et al., 1980) or a synthetic one (Jay et al., 1981). Since all structural
`genes, whether eukaryotic or prokaryotic, end with one or more of
`the three termination· codons it is not usually necessary to make
`special arrangements for translational termination when using a
`cloned eDNA sequence. However, . a termination codon must be
`included when synthetic DNA is used.
`Protein modification and stability are much less easy to control,
`largely because the structural features governing protein stability iri
`E. coli are not well understood. It lias been shown that eukaryotic
`signal sequences are recognised by· E. coli and that NH 2-t;enninal
`fusions of eukaryotic polypeptides to E. coli signal sequences results
`in secretion of the protein to the periplasmic space, With concomitant
`.cleavage of the signal sequence (Talmadge et al., 1980; 1981). There
`is also some evidence that short "foreign, polypeptides are unstable
`in E. coli (Itakura et al., 1977; Gmiddel et al., 1979a). This has been
`
`II
`
`~· .. ·
`
`:· ..
`
`EVIDENCE APPENDIX
`
`PAGE B473
`
`Sanofi/Regeneron Ex. 1 027, pg 841
`
`Merck Ex. 1027, pg 867
`
`
`
`CONTROL NOS. 90/fl9.(.542 'fl!i. ~!OfiJ~~9
`overcome by fusing the peptide to a larger E. coli protein from which
`the peptide is then cleaved.
`
`/
`
`/
`
`' ••
`
`DOCKET N·os. 22338-10230 AND -10231
`
`IV Expression of DNA from lower eukaryotes
`
`Following the observation that DNA from S. aureus could be
`expressed in E. coli (Charig and Cohen, 1974) it was shown that
`eukaryotic DNA could also be transcribed (Morrow et al., 1974;
`Chang et al., 1975; Kedes et al., 1975). It was not clear from these
`experiments, however, whether the normal transcriptional start and
`stop signals were being recognised. The fundamental question of
`whether a fungal gene could be transcribed and translated to produce
`a functional protein in E. coli was answered to some extent by the
`finding that fragments of yeast DNA cloned into phage A, or the·
`plasmid vector Col El could. complement auxotroph~c mutants of
`E. coli (e.g. His B and Leu B) (Strohl et al., 1976; Ratzkin and
`Carbon, 1977; Struhl and Davis, 1977). Similarly segments of Neuro-
`spora crassa DNA containing the gene for dehydroquinase were
`·successfully expressed in E. coli in a pBR322 replicon (Vapnek et al.,
`1977). Several other yeast genes have now been expressed in this way
`·(e.g. Trp 1, Trp 5 and Arg 4). The functional expression of yeast
`DNA in E. coli not only demonstrated that eukaryotic DNA could be
`--....______
`transcribed and transla~p, paving the way for the experiments
`..
`described below, but also provided a powerful method for isolating
`yeast genes. Some of these genes have subsequently been used to
`provide selection markers in yeast-E. coli shuttle vectors (Beggs,
`1982; Hinnen and Meyhack, 1982).
`
`•
`
`The lac promoter
`
`The lac operon is subject to two types of control. In the absence of
`lactose (or other inducer) the operon is kept switched off by lac
`repressor (the lac i gene product) binding to the operator. Positive
`regulation is also exerted through the catabolite gene activator
`protein (CAP). In the absence of glucose, CAP forms a complex with
`cyclic AMP and this complex stimulates transcription by binding
`next to the promoter. The operon is derepressed by the presence of
`lactose,. or by the addition of the non-metabolizable inducer IPTG
`(isopropylthiogalactoside) which binds to the repressor and removes
`it from the operator.
`
`..::'
`
`· .. •
`,··,
`
`..
`,•
`
`.:.·
`
`·-:
`
`EVIDENCE APPENDIX
`
`PAGE 8474
`
`Sanofi/Regeneron Ex. 1 027, pg 842
`
`Merck Ex. 1027, pg 868
`
`
`
`/
`DOCKET NOS. 22338-10230 AND -10231
`CONTROL NOS. 90/007,542 AND 90/007,859
`135
`Expression of eukaryotic genes
`
`I
`••
`
`-·
`
`Plasmid vectors containing parts of the lac operon have been
`constructed by several workers. Polisky et al. (1976) cloned an
`EcoRI fragment from A p ~c 5 DNA (a transducing phage containing
`part of the lac operon) into a Col E1-derived plasmid to obtain
`a vector with the lac promoter and operator and most of the {3-
`galactosidase. gene. Plasmids containing a small "portable" lac
`promoter fragment have also been made. In these constructions a
`203 bp Haem fragment ·of lac transducing phage DNA, containing the
`lac promoter and operator and first eight codons of J3·galactosidase,
`was blunt end ligated into EcoRI-cut and "filled in" pBR322 DNA.
`The portability derives from the fact that EcoRI sites are reformed
`at the junctions allowing the promoter fragment to be removed by
`EcoRI digestion (Backman and Ptashne, 1976; Itakura et al., 1977).
`Colonies harbouring plasmids which carried the lac ·promoter-operator
`were identified by their constitutive synthesis of J3 -galactosidase,
`rendering them blue on agar plates containing X gal ( 5 chloro-4
`bromo 3 indolyl-D galactoside). This is because multiple copies of
`the operator titrate out all the lac repressor resulting in derepression
`of the chromosomal (3-galactosidase gene. Both A p lac 5 DNA and
`A h80 lac UV5 C1857 DNA, which contains the CAP site mui:ation
`LB and the up promoter mutation UV5 (making the promoter insen(cid:173)
`sitive to catabolite repression), have been used as a source of lac
`DNA for these constructions (Backman et al., 1976; Itakura et al.,
`1977; see also Fuller, 1982). Further derivative plasmids containing a
`95 bp Alui fragment of lac DNA, including the UV5 promoter (minus
`the CAP binding site), the repressor binding site and most of the rbs,
`just excluding the ATG of (3 -galactosidase, have also been con(cid:173)
`structed for the expression of non-fusion proteins (Fuller, 1982).
`
`A The somatostatin experiment
`The first report of the designed expression of a eukaryotic gene in
`E. coli was the production of the small peptide hormone somato(cid:173)
`statin (Itakura et al., 1977). Somatostatin was used as a model
`system because the honnone was a small polypeptide of known
`·amino acid sequence for which sensitive radioimmune and biological.
`assays existed. The experiments illustrate a number of features of
`methods which are now used to obtain expression of cloned genes.
`They ·also demonstrated, although not for the f'IISt time, that
`chemically synthesized DNA was functional in a biological system.
`In addition, the production of the protein as a fusion polypeptide
`and its subsequent cleavage into the native hormone at methionine
`residues by cyanogen bromide (CNBr), has been used quite extensively
`for other proteins. This overall strategy is depicted in Fig. 1.
`
`•
`
`·:.·
`:'.'·
`
`.·.~
`
`EVIDENCE APPENDIX
`
`PAGE B475
`
`Sanofi/Regeneron Ex. 1 027, pg 843
`
`Merck Ex. 1027, pg 869
`
`
`
`/
`CONTROL NOS. 90/007",542 AND .90/007,859
`
`DOCKET NOS. 22338-10230 AND -10231
`
`136
`
`T. J. R. Harris
`
`I
`••
`
`GENETIC CODE
`
`E. coli Loc Operon DNA
`
`!
`
`Lac
`
`P 0
`
`~- Gol
`
`Chemical
`DNA Synthesis
`
`1
`
`Somatostatin Gene
`
`AATTC ATG GCT GGT TGT AAG AAC TTC TTT T G
`G
`A
`'-------~~~-p.;,:~n;~~---;]c~TA~GEG~AT[BAG~TIT~G[T]G~CT[TITC~AJC~T[T]TC~A~G~A
`-
`p8R322 Plasmid DNA
`
`lin Vivo
`
`NHz
`
`fl- Gal
`
`~- Gol Fragments
`
`+
`
`Som
`M•l · Ala • Gly • Cys • LJI ·Au · Pile • PIIP .
`S
`Trp
`Lys
`
`HO • Cys • Ser • Tllr • Pht • Tllr'
`
`In Vitro
`Cyanogen Bromide
`Cleovoge
`
`.. •
`·-.
`.-:•
`
`·~·~ .·
`
`l ~
`
`NHz • Ala • Gly • Cy& • LJ& ·Asn ·Pile· Pile.
`Trp
`
`.
`&
`S
`.
`HO • C~ · Ser • Tllr • Pile • Thr
`Active Somatostatin
`
`,LJI
`
`Strategy for the expression of the chemically synthesized somato(cid:173)
`Figure 1
`statin gene as a JJ-galactosidase fusion from the lac promoter. The active honnone
`can be cleaved from the hybrid protein by CNBr treatment. (Reproduced from
`Itakura et a/.,1977, copyright by the American Association for the Advancement
`of Science, with pennission.)
`·
`
`In the first set of experiments the chemically synthesized somato(cid:173)
`statin gene with synthetic EcoRI and Bam HI cohesive ends was
`cloned into a vector containing the wild type Hae III lac promoter
`fragment. The DNA sequence indicated thatthe plasmid should have
`produced a polypeptide containing the first seven amino acids of
`(3-galactosidase fused to somatostatin. However, no somatostatin was
`detected in bacterial extracts by radioimmunoassay. As it was found ·
`that somatostatin was not stable when added to E. coli extracts, the
`failure to find somatostatin was thought to be due to proteolytic
`digestion (Itakura et al., 1977). The approach adopted to try to
`stabilise the somatostatin was to produce it as part of a longer
`polypeptide from which it could be cleaved by CNBr. This was done
`by linking the somatostatin gene to the EcoRI fragment of A p lac
`5 DNA which carries the lac promoter and a large proportion of the
`(3-galactosidase gene (Polisky et al .• 1976). The translation reading
`
`EVIDENGE APPENDIX
`
`PAGE 8476
`
`Sanofi/Regeneron Ex. 1 027, pg 844
`
`Merck Ex. 1027, pg 870
`
`
`
`/
`CONTROL NOS. 90/007,542 AND-90/007,859-
`
`DOCKET NOS. 22338-10230 AND -10231
`
`I
`••
`
`Expression of eukaryotic genes
`
`137
`
`:fra,rp_e of /3-g~ac~q~idase was maintained in somatostatin after fusion
`~t th~_:,E~p.!Ujunction. In these constructions only one orientation
`of the EcoRI lac fragment maintained the correct reading frame in
`somatostatin- and when several independent clones were examined,
`about hili produced detectable somatostatin after CNBr cleavage.
`No immunoreactive protein was detected before cleavage since the
`antiserum used in the assay required a free NH 2 -terminal alanine
`residue (Itakura et al., 1977).
`
`··.·
`··.·
`•:·.
`
`•
`
`B Expression of insulin in E. coli
`The somatostatin work established the feasibility of the synthetic
`gene fusion approach fo~ the expression of small polypeptides in
`E. coli. It was possible to follow an almost identical strategy to
`obtain expression of human ·insulin, as neither the 20 amino acid
`A chain nor the 30 amino acid B chain . of insulin contained
`methionine and methods were available for the in vitro joining of the
`two chains. Thus, an A chain gene and a B chain gene were chem(cid:173)
`ically synthesized each with Bam HI and EcoRI cohesive ends ( Crea
`et aL~ 1978) and cloned separately into pBR322. The B chain gene
`was synthesized with a Hindi!! site in the middle so that the two
`halves could be cloned separately and the sequence verified (Goedde!
`et al., 1979a). Expression was achieved by transcription from the
`same lac promoter as used for the successful somatostatin con(cid:173)
`structions and insulin A or B~-galactosidase f~sion proteins were
`produced (Goedde! et al., 1979a). The hybrid proteins represented
`·about 20% of total cell protein, which was about ten-fold higher than
`the level of expression obtained with somatostatin. The hybrid
`proteins were insoluble and were found in the first low speed pellet
`after breaking the cells with a French press.
`To obtain A and B peptides suitable for reconstitution into
`native insulin, the hybrid proteins had to be solubilised, the /3-
`galactosidase portion removed and the peptides 8-sulphonated. This
`was achieved by dissolving the hybrid proteins in 6 M guanidinium
`chloride followed by dialysis. The precipitate was dissolved in 70%
`formic acid, the protein cleaved with CNBr and S-sulphonated
`derivatives of the peptide mixture obtained, using sodium dithionate
`and sodium sulphite at pH 9. Insulin activity was readily detected
`by radioimmunoassay after re-constitution. Further studies on the
`peptides (e.g. chromatographic behaviour) and amino acid com(cid:173)
`positions established, without doubt, that the bacteria were pro(cid:173)
`ducing authentic insulin A and B chains (Goedde! et a/., 1979a).
`Insulin, prepared from bacteria containing these constructions by a
`scaled up and modified process, has now been shown to be active
`
`EVIDENCE APPENDIX
`
`PAGE 8477
`
`Sanofi/Regeneron Ex. 1 027, pg 845
`
`Merck Ex. 1027, pg 871
`
`
`
`/
`CONTROL NOS. 90/007,542 AND .90/007,859
`
`DOCKET N·os. 22338-10230 AND -10231 ·.
`
`' ••
`
`138
`
`T. J. R. Harris
`
`when injected into human volunteers (Clark et al., 1982) and to
`interact with insulin receptors in the same way as native human
`insulin (Keefer et al., 1981).
`An alternative _approach involves the synthesis of a gene coding for
`proinsulin, the natural precursor to insulin. Proinsulin is synthesized
`initially as a preproinsulin molecule consisting of an NH 2-terminal
`signal sequence, followed by the B chain, a linking C peptide and the
`COOH-terminal A chain. Enzymatic removal of the signal peptide
`during secretion generates proinsulin and processing at two trypsin
`sensitive sites (Arg-Arg, Lys-Arg) allows the removal of the C peptide
`and the generation of active insulin. The three dimensional structure
`of insulin indicates that a peptide much shorter than the 35 amino
`acid connecting C peptide should be sufficient to connect the B and
`A chains and still allow proper folding of the modified proinsulin.
`Genes coding for human proinsulin and "mini C" derivatives of
`proinsulin, where the C peptide is replaced by a six amino acid linker
`retaining the proteolytic cleavage sites, have been constructed by
`chemical synthesis (Sung et al., 1979; Wetzel et al., 1981a; Brousseau
`et al .• 1982).
`The mini C construction was cloned for expression as a (j(cid:173)
`galactosidase fusion protein (Wetzel et al .• 1981a) and a product with
`a proinsulin-like structure (as determined by radioimmunoassay and
`HPLC) was detected after CNBr cleavage and S-sulphonation. The
`usefulness . of this route to insulin production is still not clear
`however, as there are no data on the behaviour of mini C derivatives
`in enzymatic proinsulin processing systems and there are already
`preproinsulin expression constructions available derived from cDN A
`(see P-lactamase section). However, the modular approach to the
`chemical synthesis of proinsulin adopted by Brousseau et al. (1982)
`does have the advantage that the shortening and changing of parts
`of the C peptide or alteration of the codons can be approached
`rationally by the incorporation