`/
`coNTRoL NOS. 9o.roo7,542 AND 9oro07,s59
`
`DOCKET NOS. 22338-10230 AND -10231 “
`
`-
`
`,
`
`Expression of eukaryotic genes
`E. coli
`A
`A
`
`-T. J. R. HARRIS
`
`__
`
`Ceiltech Ltd, 250 Bath Road; Slough SL1 4DY, Berks, UK
`
`.
`. . .
`.-
`.
`. . . .
`.
`Introduction . . . . .
`I
`. . . . . .
`II Gene expression in E. coil’ .
`.
`.
`.
`. . .
`A Transcription .
`.
`. . .
`.
`. . .
`.
`.
`.
`-B Translation .
`. . .
`_
`C
`Post-translational modification.
`
`.
`.
`.'
`
`.
`
`. .
`. .
`.
`.
`
`. . .
`.
`. . .
`.
`. . .
`.
`
`.
`.
`.
`
`.
`.
`.
`
`.
`
`.
`
`.
`
`.
`
`.
`
`. . .
`
`.
`.
`.
`
`.
`
`. .
`.
`.
`. .
`.
`.
`.
`.
`
`.
`.
`.
`.
`.-
`
`. . 128
`. . . .
`.
`. . 129
`. .
`.
`.
`.
`.
`. 129
`.
`.
`. -.
`.
`. . .
`.
`. ..13fl
`.
`. . .
`. . . 130
`
`III Problems encountered in theexpression of eukaryotic DNA in
`E._coI1'........... . .
`.
`. . . . .
`. . .
`. .
`.
`.
`.
`.
`. . . . . . .
`. . . ..131
`IV Expression of DNA from lower eukaryotes. . . . . .
`. . . . .
`. . . . . 134
`V The Inc promoter. . . .
`. . . . . . . .- . . .
`. . . . .
`.
`.
`.
`. . . . .
`. . . . . 134
`A The sornatostatin experiment.
`.
`.
`.
`.
`. . .
`. .
`. . . . . . .
`. . .
`.
`. 135
`B- ExpressionofinsulininE.col:' .
`.
`.
`.
`.
`. . . .
`. . .
`13'?
`.
`.
`C Synthesisof other horrnonesasfi-galactosidase fusions. .
`. 138
`D Expressionofovalbumin......................... 140.
`_ E
`Expression of native proteins .
`. . . . .
`.
`.
`. . . ...
`. .
`.
`.
`.
`.
`. . . 142
`F
`Expression of human growth hormone .
`. . . . .
`._
`. . .
`. . .
`.
`. 144
`VI The phage AP; promoter . . .
`. . .
`. . ._ .
`.
`.
`.
`. . .
`.
`. . . . .
`. . .
`.
`. 147
`A Expression of enlraryotie genes from PL plasmids . . . .
`. . . . 150
`VIITheoppromoter................_.. . .
`. .
`.
`.
`.
`. ..151
`. .
`.
`.
`A Construction of vectors .
`.
`.
`. . .
`. . . . .
`. . . .
`.
`.
`. .
`.
`. 151
`.
`B
`Expression of fusion proteins. . . . . . . . . . . .
`. . . . .
`.
`.
`.
`. 151
`C
`Expression of interferon. .
`._ .
`.
`.
`.
`.
`. . .
`. . '.
`. . .
`. . . .
`.
`. . . 155 _
`VIII The fl-lactamase promoter. . . . . .
`.
`.
`.
`.
`. . .
`.
`.
`. . .
`. . .
`.
`.
`.
`.
`.
`. 159
`A Synthesis of fusion proteins. ._ .
`. . . . . . . . .
`.
`.
`.
`.
`.
`.
`.
`.
`. . . 159
`B
`Secretion of native proteins using fl-lactamase fusions . . .
`.
`. 160
`C
`Synthesis of other native proteins. .
`. . .
`. .
`.
`.
`.
`.
`.
`.
`.
`. . .
`.
`. 161
`1X Conclusions and future prospects .
`.
`.
`.
`. .
`.
`.
`.
`.
`.
`.
`. . .
`.
`. . . .
`.
`. 163
`A Alternative promoters and constructions .
`.
`. . .
`.
`. . .
`.
`. . .
`.' 170
`3 mRNA structure and stability .
`.
`.
`.
`.
`.
`.
`.
`.
`.
`.
`.
`.
`. .
`.
`.
`.
`.
`.
`. 172
`C Nature of the proteinsproduced. . . . . .
`.
`.
`.
`.
`.
`.
`.
`.
`.
`.
`. . .
`. 172'
`D Other host-vector systems .
`. . .
`.
`.
`.
`.
`. . .
`. .
`.
`.
`.
`.
`.
`.
`.
`. .
`.
`. 173
`
`H
`
`ii.-i.'.‘«"£:’.".‘-E-..E§‘.Ef1'.i'.‘?‘i"""“‘
`
`.‘;:’:i;‘:.“L.53J£.";’.1?.;E:°:'.:1‘: .'«.’.T.: .':'.’'."’...':1
`
`MERCK v. GENENTECH
`MERCK V. GENENTECH
`IPR2016-01373
`
`30
`GENENTECH 2010
`
`
`
`I
`
`CC)NTROL NOS. 9o.reo7,542'ANo '9oroo7,s59 '
`
`DOCKET NOS. 22333-10230 AND -10231
`
`128
`
`T. J. R. Harris
`
`.a
`
`'
`
`X Acknowledgments .
`XI References . . . . .
`.
`
`. . .
`.
`. . .
`.
`
`.
`.
`
`.
`-.
`
`.
`.
`
`. .'.
`.
`.
`.
`
`.
`. . .
`.
`. . . . .
`
`.
`.
`
`. . . . .
`.
`. .
`.
`.
`.
`.
`
`.
`.
`
`.
`.
`
`.
`.
`
`.
`.
`
`.
`.
`. .
`
`.
`.
`
`. . 174
`. . 175
`
`_,-
`
`I
`
`Introduction
`
`In recent years the techniques of in vitro DNA recombination
`followed by transfection of suitable host cells with recombinant
`vectors (gene cloning) has led to a great increase in our understanding
`of the structure and function of the genomes of many organisms. In
`the early stages of this work it became clear that genes which were
`cloned in this way could be expressed in the new host if the genetic
`elements controlling expression were suitably arranged. The results
`of these efforts will find application in two spheres. In the first, new
`‘approaches to fundamental studies on the relationship of protein '
`structure to function will be possible. Already, molecules have been
`produced which are hybrids of the appropriate regions of different
`interferon molecules and their functions are being examined. This is
`possible not only because the genes for the proteins can he recom-
`bined but because they can then be expressed in E. coil’ in quantities
`sufficient for purification and biological study (Streu1i et al., 1981;
`Week et at, 1981). Further extensions of this kind of work can be
`foreseen where one or a few selected amino acids (e.g. near the active
`site of an enzyme) are altered by in vitro mutagenesis (Shortle et at,
`1981; Lathe et aL, 1983) and the effect on enzymatic function
`assayed. Secondly, such is the power of these gene cloning and
`expression techniques;
`that
`they are already having a profound
`impact on the practice of biotechnology and it seems that few areas
`of this technology will remain unaffected by them. Indeed, the first
`
`proteins made by recombinant DNA techniques are now being pro-
`duced in sufficient quantity for extensive safety and efficacy testing.
`Insulin and growth hormone, both conventionally isolated from
`human endocrine tissue have now been made in E. colt‘ and the
`
`proteins purified (Goeddel et al., 1979a, 1979b). Considerable effort
`has been expended on the isolation and expression of both leukocyte
`(Le or or) and fibroblast (F or B) interferon genes so that the potential
`of these antiviral compounds can be evaluated properly "(see Scott
`and Tyrrell, 1980). There is also the possibility of producing proteins
`for use as vaccines against a variety of infectious agents by cloning
`and expressing the genes coding for the relevant surface immunogens.
`Notable progress has been made towards a vaccine for foot and
`mouth disease virus (FMDV) using this approach, where one of the
`capsid proteins (VPI) produced in E.
`‘coil’ has been shown to elicit
`
`
`
`/
`CDNTROL NOS. 90.r’007,542 AND 90:'007,859 ''
`
`‘
`
`'
`
`'
`
`'
`
`DOCKET NOS. 22338-‘T0230 AND -10231
`
`J
`
`.
`
`_,
`
`-.
`
`Expression of eukaryo tic genes
`
`129
`
`neutralizing antibody (Kleid et al., 1981). Genetically engineered
`vaccines for other viruses such as hepatitis B and rabies virus are also
`being considered.
`_
`Although none of these initial examples. of the expression of
`proteins from recombinant organisms is as yet established as a bio-
`technological process, the way in which the expression of the recom-
`binant DNA was achieved forms a general paradigm for all future
`studies. However, at the same time, it is clear that not all the rules
`governing ‘the expression of cloned genes have been elaborated and
`those rules that do exist are still largely empirical. In this article
`the ways in which expression has been achieved are reviewed, some
`of the problems discussed and some of the probable future systems
`considered.
`
`II‘ Gene expression in__E. colt’
`
`E. coil‘ has been used as the host-cell for expression of foreign genes
`mainly because more is known about the control of gene expression
`in this organism than in any other. It is well established, for example,
`that the genes involved in a particular metabolic activity tend to be
`clustered in transcriptional units (operons) with the major control
`regions (the operator and promoter) located at the beginning of the -
`cluster (for a detailed description of bacterial gene expression, see
`Miller and Reznikoff, 1980). The operon is transcribed into a poly-
`cistronic mRNA from which the polypeptides are then translated.
`Transcriptional control is exerted over the expression of an operon
`and varies depending on the function of the genes in the operon
`(see Miller and Reznikoff, 1980). Since relatively few promoter
`systems are currently being utilized to express cloned genes, the
`essential elements of their control mechanisms will be dealt with
`
`'
`
`when considering each system. Expression of a cloned gene requires
`efficient and specific transcription _of the DNA, translation of the
`mRNA and in some cases post-translational modification of the
`resulting protein .
`'
`-
`
`I
`
`Transcription
`
`The first step in the initiation of transcription in E. coil‘ is the binding I
`of RNA polymerase to a promoter sequence in the DNA. Analysis of
`the DNA sequence of many promoters in E. coli has revealed two
`regions of homology locatedabout 35 base pairs (bp) upstream
`from the transcription initiation site (the — 35 region) and about
`10 bp- upstream (the - 10 region or Pribnow-Schaller box). The
`
`-
`
`
`
`I
`2"
`CONTROL NOS. 90:’007;542 AND'90!007.859 '
`
`130
`
`T. J. E. Harris
`
`' DOCKET NOS. 22338-10230 AND -10231
`
`J
`
`_,
`
`conserved sequences in the — 35 and - 10 regions (TTGACA and
`TATAAT respectively, Rosenberg and Court, 1979; Siebenlist et aL,
`1980) probably represent those bases most intimately involved in
`polymerase binding and orientation via sigma factor, so that RNA
`chain initiation can take place just downstream.
`Transcription termination is also controlled by signals in the DNA
`sequence, characteristically a GC rich region having a two~fo1d
`symmetry before t.he termination site, followed by an AT rich -
`sequence at the site of termination (Rosenberg and Court, 1979).
`Several protein factors are also involved in "the control of tenn-
`ination, most notably the rho factor. Anti-termination proteins such
`as the N gene product of-phage R can also be involved in specialised
`systems (Greenblatt et at, 1981).
`
`B Translation_
`
`Efficient translation of mRNA in prokaryotic cells requires the
`presence of a ribosome binding site (tbs). For most E. ooh‘ rnRNAs
`the rbs consists of two components, the initiation codon AUG and,
`lying 3-12 bases upstream, a sequence of 3-9 bases called the
`Shine-Dalgarno _(SD) sequence complementary to the 3' end of the
`16S rRNA (Shine and Dalgarno, 1975). It is ‘believed that hybrid-
`ization to this region is involved in the attachment of the ribosomal
`30S subunit to the mRNA (Steitz, 1979). The SD sequence is not
`identical in all mR.NAs but a semi-conserved consensus sequence has
`been identified just as for promoter sequences. It is possible that
`differences in SD sequences form part of a translational control
`system. In addition, ribosome binding is probably modulated by the
`secondary structure at the 5' end of the RNA since more efficient
`translation occurs if the AUG and SD sequence are freely accessible
`to 30S ribosomal subunits (Iserentant and Fiers, 1980). Termination
`of translation usually occurs whenever one of the three stop codons
`is encountered in the mRNA by a ribosome complex, provided that
`anaminoacylated suppressor tRNA is not present.
`
`C Post-translational modification
`
`There are a variety of modifications that bacterial proteins can
`undergo following translation.T-he fonnyl group on the NH 2-terminal
`methionine is hydrolysed and one or more NH,-terminal residues
`may be removed. Many secreted proteins are synthesized as large
`precursors with
`additional
`hydrophobic NH,-terminal
`signal
`sequences that are cleaved off by a membrane bound enzyme (for
`- review.
`see Davis and Tai, 1980). However, glycosylation and
`
`
`
`-
`/z
`-
`CONTROL NOS. 9o:oo7,542 AND‘9oroo7,s5'9 '
`
`-
`
`'
`' DOCKET ‘NOS. 22338-10230 AND 40231 .
`
`J
`.0
`
`phosphor-ylation, which are common modifications of proteins in
`eukaryotic cells do not occur to any great extent in E. coil‘.
`
`Expression of eukcryo tic genes
`
`131
`
`._
`
`III ' Problems encountered in the expression of
`eukaryotic DNA in E. coli
`
`Successful expression of a eukaryotic gene in E. coii requires that tbe_
`cellular machinery is organised so that the level of expression of
`' the cloned gene is as good or better than the resident genes. Probably
`the most important difference between eukaryotic genes (at least
`from higher organisms) and prokaryotic genes is the presence of
`intervening sequences (introns) which interrupt the coding sequences.
`Normally these sequences are spliced out of the
`RNA
`transcript, producing cytoplasmic mRNA suitable for translation.
`There are no introns in prokaryotic genes and consequently no
`splicing enzymes present, so in general genomic DNA cannot be used
`-as a source of genes. for expression in bacterial cells. A second
`problem is that transcriptional signals in eukaryotes are different
`from those in prokaryotes (Corden et aL, 1930; Breathnach and
`Chambon, 1981) and are not usually recognised by bacterial RNA
`polymerase. This difference again emphasizes the fact that eukaryotic
`genomic DNA is not a suitable gene source for construction of
`expression vectors. Thirdly, the structure of eukaryotic mRNA is
`different to bacterial mRNA. Eukaryotic mRNA is polyadenylated
`at the 3' end and normally capped at the 5' end. features which
`may affect mRNA stability and ribosome binding (Breathnach and
`Chambon, 1981). "Furthermore eukaryotic mRNA does not seem to
`have an equivalent of the SD sequence present in prokaryotic mRNA
`(Kozak, 1981).
`An additional problem is that of codon usage. The codons used
`in mRNA coding for highly expressed prokaryotic genes are not
`random;
`there is a marked preference for particular codons for
`some amino acids (Grantham et at, 1981; see Grosjean and Fiers,
`1982). This preference appears to correlate with the abundance of
`different tRNA species (Ikemura, 1981). As codon selection pref-
`erences are different for eukaryotic genes it is possible that the
`levels of certain tRNAs will affect translational efficiency of these
`genes in a prol-talryotic system. Finally,
`it
`is known that many
`eukaryotic proteins are subject to '_a number of post-translational
`modifications which may affect either activity or stability. Most of
`these modifications do not occur in prokaryotes.
`'
`-
`A number of strategies have been developedto try to overcome
`these difficulties (Table'1). once -the amino acid sequence of a
`
`
`
`/
`CONTROL NOS. 90!i)07,542 AND'90f0'07,859 '
`132
`T. J. R. Harris
`
`'
`
`DOCKETNOS. 22338-‘E0230 AND -10231“
`
`‘
`
`
`
`General strategies for the expression of cloned genes in E. coir‘.
`Control level
`Strategy
`
`Table I
`
`J
`,3
`
`—-
`
`Gene
`
`Transcription
`‘(Initiation and termination)
`
`-,
`
`_
`
`Translation
`(Initiation)
`
`‘
`
`.
`
`'
`
`Synthesise DNA in uitro by chemical methods,
`with optimised codon assignments or obtain
`cDNA clone to specific 1-nRNA. Chemical DNA
`synthesis probably required for tailoring genes
`into expression vector.
`
`Clone gene adjacent to strong E. coir’ promoter
`which is controllable so that transcription can
`be induced (derepressed) when required. Use a
`multicopy plasmid to increase gene dosage.
`Include termination signal after gene to prevent
`transcriptional read-through.
`
`Fuse gene in correct translational reading frame
`to an E. coli gene already in the vector, so that
`normal rbs is maintained. Possible to use both
`
`long and short NH;-terminal fusions.
`Alternatively, place new gene with its own
`AUG adjacent to an rhs. The sequence of the
`SD sequence and distance from the initiating
`AUG may modulate translation. Accessibility
`(secondary structure) around SD-AUG may be
`important. Godon usage can be overcome by
`using chemically synthesized genes. Not clear
`if codon bias actually affects the translation
`-of cloned genes. Include stop codon (s) in
`chemically synthesised genes.
`
`Protein
`(Secretion and stability)
`
`Use signal sequences to control secretion?
`Synthesis of precursor proteins followed
`by their processing ensures removal of
`N Hg-terminal initiating methionine. Factors
`affecting folding of foreign proteins and their
`degradation in E. coli are not well defined.
`Synthesis of long fusion proteins may result in
`
`increased stability. '
`
`protein is known it is now a relatively straightforward task to design
`and synthesize chemically, a DNA sequence that will code for the
`protein witliout
`the problem of intervening sequences and with
`optimized codon assignments. A gene of 514 bp coding for leukocyte
`Le (cr) interferon, a protein of 166 amino acids is the longest DNA
`sequence that has been synthesized so far
`(Edge et al., 1981).
`Although there is no theoretical limit‘ to the size of gene that can be
`synthesized, practical problems arise for much larger proteins. If the
`gene is too big for a chemical synthesis, then double stranded DNA
`copies of mRNA populations can be generated, cloned into a plasmid
`vector and the clone containing the sequence coding for the protein
`
`
`
`/
`CONTROL NOS. 90fOO7,542'AND903001859 '
`_
`
`DOCKET NOS. 22338.-10230 AND -10231 '
`_
`I
`'
`Expression of eukaryotic genes
`133
`
`g._
`_.
`
`-_
`
`‘
`
`of interest selected from the clone bank by hybridization tech-
`niques.
`-
`' Transcription of these genes is controlled by inserting the DNA
`adjacent to a strong prokaryotic promoter in a cloning vector. Four
`promoters have been used most widely for this purpose, the lac
`promoter from the E. can‘ lac operon; the trp promoter from the
`E. coir‘ trp operon; the strong leftward promoter of phage 1 (P1,) and
`the constitutive and weaker fl-lactamase promoter present in the
`plasmid vector pBR322. The expression vectors themselves are
`usually derived from high copy number plasmids so that there is
`increased expression owing to gene dosage (Gelfand er al., 1978;
`O’Farrell et el., 1978). Termination of transcription can be ensured
`by placing a termination site after the cloned gene (e.g. Nakamura
`and Inouye, 1982) although whether this is necessary for the main-
`tenance of high levels of transcription isnot yet clear. The conse-
`quences of uninterrupted transcription around a small circular
`plasmid DNA molecule are -unknown. It is presumably detrimental
`since most expression vectors“ have other genes present (e.g. an
`antibiotic resistance gene) which are transcribed in the opposite
`direction from a different promoter and it islmown that the trans-
`cription of genes in A phage carrying the trp promoter is adversely
`affected if the trp promoter is in an orientation where transcription
`occurs towards transcripts arising from the PL promoter (Hopkins
`er al., 1976).
`'
`'I‘ranslational barriers have been overcome to some extent by two
`procedures. The foreign gene is either fused (in the correct trans-
`lational reading frame) to a prokaryotic gene so that the existing rbs
`is‘ used to initiate translation, or the new gene, with its own initiation
`codon, is placed adjacent to a naturally occurring E. coil’ rbs (Backman
`er oi, 1980) or asynthetic one (Jay et at, 1981). Since all structural
`genes, whether eukaryotic or prokaryotic, end with one or more of
`the three termination" codons it is not usually necessary to make
`special arrangements for translational
`termination when using a
`cloned cDN.A sequence. However, a termination codon must be
`included when synthetic DNA is used.
`Protein modification and stability are much less easy to control,
`largely because the structural features governing protein stability in
`E. coir‘ are not well understood. It has been shown that eukaryotic
`signal sequences are recognised by-E. call’ and that NH,-terrninal
`fusions of eukaryotic polypeptides to E. coli signal sequences results
`in secretion of the protein to the periplasmic space, with "concomitant
`cleavage of the signal sequence (Talmadge et al., 1980; 1981). There
`is also some evidence that short “foreign” polypeptides are unstable
`in E. coli (Itakura et a!., 1977'. Goeddel et aI., 19793). This has been
`
`
`
`_
`/'
`CONTROL NOS. 9orpg;,542 g:_xI5>_ gzgpgfsge '
`
`'
`
`_
`DOCKET NOS. 22333-10230 AND 40231
`
`*7
`
`"
`'4-
`
`,
`
`overcome by fusing the peptide to a larger E. cola" protein from which
`the peptide is then cleaved.
`
`IV Expression of DNA from lower eulraryotes
`
`Following the observation that DNA from S. cureus could be
`expressed in E. coli (Chang and Cohen, 1974) it was shown that
`eukaryotic DNA could also be transcribed (Morrow et at, 1974;
`Chang et al., 1975; Kedes er al., 1975). It was not clear from these
`experiments, however, whether the normal transcriptional start and
`stop signals were being recognised. The fundamental question of
`whether a fungal gene could be transcribed and translated to produce
`a functional protein in E. colt‘ was answered to some extent by the
`finding that fragments of yeast DNA cloned into phage 1, or the
`plasmid vector Col E1 could _complement auxotrophic mutants of
`E. coil’ (e.g. His 13 and Len B) (Struhl et cl., 1976; Ratzkin and
`Carbon, 1977; Struhl and Davis, 1977). Similarly segments of Neuro-
`spora crasso DNA containing the gene for dehydroquinase were
`‘ successfully expressed in E. coli in a pBR322 replicon (Vapnek er a'!.,
`1977). Several other yeast genes have now been expressed in this way
`'(e.g. Trp 1, Trp 5 and Arg 4). The functional expression of yeast
`DNA in E. cob’ not only demonstrated that eukaryotic DNA could be
`transcribed and __t_ra.nsla_tgd, paving the way for the experiments
`described below, bi1_1:—a1so provided a powerful method for isolating
`yeast genes. Some of these genes have subsequently been used to
`provide selection markers in yeast-E. colt’ shuttle vectors (Beggs,
`1982; Hinnen and Meyhack, 1982).
`
`The lac promoter
`
`The {ac operon is subject to two types of control. In the absence of
`lactose (or other inducer] the operon is kept switched off by lac
`repressor (the lac i gene product) binding to the operator. Positive
`regulation is also exerted through the catabolite gene activator
`protein (CAP). In the absence of glucose, CAP forms a complex with
`cyclic AMP and this complex stimulates transcription by binding
`next to the promoter. The operon is derepressed by thepresence of
`lactose,.or by the addition of the non-metabolizable inducer IPTG
`(isopropylthiogalactoside) which binds to the repressor and removes
`it from the operator.
`-
`-
`—
`
`
`
`/
`DOCKET Nos. 22338-10230 AND 40231 '
`'
`CC-)NTROL NOS. 9oroo7,542 AND e'o;o07,s59 '
`Expression of eukaryotic genes
`135
`
`I
`_a
`
`-_
`
`Plasmid vectors containing parts of the lac operon have been
`constructed by several workers. Polisky et at. (1976) cloned an
`EcoRI fragment from A p inc 5 DNA (a transducing phage containing
`part of the lac operon) into a C01 E1-derived plasmid to obtain
`a vector with the lac promoter and operator and most of the [3-
`galactosidase gene. Plasmids containing a small “portable” lac
`promoter fragment have also been made. In these constructions a
`203 bp Haelll fragment of ice transducing phage DNA, containing the
`lac promoter and operator and first eight codons of [3-galactosidase,
`was blunt end ligated into EcoRI-cut and “filled in” pBR322 DNA.
`The portability derives fiorn the fact that EcoRI sites are reformed
`at the junctions allowing the promoter fragment to be removed by
`EcoRI- digestion (Backman and Ptashne, 1976; Ital-rura er al., 1977}.
`Colonies harbouring plasmids which carried the lac _promoter—operator
`were identified by their constitutive synthesis of B-galactosidase,
`rendering them blue on agar plates containing X gel (5 chloro-4
`' bromo 3 indolyl-D galactoside). This is because multiple copies of
`the operator titrate out all the lac repressor resulting in derepression
`of the chromosomal B-galactosidase gene. Both it p Inc 5 DNA and_
`?\ h80 lac UV5 C1857 DNA, which contains the CAP site mutation
`L8 and the up promoter mutation UV5 (making the promoter insen-
`sitive to catabolite repression), have been used as a source of lac
`DNA for these constructions (Backman et al., 1976; Itakura et al.,
`1977; see also Fuller, 1982). Further derivative plasmids containing a
`95 bp Alul fragment of lac DNA, including the UV5 promoter (minus
`the CAP binding site), the repressor binding site and most of the rbs,
`just excluding the ATG of [3-galactosidase, have also been con-
`structed for the expression of non-fusion proteins (Fuller, 1982).
`
`A The somatostatin experiment
`
`The first report of the designed expression of a eukaryotic gene in
`E. coli was the production of the small peptide hormone somato-
`statin (Itakura et aI., 1977). Somatostatin was used as a model
`system because the hormone was a small polypeptide of known
`amino acid sequence for which sensitive radioirnmune and biological .
`assays existed. The experiments illustrate a number of features of
`methods which are now used to "obtain expression of cloned genes.
`They "also demonstrated, although not for the first
`time,
`that
`_ chemically synthesized DNA was functional in a biological system.
`In addition, the production of the protein as a fusion polypeptide
`and its subsequent cleavage into the native hormone at methionine
`residues by cyanogen bromide (CNBI), has been used quite extensively"
`. for other proteins. This overall strategy is depicted in Fig. 1.
`
`'
`
`
`
`/4
`CONTROL NOS. 90;’007'.542 AND '90.r‘O0T,859
`
`'
`
`'
`
`'
`
`'
`
`DOCKET NOS. 22338-10230 AND -10231 Z:
`
`136
`
`T. J. R. Harris
`
`E. culi Loc Opercn DNA
`
`1
`
`GENETIC CODE
`
`chemiw
`
`DIM Synthesis
`
`Sornoloslolin Gene
`
`.--'
`
`I
`
`.
`
`..j———-aififl car car rat has uc no 1"r'r 1-5G
`lg‘
`
`T 1'61‘ on? ‘Ice c'r'r ‘ran 6
`
`pal-1322
`
`Plasmid DNA
`
`1 In Vivo
`
`Som
`ll - Gal
`"142 MeI - mu - GI} - 6!: - Ln-an - Phr - Phi‘
`5
`i
`HO - C-ye -sew Thr- Pne -Tm"
`
`‘ftp
`H'-
`
`In Vim:
`
`Cyanogen Bromide
`Cleavage
`
`ll-Gal Fragments
`
`+
`
`'
`
`NH2- an - Ely - clyr Lrl-Ann-PM -en: ,1
`s
`'
`?
`I-IO - Cyr Ser -Thr - Flue -.Thr
`
`T"
`‘Lu
`-
`
`active Somotostutin
`
`Strategy for the expression of the chemically synthesized somato-
`Figure 1
`statin gene as a fl-galactosidase fusion from the lac promoter. The active hormone
`can be cleaved from the hybrid protein by CNBr treatment. (Reproduced from
`Itakura ei cl., 1977, copyright by the American Association for the Advancement
`of Science, with permission.)
`'
`
`In the first set of experiments the chemically synthesized somato-
`statin gene with synthetic EcoRI and BamHI cohesive ends" was
`cloned into a vector containing the wild type Haelll lac promoter
`fragment. The DNA sequence indicated that the plasmid should have
`produced a polypeptide containing the first seven amino acids of
`‘ fl-galactosidase fused to somatostatin. However, no sornatostatin was
`detected in bacterial extracts by radioimmunoassay. Asit was found
`that somatostatin was not‘ stable when added to E. coli extracts, the
`failure to find somatostatin was thought to be due to proteolytic
`digestion (Itakura et at, 1977). The approach adopted to try to
`stabilise the somatostatin was to produce it as part of a longer
`polypeptide from which it could be cleaved by CNBI. This was done
`by linking the somatostatin gene to the EcoRI fragment of )\ p lac .
`5 DNA which carries the lac promoter and a large proportion of the
`[3-galactosidase gene (-Polisky er al., 1976). The translation reading
`
`
`
`x"
`CC-)NTR'OL NOS. 901001542 AN D'90fO07,859 '
`
`DOCKET NOS. 22338-10230 AND -10231
`
`J
`"an.
`
`137
`Expression of eukaryotic genes
`f_1-a_n_1,_e pf B-galactosidase was maintained in somatostatin after fusion
`at __the__§¢'_.jo_B_.I__ junction. In these constructions only one orientation
`of the EcoRI Inc fragment maintained the correct reading frame in
`eo__ma'tost_atin" and when several independent clones were examined,
`about half produced detectable somatostatin after CNBI cleavage.
`No immunoreactive protein was detected before cleavage since the
`antiserum used in the assay required a free NH,-terminal alanine
`residue (Itakura et aI., 1977).
`
`.
`
`i
`
`B Expression of insulin in E. coii
`
`The somatostatin work established the feasibility of the synthetic
`gene fusion approach for the expression of small polypeptides in
`E. coli. It was possible to follow an almost identical strategy to
`obtain expression of human -insulin, as neither the 20 amino acid
`A chain nor
`the 30 amino acid B chain of insulin contained
`
`methionine and methods were available for the in vitro joining of the
`two chains. Thus, an A chain gene and a B chain gene were chem-
`ically synthesized each with Barn HI and EcoRI cohesive ends (Crea
`et ai, 1978) and cloned separately into pBR322. The B chain gene
`was synthesized with a Hindlll site in the middle so that the two
`halves could be cloned separately and the sequence verified (Goeddel
`et aI., 1979a). Expression was achieved by transcription from the
`same lac promoter as used for the successful somatostatin con-
`structions and insulin A or B-ii-galactosidase fusion proteins were
`produced (Goeddei et aL, 1979a). The hybrid proteins represented
`about 20% of total cell protein, which was about ten-fold higher than
`the level of expression obtained with somatostatin. The hybrid
`proteins were insoluble and were found in the first low speed pellet
`after breaking the cells with a French press.
`To obtain A and B peptides suitable for reconstitution into
`native insulin,
`the hybrid proteins had to be solubilised,
`the {3-
`galactosidase portion removed and the peptides S-sulphonated. This
`was a'chieved by dissolving the hybrid proteins in 6M guanidinium
`chloride followed by dialysis. The precipitate was dissolved in 70%
`formic acid,
`the protein cleaved with CNBr and S-sulphonated
`_ derivatives of the peptide mixture obtained, using sodium dithionate
`and sodium sulphite at pH 9. Insulin activity was readily detected
`by radioimmunoassay after re-constitution. Further studies on the
`‘peptides (e.g. chromatographic behaviour) and amino acid com-
`positions established, without doubt, that the bacteria were pro-
`ducing authentic insulin A and B chains (Goeddel et aL, 1979a].
`Insulin, prepared from bacteria containing these constructions by a
`scaled up and modified process, has now been shown to be active
`
`
`
`/
`CC-JNTROL NOS. 901001542" AND '90!007.859 '
`
`'
`
`138
`
`T. J. R. Harris
`
`DOCKET NOS. 22338-10230 AND -1023.1
`
`when injected into human volunteers (Clark et al., 1982) and to
`interact with insulin receptors in the same way as native human
`insulin (Keefer et al., 1981).
`An alternative approach involves the synthesis of a gene coding for
`proinsulin, the natural precursor to insulin. Proinsulin is synthesized
`initially as a preproinsulin molecule consisting of an NH;-terminal
`signal sequence, followed by the B chain, a linking C peptide and the
`COOH-terminal A chain. Enzymatic removal of the signal peptide
`during secretion generates proinsulin and processing at two _trypsin
`sensitive sites (Arg-Arg, Lys-Arg) allows the removal of the C peptide
`and the generation of active
`The three dimensional structure
`of insulin indicates that a peptide much shorter than the 35 amino
`acid connecting C peptide should be sufficient to connect the B and
`A chains and still allow proper folding of the modified proinsulin.
`Genes coding for human proinsulin and “mini C" derivatives of
`proinsulin, where the C peptide is replaced by a six amino acid linker
`retaining the proteolytic cleavage sites, have been constructed by
`chemical synthesis (Sung er al., 1979; Wetzel et at, 19813; Brousseau
`et at. 1932).
`'
`The mini C construction was cloned for expression as a B-
`galactosidase fusion protein (Wetzel et al., 19813.) and a product with I
`a proinsulin-like structure (as determined by radioimmunoassay and
`HPLC} was detected after CNBr cleavage and S-sulphonation. The
`usefulness of this route to insulin production is still not clear
`however, as there are no data on the behaviour of mini C derivatives
`in enzymatic proinsulin processing systems and there are already
`preproinsulin expression constructions available derived from cDNA
`(see B—lactarnase section)_. However, the modular approach to the
`chemical synthesis of proinsulin adopted by Brousseau at at (1982)
`does have the advantage that the shortening and changing of parts
`of the C peptide or alteration of the codons can be approached
`rationally by the incorporation of different oligonucleotide blocks
`during synthesis, obviating the need to synthesise an entire coding
`sequence ‘each time a specific modification is made.
`
`C Synthesis of other hormones as B-galactosidase fusions ‘
`
`The strategy of using the lac promoter/operator and B-galactosidase
`NH,-terminal. fusions has been adopted for several other proteins
`including other hormones (see Table 2). For example the neuro-
`peptide 6-endorphin, a 30 amino acid endogenous opioid has been
`expressed in this way (Shine et at, 1980). In these experiments a
`cDNA clone to the precursor peptide of mouse corticotropin (ACTH)
`and ii-lipotropin -(LPH). was used as a source of cDNA coding for
`
` _..—___.--_.j....
`
`J'
`
`—*
`
`
`
`I
`/
`
`.
`
`CONTROL NOS} 90.’007,542 AND 90l'007,859
`
`_
`
`' DOCKET NOS. 22338-10230 AND -10231
`
`'4-E
`
`Pmtgm u ":_T""'”‘¢
`u-usn cur
`p-usn p.n-an-nu-In
`
`Expression of eukaryotic genes
`
`139
`
`maun 5- nnuu 3'
`III
`fill: III
`
`acme
`
`_
`
`,
`
`IIIIIII
`
`"
`
`us"
`
`'5
`
`fl-Endorphin
`
`,
`
`'
`
`—
`
`''
`
`'
`
`'-.._
`en-Ineilng '-.._
`__
`9'93""
`no
`In
`In
`---" age
`no cc: nc can on: es: as: m cat "rec III: uc etc etc use one uu: cc-r
`lG0|'IPtoI'|Irkg\1I!G|I|llI|FflIlI'|YrDSIfA|nPnFI'ul.'|'IlI|Il.r|AI‘
`I
`5
`H:
`I:
`nc ecu use 11: us An: ‘recon: no not can act: cc: no are Ac: :1: we
`Twf-trlilrfl-HI'TN5vGIuunsucunnn-LuvumLnen.
`'
`3|
`30
`II
`20
`no 156 I366 A1‘: ATC Int IA: 666 SIC AIS -IAG SE: C
`Ln no no In
`up up in Al:
`IlI| Ln 1,” cu,
`
`I
`
`‘
`
`___.______---.._-..1
`
`_
`_ _ . _ . . . _ . . . . . _ _ .._
`
`out-ed Ilu lupin:
`
`p-unit-in
`
`n
`
`Pal
`
`"—'
`
`'
`'
`gall Iiuluueluu
`5|!
`VII
`=='I=fi'- 5'6 ______________________ __.mc as
`C 51¢
`cc: crrccncn
`IIIIIIUII
`
`'1-—"'“ Egan
`
`ch
`ch
`utnonv
`W
`=66 W3 _______________________ __6¢c the
`0|! “C
`cc: 611’ can
`
`I
`
`slnuelnu
`
`