`
`Synthetic DNA and Medicine
`
`ARTHUR D. RIGGS1 AND KEIICHI ITAKURA
`
`Synthetic DNA chemistry is no longer an esoteric discipline without obvious practical
`applications. On the contrary, the combination of synthetic DNA chemistry, recombin(cid:173)
`ant DNA techniques, and molecular cloning already has resulted in useful products(cid:173)
`somatostatin [I] and insulin [2, 3]-and promises much more. In this review, we will
`first discuss the results and methods of our recent work on insulin [3] and mutation
`correction [ 4] and then follow with speculation on additional potential applications.
`
`THE INSULIN PROJECT
`Bacterial production of human insulin
`Figure I illustrates the overall scheme that we used [3]. The insulin chains (21 amino
`acid A chain and 30 amino acid B chain) are made in separate bacterial strains as tails
`on a rather large precursor protein, the enzyme /3-galactosidase. The insulin chains are
`efficiently clipped from the precursor protein by treatment with cyanogen bromide, a
`methionine-specific cleavage reagent. Because synthetic DNA was used to make the
`insulin genes, we were able to arrange that the insulin tails are attached to
`/3-galactosidase by a methionine linkage (see next section).
`The yields of the separate chains are extremely good; about 20% of the total bacterial
`protein is made as the insulin-/3-galactosidase precursor protein [3], and even higher
`yields should be obtained soon. Almost the entire protein-synthesizing machinery of
`the bacterial cell can be turned to the production of the desired peptide product.
`Milligrams of the insulin chains are made per liter of bacterial culture. The individual
`chains can be joined in good yields (up to 80%) by air oxidation [5] and active insulin
`obtained, as ascertained by chemical, radioimmune, and biological activity [3, and
`unpublished data]. With our present results, the commercial production of human
`insulin by bacteria seems practical, and two firms, Genentech, Inc. (South San Fran(cid:173)
`cisco, Calif.), and Eli Lilly, Inc. (Indianapolis, Ind.), are trying to develop procedures
`for large-scale production.
`The techniques we used are quite general; thus we are confident that bacteria can be
`engineered to produce any peptide hormone that does not contain methionine. By using
`
`Received April 25, 1979.
`1 Both authors: Division of Biology, City of Hope National Medical Center, Duane, CA 91010.
`© 1979 by the American Society of Human Genetics. 0002-9297/79/3105-0014$01.00
`
`531
`
`Sanofi/Regeneron Ex. 1 003, pg 52
`
`Merck Ex. 1003, pg 52
`
`
`
`532
`
`RIGGS AND ITAKURA
`
`E. coli
`
`E. coli
`
`c I
`
`A
`/3-Gal
`f/W/~
`Plasmid DNA
`~
`~ Beta
`A Chain ~ Beta
`
`B Chain
`
`Galactosidase
`
`Galactosidase
`
`l. Partial Purification
`2. Cleavage with CNBr
`3. Purification of
`Insulin Chains
`
`Insulin
`A Chain
`
`\
`
`¥
`
`Insulin
`8 Chain
`
`Air
`Oxidation
`
`~
`s s
`s s
`o:::a:x:x:xx:o
`
`Active
`
`Insulin
`
`F;a. 1. -Schematic ovetview of strains and procedures for production of human insulin by bacteria. Two
`E. coli strains were constructed having chemically synthesized insulin A orB chain genes inserted into the
`{3-galactosidase gene of a plasmid cloning vector. In vivo, a fused protein is made, mostly {3-galactosidase
`but with an insulin tail joined by a methionine. In vitro, insulin peptide chain is clipped off by treatment with
`cyanogen bromide. After separate purification, insulin A and B chains are joined by air oxidation.
`
`other cleavage tricks, or accepting lower yields, even peptides that contain methionine
`can probably be made.
`With chemical DNA synthesis, it is not necessary to copy the natural nucleotide
`sequence of a gene, because, given the sequence of amino acids in the desired peptide
`product, one can use the genetic code to design an "artificial" gene carrying the
`necessary information. This is the approach that we used first for somatostatin [I] and
`
`Sanofi/Regeneron Ex. 1 003, pg 53
`
`Merck Ex. 1003, pg 53
`
`
`
`SYNTHETIC DNA AND MEDICINE
`
`533
`
`then for insulin [2]. Techniques have developed rapidly, so that the genes necessary for
`altering the bacteria can be made and inserted with relatively modest expenditures of
`time and money. In the next section, we will describe in more detail how genes can be
`made and inserted into bacteria.
`
`Chemical DNA synthesis
`Although there are alternative methods [6], the fastest way to make DNA is by the
`phosphotriester method [2, 7] illustrated in figure 2. We will not go into details beyond
`those given in the legend of figure 2, but recent improvements in the method (such as
`the rapid synthesis of trimers, together with the extensive use of high performance
`liquid chromatography for analysis and purification of the oligodeoxyribonucleotides)
`have dramatically reduced the time necessary for the construction of DNA fragments
`[2]. A library of trimers has been established, and longer oligonucleotides can be
`assembled quickly from the trimer units (which correspond to amino acid codons).
`To make the insulin A and B chain genes, it was necessary to make 29
`oligonucleotides which were assembled and joined by ligation to make a total of 181
`base pairs of duplex DNA. Starting from the trimer library, the DNA fragments were
`made in about 3 months. The next stages, cloning and expression, took somewhat
`longer. DNA synthesis no longer is the rate limiting step.
`
`Molecular cloning and expression
`Figure 3 illustrates how the insulin A chain gene was assembled, cloned, and
`positioned at the end of ,8-galactosidase. Step I (fig. 3) was joining the small (13 base
`average) oligonucleotides. Because they were designed to have complementary
`overlaps, they assemble themselves, and were joined to give duplex DNA by the action
`of the T4 DNA ligase. The gene was designed to have restriction enzyme sites at each
`end (Eco Rl on the left and Bam HI on the right). Step 2 was preparation of the
`plasmid DNA cloning vector pBR322. Preparation included treatment with Eco Rl and
`Bam HI restriction enzymes, which cuts out a small piece of the plasmid and provides
`a site for insertion of the synthetic A gene. In step 3, the prepared plasmid and
`synthetic DNA are mixed and joined by T4 DNA ligase, followed by transformation of
`E. coli and molecular cloning. A clone was obtained that contained a correct insulin A
`gene, as verified by direct DNA sequencing. Next, a DNA fragment containing most of
`the E. coli lac operon, including the lac promoter, operator, and the first 1006 amino
`acid codons of ,8-galactosidase, was inserted (steps 4, 5, and 6; z symbolizes the
`,8-galactosidase gene). This led to a clone making insulin-,8-galactosidase fused
`protein.
`
`MUTATION CORRECTION
`
`DNA changes directed by synthetic DNA
`Several techniques have been used recently for site-localized mutagenesis of DNA in
`vitro [4, 8-10]. However, we think that the use of synthetic DNA provides the most
`specific and general approach to making directed changes in DNA [4, 10]. It should be
`possible to repair or create mutations, convert a gene of one species to the same gene of
`
`Sanofi/Regeneron Ex. 1 003, pg 54
`
`Merck Ex. 1003, pg 54
`
`
`
`534
`
`RIGGS AND ITAKURA
`
`B
`
`B
`
`B
`
`J- OJ-OJ-0
`
`DMT
`
`II
`~
`
`0
`R
`
`II
`~
`
`0
`R
`l
`
`II
`~-OCE
`
`0
`R
`
`B
`
`B
`
`B
`
`DMT J-~ J-~J-OAn
`
`0
`R
`
`0
`R
`
`~
`
`B
`
`B
`
`B
`
`OMT J-~J-~J-~-0- +
`
`0
`R
`
`0
`R
`
`0
`R
`
`B
`
`B
`
`B
`
`J- OJ-OJ-0
`
`HO
`
`II
`~
`0
`R
`
`II
`~
`0
`R
`
`II
`~-OCE
`0
`R
`
`B
`
`B
`
`B
`
`3 + HO J-~ J ~J-OAn
`
`0
`R
`
`0
`R
`
`2) Silica gel
`
`~ 1) TPSTe
`
`3) Et3N
`
`4
`
`~
`
`~ 1) TPSTe
`
`2) H+
`3) Silica gel
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`1~1~1~1~1~1~
`oMr~~~~~~-1-r~r~r-o-
`o
`R
`
`0
`R
`
`0
`R
`
`0
`R
`
`0
`R
`
`0
`R
`
`§
`
`r 2)
`
`TPSTe
`
`Silica gel
`
`3)
`
`4)
`
`NH40H
`AcOH
`
`J~J~J-~J~J~J-
`
`HO
`
`P
`I
`0
`R
`
`P
`I
`0
`R
`
`P
`I
`0
`R
`
`7
`
`OAn
`
`P
`I
`0
`R
`
`P
`I
`0
`R
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`HO
`
`J-~J-~J-~J-~J-~J-~J-~J-~J-~J-~J-~J-
`
`P
`I
`o-
`
`P
`I
`o-
`
`P
`I
`o-
`
`P
`I
`o-
`
`P
`I
`o-
`
`P
`I
`o-
`
`P
`I
`o-
`
`P
`I
`o-
`
`P
`I
`o-
`
`P
`I
`o-
`
`P
`I
`o-
`
`OH
`
`R = -@-c1
`B = ProJected ond non protected bases
`An= -c -@-oMe
`g
`
`DMT = dimethoxytrity1
`TPSTe = 4tN~= ~
`~0 =N
`CE = {3- cyonoethy 1
`
`FIG. 2. -Chemical synthesis of oligodeoxyribonucleotides by improved triester method [2, 7]. Starting
`with nucleosides, a library of fully protected triester trimers, such as 1 and 2 are made. The type 2 trimer,
`with the anisole 3' -protecting group will become the 3' end of the oligonucleotide. The type 1 trimer is
`bifunctional, and depending on treatment (either mild acid or base) will be either the 5' -end component or an
`internal sequence component. Because of the chlorophenyl protecting groups attached by ester linkage to the
`phosphate groups (forming phosphotriesters), the trimers and intermediate oligonucleotides (e.g., 3, 4, 5, 6,
`7) are not water soluble. Therefore, all condensations and purifications are done in nonaqueous solvents such
`as chloroform. Trimers can be condensed to yield hexamers (e.g., 3 + 4 yields 6) and hexamers can be
`condensed to yield dodecamers (e.g., 6 + 7 yields 8, still in fully protected triester form). The next-to-last
`step in a typical synthesis is the removal of all protecting groups by treatment with acetic acid and NH.OH,
`generating the desired water soluble single stranded DNA fragment. The last step is a careful purification of
`the DNA fragment by high performance liquid chromatography.
`
`Sanofi/Regeneron Ex. 1 003, pg 55
`
`Merck Ex. 1003, pg 55
`
`
`
`Rl
`Bam
`- - - - -
`- - - - -
`Oli ganucleatides
`
`Ligase
`
`..
`
`Rl
`
`..
`
`A
`
`Bam
`
`Trans.
`Bam
`
`®
`Rl
`a Bam ~
`
`Rl
`
`c I
`
`SYNTHETIC DNA AND MEDICINE
`CD
`..
`@~ (
`)
`
`535
`
`A
`
`Rl
`
`Bam )
`
`AmpR, TetS
`
`® ~Rl
`
`B~
`
`r, A
`
`Lac PO z'
`I
`Rl
`
`Rl
`
`A
`
`Bam
`
`(
`
`AmpR, Tet 5, Blue on XG
`
`) ~®
`"' :),
`
`Trans.
`
`Lac PO
`
`z'
`Rl
`
`@ Rl
`Lac PO z
`
`Rl
`
`Rl
`
`Lambda ploc
`FIG. 3.-The construction of a plasmid DNA containing a synthetic insulin A chain gene inserted at the
`end of a J3-galactosidase gene. The procedures are described in the text, and details are given in Goedde) et
`al. [3], so only an explanation of the symbols is given here. The symbol A represents the synthetic A chain
`gene. pBR322 is a well-characterized plasmid cloning vector containing two antibiotic resistance genes,
`ampicillin (Amp) and tetracycline (Tet), and several convenient restriction endonuclease sites including Eco
`RI (Rl) and Bam HI (Bam). Lambda plac is a lambda transducing phage carrying the entire E. coli operon,
`which includes the lac promoter (P), the lac operator ( 0), and the entire J3-galactosidase structural gene (Z).
`There is an Eco Rl endonuclease site to the left of the operon and also one near the end of the J3-galactosidase
`gene; thus, the lac operon DNA fragment can be readily obtained. The phenotype of the bacterial strains
`successfully infected with the desired plasmid are shown. For example, the A chain producing strains would
`be ampicillin resistant (AmpR), tetracycline sensitive (Tets), and the colonies would be Blue on a special
`indicator agar called Xg.
`
`another species, make genes for peptide analogues, create restriction sites, etc.
`Although practical applications have not yet been made, the feasibility of the approach
`has been demonstrated [4, 10].
`Figure 4 illustrates how we have efficiently corrected a mutation in cf>X174
`bacteriophage DNA, a favorable test system because the mature viral DNA is a single
`stranded circle, and the complete DNA sequence is known [11]. We made a synthetic
`primer with wild type sequence. This synthetic DNA hybridized with a one-base pair
`mismatch to mutant viral DNA and served as a primer for in vitro DNA replication by
`E. coli DNA polymerase. Complete heteroduplex circles were made by in vitro DNA
`replication, and these were used to infect£. coli. In vivo replication led to homoduplex
`progeny, a high percentage of which were converted to wild type [ 4]. The efficiency of
`the directed change is high enough that even sequence changes for which there is no
`method of selection can be made. Direct DNA sequencing techniques can be used to
`identify the converted molecule.
`
`Sanofi/Regeneron Ex. 1 003, pg 56
`
`Merck Ex. 1003, pg 56
`
`
`
`536
`
`RIGGS AND IT AKURA
`
`wild
`type
`
`3 ·A A c A/c,c c TAT G G G A 5'
`
`C T T G:~~G GATA C C CT )
`I In Vitro
`t DNA Synthesis
`
`- -A -
`
`w.t.
`
`Infect E. Coli
`
`DNA Replication
`
`Synthetic
`Primer
`
`¢x 174
`DNA
`
`E. Coli
`
`~
`Wild Type
`Progeny
`
`--50°/o
`
`FIG. 4.-Correction of a mutation in bacteriophage cpXI74. A chemically synthesized single stranded
`DNA fragment with a wild type sequence is specifically hybridized to a cpX 174 mutant called amber 3 (am
`3). The synthetic oligonucleotide serves as a primer for in vitro DNA synthesis. After infection and
`replication, a high percentage of the progeny phage are wild type, indicating efficient correction of the
`mutation by using synthetic DNA.
`
`Sanofi/Regeneron Ex. 1 003, pg 57
`
`Merck Ex. 1003, pg 57
`
`
`
`SYNTHETIC DNA AND MEDICINE
`
`537
`
`ISOLATION OF NATURAL GENES
`
`Genes for large proteins and enzymes could be made by chemical DNA synthesis
`(synthesis rate about one nucleotide/day per person). However, in most cases, it will
`probably be preferable to isolate the natural DNA sequence by cloning a reverse
`transcript of the appropriate mRNA. This approach to gene isolation works well if a
`method is available for detecting the desired gene sequence among the ''shotgun''
`library of bacterial or viral clones. Detection of the desired clone usually is difficult,
`except for the most abundant protein (and thus mRNA) species. With present
`methodology, it is very difficult to isolate a specific gene whose mRNA transcript does
`not comprise more than 1% of the total mRNA.
`Synthetic DNA, however, may allow the isolation of the natural gene for any protein
`for which at least partial amino acid sequence is known. Using the genetic code, a short
`(10-20 nucleotide) synthetic DNA fragment can be made that is complementary to the
`mRNA of the desired gene. Degeneracy of the genetic code will require that a mixture
`of synthetic probes be made for most proteins, but this seems feasible. Synthetic probes
`have been used to identify the cytochrome C gene of yeast [12]. We are confident that
`with synthetic DNA used as a specific primer for eDNA synthesis and as a specific
`hybridization probe, almost any gene with a known peptide product can be isolated.
`
`EXPRESSION OF NATURAL MAMMALIAN GENES IN BACTERIA
`
`There is now clear evidence that mammalian genes do function in bacteria. Bacterial
`clones producing rat insulin, chicken ovalbumin, rat growth hormone, and mouse
`dihydrofolate reductase have been obtained [13-17]. Clearly, there is no fundamental
`barrier to prevent transcription and translation of eukaryotic genes in prokaryotes. The
`key to successful, efficient expression seems to be the presence of appropriate control
`signals and correct positioning of the foreign gene sequence. The new sequence must
`be in phase, near a ribosome-binding site (if the desired product is to be made without a
`precursor peptide), downstream from a good promoter, and contain translation start
`and stop sites. Synthetic DNA will not only aid isolation of the natural gene, but will
`also aid in obtaining expression. The most important function for synthetic DNA may
`be in trimming, lengthening, or changing natural sequences to obtain efficient
`expression of the desired peptide product.
`
`SUMMARY OF POTENTIAL PRACTICAL APPLICATIONS
`
`In vitro recombinant DNA techniques, molecular cloning, and synthetic DNA
`chemistry have interacted synergistically to c.reate a revolution that is still in its early
`stages. Nevertheless, even now we see the following on the immediate horizon:
`Hormones. Numerous mammalian hormones arid hormone analogues will be made
`(e.g., somatostatin, insulin, proinsulin, growth hormone, etc.).
`Interferon. Bacterial production may be achieved within 2 years, or perhaps much
`sooner (possibly even before this article appears).
`Vaccines. The coat proteins of viruses will be made in E. coli, perhaps fused to an
`immunogenic precursor protein, and used to make vaccines (e.g., against hepatitis).
`Antibodies. Hybridomas will provide a source of mRNA for specific antibodies.
`Bacteria may then be used for the production of the antibody peptide chains, which
`
`Sanofi/Regeneron Ex. 1 003, pg 58
`
`Merck Ex. 1003, pg 58
`
`
`
`538
`
`RIGGS AND IT AKURA
`
`could be assembled in vitro and used for passive immunization.
`Enzymes. Many mammalian enzymes will be made in E. coli (e.g., urokinase).
`Prenatal diagnosis. The work of Kan and Dozy [ 18] in 1978 clearly shows that with
`the appropriate hybridization probes, Southern blot restriction analysis of fetal DNA
`can be used for prenatal diagnosis. Chemically synthesized DNA may directly provide
`useful probes. Certainly, chemically synthesized DNA will facilitate cloning and thus
`the acquisition of the appropriate cloned DNA to be used as a hybridization probe.
`
`REFERENCES
`I. ITAKURA K, HIROSE T, CREAR, et al.: Expression in E. coli of a chemically synthesized
`gene for the hormone somatostatin. Science 198: 1056- I 063, 1977
`2. CREAR, HIRO.SE T, KRASZEWSKI A, ITAKURA K: Chemical synthesis of genes for human
`insulin. Proc Natl Acad Sci USA 75:5765- 5769, 1978
`3. GoEDDEL DV, KLEID DG, BoLIVAR F, et al.: Expression in £. coli of chemically
`synthesized genes for human insulin. Proc Natl Acad Sci USA 76:106-110, 1979
`4. RAZIN A, HIROSE T,ITAKURA K, RIGGS AD: Efficient correction of a mutation by use of a
`chemically synthesized DNA. Proc Natl Acad Sci USA 75:4268-4270, 1978
`5. KATSOYANNIS PG, TRAKATELLIS AC:, JoHNSON S, ZALUT C, ScHWARTZ G: Studies on the
`synthesis of insulin from natural and synthetic A and B chains. I. Splitting of insulin and
`isolation ofthe S-sulfonated derivatives ofthe A and B chains. Biochemistry 6:2642-2655,
`1967
`6. KHORANA HG: Total synthesis of a gene. Science 203:614-625, 1979
`7. ITAKURA K, KATAGIRI N, NARANG SA, BAHL CP, MARIAUS KJ, Wu R: Chemical
`synthesis and sequence studies of deoxyribooligonucleotides which constitute the duplex
`sequence of the lactose operator of E. coli. J Bioi Chern 250:4592-4600, 1975
`8. HEFFRON F, So M, McCARTHY BJ: In vitro mutagenesis of a circular DNA molecule by
`using synthetic restriction sites. Proc Natl Acad Sci USA 75:6012-6016, 1978
`9. SHORTLE D, NATHANS D: Local mutagenesis: a method for generating viral mutants with
`base substitutions in preselected regions of the viral genome. Proc Natl Acad Sci USA
`75:2170-2174, 1978
`10. HuTCHISON CA, PHILLIPS S, EDGELL MH, GILLAMS S, JAHNKE P, SMITH M: Mutagenesis
`at a specific position in a DNA sequence. J Bioi Chern 253:6551-6560, 1978
`11. SANGER F, AIR GM, BARREL BG, et al.: Nucleotide sequence of bacteriophage <f>Xl74
`DNA. Nature 265:687-695, 1977
`12. MoNTGOMERY DL, HALL BD, GILLAM S, SMITH M: Identification and isolation ofthe yeast
`cytochrome C gene. Cell 14:673-680, 1978
`13. VILLA-KOMAROFF L, EFSTRATIADlS A, BROOME S, et al.: A bacterial clone synthesizing
`proinsulin. Proc Natl Acad Sci USA 75:3727-3731, 1978
`14. MERCEREAU-PUIJALON 0, RoYAL A, CAMI B, et al.: Synthesis of an ovalbumin-like
`protein by E. coli Kl2 harboring a recombinant plasmid. Nature 275:505-510, 1978
`15. FRASER TH, BRUCE BJ: Chicken ovalbumin is synthesized and secreted by E. coli. Proc
`Natl Acad Sci USA 75:5936-5940, 1979
`16. CHANG ACY, NUNBERG JH, KAUFMAN RJ, ERLICH HA, SCHIMKE RT, COHEN SN:
`Phenotypic expression in E. coli of DNA sequence coding for mouse dihydrofolate
`reductase. Nature 275:617-624, 1978
`17. SEEBURG PH, SHINE J, MARTIAL JA, et al.: Synthesis of growth hormone by bacteria.
`Nature 276:795-798, 1978
`18. KAN HW, Dozy AM: Polymorphism of DNA sequence adjacent to human /3-globin
`structural gene; relationship to sickle mutation. Proc Natl Acad Sci USA 75:5631-5635,
`1978
`
`Sanofi/Regeneron Ex. 1 003, pg 59
`
`Merck Ex. 1003, pg 59
`
`
`
`........•. ,.,
`POIIM TX
`
`UNITEDSTATES~OFFICl
`
`X
`
`378-204
`
`Sanofi/Regeneron Ex. 1 003, pg 60
`
`Merck Ex. 1003, pg 60
`
`
`
`....._.
`><
`
`g.:)
`.-.J
`QO
`t
`
`... 8 ··o ..
`
`~
`;M
`
`'
`
`~
`: .·
`
`... : '
`: t= :.
`: I: =
`.
`. .
`·.
`.
`. . . : 1: ;
`: :
`~ .. . . .
`
`.
`
`~
`
`....EI...-.aiu:I_E.Iul.IbII..
`
`(f)
`ru
`::J
`
`0 -::::::..:
`
`;::o
`CD co
`CD
`::J
`
`CD a
`::J
`m
`><
`
`-->.
`0
`0
`(.0
`
`"'0 co
`
`(J)
`-->.
`
`
`
`.x..s..:.._._.:.:..__._....i..r...
`
`Merck Ex. 1003, pg 61