`
`Synthetic DNA and Medicine
`
`ARTHUR D. RIGGS' AND KEIICHI ITAKURA
`
`Synthetic DNA chemistry is no longer an esoteric discipline without obvious practical
`applications. On the contrary, the combination of synthetic DNA chemistry, recombin-
`ant DNA techniques, and molecular cloning already has resulted in useful products
`somatostatin [1] and insulin [2, 3] -and promises much more. In this review, we will
`first discuss the results and methods of our recent work on insulin [3] and mutation
`correction [4] and then follow with speculation on additional potential applications.
`
`THE INSULIN PROJECT
`Bacterial production of human insulin
`Figure 1 illustrates the overall scheme that we used [3]. The insulin chains (21 amino
`acid A chain and 30 amino acid B chain) are made in separate bacterial strains as tails
`on a rather large precursor protein, the enzyme /B-galactosidase. The insulin chains are
`efficiently clipped from the precursor protein by treatment with cyanogen bromide, a
`methionine-specific cleavage reagent. Because synthetic DNA was used to make the
`insulin genes, we were able to arrange that the insulin tails are attached to
`,3-galactosidase by a methionine linkage (see next section).
`The yields of the separate chains are extremely good; about 20% of the total bacterial
`protein is made as the insulin-,3-galactosidase precursor protein [3], and even higher
`yields should be obtained soon. Almost the entire protein-synthesizing machinery of
`the bacterial cell can be turned to the production of the desired peptide product.
`Milligrams of the insulin chains are made per liter of bacterial culture. The individual
`chains can be joined in good yields (up to 80%) by air oxidation [5] and active insulin
`obtained, as ascertained by chemical, radioimmune, and biological activity [3, and
`unpublished data]. With our present results, the commercial production of human
`insulin by bacteria seems practical, and two firms, Genentech, Inc. (South San Fran-
`cisco, Calif.), and Eli Lilly, Inc. (Indianapolis, Ind.), are trying to develop procedures
`for large-scale production.
`The techniques we used are quite general; thus we are confident that bacteria can be
`engineered to produce any peptide hormone that does not contain methionine. By using
`
`Received April 25, 1979.
`1 Both authors: Division of Biology, City of Hope National Medical Center, Duarte, CA 91010.
`© 1979 by the American Society of Human Genetics. 0002-9297/79/3105-0014$01.00
`531
`
`Genzyme Ex. 1003, pg 52
`
`
`
`532
`
`RIGGS AND ITAKURA
`
`E. coli
`/,3-Gal
`
`A
`
`C7T1~midDNA
`
`E. coli
`/3-Gal
`
`B
`
`lami
`
`DNA
`
`Beta
`Galactosidase
`
`A Chain
`
`Beta
`Galactosidase
`
`Chain
`
`Purification
`1. Partial
`2. Cleavage with CNBr
`3. Purification of
`Insulin Chains
`
`Insulin
`
`A Chain
`
`C 0
`
`Insulin
`
`00
`___B Chain
`
`Air
`Oxidation
`
`soss
`
`s
`s
`
`s
`
`Active Insulin
`FIG. 1.-Schematic overview of strains and procedures for production of human insulin by bacteria. Two
`E. coli strains were constructed having chemically synthesized insulin A or B chain genes inserted into the
`,3-galactosidase gene of a plasmid cloning vector. In vivo, a fused protein is made, mostly /8-galactosidase
`but with an insulin tail joined by a methionine. In vitro, insulin peptide chain is clipped off by treatment with
`cyanogen bromide. After separate purification, insulin A and B chains are joined by air oxidation.
`
`other cleavage tricks, or accepting lower yields, even peptides that contain methionine
`can probably be made.
`With chemical DNA synthesis, it is not necessary to copy the natural nucleotide
`sequence of a gene, because, given the sequence of amino acids in the desired peptide
`product, one can use the genetic code to design an "artificial" gene carrying the
`necessary information. This is the approach that we used first for somatostatin [1] and
`
`Genzyme Ex. 1003, pg 53
`
`
`
`SYNTHETIC DNA AND MEDICINE
`
`533
`
`then for insulin [2]. Techniques have developed rapidly, so that the genes necessary for
`altering the bacteria can be made and inserted with relatively modest expenditures of
`time and money. In the next section, we will describe in more detail how genes can be
`made and inserted into bacteria.
`
`Chemical DNA synthesis
`Although there are alternative methods [6], the fastest way to make DNA is by the
`phosphotriester method [2, 7] illustrated in figure 2. We will not go into details beyond
`those given in the legend of figure 2, but recent improvements in the method (such as
`the rapid synthesis of trimers, together with the extensive use of high performance
`liquid chromatography for analysis and purification of the oligodeoxyribonucleotides)
`have dramatically reduced the time necessary for the construction of DNA fragments
`[2]. A library of trimers has been established, and longer oligonucleotides can be
`assembled quickly from the trimer units (which correspond to amino acid codons).
`To make the insulin A and B chain genes, it was necessary to make 29
`oligonucleotides which were assembled and joined by ligation to make a total of 181
`base pairs of duplex DNA. Starting from the trimer library, the DNA fragments were
`made in about 3 months. The next stages, cloning and expression, took somewhat
`longer. DNA synthesis no longer is the rate limiting step.
`
`Molecular cloning and expression
`Figure 3 illustrates how the insulin A chain gene was assembled, cloned, and
`positioned at the end of /8-galactosidase. Step 1 (fig. 3) was joining the small (13 base
`average) oligonucleotides. Because they were designed to have complementary
`overlaps, they assemble themselves, and were joined to give duplex DNA by the action
`of the T4 DNA ligase. The gene was designed to have restriction enzyme sites at each
`end (Eco RI on the left and Bam HI on the right). Step 2 was preparation of the
`plasmid DNA cloning vector pBR322. Preparation included treatment with Eco R 1 and
`Bam HI restriction enzymes, which cuts out a small piece of the plasmid and provides
`a site for insertion of the synthetic A gene. In step 3, the prepared plasmid and
`synthetic DNA are mixed and joined by T4 DNA ligase, followed by transformation of
`E. coli and molecular cloning. A clone was obtained that contained a correct insulin A
`gene, as verified by direct DNA sequencing. Next, a DNA fragment containing most of
`the E. coli lac operon, including the lac promoter, operator, and the first 1006 amino
`acid codons of /8-galactosidase, was inserted (steps 4, 5, and 6; Z symbolizes the
`,8-galactosidase gene). This led to a clone making insulin-,8-galactosidase fused
`protein.
`
`MUTATION CORRECTION
`
`DNA changes directed by synthetic DNA
`Several techniques have been used recently for site-localized mutagenesis of DNA in
`vitro [4, 8-10]. However, we think that the use of synthetic DNA provides the most
`specific and general approach to making directed changes in DNA [4, 10]. It should be
`possible to repair or create mutations, convert a gene of one species to the same gene of
`
`Genzyme Ex. 1003, pg 54
`
`
`
`534
`
`RIGGS AND ITAKURA
`
`B
`
`B
`
`B
`
`DMT
`
`11 11 11
`-P -P -P-OCE
`
`0
`
`R
`
`0
`
`R
`
`0
`
`R
`
`1
`
`//Et3N
`
`\+
`
`B
`
`B
`
`B
`
`i oio
`DIVT4=- P
`-%P4.-OAn
`
`o
`R
`
`0
`R
`
`i1
`
`P1
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`11
`DMT - I\
`
`11
`II
`p - P-O
`
`11
`11
`II
`HO \ P\ p-P-OCE
`
`+
`
`3 +HO
`
`2)H
`-P4.. \- OAn
`
`0
`R
`
`0
`R
`
`0
`R
`
`3
`
`0
`R
`
`0
`R
`
`0
`R
`
`4
`
`1) TPSTe
`Silica gel
`2)
`3) Et3N
`
`R
`
`R
`
`5
`
`1) TPSTe
`
`2) H+
`T3) Silica
`
`gel
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`B
`
`11
`DMTVT-IDO
`
`0
`R
`
`11
`
`0
`R
`
`11
`
`0
`R
`
`6
`
`lol 11
`OP-O-
`
`0
`R
`
`0
`R
`
`0
`R
`
`11
`HO -p
`
`0
`
`R
`
`O
`
`0
`
`R
`
`1) TPSTe
`Silica gel
`2)
`3) NH40H
`4) AcOH
`
`-OAn
`
`llo l
`P$JPOP
`O
`R
`
`0
`
`0
`
`R
`
`R
`
`R
`
`7
`
`B
`
`B
`
`B
`
`B
`
`l
`111
`O
`
`1
`
`l l
`P\O
`
`111
`O
`
`HO
`
`B
`B
`B
`oll l
`111111
`11
`O\ OJ
`O\
`
`1II
`O_
`
`B
`
`8
`
`B
`l ol ol l
`11
`O
`
`B
`
`B
`
`B
`
`l
`111 111 1111
`
`oll
`
`P4
`
`P\J
`
`OH
`
`R =& Cl
`B = Protected and non protected bases
`
`DMT = dimethoxytrityl
`
`TPSTe = ->
`
`-S-N_ N
`
`A n = - C--<(OMe
`
`CE =-cyonoethy I
`
`FIG. 2. -Chemical synthesis of oligodeoxyribonucleotides by improved triester method [2, 7]. Starting
`with nucleosides, a library of fully protected triester trimers, such as I and 2 are made. The type 2 trimer,
`with the anisole 3'-protecting group will become the 3' end of the oligonucleotide. The type 1 trimer is
`bifunctional, and depending on treatment (either mild acid or base) will be either the 5'-end component or an
`internal sequence component. Because of the chlorophenyl protecting groups attached by ester linkage to the
`phosphate groups (forming phosphotriesters), the trimers and intermediate oligonucleotides (e.g., 3, 4, 5, 6,
`7) are not water soluble. Therefore, all condensations and purifications are done in nonaqueous solvents such
`as chloroform. Trimers can be condensed to yield hexamers (e.g., 3 + 4 yields 6) and hexamers can be
`condensed to yield dodecamers (e.g., 6 + 7 yields 8, still in fully protected triester form). The next-to-last
`step in a typical synthesis is the removal of all protecting groups by treatment with acetic acid and NH40H,
`generating the desired water soluble single stranded DNA fragment. The last step is a careful purification of
`the DNA fragment by high performance liquid chromatography.
`
`Genzyme Ex. 1003, pg 55
`
`
`
`SYNTHETIC DNA AND MEDICINE
`
`535
`
`RI
`
`Bam
`Oligonucleotides
`
`(lI
`Ligose
`
`RI
`
`,
`
`Bam
`
`HR~m
`B(3m A0 (a
`
`(
`
`I
`
`S~~~
`
`~RI=
`
`A
`
`Lig
`~Trans.
`Bar
`
`A
`
`Ri
`
`Barn
`
`AmpR, Tets
`
`;
`
`A
`
`Lac PO Z'
`
`A
`RiR ar
`
`( Ri
`
`AmpR, TetS, Blue on XG
`
`Li
`Trans.
`
`Lac P0 Z'
`
`RI
`
`RI
`
`Lac P0 Z
`
`Lambda plac
`FIG. 3. -The construction of a plasmid DNA containing a synthetic insulin A chain gene inserted at the
`end of a ,8-galactosidase gene. The procedures are described in the text, and details are given in Goeddel et
`al. [3], so only an explanation of the symbols is given here. The symbol A represents the synthetic A chain
`gene. pBR322 is a well-characterized plasmid cloning vector containing two antibiotic resistance genes,
`ampicillin (Amp) and tetracycline (Tet), and several convenient restriction endonuclease sites including Eco
`Rl (RI ) and Bam Hi (Bam). Lambda plac is a lambda transducing phage carrying the entire E. coli operon,
`which includes the lac promoter (P), the lac operator (0), and the entire /-galactosidase structural gene (Z).
`There is an Eco RI endonuclease site to the left of the operon and also one near the end of the 3-galactosidase
`gene; thus, the lac operon DNA fragment can be readily obtained. The phenotype of the bacterial strains
`successfully infected with the desired plasmid are shown. For example, the A chain producing strains would
`be ampicillin resistant (AmpR), tetracycline sensitive (Tets), and the colonies would be Blue on a special
`indicator agar called Xg.
`
`another species, make genes for peptide analogues, create restriction sites, etc.
`Although practical applications have not yet been made, the feasibility of the approach
`has been demonstrated [4, 10].
`Figure 4 illustrates how we have efficiently corrected a mutation in OX 174
`bacteriophage DNA, a favorable test system because the mature viral DNA is a single
`stranded circle, and the complete DNA sequence is known [11]. We made a synthetic
`primer with wild type sequence. This synthetic DNA hybridized with a one-base pair
`mismatch to mutant viral DNA and served as a primer for in vitro DNA replication by
`E. coli DNA polymerase. Complete heteroduplex circles were made by in vitro DNA
`replication, and these were used to infect E. coli. In vivo replication led to homoduplex
`progeny, a high percentage of which were converted to wild type [4]. The efficiency of
`the directed change is high enough that even sequence changes for which there is no
`method of selection can be made. Direct DNA sequencing techniques can be used to
`identify the converted molecule.
`
`Genzyme Ex. 1003, pg 56
`
`
`
`536
`
`3'A
`T
`
`RIGGS AND ITAKURA
`wild
`type
`A C A" Cc
`TGT\/G
`am3
`
`C T A T G G G A
`GATAC C CT
`
`5o
`
`In Vitro
`DNA Synthesis
`
`wt.
`
`Synthetic
`Primer
`
`-,
`
`X 174
`DNA
`
`E. Coli
`
`5o0 %
`
`Wild Type
`Progeny
`FIG. 4.-Correction of a mutation in bacteriophage OX174. A chemically synthesized single stranded
`DNA fragment with a wild type sequence is specifically hybridized to a OX174 mutant called amber 3 (am
`3). The synthetic oligonucleotide serves as a primer for in vitro DNA synthesis. After infection and
`replication, a high percentage of the progeny phage are wild type, indicating efficient correction of the
`mutation by using synthetic DNA.
`
`Genzyme Ex. 1003, pg 57
`
`
`
`SYNTHETIC DNA AND MEDICINE
`
`537
`
`ISOLATION OF NATURAL GENES
`Genes for large proteins and enzymes could be made by chemical DNA synthesis
`(synthesis rate about one nucleotide/day per person). However, in most cases, it will
`probably be preferable to isolate the natural DNA sequence by cloning a reverse
`transcript of the appropriate mRNA. This approach to gene isolation works well if a
`method is available for detecting the desired gene sequence among the "shotgun"
`library of bacterial or viral clones. Detection of the desired clone usually is difficult,
`except for the most abundant protein (and thus mRNA) species. With present
`methodology, it is very difficult to isolate a specific gene whose mRNA transcript does
`not comprise more than 1% of the total mRNA.
`Synthetic DNA, however, may allow the isolation of the natural gene for any protein
`for which at least partial amino acid sequence is known. Using the genetic code, a short
`(10-20 nucleotide) synthetic DNA fragment can be made that is complementary to the
`mRNA of the desired gene. Degeneracy of the genetic code will require that a mixture
`of synthetic probes be made for most proteins, but this seems feasible. Synthetic probes
`have been used to identify the cytochrome C gene of yeast [12]. We are confident that
`with synthetic DNA used as a specific primer for cDNA synthesis and as a specific
`hybridization probe, almost any gene with a known peptide product can be isolated.
`
`EXPRESSION OF NATURAL MAMMALIAN GENES IN BACTERIA
`There is now clear evidence that mammalian genes do function in bacteria. Bacterial
`clones producing rat insulin, chicken ovalbumin, rat growth hormone, and mouse
`dihydrofolate reductase have been obtained [13-17]. Clearly, there is no fundamental
`barrier to prevent transcription and translation of eukaryotic genes in prokaryotes. The
`key to successful, efficient expression seems to be the presence of appropriate control
`signals and correct positioning of the foreign gene sequence. The new sequence must
`be in phase, near a ribosome-binding site (if the desired product is to be made without a
`precursor peptide), downstream from a good promoter, and contain translation start
`and stop sites. Synthetic DNA will not only aid isolation of the natural gene, but will
`also aid in obtaining expression. The most important function for synthetic DNA may
`be in trimming, lengthening, or changing natural sequences to obtain efficient
`expression of the desired peptide product.
`
`SUMMARY OF POTENTIAL PRACTICAL APPLICATIONS
`In vitro recombinant DNA techniques, molecular cloning, and synthetic DNA
`chemistry have interacted synergistically to create a revolution that is still in its early
`stages. Nevertheless, even now we see the following on the immediate horizon:
`Hormones. Numerous mammalian hormones and hormone analogues will be made
`(e.g., somatostatin, insulin, proinsulin, growth hormone, etc.).
`Interferon. Bacterial production may be achieved within 2 years, or perhaps much
`sooner (possibly even before this article appears).
`Vaccines. The coat proteins of viruses will be made in E. coli, perhaps fused to an
`immunogenic precursor protein, and used to make vaccines (e.g., against hepatitis).
`Antibodies. Hybridomas will provide a source of mRNA for specific antibodies.
`Bacteria may then be used for the production of the antibody peptide chains, which
`
`Genzyme Ex. 1003, pg 58
`
`
`
`538
`RIGGS AND ITAKURA
`could be assembled in vitro and used for passive immunization.
`Enzymes. Many mammalian enzymes will be made in E. coli (e.g., urokinase).
`Prenatal diagnosis. The work of Kan and Dozy [18] in 1978 clearly shows that with
`the appropriate hybridization probes, Southern blot restriction analysis of fetal DNA
`can be used for prenatal diagnosis. Chemically synthesized DNA may directly provide
`useful probes. Certainly, chemically synthesized DNA will facilitate cloning and thus
`the acquisition of the appropriate cloned DNA to be used as a hybridization probe.
`
`REFERENCES
`1. ITAKURA K, HIROSE T, CREA R, et al.: Expression in E. coli of a chemically synthesized
`gene for the hormone somatostatin. Science 198:1056- 1063, 1977
`2. CREA R, HIRO0E T, KRASZEWSKI A, ITAKURA K: Chemical synthesis of genes for human
`insulin. Proc Natl Acad Sci USA 75:5765- 5769, 1978
`3. GOEDDEL DV, KLEID DG, BOLIVAR F, et al.: Expression in E. coli of chemically
`synthesized genes for human insulin. Proc Natl Acad Sci USA 76:106-110, 1979
`4. RAZIN A, HIROSE T, ITAKURA K, RIGGS AD: Efficient correction of a mutation by use of a
`chemically synthesized DNA. Proc Natl Acad Sci USA 75:4268-4270, 1978
`5. KATSOYANNIS PG, TRAKATELLIS AC, JOHNSON S, ZALUT C, SCHWARTZ G: Studies on the
`synthesis of insulin from natural and synthetic A and B chains. I. Splitting of insulin and
`isolation of the S-sulfonated derivatives of the A and B chains. Biochemistry 6:2642-2655,
`1967
`6. KHORANA HG: Total synthesis of a gene. Science 203:614-625, 1979
`7. ITAKURA K, KATAGIRI N, NARANG SA, BAHL CP, MARIAUS KJ, Wu R: Chemical
`synthesis and sequence studies of deoxyribooligonucleotides which constitute the duplex
`sequence of the lactose operator of E. coli. J Biol Chem 250:4592-4600, 1975
`8. HEFFRON F, SO M, MCCARTHY BJ: In vitro mutagenesis of a circular DNA molecule by
`using synthetic restriction sites. Proc Natl Acad Sci USA 75:6012-6016, 1978
`9. SHORTLE D, NATHANS D: Local mutagenesis: a method for generating viral mutants with
`base substitutions in preselected regions of the viral genome. Proc Natl Acad Sci USA
`75:2170-2174, 1978
`10. HUTCHISON CA, PHILLIPS S, EDGELL MH, GILLAMS S, JAHNKE P, SMITH M: Mutagenesis
`at a specific position in a DNA sequence. J Biol Chem 253:6551-6560, 1978
`11. SANGER F, AIR GM, BARREL BG, et al.: Nucleotide sequence of bacteriophage 4X174
`DNA. Nature 265:687-695, 1977
`12. MONTGOMERY DL, HALL BD, GILLAM S, SMITH M: Identification and isolation of the yeast
`cytochrome C gene. Cell 14:673-680, 1978
`13. VILLA-KOMAROFF L, EFSTRATIADIS A, BROOME S, et al.: A bacterial clone synthesizing
`proinsulin. Proc Natl Acad Sci USA 75:3727 - 3731, 1978
`14. MERCEREAU-PUIJALON 0, ROYAL A, CAMI B, et al.: Synthesis of an ovalbumin-like
`protein by E. coli K 12 harboring a recombinant plasmid. Nature 275:505-510, 1978
`15. FRASER TH, BRUCE BJ: Chicken ovalbumin is synthesized and secreted by E. coli. Proc
`Natl Acad Sci USA 75:5936- 5940, 1979
`16. CHANG ACY, NUNBERG JH, KAUFMAN RJ, ERLICH HA, SCHIMKE RT, COHEN SN:
`Phenotypic expression in E. coli of DNA sequence coding for mouse dihydrofolate
`reductase. Nature 275:617-624, 1978
`17. SEEBURG PH, SHINE J, MARTIAL JA, et al.: Synthesis of growth hormone by bacteria.
`Nature 276:795-798, 1978
`18. KAN HW, DOZY AM: Polymorphism of DNA sequence adjacent to human 8-globin
`structural gene; relationship to sickle mutation. Proc Natl Acad Sci USA 75:5631-5635,
`1978
`
`Genzyme Ex. 1003, pg 59
`
`
`
`.-».—.~.....-..—......
`
`TX
`|JNlTE__D STATES OOPVHOU-fl’ OFFICE
`
`37s—2o4
`
`
`
`
`
`V: .....§;:mj...g;.:;gfl Mm’
`
`Genzyme Ex. 1003, pg 60
`
`
`
`...«2m'..;,.,.......u......4.e..--.-.n._a-a-..--u--
`
`Irn,n1nnpt;:aa_uoa_Ipuun-gpauagdh M-uhmuiinIn_4IIu41'
`
`llll_tlllIj!'IlfII1‘i|
`
`-.‘
`
`Genzyme Ex. 1003, pg 61