`
`rJ
`
`Sambrook Fritsch Maniatis
`
`BUTAMAX 1025
`
`
`
`Molecular
`Cloning
`A LABORATORY MANUAL
`SECOND EDTON
`
`All rights reserved
`' 1989 by Cold Spring Harbor Laboratory Press
`Printed in the United States of America
`
`98765432
`
`Book and cover design by Emily Harste
`
`Cover: The electron micrograph of bacteriophage A particles
`stained with uranyl acetate was digitized and assigned false color
`by computer. (Thomas R. Broker, Louise T. Chow, and James I.
`Garrels)
`
`Cataloging in Publications data
`
`Sambrook, Joseph
`Molecular cloning: a laboratory manual I E.F.
`Fritsch, T. Maniatis-2nd ed.
`cm.
`P. (cid:9)
`Bibliography: p.
`Includes index.
`ISBN 0-87969-309-6
`1. Molecular cloning(cid:151)Laboratory manuals. 2. Eukaryotic cells-
`-Laboratory manuals. I. Fritsch, Edward F. II. Maniatis, Thomas
`III. Title.
`QH442.2.M26 1987
`574.87’3224(cid:151)dcl9
`
`87-35464
`
`Researchers using the procedures of this manual do so at their own risk. Cold Spring Harbor
`Laboratory makes no representations or warranties with respect to the material set forth in
`this manual and has no liability in connection with the use of these materials.
`
`Authorization to photocopy items for internal or personal use, or the internal or personal use of
`specific clients, is granted by Cold Spring Harbor Laboratory Press for libraries and other
`users registered with the Copyright Clearance Center (CCC) Transactional Reporting Service,
`provided that the base fee of $0.10 per page is paid directly to CCC, 21 Congress St., Salem MA
`01970. [0-87969-309-6/89 $00 + $0.101 This consent does not extend to other kinds of copying,
`such as copying for general distribution, for advertising or promotional purposes, for creating
`new collective works, or for resale.
`
`All Cold Spring Harbor Laboratory Press publications may be ordered directly from Cold
`Spring Harbor Laboratory, Box 100, Cold Spring Harbor, New York 11724. Phone: 1-800-843-
`4388. In New York (516)367-8423.
`
`BUTAMAX 1025
`
`(cid:9)
`(cid:9)
`
`
`r
`
`TATA box and the upstream promoter elements. The TATA box, located
`25-30 bp upstream of the transcription initiation site, is thought to be in-
`volved in directing RNA polymerase II to begin RNA synthesis at the correct
`site. In contrast, the upstream promoter elements determine the rate at
`which transcription is initiated. These elements can act regardless of their
`orientation, but they must be located within 100 to 200 bp upstream of the
`TATA box. Enhancer elements can stimulate transcription up to 1000-fold
`from linked homologous or heterologous promoters. However, unlike up-
`stream promoter elements, enhancers are active when placed downstream
`from the transcription initiation site or at considerable distances from the (cid:9)
`promoter. Many enhancers of cellular genes work exclusively in a particular
`tissue or cell type (for review, see Voss et al. 1986; Maniatis et al. 1987). In
`addition, some enhancers become active only under specific conditions that
`are generated by the presence of an inducer, such as a hormone or metal ion
`(for review, see Sassone-Corsi and Borrelli 1986; Maniatis et al. 1987). Be-
`cause of these differences in the specificities of cellular enhancers, the choice
`of promoter and enhancer elements to be incorporated into a Øukaryotic
`expression vector will be determined by the cell type(s) in which the recombi-
`nant gene is to be expressed. Conversely, the use of a prefabricated vector
`containing a specific promoter and cellular enhancer may severely limit the
`cell types in which expression can be obtained.
`Many enhancer elements derived from viruses have a broader host range
`and are active in a variety of tissues, although significant quantitative
`differences are observed among different cell types. For example, the SV40
`early gene enhancer is promiscuously active in many cell types derived from
`a variety of mammalian species, and vectors incorporating this enhancer
`have consequently been widely used (Dijkema et al. 1985). Two other
`enhancer/promoter combinations that are active in a broad range of cells are
`derived from the long terminal repeat (LTR) of the Rous sarcoma virus
`genome (Gorman et al. 1982b) and from human cytomegalovirus (Boshart et
`al. 1985).
`
`TERMINATION AND POLYADENYLATION SIGNALS (cid:9)
`
`I
`
`During the expression of eukaryotic genes, RNA polymerase II transcribes
`through the site where polyadenylation will occur. Consequently, the 3’
`terminus of the mature mRNA is formed by site-specific posttranscriptional
`cleavage and polyadenylation (for review, see Birnstiel et al. 1985; Proudfoot
`and Whitelaw 1988; Proudfoot 1989). Although discrete sites for the termi-
`nation of the primary transcript have not yet been identified, general regions
`of DNA a few hundred nucleotides in length and downstream from the poly-
`adenylation site have been identified where transcription randomly termi-
`nates.
`Two distinct sequence elements are required for accurate and efficient
`polyadenylation: (1) GU- or U-rich sequences located downstream from the
`polyadenylation site and (2) a highly conserved sequence of six nucleotides,
`AAUAAA, located 11-30 nucleotides upstream, which is necessary but not
`sufficient for posttranscriptional cleavage and polyadenylation (for review,
`see Mason et al. 1986; Proudfoot and Whitelaw 1988). The practical implica-
`tion of these observations is that sequences downstream from the polyadenyl-
`
`16.6 Expression of Cloned Genes in Cultured Mammalian Cells
`
`BUTAMAX 1025
`
`
`
`ation site must be included in eukaryotic expression vectors to ensure
`efficient polyadenylation of the mRNA of interest. Although a full-length
`cDNA clone may encode the conserved AAUAAA sequence and a tract of
`poly(A), these endogenous elements are not by themselves sufficient to
`guarantee polyadenylation. The downstream GU- or U-rich sequences neces-
`sary for cleavage and polyadenylation must therefore be incorporated into the
`vector. The most frequently utilized signals are those derived from SV40; a
`237-bp BamHI-BclI restriction fragment contains the cleavage/polyadenyla-
`tion signals from both the early and the late transcription units. These
`signals are positioned in opposite orientations, one on each DNA strand, and
`both sets of signals have been shown to be extremely efficient for the
`processing of hybrid mRNAs. Less frequently, polyadenylation signals have
`been provided by fusing a full-length cloned cDNA onto a partial genomic
`copy of a gene already resident in an expression vector (O’Hare et al. 1981;
`Kaufman et al. 1986b).
`Sequences within the 3’ noncoding regions of eukaryotic gertes may play a
`role in mRNA stability. For example, the presence of an AU-rich sequence,
`derived originally from the 3’ noncoding region of granulocyte-macrophage
`colony-stimulating factor (GM-CSF), has been shown to destabilize mRNAs
`transcribed from mammalian expression vectors (Shaw and Kamen 1986).
`Although similar motifs have been found in analogous locations within
`mRNAs encoding a variety of growth factors and oncogenes, relatively little is
`known about the way they function. To obtain maximal expression of a
`cloned gene, it may therefore be necessary to remove the nucleotide se-
`quences 3’ of the termination codon.
`
`SPLICING SIGNALS
`
`The DNA sequences coding for a eukaryotic protein are rarely contiguous;
`usually, they are separated in the genome by intervening noncoding se-
`quences that may vary in size from tens to many thousands of nucleotides.
`Following polyadenylation of the primary transcript, the introns are removed
`by splicing to generate the mature mRNA, which is then transported from the
`nucleus to the cytoplasm (for review, see Nevins 1983; Green 1986; Padgett
`et al. 1986; Kramer and Maniatis 1988).
`The minimal sequences required for splicing of mRNA are located at the 5’
`and 3’ boundaries of the intron. Comparison of a large number of these
`sequences has led to the identification of consensus sequences in which the
`first two and the last two nucleotides of the intron are essentially invariant:
`AG :G U (A) AG U . . intron ... (U/C) N il CAG: G
`5’ splice site (cid:9)
`3’ splice site
`
`The development of in vitro splicing systems has led to the elucidation of
`much of the biochemistry of the splicing reaction, but the processes that
`guarantee correct matching of 5’ and 3’ splice sites are not yet understood.
`The fact that hybrid pre-mRNAs containing 5’ and 3’ splice sites derived
`from different introns can be accurately spliced (Chu and Sharp 1981)
`indicates the importance of the conserved consensus sequences in this pro-
`cess. However, these sequences cannot be the sole determinants of splice-site
`selection, since identical, but ordinarily inactive, consensus sequences can be
`
`Expression of Cloned Genes in Cultured Mammalian Cells
`
`16.7
`
`BUTAMAX 1025
`
`
`
`developed that express the Tn5 neor gene under the control of SV40
`regulatory elements (Chia et al. 1982; Southern and Berg 1982; Okayama
`and Berg 1983; Van Doren et al. 1984). Vectors such as pSV2-neo ( Southern
`and Berg 1982) and pRSVneo (Figure 16.1C), which have been widely used
`in cotransformation experiments, contain a version of the Tn5 neo’ gene
`that retains prokaryotic promoter sequences between the eukaryotic pro-
`moter and the APH coding sequences. This configuration yields a vector
`that can confer antibiotic resistance upon both prokaryotic and eukaryotic
`cells. However, perhaps because the bacterial promoter contributes several
`upstream AUG codons, the efficiency of translation of APH mRNAs synthe-
`sized from these vectors is comparatively low in mammalian cells (Chen
`and Okayama 1987). Vectors such as pko-neo (Figure 16.1D) (Van Doren et
`al. 1984) and pcDneo (Okayama and Berg 1983; Chen and Okayama 1987),
`which lack prokaryotic promoter sequences, are therefore preferred.
`
`Hygromycin B phosphotransferase. The E. coli gene encoding hygromycin B
`phosphotransferase (Gritz and Davies 1983) can be used as a dominant
`selectable marker in much the same way as the APH gene. When the
`hygromycin B phosphotransferase gene (hyg) is introduced into mammalian
`cells on an appropriate expression vector (e.g., pHyg, Figure 16.1E) (Sugden
`et al. 1985), the transfected cells become resistant to the antibiotic hy-
`gromycin. Resistance to neomycin and to hygromycin can be selected for
`independently and simultaneously in cell lines that have been transfected
`with both genes. Thus, two different vectors can be introduced into one cell
`line, either simultaneously or sequentially.
`
`Xanthine-guanine phosphoribosyl transferase. The gpt gene of E. coli en-
`codes the enzyme xanthine-guanine phosphoribosyl transferase (XGPRT),
`which is the bacterial analog of the mammalian enzyme hypoxanthine-
`guanine phosphoribosyl transferase (HGPRT). Whereas only hypoxanthine
`and guanine are substrates for HGPRT, XGPRT will also efficiently convert
`xanthine into XMP, which is a precursor of GMP. The bacterial gpt gene
`has been cloned and expressed in mammalian cells under the control of an
`SV40 promoter (Mulligan and Berg 1980, 1981a,b) (see, e.g., Figure 16.1F).
`Vectors expressing XGPRT restore the ability of mammalian cells lacking
`HGPRT activity to grow in HAT medium (Szybalska and Szybalski 1962;
`Littlefield 1964, 1966).
`Of much greater general use is the application of the gpt gene as a
`dominant selection system, which can be applied to any type of cell
`(Mulligan and Berg 1981a,b). Vectors expressing XGPRT confer upon
`wild-type mammalian cells the ability to grow in medium containing
`adenine, xanthine, and the inhibitor mycophenolic acid. Mycophenolic acid
`blocks the conversion of IMP into XMP and inhibits the de novo synthesis of
`GMP. The selection can be made more efficient by the addition of aminop-
`term, which blocks the endogenous pathway of purine biosynthesis.
`
`CAD. A single protein, CAD, possesses the first three enzymatic activities
`of de novo uridine biosynthesis (carbamyl phosphate synthetase, aspartate
`transcarbamylase, and dihydroorotase). Transfection of vectors expressing
`the CAD protein from Syrian hamsters into CAD-deficient (UrdA) mutants
`of CHO cells allows selection of CAD transfectants that are able to grow in
`the absence of uridine (Robert de Saint Vincent et al. 1981).
`
`16.14 Expression of Cloned Genes in Cultured Mammalian Cells
`
`i
`
`El
`L
`
`r
`
`L-Phosphonacetyl-L-aspartate (PALA) is a specific inhibitor of the aspar-
`tate transcarbamylase activity of CAD. Growth of wild-type or transfected
`mammalian cells in the presence of increasing concentrations of PALA
`leads to the amplification of the CAD gene and DNA sequences linked to it
`(Kempe et al. 1976; Robert de Saint Vincent et al. 1981; Wahl et al. 1984).
`The E. colt gene encoding aspartate transcarbamylase (pyrB), when ex-
`pressed in CHO cells deficient in aspartate transcarbamylase, is also
`amplified by PALA selection (Ruiz and Wahl 1986).
`
`(cid:149) Adenosine deaminase. Adenosine deaminase (ADA) is present in virtually
`all animal cells, but it is normally synthesized in minute quantities and is
`not essential for cell growth. However, because ADA catalyzes the irrevers-
`ible conversion of cytotoxic adenine nucleosides to their respective nontoxic
`inosine analogs, cells propagated in the presence of toxic concentrations of
`adenosine or its analog 9-13-D-xylofuranosyl adenine (Xyl-A) require ADA
`for survival (for references and review, see Kaufman 197). Under condi-
`tions where ADA is required for cell growth, amplification of the gene can
`be achieved in the presence of increasing concentrations of 2’ -deoxycofor-
`mycin (dCF), a transition-state analog of adenine nucleotides that strongly
`inhibits the enzyme. In cells selected for their ability to resist high
`concentrations of 2’ -deoxycoformycin, it has been shown that ADA was
`overproduced 11,400-fold and represented 75% of the soluble protein syn-
`thesized by the cells (Ingolia et al. 1985).
`
`(cid:149) Asparagine synthetase. The E. coli gene coding for asparagine synthetase
`(AS) is a potentially useful, dominant, amplifiable marker for mammalian
`cells. Because the bacterial enzyme uses ammonia as an amide donor(cid:151)in
`contrast to the mammalian enzyme, which uses glutamine(cid:151)cells that
`express the bacterial AS gene will grow in asparagine-free medium contain-
`ing the glutamine analog albizziin. Subsequently, the transfected AS gene
`can be amplified by selection in medium containing increasing concen-
`trations of 8-aspartyl hydroxamate, an analog of aspartic acid.
`
`Foreign DNA Sequences
`
`DNAs encoding the foreign protein of interest are usually cloned as cDNAs
`that lack all of the controlling elements required for expression in mam-
`malian cells but may contain ancillary sequences introduced during the
`construction of the cDNA library (e.g., homopolymeric stretches of guanine or
`cytosine residues, synthetic linkers, etc.). No consensus exists as to whether
`or not these ancillary sequences need to be removed before the cDNA can be
`expressed in mammalian cells. However, since such sequences never en-
`hance, and in some circumstances may suppress, the level of expression of
`foreign DNAs in mammalian cells (Simonsen et al. 1982), most workers
`prefer to remove as many extraneous sequences as is conveniently possible.
`Less frequently, DNAs encoding the foreign protein of interest are obtained
`as a genomic copy in which the coding sequences may be interrupted by one
`or more introns. A complete genomic copy will have all the controlling
`sequences necessary for the expression of the protein in some, but not
`necessarily all, cell types. Because the specificity of these sequences de-
`termines the range of cell types in which the gene will be active, replacement
`
`Expression of Cloned Genes in Cultured Mammalian Cells
`
`16.15
`
`Ad
`
`BUTAMAX 1025
`
`