`
` Published online July 6, 2004
`
`Nucleic Acids Research, 2004, Vol. 32, No. 12
`e96
`doi:10.1093/nar/gnh082
`
`Increasing the efficiency of SAGE adaptor ligation
`by directed ligation chemistry
`Austin P. So1, Robin F. B. Turner1 and Charles A. Haynes1,2,*
`
`1Biotechnology Laboratory and 2Department of Chemical and Biological Engineering, University of British Columbia,
`Vancouver BC Canada V6T 1Z3
`
`Received February 11, 2004; Revised April 8, 2004; Accepted May 17, 2004
`
`ABSTRACT
`
`The ability of Serial Analysis of Gene Expression
`(SAGE) to provide a quantitative picture of global
`geneexpression reliesnotonly onthedepthandaccur-
`acy of sequencing into the SAGE library, but also on
`the efficiency of each step required to generate the
`SAGE library from the starting mRNA material. The
`first critical step is the ligation of adaptors contain-
`ing a Type IIS recognition sequence to the anchored 30
`end cDNA population that permits the release of
`short sequence tags (SSTs) from defined sites within
`the 30 end of each transcript. Using an in vitro tran-
`script as a template, we observed that only a small
`fraction of anchored 30 end cDNA are successfully
`ligated with added SAGE adaptors under typical reac-
`tion conditions currently used in the SAGE protocol.
`Although the introduction of 500-fold molar excess
`of adaptor or the inclusion of 15% (w/v) PEG-8000
`increased the yield of the adaptor-modified product,
`complete conversion to the desired adaptor:cDNA
`hetero-ligation product is not achieved. An alternative
`method of ligation, termed as directed ligation, is
`described which exploits a favourable mass-action
`condition created by the presence of NlaIII during
`ligation in combination with a novel SAGE adaptor
`containing a methylated base within the ligation
`site. Using this strategy, we were able to achieve
`near complete conversion of the anchored 30 end
`cDNA into the desired adaptor-modified product.
`This new protocol therefore greatly increases the
`probability that a SST will be generated from every
`transcript, greatly enhancing the fidelity of SAGE.
`Directed ligation also provides a powerful means
`to achieve near-complete ligation of any appropri-
`ately designed adaptor to its respective target.
`
`INTRODUCTION
`
`The development of technologies aimed towards monitoring
`gene expression on a global scale has revolutionized the
`study of biology from a systems perspective (1). This per-
`spective embraces the idea that the functional significance of
`
`gene products is not only related to their quantity in the cell,
`but also to how they interact and are strung together to form
`genetic and biochemical networks. Numerous technologies
`have been developed over the past decade, with the greatest
`attention being given to approaches based on either high-
`throughput sequencing or massively parallel analysis of
`the transcriptome (i.e. the set of all expressed genes weighted
`by transcript abundance) using array hybridization techno-
`logy. The sequencing approach to monitoring gene expression
`on a global scale typically involves the creation of short
`representations of each transcript, such as expressed sequence
`tags (ESTs) or short sequence tags (SSTs) generated using
`Serial Analysis of Gene Expression (SAGE) technology
`(2,3). DNA microarray technology attempts to resolve the
`transcriptome by selectively binding and quantifying each
`transcript at one or more complementary registers of a
`high-density array (4–6). These technologies are now rou-
`tinely used to identify families of genes—in many cases
`incompletely characterized or with previously unidentified
`functionality—which act in concert to define a given cell
`fate or outcome (7), and have been used to identify upstream
`sequence elements involved in directing the expression of
`these gene families. Although microarray technology offers
`an increasingly reliable and sensitive analysis of gene expres-
`sion, its use is dependent on an a priori knowledge of genes,
`which are expressed under a given cell state, currently
`restricting application of the technology to the identification
`and quantification of these subsets of genes.
`SAGE technology (2), in contrast, directly samples the
`entire transcriptome of an organism under a given cellular
`state through the generation of SSTs of 9–22 bp in length.
`Because a 9–10mer oligonucleotide can theoretically identify
`49 (262 144) or 410 (1 048 576) unique sequences, the entire
`transcript population of any organism can potentially be repres-
`ented (2,3). First, a cDNA copy of the mRNA population is
`digested with a restriction endonuclease (RE; e.g. NlaIII) and
`the most 30 end restriction fragments of the digested popula-
`tion are purified. A short oligonucleotide adaptor that contains
`a unique primer sequence and a recognition sequence for a
`Type IIS RE is then ligated to the anchored cDNA. Because
`Type IIS REs are capable of cleaving DNA outside their
`recognition sequence (8), subsequent cleavage with a Type
`IIS RE (e.g. BsmFI) releases SSTs of equal length (2). A
`library of these SSTs is created through subsequent dimeriza-
`tion, amplification via the PCR, concatemerization and inser-
`tion into an appropriate vector. Finally, a representative
`
`*To whom correspondence should be addressed. Tel: +1 604 822 5136; Fax: +1 604 822 2114; Email: israels@chml.ubc.ca
`
`Nucleic Acids Research, Vol. 32 No. 12 ª Oxford University Press 2004; all rights reserved
`
`00001
`
`EX1021
`
`
`
`e96 Nucleic Acids Research, 2004, Vol. 32, No. 12
`
`PAGE 2 OF 10
`
`Table 1. Outline of the enzymatic, purification and isolation steps involved in the SAGE and microSAGE protocols (http://www.sagenet.org/protocol/index.htm)
`
`Enzymatic steps
`
`Purification and isolation steps
`MicroSAGE (SADE)
`
`SAGE
`
`(1) mRNA preparation
`(2) cDNA synthesis
`(3) cleavage with anchoring enzyme (digest with NlaIII)
`(4) 30end cDNA isolation
`(5) Ligating adaptors to bound 30 end cDNA
`(6) Release of cDNA tags (digest with BsmFI)
`(7) Blunt-ending of cDNA tags
`(8) Ligating tags to form ditags
`(9) PCR amplification of ditags
`
`(10) Adaptor removal (NlaIII digestion) and purification
`of ditags
`
`(11) Ligation of ditags to form concatamers
`(12) Insertion into vector
`
`Affinity purification
`–
`–
`–
`–
`
`Precipitation, selection with biotinylated oligo(dT)
`Phenol extraction, precipitation
`Phenol extraction, precipitation
`Affinity purification
`
`Phenol extraction, precipitation
`Phenol extraction, precipitation
`
`–
`
`Phenol extraction, precipitation
`PAGE purification, gel extraction, precipitation
`Phenol extraction, precipitation
`
`PAGE purification, gel extraction, precipitation
`PAGE purification, size selection, gel extraction, precipitation
`Phenol extraction, selection by host
`
`population of clones is serially sequenced to identify and tally
`each SST. Since each SST is derived from a defined position
`within a particular cDNA, a given tag can be cross-referenced
`through organism- and/or tissue-specific genome databases to
`a particular gene to give a profile of global gene expression. An
`important advantage compared with microarray technology is
`that unreferenced SSTs that arise out of the SAGE analysis can
`be used to identify previously unknown genes and aid in the
`completion of genome annotations for the organism under
`study (2,3,9–17).
`The ability of SAGE to provide an accurate measure of gene
`expression profiles is dependent upon the extent to which the
`distribution of transcript abundances inferred through the
`sequenced set of amplified SSTs fully reflects the real distri-
`bution of the abundances of associated transcripts in the
`original mRNA population. This fidelity depends upon
`the accuracy of the sequencing method used to identify the
`SSTs (18) and on the depth of sequencing applied to the SAGE
`library (19,20). Less appreciated, however, is the extent to
`which losses and processing artefacts in each of the 12 enzy-
`matic and 10 purification steps—or 7 in the microSAGE pro-
`tocol—used to convert the starting mRNA sample into a
`SAGE library (Table 1) can skew the sequencing results
`away from the real distribution. To illustrate, if 5 mg of
`mRNA (5·1012 molecules of average length 2 kb) are
`used as starting material for the SAGE protocol, a 50% aver-
`age yield in each processing step would result in an overall
`yield of 0.000024% (i.e. 0.522), such that the final sample
`(1.2·106 molecules) would represent a minute fraction of
`the original. Such an overall yield would result in a form of
`sampling bias in SAGE analysis equivalent to the bias intro-
`duced by an insufficient depth of sequencing (19,20).
`Although inclusion of PCR steps in the SAGE protocol is
`intended to recover these losses, amplification after processing
`can only recover those ditags derived from targets that have
`survived the numerous enzymatic and purification steps.
`Clearly then, efforts to maximize yields and minimize arte-
`facts introduced in each processing step are required to ensure
`the fidelity of SAGE.
`Although a number of recent studies have resulted in the
`improvement of some of the purification steps in the SAGE
`protocol (13,15,21–28),
`little attention has been given to
`
`addressing the efficiencies of the enzymatic steps of the pro-
`tocol. Given that the ability to generate a SAGE tag from a
`transcript is determined by the successful ligation of the SAGE
`adaptor to the anchored 30 end cDNA population, the yield in
`this step is likely to contribute significantly to the overall
`fidelity of the SAGE protocol. Here we demonstrate, using
`adaptors 1A/B of the current SAGE protocol (version 1e;
`http://www.sagenet.org/protocol/index.htm), that the yield of
`this ligation step is generally low due to a strong propensity
`of the anchored 30 end cDNA target to self-ligate. We then
`show that the addition of PEG-8000, traditionally used to
`favour the formation of linear ligation products (29–31),
`increases the yield of the desired adaptor-target heterodimer,
`but is unable to fully eliminate the formation of unwanted
`homodimer. Finally, we show that by using an alternative
`method of ligation, which we call ‘directed ligation’, a sig-
`nificant
`improvement
`in the SAGE protocol
`is achieved,
`increasing the efficiency of adaptor ligation and eliminating
`the irreversible formation of unwanted ligation products.
`
`MATERIALS AND METHODS
`
`Enzymes and constructs
`A 956 bp clone homologous to rat liver a transcription factor
`(GenBank ID: X65948) from rat brain with a polyadenylated 30
`end (58 bp), kindly provided by Dr Terry Snutch (Biotechnology
`
`Laboratory) in pBluescript SK
`(Stratagene), was propagated in
`Escherichia coli DH5a (Invitrogen). Plasmids were isolated
`using the boiling miniprep method (32) from 3 ml Terrific
`broth (Sigma Aldrich) cultures in the presence of 100 mg/ml
`ampicillin (Sigma Aldrich) when required. Plasmids (20 mg
`each) were linearized with EcoRV and further purified using the
`Qiagen Qiaquick purification kit according to the manufac-
`turer’s protocol (Qiagen). Orientation and identification of
`the insert were verified by sequencing of 100 ng of the purified
`plasmid at the Nucleic Acids and Peptide Synthesis Unit,
`University of British Columbia. In vitro RNA transcripts in
`the sense orientation were generated from 1 mg of linearized
`plasmid using the T3 MEGAscript kit (Ambion) following the
`manufacturer’s protocol and stored at 70C in diethyl-pyro-
`carbonate (DEPC) treated H2O (Ambion). All reactions in this
`
`00002
`
`
`
`PAGE 3 OF 10
`
`Nucleic Acids Research, 2004, Vol. 32, No. 12
`
`e96
`
`Table 2. List of oligonucleotides used in this study to form various SAGE adaptors
`Sequence (50!30)
`
`Oligo ID
`
`1A
`1Am6Aa
`1Am5Cb
`1Bphos
`2A
`2Am6Aa
`2Am5Cb
`2Bphos
`
`TTTGGATTTGCTGGTGCAGTACAACTAGGCTTAATAGGGACATG
`TTTGGATTTGCTGGTGCAGTACAACTAGGCTTAATAGGGACAm6TG
`TTTGGATTTGCTGGTGCAGTACAACTAGGCTTAATAGGGAC m5ATG
`c
`pTCCCTATTAAGCCTAGTTGTACTGCACCAGCAAATCC-NH2
`TTTCTGCTCGAATTCAAGCTTCTAACGATGTACGGGGACATG
`TTTCTGCTCGAATTCAAGCTTCTAACGATGTACGGGGACAm6TG
`TTTCTGCTCGAATTCAAGCTTCTAACGATGTACGGGGAC m5ATG
`c
`pTCCCCGTACATCGTTAGAAGCTTGAATTCGAGCAG-NH2
`
`Oligonucleotides were obtained gel-purified and verified by mass spectrometry.
`aAm6, N6-methyl-deoxyadenosine.
`bCm5, 5-methyl-deoxycytosine.
`cNH2, 30 C7 amino spacer.
`
`MW (g/mol)
`
`13657.06
`13670.95
`13670.95
`11517.57
`12919.55
`12933.58
`12933.58
`11020.24
`
`study were incubated using an Eppendorf Mixmaster pro-
`grammed for 3 s mixtures at 1400 rpm every 15 min.
`Preparation of 30 end anchored cDNA
`An aliquot of 5 mg (16 pmol) or 0.1 mg (0.3 pmol) of in
`vitro transcribed RNA was processed according to the regular
`SAGE protocol or the microSAGE protocol version 1e. Alter-
`natively, in vitro transcribed RNA (0.6 mg or 1.9 pmol) was
`annealed to 3.0 mg oligo(dT)25 dynabeads (Dynal Biotech) in
`the presence of 600 U of SUPERase In (Ambion). Annealed
`RNA was then processed according to the microSAGE pro-
`tocol version 1e using components from a cDNA synthesis kit
`(Invitrogen) and scaled accordingly to a final volume of 600 ml
`with the following exception: after first strand synthesis, the
`reaction was cooled on ice, magnetized and 520 ml of the first
`strand reaction was replaced with 520 ml of a pre-chilled
`mixture of second strand synthesis reaction components and
`incubated for 16 h at 16C. Anchored second strand products
`were then blunt-ended, washed and digested with NlaIII (New
`England Biolabs) as described. The resulting anchored 30 end
`cDNAs (0.6 pmol/mg dynabeads) were stored at 20C
`until ready for use.
`
`Adaptors
`
`Oligonucleotides corresponding to the adaptors and primers
`used in the SAGE and microSAGE protocols version 1e were
`obtained gel- or HPLC-purified (Qiagen) and are shown in
`Table 2. Stock concentrations (5 mM) of the following adaptors
`were prepared in 1 · NEB4 buffer (New England Biolabs) by
`mass dilutions: adaptor 1 (1A/1Bphos), adaptor 1m5C
`(1Am5C/1Bphos), adaptor 1m6A (1Am6A/1Bphos), adaptor
`2 (2A/2Bphos), adaptor 2m5C (2Am5C/2Bphos) and adaptor
`2m6A (2Am6A/2Bphos). Adaptors were annealed according
`to the annealing schedule described in the current SAGE
`protocols.
`
`Standard ligation protocol used in SAGE
`
`Ligation reactions using adaptor 1 at a final concentration of
`80 nM were performed according to microSAGE protocol
`version 1e. Additional ligation reactions, scaled to a final
`volume of 10 ml (0.075 pmol cDNA per 125 mg dynabeads)
`and containing varying amounts of adaptor 1 (0.038–38 pmol),
`or supplemented with PEG-8000 [15% (w/v) final] using a
`
`final adaptor concentration of 1 mM were also performed.
`All reaction samples were incubated for 2 h at 16C or 25C.
`
`Directed ligation
`
`Titration of T4 DNA ligase activity with NlaIII. Stock ligase
`mixture containing T4 DNA ligase (5 Weiss U/ml; Fermentas)
`were prepared with various amounts of NlaIII (120 U/ml; New
`England Biolabs) in a final buffer composition of 15 mM Tris–
`HCl (pH 7.5), 0.1 mM EDTA, 1 mM DTT, 200 mM KCl,
`0.5 mg/ml BSA and 50% glycerol, and stored at 70C.
`(125 mg) with anchored 30end
`Oligo(dT)25 dynabeads
`cDNA (0.075 pmol) were pre-incubated with adaptor 1,
`adaptor 1m5C or adaptor 1m6A (1 mM final) for 5 min at
`37C in 1· NEB4 buffer supplemented with 1 mM ATP
`and 100 ng/ml BSA in a volume of 9 ml. The reactions
`were initiated by adding 1 ml from one of the stock enzyme
`mixtures described above, overlaid with mineral oil, and
`incubated for 2 h at 37C.
`Directed ligation protocol for SAGE. A stock enzyme mixture
`containing NlaIII (25 U/ml final) and T4 DNA ligase (2.5 Weiss
`U/ml final) was prepared as described above. Oligo(dT)25
`dynabeads (125 mg) with anchored 30 end cDNA (0.075
`pmol) were pre-incubated with 2.5 pmol of adaptor 1m6A
`for 5 min at 37C in 1· NEB4 buffer supplemented with
`100 ng/ml BSA and 1 mM ATP. After initiation with 1 ml
`of the stock enzyme mixture, reactions were spiked every
`15 min with 2.5 pmol of adaptor 1m6A for a total incubation
`time of 1 h and a total addition of 10 pmol adaptor.
`
`Analysis of anchored ligation products
`The reactions were heat-inactivated for 20 min at 65C in
`200 ml of 1 · NEB4 supplemented with 100 ng/ml BSA, fol-
`lowed by two washes with the same buffer. Anchored ligation
`products were then cleaved off the dynabead support with 10 U
`DraI (New England Biolabs) in 30 ml of 1· NEB4 supplemen-
`ted with BSA. After incubation for 1 h at 37C, products were
`resolved via PAGE (6% PAGE; Owl Scientific) for 3 h at 12.5
`V/cm. SYBR-Gold (Molecular Probes) stained gels were
`visualized using a CCD-based gel documentation system
`(Alpha Innotech) using a SYBR-green filter set (Molecular
`Probes) at a sub-saturating aperture setting and recorded as
`TIFF files. When required, densitometric analysis was per-
`formed using publicly available software (tnimage-3.3.7a;
`http://brneurosci.org/tnimage.html).
`
`00003
`
`
`
`e96 Nucleic Acids Research, 2004, Vol. 32, No. 12
`
`PAGE 4 OF 10
`
`Preparation and PCR amplification of ditags
`
`Adaptors 1 and 2 or adaptors 1m6A and 2m6A were ligated to
`anchored 30 end cDNA derived from 100 ng of in vitro tran-
`scripts as described above using the standard microSAGE
`protocol version 1e or the directed ligation protocol. After
`ligation, the anchored products were processed according to
`microSAGE protocol version 1e to form ditags. Ditag ligation
`mixtures (3 ml) were brought up to a final volume of 20 ml with
`LoTE buffer (2 mM Tris–HCl, 0.2 mM EDTA, pH, 8.0). One
`microlitre aliquots of 1 : 20 and 1 : 200 dilutions of the ligation
`mixture in LoTE were then used as a template for PCR ampli-
`fication with Platinum Pfx thermophilic DNA polymerase
`(Invitrogen) supplemented with 0.5· PCRX enhancer solution
`and 0.1 mM MgSO4 according to the manufacturer’s protocol
`in a final volume of 50 ml. PCR amplification was performed in
`the presence or absence of template on an Eppendorf Master-
`cycler (Eppendorf) using primer 1 and primer 2 as described in
`the microSAGE protocol. After activation for 1 min at 95C,
`26 cycles were performed according to the following schedule:
`95C, 30 s; 55C, 1 min and 72C, 1 min. Upon completion,
`a 10 ml aliquot was then resolved via 6% PAGE for 1 h at
`12.5 V/cm and visualized as described above.
`
`RESULTS AND DISCUSSION
`
`The ability of SAGE to provide a truly quantitative picture of
`gene expression relies on the efficiency of each step required
`to generate the library of SSTs from the harvested mRNA
`starting material. Currently, two general approaches to gen-
`erate SAGE libraries are utilized (Table 1), each customized
`towards the amount of starting material available to the
`researcher. The original SAGE protocol described by Velcu-
`lescu et al. (2) uses 5 mg of mRNA (7.8 pmol mRNA of
`average length 2 kb) as starting material. After conversion into
`biotinylated cDNA, half of this sample is digested with the
`RE NlaIII, and the 30 end fragments are affinity purified via
`streptavidin-linked dynabeads (2 mg) to generate anchored
`30 end cDNA (3.9 pmol/mg dynabeads). In contrast, the micro-
`SAGE protocol, a modification of the SADE (SAGE Analysis
`for Down-sized Extracts) protocol of Virlon et al. (33) and
`commercially available I-SAGETM kit from Invitrogen, is
`designed to process the RNA from 5 · 104 to 2 · 106 cells
`or up to 100 ng (0.16 pmol mRNA of average length 2 kb)
`of starting mRNA. Oligo(dT)25 dynabeads (0.5 mg) are used as
`an affinity support to directly harvest polyadenylated RNA
`from the sample. The anchored oligo(dT)25 on the support is
`used to prime cDNA synthesis which is then digested with
`NlaIII to generate anchored 30 end cDNA (0.31 pmol/mg
`dynabeads).
`When our in vitro RNA material was used as the starting
`material, we found that the amount of anchored 30 end cDNA
`recovered using the original SAGE protocol was similar to that
`obtained through the microSAGE protocol despite using 25-
`fold more starting material (data not shown). This observation
`is consistent with work by Virlon et al. (33) where 200-fold
`less anchored 30 end cDNA was recovered from microdis-
`sected renal tubules using the SAGE protocol compared to
`those recovered from their SADE protocol, which used half the
`amount of starting material and employed Sau3A I as the
`anchoring enzyme. Although this material loss is largely
`
`due to the presence of four additional extraction and precipita-
`tion steps in the original SAGE protocol prior to adaptor
`ligation (Table 1), additional losses may arise from the pre-
`sence of excess biotinylated oligo(dT)20 primer used to prime
`first strand synthesis. Any such primer that survives the extrac-
`tion and precipitation steps will compete with binding to the
`streptavidin support. This primer contamination is most prob-
`ably small, however, as batch purification of biotinylated
`cDNAs using Qiaex II silica beads did not improve yields
`significantly.
`After synthesis of the anchored 30 end cDNA library
`on either streptavidin-linked Dynabeads (i.e. SAGE) or
`oligo(dT)25 Dynabeads (i.e. microSAGE), further processing
`towards generation of the SAGE library is essentially the same
`under the two protocols (Table 1).
`Self-ligation of the anchored 30 end cDNA competes
`with ligation of the adaptor
`
`Under standard microSAGE reaction conditions, we observe
`that the ligation of SAGE adaptors to the cohesive end of the
`anchored 30 end cDNA consistently produces two products. In
`the presence of T4 DNA ligase and the standard 80 nM con-
`centration of adaptor 1, a relatively small fraction (<5%) of
`the anchored 30 end cDNA was found to ligate to adaptor 1 to
`form the desired adaptor-target cDNA hetero-ligation product
`(Figure 1). The bulk of the anchored cDNA underwent an
`undesired reaction to form a high molecular weight product
`(lane 3). Comparisons with the control reaction in which no T4
`
`Figure 1. Ligation of SAGE adaptor 1A to anchored 30 end cDNA. An aliquot
`of 100 ng of in vitro transcribed polyadenylated product was processed under
`the microSAGE protocol and split into half. Lane 2 shows a control reaction in
`which T4 DNA ligase was not added to the ligation mixture. Lane 3 shows the
`formation of a small amount of the hetero-ligation product indicated by the
`arrow as well as a high molecular weight band corresponding to twice the
`molecular weight of the unligated cDNA. Ligations were performed as
`described in Materials and Methods.
`
`00004
`
`
`
`PAGE 5 OF 10
`
`Nucleic Acids Research, 2004, Vol. 32, No. 12
`
`e96
`
`DNA ligase was added (lane 2), and with a ligation reaction
`performed in the presence of NlaIII indicated that this high
`molecular weight product is a homodimer of the anchored 30
`end cDNA. Identical experiments were also carried out on
`streptavidin anchored 30 end cDNA samples prepared by
`the original SAGE protocol and gave essentially the same
`results. Lower loading densities of in vitro RNA onto oli-
`go(dT)20 dynabeads or biotinylated cDNA onto streptavidin-
`linked dynabeads only marginally inhibited formation of the
`homodimer, suggesting that homodimer formation depends on
`both the distance of separation between anchored 30 end cDNA
`molecules on the surface of a given dynabead (intermolecular)
`as well as between those anchored on adjacent dynabeads
`(intramolecular). Formation of
`the homodimer was also
`observed when other in vitro RNA transcripts were utilized
`to generate anchored 30 end cDNA targets ranging from 132 to
`355 bp in length. Thus, under
`the ligation conditions
`described, most of the desired hetero-ligation product is lost
`in favor of self-ligation of two anchored cDNA fragments.
`The yield of the desired hetero-ligation product was found
`to depend on the amount of SAGE adaptor introduced into the
`ligation mixture, and increased with increasing adaptor con-
`centration (Figure 2). However, even at very high concentra-
`tions of added adaptor (500:1, lane 10), formation of the
`unwanted cDNA self-ligation product remained significant,
`resulting in a loss of approximately half of the starting
`cDNA material. Under homogeneous reaction conditions
`(i.e. all reactants present in the solution phase), mass-action
`should favour the formation of two products, the desired
`adaptor–cDNA heterodimer and the adaptor–adaptor homodi-
`mer at these high concentrations of added adaptor. However,
`tethering of the target cDNA to the polystyrene surface of
`dynabeads creates a heterogeneous reaction environment.
`The distribution of ligation products may therefore be con-
`trolled by mass transfer effects that limit the concentration of
`
`Figure 2. Influence of increasing adaptor:target molar ratios on the formation of
`adaptor–target heterodimer versus target homodimer. Increasing amounts of
`adaptor 1 (0–3.8 mM final) were introduced into standard ligation reactions
`containing 0.075 pmol anchored target in a final volume of 10 ml as described in
`Materials and Methods. In microSAGE, adaptors are introduced to a reaction
`mixture containing 0.08 pmol anchored target at a final concentration of
`0.08 mM in a total volume of 20 ml, corresponding to adaptor:target ratio of
`approximately 20:1. The classic SAGE protocol introduces a final adaptor
`concentration of 0.8 mM to the ligation mixture containing 1.95 pmol
`anchored target in a total volume of 40 ml, corresponding to an adaptor:
`target ratio of 16:1.
`
`adaptor in the solid–liquid interfacial region where the target
`cDNA is anchored and the reaction must take place. Conse-
`quently, adaptor–adaptor and cDNA–cDNA homodimers are
`produced preferentially, even in the presence of a large excess
`of the added adaptor.
`Improving the yield of adaptor–cDNA heterodimer by
`increasing the adaptor concentration in the reaction mixture
`is impractical for large-scale SAGE projects. In addition to
`the high associated costs of preparing the adaptor, excess
`adaptor may have deleterious effects on subsequent steps
`in SAGE. High concentrations of adaptor promote the
`formation of a large number of adaptor dimers, which can
`interfere with subsequent PCR amplification steps or neces-
`sitate excessive washing of the anchored ligation product
`to remove unreacted adaptor and adaptor dimers. For this
`reason, some groups (33,34) have attempted to limit adaptor–
`dimer contamination of the ditag PCR mixture by reducing
`the concentration of adaptor used in the adaptor ligation step.
`However, our results show that lowering the added SAGE
`adaptor concentration below the standard concentration of
`80 nM (i.e. lanes 4 and 5 of Figure 2) results in a significant
`reduction in the already low yield of the desired adaptor–
`cDNA hetero-ligation product. As the overall fidelity of
`SAGE to provide an accurate read of the distribution of
`transcript abundances will be affected by this sampling
`loss, there exists a need to develop cheaper and more effec-
`tive methods to increase the yield of the desired hetero-
`ligation product by reducing or, better yet, eliminating the
`formation of self-ligation products.
`
`Addition of macromolecular crowding agents increases
`the yield of adaptor modified anchored 30 end cDNA
`Other changes in reaction conditions that alter the distribution
`of ligation products were therefore explored to improve the
`yield of the desired hetero-ligation product. For example, low-
`ering the reaction temperature can be used to slow the ligation
`reaction to a point where the rate of mass transfer of the
`adaptor to the solid–liquid interface no longer limits the forma-
`tion of the hetero-ligation product. In this case, however, a
`significantly increased incubation time is required, extending
`the already lengthy process involved in producing a SAGE
`library. Varying the rate of mixing during the reaction to
`decrease the hydrodynamic boundary layer and increase
`the surface concentration of the free adaptor was explored,
`but led to only a marginal improvement in the yield of the
`hetero-ligation product.
`Adding co-solutes that act as macromolecular crowding
`agents (i.e. compaction agents) has been shown to dramatic-
`ally affect the thermodynamics of reaction mixtures, gener-
`ally favouring the formation of products with compact
`conformations and for some proteins, linear rod-like aggre-
`gates (35,36). For ligation reactions, addition of 15% (w/v) of
`the neutral polymer polyethylene glycol (PEG) has been
`shown to enhance by up to 100-fold the formation of inter-
`molecular ligation products (i.e. linear concatamers) during
`the ligation of cohesive or blunt-ended DNA fragments in the
`solution phase (30,31,37). The influence of increased concen-
`trations of PEG-8000 on the formation of the desired hetero-
`ligation product was therefore examined (Figure 3). At the
`standard reaction temperature of 16C and a fixed adaptor
`
`00005
`
`
`
`e96 Nucleic Acids Research, 2004, Vol. 32, No. 12
`
`PAGE 6 OF 10
`
`Product distribution can be directed through the
`introduction of a restriction enzyme into the ligation
`reaction—directed ligation chemistry
`
`The inability to adjust ligation conditions such that the hetero-
`ligation product becomes the only significant reaction product
`suggests that surface-anchoring of the target cDNA presents
`kinetic or mass-transfer barriers that cannot be overcome by
`simple adjustments to the reaction conditions. As the primary
`problem lay in the inability to offset self-ligation of the target
`cDNA molecules on the surface, we sought a novel method to
`limit or prevent formation of this undesired ligation product.
`Although removal of the 50-phosphate on the recessed 50 ends
`of the anchored 30 end cDNAs using an appropriate alkaline
`phosphatase could potentially provide a means to eliminate
`self-ligation of the anchored 30 end cDNAs, the efficiency of
`dephosphorylation by such phosphatases is often much lower
`for 50-phosphates on these sites. This, combined with the back-
`ground nuclease activity of the enzyme that can catalyse diges-
`tion of 50 overhangs, would significantly reduce the overall
`yield of defined ligation products. In addition, the ligation of
`SAGE adaptors to such modified targets would lead to the
`formation of a nicked adaptor–cDNA hetero-ligation product
`that is inappropriate for further SAGE processing without the
`introduction of an additional enzymatic step prior to PCR
`amplification. Another approach would be to use an adaptor
`with an unphosphorylated 50 end that would prevent adaptor
`dimer formation and thereby enhance reactivity by maintain-
`ing a large excess of adaptor relative to target cDNA. How-
`ever, this approach would also result in a nicked strand that
`requires additional enzymatic steps for processing in SAGE.
`As an alternative approach, we considered the effect of
`adding different amounts of NlaIII to the reaction mixture,
`with the aim of establishing a more favourable product dis-
`tribution based on the relative rates of ligation-product forma-
`tion catalysed by T4 DNA ligase and ligation product cleavage
`by NlaIII. In the presence of both enzymes, ligation would
`proceed until a steady-state product profile is reached in the
`presence of ATP. We observed that the addition of various
`amounts of NlaIII to a standard ligation reaction containing the
`SAGE adaptor clearly influenced the product distribution of
`hetero- versus homo-ligation product (Figure 4A). Titration of
`a standard ligation reaction containing 0.25 U/ml T4 DNA
`ligase and 1 mM of adaptor 1 with increasing amounts of NlaIII
`in the absence of PEG-8000 resulted in a gradual decrease in
`the amount of the high molecular weight homodimer as well as
`the desired heterodimer, and a concomitant increase in the
`amount of unmodified target DNA.
`Although we were unable to selectively enhance the forma-
`tion of the hetero-ligation product relative to the undesired
`cDNA–cDNA homodimer, the results suggested that the com-
`petitive actions of NlaIII and T4 DNA ligase could provide an
`efficient route to complete conversion of the anchored 30 end
`cDNA fragments into the desired hetero-ligation product if
`RE-catalysed digestion of the desired adaptor–cDNA hetero-
`dimer could be specifically inhibited. As NlaIII is one of a
`number of REs sensitive to the presence of a methylated base
`within its recognition sequence (Table 3), the introduction of a
`methylated base within the ligation site of the SAGE adaptor
`could potentially enable the selective inhibition of digestion of
`the desired ligation product. Through subsequent formation of
`
`Figure 3. Influence of supplemental PEG-8000 and incubation temperature on
`the formation of adaptor–target heterodimer versus target homodimer. The
`standard ligation reaction in the microSAGE protocol is performed in the
`presence of 5% PEG-8000 (w/v) at 16C for 2 h using a final adaptor
`concentration of 0.08 mM in a final volume of 20 ml. Ligation reactions
`shown were performed in a final volume of 10 ml as described in Materials
`and Methods using a adaptor concentration of 1 mM final in the presence or
`absence of PEG-8000 supplemented to a final concentration of 15% (w/v). The
`reactions were carried out for 2 h under the conditions indicated.
`
`concentration of 1 mM (i.e. >10-fold than typically u