throbber
PROTOCOL
`
`Parallel tagged sequencing on the 454 platform
`
`Matthias Meyer, Udo Stenzel & Michael Hofreiter
`
`Max Planck Institute for Evolutionary Anthropology, Department of Evolutionary Genetics, Deutscher Platz 6, D-04103 Leipzig, Germany. Correspondence should be
`addressed to M.M. (mmeyer@eva.mpg.de).
`
`Published online 31 January 2008; doi:10.1038/nprot.2007.520
`
`Parallel tagged sequencing (PTS) is a molecular barcoding method designed to adapt the recently developed high-throughput 454
`parallel sequencing technology for use with multiple samples. Unlike other barcoding methods, PTS can be applied to any type of
`double-stranded DNA (dsDNA) sample, including shotgun DNA libraries and pools of PCR products, and requires no amplification or
`gel purification steps. The method relies on attaching sample-specific barcoding adapters, which include sequence tags and a
`restriction site, to blunt-end repaired DNA samples by ligation and strand-displacement. After pooling multiple barcoded samples,
`molecules without sequence tags are effectively excluded from sequencing by dephosphorylation and restriction digestion, and using
`the tag sequences, the source of each DNA sequence can be traced. This protocol allows for sequencing 300 or more complete
`mitochondrial genomes on a single 454 GS FLX run, or twenty-five 6-kb plasmid sequences on only one 16th plate region. Most of the
`reactions can be performed in a multichannel setup on 96-well reaction plates, allowing for processing up to several hundreds of
`samples in a few days.
`
`INTRODUCTION
`Rationale
`Over the last three decades, Sanger sequencing1 has been the
`dominant DNA sequencing technology in all areas of life sciences,
`used to retrieve individual sequences or to decipher entire genomes.
`Although the throughput of this technology has gradually increased
`over time, it has now been exceeded by recently developed next-
`generation sequencing technologies2, such as 454 (Roche)3, Solexa
`(Illumina)4 and SOLiD (ABI). These technologies have increased
`the number of sequences obtained in a single run of a machine by
`several orders of magnitude, from mere hundreds to hundreds of
`thousands or even millions. Their superior efficiency in terms of
`both cost and time per sequenced nucleotide has not only spawned
`exploration in new sequencing fields, for example, ultra deep
`amplicon sequencing5 or paleogenomics6, but has also replaced
`Sanger sequencing in some of its ancestral domains, such as
`genome sequencing3,7 and serial analysis of gene expression8.
`Among the next-generation sequencing technologies, 454
`currently offers by far the highest read length, which is B250 bp
`on the GS FLX platform, not far from the 700 bp achieved through
`routine Sanger sequencing. However, despite its comparatively low
`throughput, Sanger sequencing is still used for many everyday
`applications, for example, amplicon sequencing and the sequencing
`of DNA fragments, a few kilobases long by primer walking. One
`important reason for this lies in a conceptual difference between
`Sanger and 454 sequencing, which affects the number of samples
`that can be processed in parallel. Whereas in Sanger sequencing
`each sequence read is derived from a separate sequencing reaction,
`454 uses emulsion PCR9 to amplify a pool of templates in a single
`reaction vessel before sequencing. Within one emulsion PCR, no
`information is retained about a sequence’s sample origin. Thus, to
`process several samples in parallel, these must be kept in separate
`pools, physically subdivided from each other not only during
`library preparation, but also during bead-emulsion-amplification
`and sequencing. However, the 454 sequencing plate can only be
`divided into a maximum of 16 regions, each of which yields on
`average 3 Mb of sequence. In many cases, this amount of sequence
`data produces an unnecessarily high redundancy in coverage. For
`
`example, if a 6-kb plasmid is shotgun sequenced on one 16th GS
`FLX plate region, it will be covered 500-fold on average. If it were
`possible to retain information about the sample origin of the
`obtained sequence reads, the same capacity could be used to
`sequence 25 such plasmids to 20-fold coverage.
`The method described here, called parallel tagged sequencing
`(PTS), allows for parallel sequencing large numbers of double-
`stranded DNA (dsDNA) samples on the 454 platform10. This is
`achieved by barcoding each sample with a specific sequence tag.
`After pooling the tagged DNA samples, library preparation and
`sequencing, the tag sequences are used to identify each sequence’s
`sample origin. The protocol (illustrated in Fig. 1) begins by blunt-
`end repairing each sample in separate reactions. Subsequently,
`barcoding adapters are ligated to both ends of the molecules.
`These adapters comprise single self-hybridized oligos containing
`a sequence tag and an Srf I restriction site. After ligation, the
`resulting single-strand nicks are removed by fill-in using a strand
`displacing polymerase. The barcoded samples are then quantified
`and pooled in ratios reflecting the desired relative sequence
`representation. After dephosphorylation, half of the adapter is cut
`off using Srf I11, a rare cutting restriction enzyme with restriction
`sites approximately every 150 kb in the human genome. Srf I leaves
`5¢ phosphates for the ligation of universal 454 adapters during
`sequencing library preparation. The dephosphorylation step
`excludes unreacted molecule ends from sequencing.
`PTS offers several important features providing both efficient use
`of sequencing resources and high data reliability. First, all reactions
`are completed with B100% efficiency, ensuring highly homo-
`geneous
`sequence
`representation among
`samples. Second,
`background sequences without a sequence tag are efficiently
`excluded from sequencing by dephosphorylation in conjunction
`with the use of a restriction enzyme. The Srf I restriction site
`produces a run of Gs before the tag and immediately adjacent to
`the key sequence used by the 454 system for quality controls. As the
`last nucleotide of this key is also Gua, all nucleotides remaining
`from the Srf I restriction site are inserted within a single flow cycle
`without significantly reducing the read length of the following
`
`NATURE PROTOCOLS | VOL.3 NO.2 | 2008 | 267
`
`natureprotocols
`
`/moc.erutan.www//:ptth
`
`
`
` puorG gnihsilbuP erutaN 8002 ©
`
`00001
`
`EX1005
`
`

`

`a
`
`b
`
`I
`
`Blunt end repair
`
`II
`
`Adapter ligation
`
`III
`
`Adapter fill-in
`
`IV
`
`Pooling, dephosphorylation,
`restriction digestion
`
`V
`
`454 Sequencing library preparation
`
`A
`
`B′
`
`VI
`
`454 Sequencing
`
`SrfI
`site
`
`Specific
`tag
`
`Target sequence
`
`Complementary
`tag /SrfI site
`
`sequencing. We are currently using PTS particularly for two
`sequencing applications, which are discussed below.
`
`Shotgun sequencing contiguous DNA segments. Owing to its
`high throughput and the absence of microbial subcloning, the 454
`technology enables faster and cheaper shotgun sequencing com-
`pared to the Sanger methodology. Using PTS, this power can be
`fully exploited for parallel sequencing contiguous DNA segments
`
`PROTOCOL
`
`Figure 1 | Overview of the tagging protocol. (a) Each DNA sample is blunt-
`end repaired (I, Steps 2–8), before sample-specific barcoding adapters are
`ligated to both ends of the molecules (II, Steps 9–13). Nicks resulting from
`the ligation are removed by strand displacement with Bst polymerase
`(III, Steps 14–16). The barcoded samples are pooled in equimolar ratios and
`unligated molecule ends are excluded from sequencing through
`dephosphorylation and restriction digestion (IV, Steps 18–25). A single-
`stranded 454 sequencing library is prepared from the sample pool; this
`includes the blunt-end ligation of universal 454 adapters to the template
`molecules and isolation of correctly ligated molecules as single strands
`(V, Step 26). After sequencing (VI, Steps 27 and 28) the sequence reads are
`sorted according to their tag sequences (Step 29). Before downstream data
`processing, the sequence tags are removed from the 5¢ ends and, if applicable,
`the 3¢ ends of the reads. (b) Barcoding adapters comprise single self-
`hybridized palindromic oligonucleotides, carrying an SrfI restriction site in the
`middle (GCCCGGGC), a sequence tag at the 3¢ end and the reverse
`complementary tag sequence at the 5¢ end. Each sequence tag may start with
`either an A or T, followed by several freely chosen nt, and ends in C or G. No
`homopolymers are allowed within the tag sequence.
`
`sequence. Third, the tag design is particularly robust to sequencing
`errors in and around homopolymers, which are known to be the
`most common errors in 454 sequencing. In our experience, B97%
`of the sequences can be assigned to their sample origin with an
`extremely low false-assignment rate. Finally, the protocol is opti-
`mized for reaction setup using multichannel pipettes and a 96-well
`plate format, minimizing the time required for setup.
`
`Applications of PTS
`In existing 454 applications, physical subdivision of the 454
`sequencing plate is frequently used to process up to 16 different
`samples in one run. As this requires covering the sequencing plate
`with a gasket, the overall number of sequences retrieved from one
`run is reduced by half (Table 1). Using PTS instead of physical
`subdivision for such applications immediately doubles the sequen-
`cing throughput. Moreover, as theoretically an unlimited number
`of tags can be produced, PTS overcomes any limitation on the
`number of samples that can be processed in parallel. In principle,
`PTS can be applied to all types of double-stranded nucleic
`acid samples, allowing an efficient switch from Sanger to 454
`
`natureprotocols
`
`/moc.erutan.www//:ptth
`
`
`
` puorG gnihsilbuP erutaN 8002 ©
`
`TABLE 1 | Sequencing throughput of the GS 20 and GS FLX platforms. 454 Sequencing plates can physically be subdivided into a minimum of two
`and a maximum of 16 regions. When using 16 plate regions, roughly half of the output is lost, as parts of the plate are covered with a gasket.
`In contrast, PTS allows for sequencing hundreds of samples in parallel without requiring physical subdivision, thereby retaining the maximum
`throughput.
`
`Sequencing platform
`Average read length
`
`Plate region
`Reads per region
`Base pairs per regions
`Base pairs per plate
`
`GS 20
`B100 bp
`
`1/4th
`33,000
`3.3 Mb
`13.2 Mb
`
`1/16th
`6,300
`630 kb
`10 Mb
`
`1/2
`100,000
`10 Mb
`20 Mb
`
`1/16th
`12,000
`2.88 Mb
`46 Mb
`
`Number of samples per plate region that can be processed in parallel using PTS
`17 kb segments, for example, mtDNA genomes (shotgun sequenced, average 20-fold coverage)
`2
`10
`29
`6 kb segments, for example, plasmids (shotgun sequenced, average 20-fold coverage)
`5
`28
`83
`PCR products (o100/250 bp length, average 40-fold coverage)
`158
`825
`
`2,500
`
`8
`
`24
`
`300
`
`GS FLX
`B250 bp
`
`1/4th
`70,000
`16.8 Mb
`67 Mb
`
`49
`
`140
`
`1/2
`210,000
`50.4 Mb
`101 Mb
`
`148
`
`420
`
`1,750
`
`5,250
`
`268 | VOL.3 NO.2 | 2008 | NATURE PROTOCOLS
`
`00002
`
`

`

`PROTOCOL
`
`30
`
`25
`
`20
`
`15
`
`10
`
`5 0
`
`c
`
`Sequence reads
`
`0
`
`2,000 4,000 6,000 8,000 10,00012,000 14,00016,000 18,000
`Nucleotide position
`
`Tag 25
`
`Tag 27
`
`Tag 11
`
`Tag 7
`
`Tag 5
`
`Tag 21
`
`Tag 9
`Tag 17
`
`Tag 3
`
`Tag 23
`
`Tag 15
`
`Tag 19
`
`Tag 13
`
`Tag 1
`
`No tag
`
`2,000
`1,800
`1,600
`1,400
`1,200
`1,000
`800
`600
`400
`200
`0
`
`b
`
`Number of sequeces
`
`a
`
`Sequences
`
`Number % of total
`
`Average
`coverage
`
`Fully
`assembled
`
`Total
`
`Tag 1
`Tag 3
`Tag 5
`Tag 7
`Tag 9
`Tag 11
`Tag 13
`Tag 15
`Tag 17
`Tag 19
`Tag 21
`Tag 23
`Tag 25
`Tag 27
`No tag
`
`14,780
`683
`938
`1,058
`1,091
`1,019
`1,107
`776
`901
`987
`888
`1,027
`933
`1,525
`1,284
`563
`
`4.6
`6.3
`7.2
`7.4
`6.9
`7.5
`5.3
`6.1
`6.7
`6.0
`6.9
`6.3
`10.3
`8.7
`3.8
`
`9.6
`13.0
`14.6
`15.0
`14.1
`15.5
`12.6
`12.5
`13.8
`12.5
`14.1
`12.9
`20.3
`17.6
`
`Yes
`Yes
`Yes
`Yes
`Yes
`Yes
`No
`Yes
`Yes
`Yes
`Yes
`Yes
`Yes
`Yes
`
`Figure 2 | PTS of 14 complete human mtDNA genomes (B16.5 kb) on a small GS FLX plate region to on average 14-fold coverage. The mtDNA genomes were
`amplified in two overlapping long-range PCRs as described previously10. The long-range PCR products were then quantified, pooled in equimolar ratios and
`nebulized. Between 100 and 200 ng of each sample were used as templates for the tagging reactions using barcoding adapters with 7-bp sequence tags differing
`by at least three substitutions. From 40 barcoding adapters that had been synthesized and diluted in a single batch in order from 1 to 40, no immediate
`neighbors were used to obtain full power for detecting cross-contamination among barcoding adapters. After barcoding, the samples were pooled in equal mass
`ratios. (a) Table of sequencing results showing the number of sequences and coverage obtained for each sample. As a result of inaccuracies in quantification or
`pipetting when pooling the long-range PCR products, one sample exhibited uncovered positions. These could be filled-in by deeper sequencing or single Sanger
`reads. No sequences with sequence tags from unused barcoding adapters were observed, indicating no detectable presence of cross-contamination among the
`barcoding adapters and the absence of sequencing errors potentially leading to false assignment of sequences to their sample origin. Thus, the best estimate of
`the false-assignment frequency in this experiment is zero. (b) Bar chart visualizing the sequence representation among samples, ordered from lowest to highest.
`(c) Exemplary coverage plot for one of the mitochondrial genomes (tag 23).
`
`from numerous samples, such as plasmids or target regions pre-
`amplified by long-range PCR (see ref. 10, Fig. 2). For example,
`when sequencing is performed on the new 454 GS FLX platform,
`up to 300 complete mitochondrial genomes or a comparable
`number of nuclear DNA fragments of similar length (B17 kb)
`can theoretically be sequenced to 20-fold coverage in parallel in a
`single run (see Table 1). In this way, population data produced by
`re-sequencing can be obtained with unprecedented speed. In
`contrast to Sanger-based primer walking approaches, shotgun
`sequencing long-range PCR products does not require a priori
`sequence information for designing sequencing primers and
`saves
`time and costs
`for setting up individual PCRs and
`sequencing reactions.
`
`Sequencing pooled amplicons. This application is useful when
`short sequences within the 454 read length limit are desired, as in
`population studies using ancient DNA or DNA from museum
`specimens. As 454 sequence reads stem from single template
`molecules, miscoding base damage and contamination can be
`readily identified without microbial subcloning of the PCR pro-
`ducts. This allows for cost- and time-efficient sequencing of pooled
`amplicons from multiple samples, while retaining the highest
`standards in ancient DNA and museum research12,13. Phylogenetic
`and population genetic studies are also increasingly performed
`using multiple short nuclear sequences, totaling from a few to 30 kb
`of sequence14–16. In such applications, complete data sets for whole
`species groups as well as large population samples could be
`obtained in either a single or partial 454 run. In addition to low
`coverage sequencing of many pooled amplicons, PTS can be used
`for deep sequencing fewer amplicons in parallel.
`
`all types of dsDNA samples. It is also the only currently available
`method for barcoding shotgun DNA libraries and pooled PCR
`products.
`The previously reported barcoding methods used 5¢-tagged PCR
`primers to distinguish PCR products derived from different
`sources17–19. The universal 454 adapter sequences were either
`included as additional 5¢-tails or added in the regular 454 library
`preparation process. This approach is simple, quick and efficient
`for sequencing short (o250 bp) homologous PCR products from
`different samples, because combinations of tagged forward and
`reverse primers can be used to barcode a large number of samples.
`However, when dealing with many different targets, this approach
`becomes cost-prohibitive and prone to confusion, because sets of
`primers must be synthesized for each target under study, and the
`primers must be added separately both to each PCR and the
`corresponding control reaction. PTS is preferable in this case, as
`it is suitable for simultaneously barcoding a pool of PCR products.
`It does not require changes to the experimental design of existing
`PCR applications and provides the flexibility of choosing the
`sequencing strategy after amplification. Furthermore, because
`they consist only of single short self-hybridized oligos, barcoding
`adapters for PTS are cheap to synthesize and can be reused in
`subsequent experiments.
`Another method has recently been introduced for parallel
`sequencing small RNAs20. It involves stepwise single-stranded
`ligation of universal adapters to both ends of the RNA molecules.
`Barcoding is then achieved by re-amplification with tagged PCR
`primers. Although this method may be suitable for barcoding small
`RNAs, the protocol is very complex and has yet to be applied to
`dsDNA samples.
`
`Comparison to other barcoding methods
`Although other methods for barcoding and sample multiplexing on
`the 454 platform have been previously introduced, these methods
`are limited to the parallel sequencing of PCR products. In contrast,
`PTS is the first barcoding method to allow for parallel sequencing
`
`Limitations of the method
`One limitation of the method arises through the use of a restriction
`enzyme. With its GC-rich 8 bp recognition sequence, Srf I is a rare
`cutter in mammalian genomes, with restriction sites approximately
`only every 150 kb in the human genome. However, it may cut more
`
`NATURE PROTOCOLS | VOL.3 NO.2 | 2008 | 269
`
`natureprotocols
`
`/moc.erutan.www//:ptth
`
`
`
` puorG gnihsilbuP erutaN 8002 ©
`
`00003
`
`

`

`be measured and adjusted before barcoding (after Step 1).
`Although requiring very little material, the latter strategy yields
`less homogeneous sequence representation, as handling variation
`may cause different recoveries during the purification steps.
`For shotgun sequencing long-range PCR products, we generally
`recommend using long-range PCR kits from Roche (Expand Long-
`Range dNTPack, Expand 20kbPlus PCR Systems), which in our
`hands yield superior results as compared with other suppliers.
`
`Choosing a tag length. To avoid falsely assigning sequences to their
`respective samples, not all possible tags of a certain length should be
`used, as single substitutions attributable to sequencing errors could
`convert one tag into another. A tag length of 6 nt produces only 72
`different tags that are at least two substitutions apart, and 21 different
`tags that are at least three substitutions apart. These numbers are 173
`and 52 for 7-nt tags, and 475 and 130 for 8-nt tags, respectively. With
`a minimal distance of two substitutions between 6-nt tags, we
`previously estimated a false-assignment rate of B0.35% on the GS
`20 platform10. Using the new GS FLX platform and 7-nt tags, this
`number drops to 0.03% at a minimal distance of two, and o0.01%
`at a minimal distance of three substitutions, respectively (M.M.,
`unpublished data). However, these numbers should be considered as
`rough estimates only, as they may vary among runs and with the
`purity of oligos. We recommend independently estimating the false-
`assignment rate in each experiment (see Box 1 and QUALITY
`CONTROL, below). Since on the GS FLX platform the read length
`has increased to B250 bp compared with 100 bp on the previous GS
`20 system, tag lengths of 7 or 8 nt do not significantly reduce the
`amount of usable sequence data obtained by PTS, but provide the
`opportunity to sequence hundreds of samples in parallel at extremely
`low false-assignment rates.
`
`Coverage requirements and sequencing strategy. An issue that
`requires careful evaluation before PTS is begun is the amount of
`sequence coverage required. For shotgun sequencing, in our experi-
`ence 10- to 20-fold average coverage is sufficient for re-sequencing
`mono-allelic sequences, such as mitochondrial genomes. Indel
`sequencing errors around homopolymers can be eliminated by
`comparison to a reference sequence. If no closely related sequences
`are available for comparison, higher sequence coverage (B30-fold)
`is preferred for obtaining sequences with a low indel rate. If nuclear
`sequences with two potential alleles are sequenced, higher coverage
`is necessary to ensure that both alleles are detected in heterozygous
`samples. In general, the coverage requirements must be chosen
`according to the specific needs of a study.
`While estimating the coverage requirements for a study, it is
`important to understand that sequence representation is approxi-
`mately normally distributed among samples. Whereas most sam-
`ples will be covered by the desired number of sequence reads, a few
`samples will be covered higher or lower. For samples quantified
`after barcoding (see SAMPLE REQUIREMENTS, above), we
`usually observed at maximum B50% deviation from the mean
`coverage (Fig. 2 and M.M., unpublished data). This can be
`compensated for with either higher coverage sequencing a priori,
`or by subsequently filling in sequences from under-represented
`samples in an additional run on a small plate region. For large-scale
`projects, where one or several full runs will be completed, the
`sequencing resources can be optimally exploited by initially
`sequencing part of
`the sample pool on only a small plate
`
`PROTOCOL
`
`often in GC-rich bacterial genomes. As the 454 universal adapters
`are added after the Srf I restriction step, only sequence coverage
`immediately around an Srf I site is lost, and gaps can be filled with
`single Sanger sequence reads. If sequencing templates are known to
`contain Srf I sites, it is possible to enzymatically methylate all Srf I
`sites before adapter ligation using CpG methyltransferase according
`to the supplier’s protocol (http://www.neb.com). Owing to the
`inability of Srf I to cut CpG methylated restriction sites, this should
`effectively mask all restriction sites, thereby eliminating restriction
`occurring within template molecules.
`Another important issue is the quality of the resulting sequences.
`With regard to substitutional errors, well above 99.99% accuracy
`was consistently reported for shotgun consensus sequences on the
`GS 20 platform3,10,21,22, and 99.92% were estimated for single reads
`in a recent study23. However, single base pair insertions and
`deletions (indels) occur with considerable frequency both within
`and around homopolymer regions, often persisting even at high
`coverage. In shotgun consensus sequences of human mitochondrial
`genomes10, we recently observed indel errors at a frequency of
`0.27%, although previous estimates from shotgun consensus
`sequences were about ten times lower3,21,22. However, all current
`estimates should be considered with caution, as the error rate varies
`among different versions of the 454 assembly programs newbler
`and runMapper (see ref. 21, and M.M., unpublished data), and may
`also differ among runs. As single base pair indels can be identified
`as frame-shift mutations in coding sequences or by comparison to
`closely related sequences, they usually have no practical impact on
`sequence usability.
`
`Experimental design
`Several points should be considered before large-scale sequencing
`projects are performed using PTS.
`
`In principle, every double-stranded
`Sample requirements.
`nucleic acid sample with natural 5¢-ends (hydroxyl or phosphate)
`is a suitable template for parallel sequencing using PTS. However,
`there are upper and lower limits on template size. The upper limit is
`defined by the 454 process, as the maximum read lengths obtained
`on the GS 20 and GS FLX platforms are B100 and 250 bp,
`respectively. In addition, fragments of 800 bp or more amplify
`poorly in emulsion PCR. Adequate fragment length distributions,
`for example, for shotgun sequencing, can be achieved by DNA
`shearing. The lower size limit is introduced through the SPRI bead
`purification steps in the PTS protocol. SPRI bead purification24 is
`quick and efficient, but does not recover molecules o80–100 bp. If
`shorter molecules need to be sequenced (50–100 bp), all SPRI
`purification steps can be replaced by MinElute Spin column
`purification (Qiagen), using the same elution volumes and buffers
`without additional changes.
`The minimal material requirements for PTS are very low. As 454
`sequencing is possible from picogram amounts of DNA25, and
`there are no significant losses in the tagging protocol, o1 ng of
`initial material per sample is theoretically sufficient. However, the
`sequence representation of each barcoded sample depends on its
`relative concentration in the sample pool, and is thus affected by the
`accuracy of DNA quantification. For optimal results, we recom-
`mend measuring DNA concentration after barcoding (Step 18).
`This strategy requires at least 100 ng of initial material for Pico-
`Green quantification. If less material is available, DNA amounts can
`
`270 | VOL.3 NO.2 | 2008 | NATURE PROTOCOLS
`
`natureprotocols
`
`/moc.erutan.www//:ptth
`
`
`
` puorG gnihsilbuP erutaN 8002 ©
`
`00004
`
`

`

`PROTOCOL
`
`BOX 1 | ESTIMATING THE FALSE-ASSIGNMENT FREQUENCY
`
`The reliability of PTS should be independently evaluated in each experiment. This can be achieved by estimating the false-assignment
`frequency, that is, the frequency at which false assignment of sequences to their sample origin is expected, based on the occurrence of sequence
`reads that carry tag sequences from unused barcoding adapters. The ability to detect false assignment improves as the number of barcoding
`adapters that remain unused in an experiment increases.
`False-assignment frequency ¼ F
`T  N
`A N
`F, Number of sequences carrying tags from unused barcoding adapters
`T, Total number of sequence reads obtained in the experiment
`N, Total number of barcoded samples that were sequenced in parallel
`A, Total number of barcoding adapters within the chosen category that have actually been synthesized (e.g., 52 if all 7-bp tags differing by at
`least three substitutions were synthesized)
`The formula is based on the assumption that all tags can be converted into one another with the same probability. It is a composite estimate
`of false assignment that occurs due to sequencing errors and cross-contamination of barcoding adapters during synthesis and dilution. It does
`not consider the possibility that cross-contamination is introduced while preparing samples for PTS or setting up the blunt-end repair and
`ligation reactions (Steps 1–10) and is unlikely to detect punctual contamination. Thus, careful pipetting is strongly advised.
`
`region. Subsequently, samples can be re-pooled according to the
`observed sequence representation, thereby guaranteeing optimal
`sequence representation among samples during the large-scale
`sequencing phase.
`
`Quality control. Finally, as the assignment of tags to the correct
`sample source is critical, some quality control should be performed.
`The two major factors leading to false assignment of sequences are
`cross-contamination of barcoding adapters and sequencing errors.
`By testing a subset of barcoding adapters in a small-scale experiment
`before large-scale adoption of PTS, it is possible to monitor whether
`cross-contamination of adapters occurred during synthesis or
`dilution. If cross-contamination goes undetected, misassignment
`of sequences will occur. Moreover, in such a small-scale preliminary
`experiment, the occurrence of tag sequences from unused adapters
`can be monitored and used to estimate the false-assignment rate due
`to sequencing errors and/or errors occurring during oligo synthesis.
`By randomly omitting a small subset of barcoding adapters used in a
`
`study, the same quality control is advised for every experiment using
`PTS (Box 1). As all molecules carry sequence tags on both ends,
`another independent, albeit less stringent, quality check can be
`performed by comparing the 5¢ and 3¢ tag sequences in reads where
`the ends of molecules have been reached. The repeated occurrence
`of
`identical
`false tag pairs indicates that cross-contamination
`persists among barcoding adapters or was introduced during
`pipetting. It should be noted, however, that the identification of
`3¢ tag sequences is less reliable due to the higher sequencing error
`rate near the ends of reads and possible misidentification of 3¢
`adapter sequence starting points. In addition, for many samples
`with relatively long mean fragment sizes, such as shotgun libraries,
`the majority of sequences will terminate before the end of the
`molecule and the 3¢ adapter are reached.
`When combining PTS with physical separation of the sequencing
`plate, avoid sequencing different libraries containing the same
`sequence tags on neighboring regions, as occasionally leakage of
`sequencing beads occurs.
`
`MATERIALS
`REAGENTS
`. T4 DNA polymerase (Fermentas, cat. no. EP0062)
`. T4 polynucleotide kinase (Fermentas, cat. no. EK0032)
`. T4 ligase (Fermentas, cat. no. EL0331), including 50%
`PEG-4000 solution and 10 ligation buffer
`. Bst DNA polymerase, large fragment (NEB, cat. no. M0275S), including
`10 ThermoPol buffer
`. Calf-intestine phosphatase (NEB, cat. no. M0290S), including
`10 NEBuffer 3
`. Srf I restriction enzyme (Stratagene, cat. no. 501064), including
`10 universal buffer
`. 10 Buffer Tango (Fermentas, cat. no. BY5)
`. ATP (Fermentas, cat. no. R0441), 100 mM stock solution
`. dNTPs (GE Healthcare, cat. no. US77119-500UL), 25 mM each
`. BSA (Sigma-Aldrich, cat. no. B4287), powder for preparation of a 10 mg
`ml1 stock solution in water
`. Water, HPLC-grade (Sigma, cat. no. 270733)
`. Absolute ethanol (Merck, cat. no. 1.00983.2500)
`. TE buffer (many suppliers or self-made); 10 mM Tris–HCl, 0.1 mM EDTA,
`pH 8.0
`. EB buffer (supplied with MinElute PCR Purification kit); 10 mM Tris–HCl,
`pH 8.5
`. DNA-loading dye (Fermentas, cat. no. R0611)
`
`. Ethidium bromide (Sigma, cat. no. 46067), 1% solution ! CAUTION
`Mutagen and potential carcinogen.
`. TBE electrophoresis buffer (Sigma, cat. no. 51309), 10 concentrate
`. MinElute PCR purification kit (Qiagen, cat. no. 28006)
`. PicoGreen dsDNA quantitation reagent (Invitrogen, cat. no. P11495)
`. Oligonucleotides (Metabion), desalted, lyophilized. Sequences for sets of
`barcoding oligos with 6–8-nt tags are available in Supplementary Table 1
`online (see also REAGENT SETUP) m CRITICAL Basic post-synthesis
`purification (desalting) is sufficient. Additional purifications, such as HPLC
`or PAGE, increase the risk of cross-contaminating oligonucleotides. The
`oligos should be synthesized on larger scales to suffice for several
`rounds of PTS.
`. AMPure SPRI PCR purification kit (Agencourt, cat. no. 000130)
`. GS DNA Library Preparation Kit (Roche, cat. no. 04852265001), including
`nebulizers and nebulization buffer m CRITICAL Only 10 nebulizers and
`20 ml nebulization buffer are supplied. Additional nebulizers can be
`obtained from Graham-Field (cat. no. BF61402). Nebulization buffer consists
`of 53.1% glycerol, 37 mM Tris–HCl, 5.5 mM EDTA, pH 7.5 (ref. 3).
`EQUIPMENT
`. 96-Well PCR plates
`. Multichannel reagent basin (e.g., Thermo Scientific, cat. no. 9510027)
`. Filter tips
`. SPRIPlate 96R—Ring Magnet Plate (Agencourt, cat. no. 000219)
`
`NATURE PROTOCOLS | VOL.3 NO.2 | 2008 | 271
`
`natureprotocols
`
`/moc.erutan.www//:ptth
`
`
`
` puorG gnihsilbuP erutaN 8002 ©
`
`00005
`
`

`

`PROTOCOL
`
`. Agarose gel electrophoresis unit
`. NanoDrop spectrophotometer (NanoDrop Technologies)
`. Stratagene MX 3005P QPCR System or any other qPCR system or
`microplate reader suitable for PicoGreen fluorescence measurements
`. GS 20/GS FLX Genome Sequencer and associated equipment
`. Software for data analysis (available with accompanying usage instructions
`at http://bioinf.eva.mpg.de/pts/)
`REAGENT SETUP
`Preparing barcoding adapters Barcoding adapters comprise of single self-
`hybridized oligos. Dissolve the oligos in TE to obtain 500 mM stock solutions.
`Create barcoding adapters by setting up separate reactions in PCR tubes
`containing in final concentrations 400 mM of one oligo and 1 T4 ligase buffer.
`Incubate in a thermal cycler with a temperature profile of 95 1C for 10 s and a
`ramp to 25 1C at a rate of 0.1 1C s1. Immediately freeze the barcoding adapters
`at 20 1C until further usage m CRITICAL The barcoding adapters may be
`prepared and stored in PCR strip tubes or plates to allow for subsequent
`handling with multichannel pipettes. However, be extremely careful
`not to cross-contaminate the adapters during preparation or later handling.
`Preparation of aliquots may be useful for minimizing the number
`of handling cycles.
`
`Generating a positive control template To control the efficiency of the tagging
`reactions, we strongly recommend carrying a PCR product as a control template
`alongside the samples for sequencing. Any PCR product producing a single
`distinct band of a size between 100 and 200 bp is suitable, as long as it is
`generated with Taq polymerase and unmodified PCR primers. Purify the PCR
`product using a MinElute spin column and adjust it to a concentration of 25 ng
`ml1. As 24 ml is needed for one control experiment, several PCRs

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket