`http://genomebiology.com/2011/12/1/R1
`
`Open Access
`
`ME T H O D
`A scalable, fully automated process for
`construction of sequence-ready human exome
`targeted capture libraries
`Sheila Fisher1, Andrew Barry1, Justin Abreu1, Brian Minie1, Jillian Nolan1, Toni M Delorey1, Geneva Young1,
`Timothy J Fennell1, Alexander Allen1, Lauren Ambrogio1, Aaron M Berlin2, Brendan Blumenstiel3, Kristian Cibulskis3,
`Dennis Friedrich1, Ryan Johnson1, Frank Juhn4, Brian Reilly1, Ramy Shammas1, John Stalker1, Sean M Sykes2,
`Jon Thompson1, John Walsh1, Andrew Zimmer1, Zac Zwirko1,4, Stacey Gabriel2, Robert Nicol1, Chad Nusbaum2*
`
`Abstract
`Genome targeting methods enable cost-effective capture of specific subsets of the genome for sequencing. We
`present here an automated, highly scalable method for carrying out the Solution Hybrid Selection capture
`approach that provides a dramatic increase in scale and throughput of sequence-ready libraries produced.
`Significant process improvements and a series of in-process quality control checkpoints are also added. These
`process improvements can also be used in a manual version of the protocol.
`
`Background
`The cost of DNA sequencing continues to fall, driven by
`ongoing innovation in sequencing technology [1-4]. As a
`result, it has recently become feasible to sequence non-
`trivial numbers of whole human genomes [3,5-10].
`Many more such projects are planned and commercial
`genome sequencing services are now becoming available
`[11,12]. At the same time, there is growing interest in
`sequencing specific portions of genomes, and several
`affordable methods for sample preparation of targeted
`regions have been recently published [13-17]. Key appli-
`cations for targeted approaches include sequencing of
`exons or sets of protein-coding genes implicated in spe-
`cific diseases [18-21], whole human exome sequencing
`(for example, in cancer or disease cohorts) [22-24]
`(reviewed in [25]), and resequencing of specific regions
`as a follow-up to genome-wide association studies [26].
`The economics of whole exome sequencing have made
`targeted enrichment approaches an attractive option for
`discovery of rare mutations in a variety of diseases as
`the price tag is substantially lower than for sequencing
`an entire human genome. For example, using list prices
`
`* Correspondence: chad@broadinstitute.org
`2Genome Sequencing and Analysis Program, Broad Institute of MIT and
`Harvard, 320 Charles Street, Cambridge, MA 02141, USA
`Full list of author information is available at the end of the article
`
`and including the targeted capture step, the all-in cost
`of sequencing a whole exome (roughly 30 Mb),
`is
`13-fold less than for the whole genome (Table S1a in
`Additional file 1). This translates directly into a budget
`that can include more than ten times as many samples,
`greatly increasing the statistical power of the data to be
`generated. The effect is even greater for smaller sequen-
`cing targets, which further scale down the required
`sequencing, although costs of targeting scale down more
`slowly. Ultimately, as long as the expense of the
`required sample preparation does not dominate, target-
`ing will continue to be a cost-effective approach. To
`date, however, no targeting method has been described
`that can handle the many thousands of samples that are
`becoming available. To fill this need, we set out to
`develop such a method.
`Solution hybrid selection (SHS), developed by Gnirke
`et al. [14], was created as a tool to cheaply and effec-
`tively target multiple regions in the genome in a way
`that is compatible with next generation sequencing tech-
`nologies (Figure 1). The published protocol performs
`well in terms of efficiency of enrichment (selectivity),
`reproducibility, evenness of coverage, and sensitivity to
`detect single-base changes [14]. Using this method, a
`single technician can process six samples simultaneously
`from genomic DNA to sequence-ready library in
`
`© 2011 Fisher et al.; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons
`Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in
`any medium, provided the original work is properly cited.
`
`00001
`
`EX1031
`
`
`
`Fisher et al. Genome Biology 2011, 12:R1
`http://genomebiology.com/2011/12/1/R1
`
`Page 2 of 15
`
`(a)
`
`Generation of RNA bait capture probes
`
`T7
`
`Elute, amplify,
`add T7 promoter
`
`Biotin–UTP
`transcription
`
`Biotinylated
`bait probes
`
`(b)
`
`Solution hybrid selection
`
`Adaptored
`pond of targets
`
`Solution
`hybridization
`
`Bead
`capture
`
`SEQUENCE
`
`Amplify by
`off-bead PCR
`
`Universal amplification sequences
`
`RNA bait probes
`
`T7 RNA polymerase promoter
`
`Illumina sequencing adaptors
`
`Biotin-labeled UTP
`
`Streptavidin bead
`
`Figure 1 Overview of the hybrid selection method. Two specific sequencing targets and their respective capture baits are indicated in blue
`and red. (a) Generation of RNA bait capture probes. 150mer oligos are synthesized on array in batches of 55,000 and cleaved off. They are
`made double stranded by PCR amplification and tailed with a T7 RNA polymerase promoter, and RNA capture baits are made by transcription in
`the presence of biotinylated UTP. (b) Solution hybrid selection. RNA baits (from the top line) are mixed with a size selected pond library of
`fragments modified with sequencing adaptors. Hybridized fragments are then captured to streptavidin beads and eluted by the with-bead
`protocol for sequencing. See text for details.
`
`approximately one week. This process was designed
`purely as a series of liquid handling steps and incuba-
`tions, with the specific intention of making it amenable
`to scale up and automation. Given the demonstrated
`success of this and other methods, demand for targeted
`sequencing has increased sharply. To accommodate the
`increased demand, keep costs down, and limit the
`requirements for human labor, we have adapted SHS to
`an automated high-throughput process. This SHS
`method includes improvements designed to increase the
`efficiency of the target selection process through optimi-
`zation of reactions and automation of the library and
`capture procedures using liquid handling robots. Several
`aspects of this method, in particular the ‘with-bead’
`sample preparation method, are amenable to sample
`preparation steps for a range of next generation sequen-
`cing applications, including alternative in-solution and
`solid-phase capture strategies.
`To support high-throughput SHS for targeted sequen-
`cing, we set out to devise a laboratory process that
`would handle very large numbers of samples in parallel
`for targeting and preparation of sequence-ready libraries
`at a low cost per sample. This process was designed to
`carry out whole exome targeting but also yields good
`
`results in targeting subsets of genes or regions for rese-
`quencing. Results described here come from whole
`exome targeting using the Agilent SureSelect Human
`All Exon v2 kit, which is a commercially available imple-
`mentation of the optimized capture reagent we have
`described previously [14].
`A number of challenges were overcome in developing
`a robust, automated, and highly scalable process for
`selection of exomes and other targets. Beyond the need
`for processing large numbers of samples, modifications
`of the protocol were made to achieve or maintain the
`following: elimination of manual, agarose gel-based size
`selection, which has now been replaced by fully auto-
`mated, bead-based steps; high selectivity, with a high
`number of sequenced bases on or near the target region
`of interest; evenness of sequence coverage among cap-
`tured targets, avoiding highly overrepresented targets
`and dropouts; high library complexity, or low molecular
`duplication, so that libraries contain large numbers of
`unique genome fragments; reproducibility, so that per-
`formance of the process is highly predictable; low cost
`of the targeting process relative to sequencing; detailed
`process tracking to reduce errors and provide sample
`history; quality control checkpoints built into the
`
`00002
`
`
`
`Fisher et al. Genome Biology 2011, 12:R1
`http://genomebiology.com/2011/12/1/R1
`
`Page 3 of 15
`
`process to identify poor performers prior to sequencing;
`and limited human labor.
`We present here a scalable, automated SHS method
`that operates at a throughput far higher than achieved
`by other methods. The process can also be carried out
`by hand using a multichannel pipetter. This method has
`not only been scaled but also optimized to improve
`selectivity and evenness of target coverage and to mini-
`mize artifactual duplication to consistently deliver
`greater than 94% of the alignable exome (Additional file
`2). The automated protocol has a capacity to process
`over 1,200 SHS samples in less than a week with four
`technicians (one technician can generate 1,200 pond
`libraries per week, and three technicians can each gener-
`ate 384 SHS captures per week). For ease of explanation,
`we employ a fishing-based terminology in SHS, where
`the biotinylated RNA capture reagent is referred to as
`the ‘bait’, the genomic DNA library from which targets
`are captured as the ‘pond’ in which we are ‘fishing’, and
`the DNA targets from the pond that are captured by the
`bait are referred to as the ‘catch’.
`
`Results
`Building a high-throughput solution hybrid selection
`process
`SHS is a method used to selectively enrich for regions
`of interest within the human genome [14] (Figure 1).
`Briefly, a library (or ‘pond’) of adapter-ligated fragments
`of randomly sheared DNA is hybridized to biotinylated
`RNA (or ‘baits’) that are complementary to the target
`sequences. Hybridized molecules (the ‘catch’) are then
`captured using streptavidin-coated beads. Once the cap-
`tured DNA fragments are PCR amplified off the capture
`reagent, they are available to be sequenced using next
`generation sequencing technologies. The standard SHS
`protocol was redesigned from a manual, bench scale
`process to an automated process, in much the same
`way as our recent work to scale library construction for
`454 sequencing [27], and is capable of far greater
`throughput than demonstrated for other methods
`(Additional file 2). A series of process innovations were
`required to facilitate reimplementation of this process at
`large scale. In particular, all manual pipetting steps were
`converted to automation-amenable liquid handling
`steps, and these liquid handling steps were extensively
`optimized to maximize yield efficiency. As part of this,
`the electrophoretic size selection step has been replaced
`by fully automated bead-based sizing. Other optimiza-
`tions are described below. Table 1 shows a comparison
`of the original published method and the new protocol
`with a description of each step and the improvements
`in the new method. Table 2 describes a set of key
`sequencing metrics by which we measure SHS process
`performance.
`
`The automated SHS process is implemented on the
`Bravo liquid handling workstation (Agilent Automation
`Solutions), a commercially available small-footprint,
`liquid handling platform, but can be implemented on
`many commercially available liquid handlers. The pro-
`cess can also be carried out manually using a multichan-
`nel pipette. An overview map of the process can be
`found in Additional file 3 and the manual protocol ver-
`sion can be found in Additional file 4.
`
`Optimization of acoustic shearing
`The process begins with fragmentation of genomic DNA
`using the Covaris E210 adaptive focused acoustics
`instrument. Maximizing the yield of DNA fragments in
`the desired size range is a key step in minimizing overall
`sample loss. The Covaris E210 instrument focuses
`acoustic energy into a small, localized zone to create
`cavitation, thereby producing breaks in double-stranded
`DNA. A number of variables control mean fragment
`length and distribution, including duty cycle, cycles per
`burst, and time. The Covaris adaptive focused acoustics
`system has several advantages over other methods such
`as nebulization or hydrodynamic force. First, DNA is
`sheared in a small closed environment and is not
`handled in large volume vessels or in tubing, greatly
`reducing sample loss. Second, the closed, independent
`vessels greatly reduce sample cross-contamination.
`Third, the Covaris machine can operate automatically
`on up to 96 samples per run, eliminating significant
`sample handling labor and eliminating shearing as a
`process bottleneck. Fourth, improvements to the shear-
`ing protocol in combination with removal of small frag-
`ments in subsequent bead-based clean up steps (see
`below) eliminates the need to size select and extract
`samples from agarose gels, a critical bottleneck in the
`overall process.
`Shearing performance was extensively optimized for
`increased sample yield, narrower insert size distribution,
`and robust and reproducible handling of large numbers
`of samples in parallel. Optimizations focused on the fol-
`lowing factors: shearing volume, tube type, elimination
`of tube breakage, shearing pulse time, water degassing,
`and positioning of tubes in the water bath (see Materials
`and methods for details). In order to accommodate
`automated handling of the samples, volumes were
`reduced from 100 μl to 50 μl without any effect on
`shearing profiles or sample loss (Additional file 5).
`Importantly, proper fit of the shearing rack (Covaris,
`catalogue number 500111) into custom adapters (see
`Additional file 6 for CAD drawing) prevents movement,
`allowing transfers to occur via automated liquid hand-
`ling. In addition, specific tubes available from Covaris
`(Covaris, catalogue number 500114) virtually eliminated
`the problem of tube breakage. Only a single sample in
`
`00003
`
`
`
`Fisher et al. Genome Biology 2011, 12:R1
`http://genomebiology.com/2011/12/1/R1
`
`Page 4 of 15
`
`Table 1 Comparison of standard versus improved solution hybrid selection methods
`Manual standard SHS protocol
`Automated improved SHS protocol
`Standard
`Drawbacks
`Advantages
`method
`Covaris S2
`
`Process step
`
`Shearing of genomic
`DNA
`Enzymatic cleanups
`
`Single sample
`
`Solution hybrid selection
`capture
`Final PCR enrichment
`
`In process quality control
`checkpoints
`
`Individual spin
`columns
`Manual, column-
`based
`Denature,
`followed by PCR
`Agilent
`Bioanalyzer
`
`Low throughput, 50 to 60%
`recovery, manual
`Labor intensive (6 samples/
`FTE/week)
`Sample loss through transfers
`
`Limited visibility until
`sequence results
`
`Improved
`method
`Optimized
`Covaris E210
`’With-bead’ SPRI High throughput, 80 to 90% recovery, automated
`
`Multi-sample, improved yield, tight size range
`
`Fully
`automated
`Direct ‘off-bead’
`PCR
`Many
`
`Walkaway, high throughput (1,200 samples/4FTE/
`week)
`Improved final yield
`
`In process results: key predictors of sample, library
`and sequencing quality
`
`FTE, full time employee; SHS, solution hybrid selection; SPRI, solid phase reversible immobilization.
`
`the most recent 5,000 processed suffered a broken tube.
`Through a systematically designed and controlled set of
`experiments, optimal pulse time parameters were chosen
`to provide a mean fragment length of 150 bp with a
`range of 75 to 300 bp (Materials and methods). Addi-
`tional file 5 shows the contrast between unoptimized
`and optimized size profiles of sheared DNA. In addition
`to regular maintenance, careful degassing of the water
`bath and proper water levels are critical for reproducible
`results. In a nondegassed water bath dissolved oxygen
`reduces cavitation and disperses energy, reducing shear-
`ing efficiency.
`
`Modified bead-based cleanups enable scale-up to 96 wells
`A key requirement in scaling SHS was to implement
`processing of samples in a standard 96-well microtiter
`plate. This was facilitated by development of a novel
`modification to solid-phase reversible immobilization
`(SPRI) magnetic bead reaction cleanup methodology
`[27,28] we have termed ‘with-bead’ SPRI (Figure 2),
`
`Table 2 Automated solution hybrid selection
`performance
`Performance factor
`
`Median target coverage
`Percentage bases > 2×
`Percentage bases > 10×
`Percentage bases > 20×
`Percentage selected bases (on
`target)
`Percentage duplicated reads
`Fold 80 penaltya
`Estimated library size of captured
`fragments
`
`3 μg input average (n = 1,117
`exomes)
`131.0×
`96.0%
`91.9%
`87.6%
`83.7%
`
`4.4%
`3.17
`278 million
`
`See Additional file 12 for metric definitions. aFold 80 penalty is a measure of
`the non-uniformity of sequence coverage, defined as the amount of
`additional coverage (in fold coverage of the genome) required so that 80% of
`the target bases will be covered at the current mean coverage (see Additional
`file 12 for details).
`
`which is highly scalable due to its amenability to liquid
`handling automation. Implementation of with-bead SPRI
`in SHS offers significant advantages. First, it replaces
`single tube spin-column-based cleanups with liquid
`handling-compatible magnetic bead-based cleanups; sec-
`ond, it enables selection of molecular weight ranges,
`eliminating the need for agarose gel-based sizing; third,
`it simplifies the process by allowing elimination or com-
`bining of several steps, which results in a higher overall
`DNA yield.
`The innovation of the with-bead SPRI method is as
`follows. Rather than employing a series of discrete
`cleanup steps in the library construction process, the
`cleanups are effectively integrated. The SPRI beads are
`added to the sample after the shearing step, and remain
`in the reaction vessel throughout the sample preparation
`protocol. By allowing each cleanup step to employ the
`same beads, the with-bead method greatly reduces the
`number of liquid transfer steps required. The ‘cleaned
`up’ DNA is then eluted at the conclusion of the process.
`This methodology increases the overall DNA yield
`(Figure 3), primarily because it allowed us to eliminate
`six of the ten sample transfer steps, avoiding the loss of
`DNA sticking to the sides of the vessel or loss of
`volume in pipetting. Briefly, following each process step,
`DNA is selectively bound to the iron beads, already pre-
`sent, through the addition of a 20% polyethylene glycol
`(PEG), 2.5 M NaCl buffer. The mixture is placed on a
`magnet, which pulls the beads and bound DNA to the
`sides of the well so that the reagents, washes and/or
`unwanted fragments can be removed with the superna-
`tant. Molecular weight exclusion, which is essentially a
`size selection, of unwanted lower molecular weight
`DNA fragments can be controlled through the volume
`of the PEG NaCl buffer that is added to the reaction,
`changing the final concentration of PEG in the resulting
`mixture and altering the size range of fragments bound
`to the beads [27,28]. DNA fragments that have been
`cleaned or size selected are eluted from the beads, ready
`
`00004
`
`
`
`Fisher et al. Genome Biology 2011, 12:R1
`http://genomebiology.com/2011/12/1/R1
`
`Page 5 of 15
`
`Sheared
`DNA
`
`Add SPRI
`beads
`
`Remove
`supernatant
`
`Place on
`magnet
`
`1 . E n d r
`
`e
`
`p
`
`air
`
`2 . A -
`
`b
`
`a
`
`3 .
`
` A d
`
`a
`
`Add PEG
`buffer
`
`p t o r ligation
`
`s
`
`e
`
`Hybridization
`reaction
`
`4. PCR enrichment
`
`Remove from magnet
`Elute DNA from beads
`Figure 2 With-bead SPRI method for pond library construction. SPRI magnetic beads are added to the sheared DNA sample. DNA is
`selectively bound to SPRI beads, which are immobilized when the sample plate is placed on a magnet, leaving other molecules in the liquid
`phase. The liquid phase is removed and discarded. The sample plate is then removed from the magnet and DNA is eluted from the beads.
`Library construction master mixes are then added to eluant/bead solution. The DNA and SPRI beads then pass through three cycles of reaction,
`binding to beads (in the presence of polyethylene glycol (PEG)/NaCl solution) and cleanup/washing. The cycles carry out end repair, A-base
`addition and adaptor ligation, respectively. A final elution is then followed by PCR amplification.
`
`for the next step; however, the eluate is not transferred
`into a new reaction vessel. Rather, the reagents for the
`next step are added directly to the reaction vessel con-
`taining samples and beads. The presence of beads does
`not interfere with any of the steps in the process
`(Table 3). This with-bead protocol has greatly increased
`the number of unique fragments entering the pond PCR
`step, increasing the complexity of libraries made by
`roughly 12-fold (Table 3).
`
`This increase in yield with the with-bead SPRI proto-
`col has the added benefits of reducing both the input
`DNA requirement to the process and the number of
`PCR cycles required. Efficient with-bead targeted cap-
`tures can be achieved with pond libraries made with as
`little as 100 ng of input DNA and six to eight cycles of
`PCR, a major improvement over the commercialized
`SHS method, which requires 3 μg of starting genomic
`DNA and 14 cycles (Table 3). We note here that PCR
`
`00005
`
`
`
`Fisher et al. Genome Biology 2011, 12:R1
`http://genomebiology.com/2011/12/1/R1
`
`Page 6 of 15
`
`Recovery (%)
`
`50
`
`45
`
`40
`
`35
`
`30
`
`25
`
`20
`
`15
`
`10
`
`5 0
`
`47
`
`28
`
`23
`
`1.6
`
`1.4
`
`1.2
`
`1.0
`
`0.8
`
`0.6
`
`0.4
`
`0.2
`
`0
`
`construction process (mg)
`
`Output of pond library
`
`Column
`cleanups
`
`Automated
`standard bead-based
`cleanups
`
`Automated
`with-bead
`SPRI cleanups
`
`Figure 3 Yield output from pond library construction methods. Data are shown left to right, for pond libraries constructed with three
`methods: the widely used standard column-based cleanups [14], an automated implementation of standard bead cleanups and our
`implementation of with-bead SPRI cleanups. Each library was constructed with 3 μg input of NA12878 genomic DNA, in triplicate. Bars: total
`DNA output from pond library construction before PCR amplification. Blue diamonds: percentage recovery of input DNA for duplicates of 3 μg
`of the same input DNA. With-bead-based cleanups increased the amount of DNA retained throughout library construction compared to the
`standard column or SPRI cleanup methods.
`
`cannot be completely eliminated because the efficiency
`of adaptor ligation varies between samples, probably
`because of variation in input DNA quality. PCR cycle
`number was optimized to maximize the number of
`unique fragments in the library while minimizing the
`duplication rate (Additional file 7). This resulted in a
`modest number of cycles that enriches fragments con-
`taining an adapter at each end but not fragments with
`either no adapters or an adapter at one end only. These
`incomplete constructs compete with two-adapter frag-
`ments in the hybridization reaction but cannot be
`sequenced.
`
`Pre-mixed reagents for automated library construction
`Currently available commercial library reagent kits are
`packaged for bench-level processing of eight to ten
`
`samples. In order to accommodate the increase in scale
`and automated processing of samples,
`large-scale
`reagent kits were developed and optimized for the high-
`throughput SHS pond construction process. All buffers
`and non-enzyme components are premixed and ali-
`quoted at volumes appropriate for 96 samples, including
`necessary dead volume. Prior to use, the premixed
`reagents only need to be thawed and placed on the deck
`where enzymes are added immediately before dispense
`into reaction plates. To accomplish this, we developed a
`custom reservoir in combination with optimized aspira-
`tion and dispense protocols. The custom reservoir is
`designed to limit dead volume, thereby minimizing the
`reagent volume required, thus reducing reagent waste.
`Details, including the dimensions of the reservoir, can
`be found in Additional file 8.
`
`Table 3 Performance comparison of manual versus automated solution hybrid selection
`Factor
`Column based
`Automated (with-bead SPRI)
`Automated (with-bead SPRI) low input
`3 μg
`3 μg
`0.1 μg
`Input DNA
`Samples/FTE/week
`6-12
`384
`384
`Number of sample transfer steps
`10
`4
`4
`Output DNA prior to PCR
`720 ng
`1,330 ng
`Below limit of detection
`Number of pond PCR cycles
`12-16
`6
`6
`Percentage duplicated reads
`19.8
`2.2
`10
`Percentage selected bases
`84.7
`88.6
`83.76
`Estimated library size
`43 million
`516 million
`223 million
`
`FTE, Full time employee; SHS, solution hybrid selection; SPRI, solid-phase reversible immobilization.
`
`00006
`
`
`
`Fisher et al. Genome Biology 2011, 12:R1
`http://genomebiology.com/2011/12/1/R1
`
`Page 7 of 15
`
`Automation of capture protocol to process 96 samples
`simultaneously
`The most labor-intensive step in the manual selection
`process is the ‘capture’ protocol (Table 1), where hybri-
`dized DNA-RNA bait duplexes are separated from
`unbound fragments. The separation is performed using
`streptavidin beads that bind to the biotin molecules that
`are covalently linked to the RNA bait. Fragments that
`are not hybridized to the biotinylated RNA baits are
`removed through a series of washes.
`Wash conditions were redesigned for compatibility
`with automated liquid handling and optimized for maxi-
`mal yield (Additional file 9). Since microtiter wells are
`of much smaller volume than the standard microtubes
`used in the manual process, the number of wash cycles
`was increased as the volume of each wash had to be
`decreased to fit the wells while maintaining the proper
`level of stringency. Wash buffers are precisely controlled
`for temperature by storing the buffer-containing vessels
`in 65°C temperature-calibrated heating blocks (V&P
`scientific, VP-741BW MICA) integrated onto the deck
`of the liquid handler robot. This automation provides a
`hands-off capture protocol capable of consistently set-
`ting up capture reactions for 96 samples in 4 hours; in
`comparison, the manual (and somewhat variable) pro-
`cess handled 6 samples in 4 hours. Additionally, the
`automated process delivers output of a more consistent
`quality, and eliminates manual tracking and pipetting
`errors (Additional file 10).
`
`Off-bead PCR to increase yield of captured product
`In the manual protocol [14], the elution of desired DNA
`fragments from the RNA bait-streptadavidin bead com-
`plex is accomplished by denaturation using 0.1 N
`sodium hydroxide followed by a cleanup step prior to
`PCR amplification. This series of steps requires large
`volumes and is therefore difficult to scale in a microtiter
`plate format. In addition, variability at this step can
`result in loss of captured DNA. We have replaced elu-
`tion through denaturation by amplifying the captured
`sequences directly by PCR, by a process we term ‘off-
`bead’ PCR, as the target is PCR amplified off the bead
`directly in the capture plate. This allows scaling in a
`microtiter plate format, simplifies the process by remov-
`ing a pipetting step, eliminates process variability and
`improves the yield of captured product roughly three-
`fold (Additional file 11). Briefly, PCR enzyme, PCR
`primers, and dNTPs are added directly to the bead-bait-
`DNA complex, and the mix is amplified via thermalcy-
`cling (see Materials and methods for details). Bait
`RNAs, which lack Illumina adapter sequences, and pond
`fragments with fewer than two adapters are not ampli-
`fied. The amplified fragments are then separated from
`the beads through a modified SPRI bead cleanup
`
`(Materials and methods). This off-bead PCR protocol, in
`combination with improvements described above, signif-
`icantly improves yield at this step in the process
`(Table 3). This simple, automation-friendly, cost-effec-
`tive protocol can be used to process up to 1,200 samples
`per week in batches of 96 (Table 2).
`
`Development and automation of in-process quality
`control checkpoints
`As the process increases in scale, readouts of sample
`quality and process success become increasingly impor-
`tant as indicators of the likelihood of producing high
`quality sequencing results. To this end we have imple-
`mented a series of in-line quality control checkpoints.
`This enables granular reporting of metrics during the
`SHS process and, importantly, allows poorly performing
`samples to be quickly identified and removed, avoiding
`the associated costs of downstream processing and
`sequencing (Figure 4). Central to this is the development
`of critical quality control assays, both in terms of their
`sensitivity to the samples at the point at which they are
`assayed, as well as their utility as a predictor of sequen-
`cing quality. The eight key quality control checkpoints
`that add immediate value to the process are outlined
`below (see Materials and methods for details on each).
`Volume check
`Volumes are checked for every sample by visual inspec-
`tion to ensure predictable performance in shearing
`(Figure 4a). If volumes are outside of specification (50 μl
`± 20%), samples are either concentrated or diluted to
`reach the appropriate range. Low volumes cause inaccu-
`rate automated transfer of sample into shearing vessels.
`Sample concentration check by PicoGreen
`Concentrations for all samples are measured via an
`automated PicoGreen assay (see Materials and methods)
`and are specified to be within 2.0 to 60 ng/μl (Figure
`4b). Samples above this range are normalized and re-ali-
`quoted to appropriate volumes since excess input DNA
`can actually inhibit the enzymatic pond reactions (data
`not shown). Samples above the 2.0 ng/μl threshold are
`considered to pass. Those below this range can be run
`on risk.
`Size quality control of sheared DNA
`Sheared samples are assayed on an automated microflui-
`dic electrophoresis instrument, the Caliper GX system,
`using the 1K DNA Chip to evaluate the size distribution
`produced by the Covaris instrument (Figure 4c). Frag-
`ment sizes should be between 75 and 300 bp with the
`distribution centered on 150 bp. Samples that shear
`above this range can decrease the specificity and effi-
`ciency of the selections. Samples sheared to less than a
`mean of 110 bp will be suffer losses during the various
`with-bead cleanups, greatly reducing the complexity of
`the library before selection.
`
`00007
`
`
`
`Fisher et al. Genome Biology 2011, 12:R1
`http://genomebiology.com/2011/12/1/R1
`
`Page 8 of 15
`
`(a) Input DNA quantification by pico
`
`(b) Pre-flight pico
`
`1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0 4.1 4.2 4.3 4.4 4.5
`
`Total DNA (μg)
`
`(c) Shearing profile
`
`120
`
`100
`
`80
`
`60
`
`40
`
`20
`
`0
`
`Fluorescence
`
`20 30
`
`50 70 100
`
`200 300 500 700 1000
`
`2000 3000 5000
`
`Size (bp)
`
`(d) Automation performance
`
`0 – 50,000
`50,000 – 100,000
`100,000 – 150,000
`150,000 – 200,000
`
`1 2 3 4 5 6 7 8 9 10 11 12
`
`Distributions
`
`A B C D E F G H
`
`Covaris shearing
`
`End repair
`
`A-base addition
`
`(e) Deck layout
`
`165 ml SPRI XP beads
`
`165 ml filter tips
`
`Shearing holder
`with product
`in AFA tubes
`
`100 ml
`70% ethanol
`
`50 ml
`elution
`buffer
`
`Sample
`binding
`plate
`
`Magnet
`
`Adaptor ligation
`
`Pond PCR
`
`Hybridization
`
`(f) Pond quantification
`
`LC set #
`46
`50
`51
`53
`54
`55
`56
`57
`58
`59
`
`53
`
`56
`
`57
`
`58
`
`59
`
`50
`
`40
`
`30
`
`20
`
`10
`
`Enriched library (ng/μl)
`
`46
`
`50
`
`51
`
`54
`55
`LC set #
`
`Capture on bead
`
`(g) Catch quantification by pico
`
`(h) Catch quantification by qPCR
`
`2
`
`4
`
`6
`
`8
`
`10
`
`12
`
`14
`
`16
`
`26
`
`28
`
`30
`
`32
`
`34
`
`36
`
`38
`
`18
`20
`22
`24
`Cycles
`
`1
`
`0.1
`
`0.01
`
`Fluorescence (dRn)
`
`Linear regression
`
`Concentrations
`
`Off-bead catch PCR
`
`100
`
`80
`
`60
`
`40
`
`20
`
`Sample mean (B/B0)
`
`0
`
`0
`
`10
`
`20
`
`30
`
`40
`
`60
`
`70
`
`80
`
`90
`
`100
`
`50
`ng/μl
`
`Illumina sequencing
`
`Figure 4 Quality control checkpoints. (a-h) Eight different quality control checkpoints for the scaled SHS process are schematized. Quality is
`assayed at key steps to quickly identify failed samples and also to provide ability to troubleshoot process failures. See text for details. AFA,
`adaptive focused acoustics.
`
`Performance quality control of automation
`The Bravo automated liquid handling platform is assayed
`daily for dispense accuracy and precision using a quantita-
`tive fluorescent dye assay (Figure 4d). Standard liquid
`handling sequences are run using sulforhodamine dye, and
`relative fluorescent units of the dispensed dye are assayed
`
`on a Perkin Elmer Victor3 plate reader. Coefficients of var-
`iation (%CV) are calculated between wells and must be
`within three standard deviations of the mean. If the robot
`is out of specification, maintenance is performed on the
`system followed by repeat of the quality control until the
`coefficients of variance are back within acceptable ranges.
`
`00008
`
`
`
`Fisher et al. Genome Biology 2011, 12:R1
`http://genomebiology.com/2011/12/1/R1
`
`Page 9 of 15
`
`Confirmation of deck configuration
`To confirm proper set up of the Bravo platform before
`each step in the protocol, the software requests the
`operator to confirm the proper deck layout by compar-
`ing the deck positions
`to a picture shown on
`screen (Figure 4e). This prevents users from starting
`programs without the proper materials in place or from
`running the wrong combination of program and deck
`configuration.
`Quantification of pond libraries and catch libraries
`Prior to selection, pond libraries are assayed for concen-
`tration by an automated PicoGreen assay (Materials and
`methods) and are specified to be within a range of 25 to
`60 ng/μl in a volume of 40 μl (Figure 4f