`
`Integrated Systems and Technologies
`
`Cancer
`Research
`
`Ultrasensitive Measurement of Hotspot Mutations in Tumor
`DNA in Blood Using Error-Suppressed Multiplexed Deep
`Sequencing
`
`Azeet Narayan1, Nicholas J. Carriero2, Scott N. Gettinger3, Jeannie Kluytenaar1, Kevin R. Kozak4,
`Torunn I. Yock5, Nicole E. Muscato6, Pedro Ugarelli7, Roy H. Decker1, and Abhijit A. Patel1
`
`Abstract
`
`Detection of cell-free tumor DNA in the blood has offered promise as a cancer biomarker, but practical clinical
`implementations have been impeded by the lack of a sensitive and accurate method for quantitation that is also
`simple, inexpensive, and readily scalable. Here we present an approach that uses next-generation sequencing to
`quantify the small fraction of DNA molecules that contain tumor-specific mutations within a background of
`normal DNA in plasma. Using layers of sequence redundancy designed to distinguish true mutations from
`sequencer misreads and PCR misincorporations, we achieved a detection sensitivity of approximately 1 variant in
`5,000 molecules. In addition, the attachment of modular barcode tags to the DNA fragments to be sequenced
`facilitated the simultaneous analysis of more than 100 patient samples. As proof-of-principle, we showed the
`successful use of this method to follow treatment-associated changes in circulating tumor DNA levels in patients
`with non–small cell lung cancer. Our findings suggest that the deep sequencing approach described here may
`be applied to the development of a practical diagnostic test that measures tumor-derived DNA levels in blood.
`Cancer Res; 72(14); 3492–8. Ó2012 AACR.
`
`Introduction
`The release of cell-free DNA into the bloodstream from dying
`tumor cells has been well documented in patients with various
`types of cancer (1–4). There has been growing interest in trying
`to use such circulating tumor DNA (ctDNA) as a noninvasive
`biomarker to detect the presence of malignancy, follow treat-
`ment response, gauge prognosis, or monitor for recurrence (5,
`6). Because unique somatic mutations can be used as telltale
`marks to distinguish tumor-derived DNA in plasma, a new
`class of highly specific DNA-based cancer biomarkers can be
`envisioned with clinical applications that may complement
`those of conventional serum protein markers.
`To more formally explore the clinical utility of ctDNA, it
`would be imperative to be able to sensitively and accurately
`
`Authors' Affiliations: Departments of 1Therapeutic Radiology, 2Computer
`Science and 3Internal Medicine, Yale University, New Haven, Connecticut;
`4Department of Human Oncology, University of Wisconsin, Madison,
`Wisconsin; 5Department of Radiation Oncology, Massachusetts General
`Hospital, Boston, Massachusetts; and Departments of 6Pathology and
`7Hematology, Lawrence & Memorial Hospital, New London, Connecticut
`
`Note: Supplementary data for this article are available at Cancer Research
`Online (http://cancerres.aacrjournals.org/).
`
`A. Narayan and N.J. Carriero contributed equally to this work.
`
`Corresponding Author: Abhijit A. Patel, Department of Therapeutic Radi-
`ology, Yale University School of Medicine, P.O. Box 208040, New Haven,
`CT 06520. Phone: 203-785-2971; Fax: 203-785-7482; E-mail:
`abhijit.patel@yale.edu
`
`doi: 10.1158/0008-5472.CAN-11-4037
`
`©2012 American Association for Cancer Research.
`
`measure its levels in blood. However, because mutation-har-
`boring ctDNA can be obscured by a relative excess of back-
`ground wild-type DNA, quantitation has proven to be chal-
`lenging. Several innovative approaches have been developed to
`detect the presence or absence of low-level mutant DNA in
`clinical samples (7–12), but few methods are able to sensitively
`quantify ctDNA (13, 14).
`The recent advent of next-generation, high-throughput
`DNA sequencing technologies presents an attractive and
`seemingly obvious solution to this problem. By oversampling
`multiple DNA molecules from a particular genomic region
`using an approach known as ultradeep sequencing, it is
`possible to identify and enumerate rare sequence variants.
`But the sensitivity of this method is limited by the inherent
`error rate of the sequencer, as incorrectly read bases might
`be mistaken for true mutant copies. Indeed, mutant ctDNA
`has been previously reported to comprise an average of 0.2%
`of total plasma DNA (14)—a range in which sequencer
`misreads can be problematic.
`Here we describe a modified deep sequencing method that
`demands redundancy within each clonal sequence to produce
`extremely high quality base calls in short, mutation-prone
`regions of plasma DNA. Amplification of both mutated and
`wild-type sequences is carried out by unbiased PCR in a single
`tube, ensuring highly accurate and reproducible quantitation.
`The scheme is designed to have the flexibility to simultaneous-
`ly analyze mutations in several genes from multiple patient
`samples, making it practically feasible to screen plasma sam-
`ples for mutant ctDNA without previous knowledge of the
`tumor's mutation profile.
`
`3492
`
`Cancer Res; 72(14) July 15, 2012
`
`EX1082
`
`00001
`
`
`
`Downloaded from http://aacrjournals.org/cancerres/article-pdf/72/14/3492/2671690/3492.pdf by guest on 08 March 2022
`
`Deep Sequencing for Measurement of Circulating Tumor DNA
`
`lines were used only for analysis of short regions of genomic
`DNA, authentication of lines by our laboratory was limited to
`sequencing of those regions. To test the performance of the deep
`sequencing method for a particular gene, DNA derived from
`cells known to be either mutant or wild type with respect to that
`gene was mixed in various ratios between 10,000:1 and 1:10,000.
`Cell line DNA samples were then amplified and sequenced
`according to the same methods that were used for plasma DNA.
`
`Ultradeep sequencing
`Barcoded PCR products from all samples were mixed to
`create 3 separate pools, each corresponding to one set of
`replicate reactions. IlluminaTruSeq sequencing adapters with
`different indices were ligated to each of the 3 amplicon pools
`and were gel-purified as recommended by the manufacturer.
`All samples were loaded onto a single lane of an Illumina
`HiSeq2000 flow cell and were subjected to paired-end sequenc-
`ing (75 base pairs per read). Sequencing details and modifica-
`tions to standard Illumina protocols are described in Supple-
`mentary Materials and Methods.
`
`Data analysis
`All sequences were initially sorted into 3 replicate groups
`based on the adapter index sequence using Illumina's demul-
`tiplexing algorithm. A computer script was written to filter,
`assort, align, and count millions of paired-end sequences
`within each group. First, a read pair was assigned to a sam-
`ple-specific data bin based on the barcode of each read in the
`pair. Then, based on PCR primer sequences, the pair was
`assigned to one of the reference genes. Next, the longest stretch
`of perfect sequence agreement between each pair of reads was
`determined, and this was used to align the reads to the
`reference sequence for the gene. A read pair was discarded
`if either member did not pass Illumina filtering or a nucleotide
`was reported to be "."; if there was an inconsistency in barcodes,
`strands, or PCR tags; or if their region of perfect sequence
`agreement was less than 36 nucleotides in length. Finally,
`variant sequences confirmed by reads from both strands were
`identified and counted within each data bin based on com-
`parison with the reference sequence. The mean counts of
`variant sequences corresponding to known hotspot mutations
`were calculated from the 3 replicate data bins for each sample.
`The number of copies of mutant DNA fragments in a plasma
`sample was determined by using real-time PCR to measure the
`total concentration of both mutant and wild-type DNA frag-
`ments and then multiplying this value by the fraction of mutant
`molecules determined by deep sequencing. A mutation was
`considered to be undetectable if the number of mutant copies
`in the plasma sample was calculated to be less than one.
`Further details are available in Supplementary Materials and
`Methods. The full computer code, which we call OPAL for
`Overlapped Paired-end ALignment, is available upon request.
`
`Confirmation of mutations in tumor tissue
`Genomic DNA was isolated from paraffin-embedded tumor
`tissue samples using the QuickExtract FFPE DNA Extraction
`Kit (Epicentre). Mutation hotspot regions of KRAS, BRAF, and
`EGFR were amplified using the same PCR primers that were
`
`Materials and Methods
`Patient plasma and tumor samples
`Under the approval of the Human Investigation Committees
`at the Yale School of Medicine and at Lawrence & Memorial
`Hospital, 30 patients with stage I–IV non–small cell lung cancer
`(NSCLC) were enrolled in this study between July 2009 and July
`2010. Informed consent was obtained from all patients. Most
`patients were recruited in the radiation oncology clinic and
`underwent treatment with radiation therapy, chemotherapy,
`targeted systemic therapy, and/or surgery. Whenever possible,
`blood samples were collected from patients before starting the
`current course of treatment and then at subsequent times
`during and after treatment. A total of 117 samples were
`obtained. Formalin-fixed, paraffin-embedded tumor specimens
`were obtained for all patients with non-squamous histology
`whose tumors had not already been tested for mutations by a
`clinical laboratory and for whom sufficient tissue was available
`in the block after standard pathology evaluation.
`
`Extraction and amplification of plasma DNA
`Blood was collected in EDTA-containing tubes (Becton Dick-
`inson) and was centrifuged at 1,000 g for 10 minutes within
`3 hours of collection. Plasma was transferred to cryovials, being
`careful to avoid the buffy coat, and was stored at 80
`
`C until
`further processing. DNA was extracted from 0.2 mL of each
`plasma sample using the QIAamp DNA Blood Mini Kit (Qiagen),
`according to the manufacturer's protocol. Amplification of
`mutation hotspot regions was carried out in triplicate for each
`sample using 2 successive rounds of PCR, with primers designed
`to flank codons 12 and 13 of KRAS, codon 858 of EGFR, and
`codon 600 of BRAF. In the first round of PCR, all hotspot regions
`from a given sample were amplified in a multiplexed fashion.
`The products of these reactions were diluted 5,000-fold and then
`used as templates for a second round of PCR in which each hot-
`spot was amplified separately using nested primers with sam-
`ple-specific barcodes. The barcode sequences were 6 nucleo-
`tides in length and were designed to differ from all other
`barcodes in the set at a minimum of 2 positions so that a single
`sequencing error would not lead to misclassification of samples.
`All PCR steps were carried out using a high-fidelity polymerase
`(HiFi Hotstart, Kapa Biosystems). Real-time quantitative PCR
`was used to determine the total concentration of mutant and
`wild-type DNA fragments. Details of PCR and of modular
`barcode attachment to gene-specific primers are included in
`Supplementary Materials and Methods.
`
`Analysis of cell line DNA
`Genomic DNA was purified from human cancer cell lines
`using the same method used for purifying plasma DNA, after
`suspending cells in 0.2 mL of phosphate buffered saline. The
`following cell lines were used: A549 (having a KRAS Gly12Ser
`mutation; gift of J. Weidhaas, Yale University, Department of
`Therapeutic Radiology, New Haven, CT), H1957 [having an EGF
`receptor (EGFR) Leu858Arg mutation; gift of J. Weidhaas, Yale
`University], and YUSAC (having a BRAF Val600Glu mutation;
`gift of R. Halaban, Yale University, Department of Dermatology,
`New Haven, CT). Cells were passed in culture for no more than 6
`months after being thawed from original stocks. Because cell
`
`www.aacrjournals.org
`
`Cancer Res; 72(14) July 15, 2012
`
`3493
`
`00002
`
`
`
`Downloaded from http://aacrjournals.org/cancerres/article-pdf/72/14/3492/2671690/3492.pdf by guest on 08 March 2022
`
`obtained a median depth of 108,467 read pairs per mutation
`site per sample after filtering and demultiplexing a total of
`86,359,980 raw sequences generated on a single lane of an
`Illumina HiSeq 2000 flow cell.
`
`Single-end deviations
`from wild type
`
`G G T G G C
`Gly12 Gly13
`Replicate 1
`
`G G T G G C
`Gly12 Gly13
`Replicate 2
`
`G G T G G C
`Gly12 Gly13
`Replicate 3
`
`Overlapped paired-end
`deviations from wild type
`
`G G T G G C
`Gly12 Gly13
`Replicate 1
`
`G G T G G C
`Gly12 Gly13
`Replicate 2
`
`G G T G G C
`Gly12 Gly13
`Replicate 3
`
`Deviations from
`wild-type:
`
`AGTCI
`
`nsertion
`Deletion
`
`Mean overlapped
`paired-end deviations
`
`Gly12Ser mutation
`(GGT AGT)
`
`G
`
`G
`Gly12
`
`T
`
`G
`
`C
`
`G
`Gly13
`
`0.0025
`
`0.002
`
`0.0015
`
`0.001
`
`0.0005
`
`0
`
`0.003
`
`0.0025
`
`0.002
`
`0.0015
`
`0.001
`
`0.0005
`
`0
`
`A
`
`Variant : wild-type ratio
`
`B
`
`0.003
`
`0.0025
`
`0.002
`
`0.0015
`
`0.001
`
`0.0005
`
`Variant : wild-type ratio
`
`0
`
`C
`
`variant : wild-type ratio
`
`Mean
`
`Figure 2. Suppression of spurious mutation counts to reveal low-abundance
`variants. Each bar indicates the frequency of a particular deviation from
`the wild-type sequence occurring within the codon 12/13 hotspot region of
`KRAS. The tested sample contained 0.2% DNA derived from a lung cancer
`cell line that is known to be homozygous for a KRAS Gly12Ser mutation. A,
`filtered reads from one end of the amplicon had relatively frequent
`mismatches when directly compared with the wild-type sequence. Data
`from 3 replicate amplifications are shown. B, sequencer errors were greatly
`reduced by requiring both partially overlapping paired-end reads from each
`clone to exactly match a specific mutation. The Gly12Ser mutation was
`now readily distinguished from the remaining low-level errors that were likely
`introduced during DNA amplification and processing. Insertions and
`deletions were no longer seen in this region after requiring agreement of
`overlapped reads. C, a further reduction in the relative error level can be
`achieved by calculating the mean values of 3 replicate measurements, as
`mutations found in the original DNA sample should produce more
`consistent counts than randomly occurring errors.
`
`Narayan et al.
`
`A
`
`Plasma DNA
`
`Barcode
`
`Mutation hotspot
`
`PCR
`Round 1
`
`PCR
`Round 2
`
`Multiplexed
`preamplification of
`KRAS, EGFR, & BRAF
`hotspots
`
`Nested PCR of single
`gene region using
`barcoded primers
`
`Batch sequencing of
`barcoded PCR
`amplicons from
`multiple samples
`
`B
`
`5′
`
`Sequence redundancy
`in hotspot region
`{
`Forward 75 bp read
`3′
`GGAACCTT . . . CTTGTGGTAGTTGGAGCTGGTGGCGTA . . .
`
`Sequencing
`adapter
`
`3′
`
`. . . CGACCACCGCATCCGTTCTCACGGAACT . . . TTGATTCG
`Reverse 75 bp read
`Sequencing
`primer
`
`5′
`
`Figure 1. Schematic of the error-suppressed multiplexed deep sequencing
`approach. A, cell-free DNA purified from plasma undergoes 2 rounds of
`amplification by PCR. The first round amplifies mutation hotspot regions of
`several genes from a given sample in a single tube. The second round
`separately amplifies each hotspot region using nested primers
`incorporating unique combinations of barcodes to label distinct samples.
`The barcoded PCR products are then pooled and subjected to deep
`sequencing. Millions of sequences are sorted and counted to determine
`the ratio of mutant to wild-type molecules derived from each sample. The
`total number of plasma DNA fragments is measured by real-time PCR
`and can be used to calculate the absolute concentration of mutant ctDNA.
`B, sequence redundancy in mutation hotspot regions is produced by
`partial overlap of paired-end reads from the forward and reverse strands
`of each clone. This yields highly accurate base calls, permitting detection
`and quantitation of rare mutations with greater sensitivity.
`
`used in the first round of PCR described above. Sanger
`sequencing was carried out on gel-purified amplicons, and
`mutations were identified from chromatograms using Muta-
`tion Surveyor software (Softgenetics).
`
`Results
`Error suppression reveals low-abundance variants
`To determine the relative abundance of tumor-specific
`mutations, we carried out massively parallel sequencing of
`PCR amplicons derived from plasma DNA fragments contain-
`ing known mutation hotspots. Thousands of clonal sequence
`reads from each plasma sample were compared with reference
`sequences to identify and quantify variants. For proof-of-
`concept, we restricted the analysis to frequently mutated
`codons within 3 oncogenes that commonly develop somatic
`mutations in various malignancies: codons 12 and 13 of KRAS,
`codon 600 of BRAF, and codon 858 of EGFR. By designing PCR
`primers that flank very short regions (<50 bp) surrounding
`these mutation hotspots, we could ensure adequate amplifi-
`cation of highly fragmented plasma DNA and achieve greater
`sequence depth. Modular attachment of DNA barcode tags to
`0
`the 5
`-ends of the PCR primers allowed sequencing of up to 256
`DNA samples in batch (Fig. 1A and Supplementary Fig. S1). We
`
`3494
`
`Cancer Res; 72(14) July 15, 2012
`
`Cancer Research
`
`00003
`
`
`
`Deep Sequencing for Measurement of Circulating Tumor DNA
`
`BRAF
`Val600Glu
`
`B
`
`10,000
`
`1,000
`
`100
`
`10
`
`1
`
`0.1
`
`0.01
`
`0.001
`
`0.0001
`
`EGFR
`Leu858Arg
`
`C
`
`10,000
`
`1,000
`
`100
`
`10
`
`1
`
`0.1
`
`0.01
`
`0.001
`
`0.0001
`
`0.00001
`
`0.0001
`
`0.001
`
`0.01
`
`0.1
`
`1
`
`10
`
`100
`
`1,000
`
`10,000
`
`KRAS
`Gly12Ser
`
`10,000
`
`1,000
`
`100
`
`10
`
`1
`
`0.1
`
`0.01
`
`0.001
`
`0.0001
`
`A
`
`(mutant : wild-type)
`
`Measured ratio
`
`0.00001
`
`0.00001
`
`0.0001
`
`0.001
`
`0.01
`
`0.1
`
`1
`
`10
`
`100
`
`1,000
`
`10,000
`
`0.00001
`
`0.001
`
`0.01
`
`0.1
`
`1
`
`10
`
`100
`
`1,000
`
`0.0001
`0.00001
`0.00001
`10,000
`Ratio of DNA added (mutant : wild type)
`
`Downloaded from http://aacrjournals.org/cancerres/article-pdf/72/14/3492/2671690/3492.pdf by guest on 08 March 2022
`
`Figure 3. Performance of the error-suppressed deep sequencing approach. Measurements of DNA extracted from mutant and wild-type cancer cell lines
`mixed in various ratios ranging from 1:10,000 to 10,000:1 show a high degree of accuracy and reproducibility. A, DNA from the KRAS-mutant cell line
`produced a linear plot over the range of concentrations tested. B and C, BRAF- and EGFR-mutant lines contained a small amount of wild-type DNA, thereby
`yielding a plateau at higher mutant to wild-type ratios. Non-linear least-squares fits were carried out using the equation y ¼ 10^(slope log((1-C) x/(C xþ1))þ
`intercept), in which C was the fraction of wild-type molecules found in DNA extracted from mutant cell lines. Error bars indicate the SD of 3
`measurements.
`
`Importantly, the design of short PCR amplicons enabled us
`to devise a sequencing strategy that could distinguish mutant
`from wild-type DNA molecules with very high confidence. We
`modified Illumina's paired-end sequencing mode to achieve
`partial overlap of 75 base pair bidirectional reads obtained
`sequentially from the forward and reverse strands of each
`clonal DNA cluster on the flow cell (Fig. 1B). Mutation hotspots
`were included in the overlapping sequence region so that the
`hotspot within each clone would be read from one strand and
`then proofread from the opposite strand. By discarding clones
`that did not have perfect sequence agreement between the 2
`paired-end reads, we were able to eliminate the vast majority of
`sequencer-generated errors. Imperfect sequence agreement
`was found in 22% of read pairs that had already passed
`Illumina's chastity filter. Similar to previous reports (15, 16),
`we observed a median error frequency of 0.31% per base when
`directly comparing single reads derived from either strand of
`wild-type control samples to known reference sequences. The
`frequency of such errors was reduced to 0.07% per base in the
`region of overlap after removing read pairs that lacked
`sequence consistency.
`Any remaining errors were highly unlikely to be caused by
`coincidentally consistent misreads from opposite ends of a
`clone. Rather, most of these errors were probably present
`within the DNA molecules being sequenced, introduced by
`polymerase misincorporations or DNA damage. To further
`discriminate true mutations from such errors, we carried out
`all amplification and processing steps in triplicate and deter-
`mined the mean of the 3 mutation counts. This was done based
`on the premise that true mutations would be reproducibly
`counted in all 3 instances, whereas counts from randomly
`occurring errors would be more variable (recognizing that the
`distribution of errors is not entirely random). Using this
`approach, we were able to reduce the frequency of miscalls
`
`of specific mutations from known wild-type samples to a
`median value of 0.014% [interquartile range (IQR): 0.0052%–
`0.023%; wild-type DNA was obtained from A549 cells for testing
`BRAF and EGFR mutations and from YUSAC cells for testing
`KRAS mutations; Supplementary Table S1]. Suppression of
`errors in this manner permitted rare mutations to be identified
`with a high degree of certainty (Fig. 2).
`
`Sensitive and accurate quantitation of mutant DNA
`Next, we tested the ability of this deep sequencing approach
`to measure mutant and wild-type DNA levels over a broad
`range of relative concentrations. Genomic DNA from KRAS-,
`BRAF-, or EGFR-mutant cancer cell lines was mixed in different
`ratios and then subjected to amplification and deep sequenc-
`ing. We found that mutant DNA could be accurately and
`reproducibly measured in a linear manner over approximately
`7 to 8 orders of magnitude and down to levels of approximately
`1 in 5,000 molecules (Fig. 3). Also, by testing combinations of
`DNA from multiple mutant cell lines, we confirmed the ability
`of the assay to simultaneously quantify more than one muta-
`tion from a given sample.
`
`Monitoring ctDNA levels in cancer patients
`To validate this method with clinical samples, we ana-
`lyzed plasma collected from patients with NSCLC at various
`times before, during, or after treatment. Patients were
`enrolled in the study (and their plasma DNA was tested)
`without previous knowledge of the mutation status of their
`tumors. A total of 117 samples were obtained from 30
`patients (17 patients with adenocarcinoma, 9 with undiffer-
`entiated NSCLC, and 4 with squamous cell carcinoma). KRAS
`Gly12Asp, Gly12Val, Gly12Cys, or Gly13Asp point mutations
`were detectable in the plasma DNA of 6 patients of 26 with
`adenocarcinoma or undifferentiated NSCLC. As expected, no
`
`www.aacrjournals.org
`
`Cancer Res; 72(14) July 15, 2012
`
`3495
`
`00004
`
`
`
`Downloaded from http://aacrjournals.org/cancerres/article-pdf/72/14/3492/2671690/3492.pdf by guest on 08 March 2022
`
`6,000
`
`4,000
`
`2,000
`
`KRAS Gly12Val
`
`= Undetectable
`
`0
`
`0
`
`Brain
`radiation
`
`100
`
`200
`
`300
`
`Chemotherapy
`Days
`
`Mutant molecules per mL
`
`B
`
`KRAS Gly12Asp
`
`0
`
`10
`
`20
`
`30
`
`40
`
`50
`
`Radiation +
`HDAC inhibitor
`
`Days
`
`Chemotherapy
`
`Narayan et al.
`
`30,000
`
`20,000
`
`10,000
`
`0
`
`A
`
`Mutant molecules per mL
`
`30,000
`
`20,000
`
`10,000
`
`KRAS Gly12Val
`
`0
`
`0
`
`20
`Completion of
`chest radiation
`
`40
`
`60
`
`80
`
`Deceased
`
`Days
`
`C
`
`Mutant molecules per mL
`
`Figure 4. Changes in ctDNA levels with treatment or disease progression. Measurements of mutant ctDNA from patients with NSCLC are shown at
`various times in relation to therapeutic interventions and disease status. The ctDNA was considered undetectable if sequence counts yielded a quantity of less
`than one mutant molecule per sample. Median genome equivalents per sample as determined by real-time PCR were 9602 (IQR ¼ 5,412–11,513). A, patient
`3 had stage IV lung adenocarcinoma with a 4.3 cm right upper lobe tumor and large metastases in the abdomen and supraclavicular region. She was
`treated concurrently with an experimental histone deacetylase (HDAC) inhibitor and palliative radiation therapy directed at her painful 6.9 cm supraclavicular
`lesion. She began chemotherapy treatment shortly afterwards. B, patient 5 had a 7.5 cm lung adenocarcinoma with 8 small brain metastases ranging from
`3 to 15 mm in size at presentation. He was treated with palliative whole-brain radiation therapy, followed by long-term weekly chemotherapy. Follow-up
`imaging revealed an excellent, durable response with shrinkage of the lung tumor to approximately 15% of its original volume at 7 months after diagnosis.
`No evidence of disease progression was seen during this time period. C, patient 9 underwent definitive radiation treatment for locally advanced, stage
`IIIB undifferentiated NSCLC. Other health conditions prevented him from undergoing surgery or concurrent chemotherapy. Blood sample collection
`commenced upon completion of his treatment. Although his disease was confined to the thorax before initiating radiation therapy, a positron emission
`tomography scan conducted 8 weeks after treatment showed marked progression of disease with multiple osseous, hepatic, and subcutaneous metastases.
`He expired 10 weeks after completing treatment.
`
`KRAS mutations were found in specimens from patients with
`squamous cell carcinoma. BRAF and EGFR mutations were
`not detectable in any plasma samples. This was somewhat
`surprising for EGFR, which has a reported prevalence of
`activating mutations in NSCLC of approximately 10% (17–
`19). However, evaluation of 21 available tumor tissue speci-
`mens confirmed the absence of EGFR mutations in this
`population (mutations occurring outside of the sequenced
`hotspot region may have been missed). We found the
`presence or absence of KRAS mutations in all tested tumor
`samples to be concordant with the findings in plasma: 5
`patients had identical KRAS mutations in both tumor and
`
`plasma, and 16 patients had no KRAS mutations detected
`from either source. Tumor tissue was unavailable or insuf-
`ficient for 1 patient with mutant KRAS in the plasma and for
`4 patients with no plasma mutations. Supplementary Table
`S2 lists the clinical characteristics and mutation findings for
`all patients in the study.
`For patients with detectable plasma DNA mutations, we
`were able to follow changes in measured ctDNA levels in the
`context of therapeutic interventions or disease progression. To
`determine the absolute concentration of mutant KRAS DNA
`fragments in a plasma sample, we measured the total concen-
`tration of KRAS fragments by real-time PCR and then
`
`3496
`
`Cancer Res; 72(14) July 15, 2012
`
`Cancer Research
`
`00005
`
`
`
`Downloaded from http://aacrjournals.org/cancerres/article-pdf/72/14/3492/2671690/3492.pdf by guest on 08 March 2022
`
`Deep Sequencing for Measurement of Circulating Tumor DNA
`
`and colleagues recently described an elegant and powerful
`error reduction strategy that enables highly sensitive quan-
`titation of DNA variants using massively parallel sequencing
`(21). However, this approach was not designed to analyze
`multiple amplicons from samples containing limited DNA
`and was not tested on clinical specimens.
`A notable limitation of our method is the need to focus on
`short, mutation-prone regions of DNA because of length
`constraints in the overlapping portion of paired-end reads.
`This may preclude evaluation of inactivating mutations that
`can occur at numerous possible locations within tumor
`suppressor genes. Nonetheless, many tumor suppressor
`genes would be amenable to analysis by this method because
`their mutations tend to be clustered within short domains
`(such as the region of TP53 that encodes the DNA binding
`domain). Also, as sequencing technologies improve, read
`lengths are likely to become less constraining. Another
`limitation of our study was the small number of patients
`whose mutant ctDNA we were able to measure. Had we
`selectively enrolled patients with known tumor mutations,
`we would have likely seen more cases with measurable
`ctDNA. However, we chose to test the performance of the
`method on an unselected population of patients with NSCLC
`and did observe congruence between mutations from plas-
`ma and tumor tissue in this small sample set. By including a
`larger panel of cancer-specific mutations in future studies,
`we expect to be able to measure mutant ctDNA in a higher
`percentage of patients. Finally, although we report a limit of
`detection of approximately 1 variant in 5,000 molecules, one
`must keep in mind that mutant counts would need to be
`several-fold above background to guide clinical decisions.
`Practical considerations such as cost and turnaround time
`must also be factored into any analysis of the potential
`utility of a new technology. Although the capital equipment
`costs of next-generation sequencing are presently very high,
`they are predicted to drop substantially as the technologies
`mature. Because barcoding allows multiple samples to be
`analyzed in a single sequencing run, the current cost per
`sample is well under 100 U.S. dollars. When we carried out
`the described experiments, a 75-base pair, paired-end run
`with indexing took approximately 9 to 10 days of sequencer
`time. If 3 to 4 days are added for sample preparation and
`computational analysis,
`the resultant
`turnaround time
`might be considered to be impractical for a clinical test.
`But sequencer speeds have improved dramatically in just the
`past several months, and this trend is likely to continue.
`Thus,
`it does not seem that cost and time will remain
`significant impediments for much longer.
`Error-suppressed deep sequencing may find applica-
`tions in other scenarios in which quantitation of low-abun-
`dance DNA variants would be informative. The diagnostic
`utility of rare cancer-associated mutations is being inves-
`tigated in biologic specimens such as lymph nodes, stool,
`pleural fluid, peritoneal fluid, and urine. Outside of onco-
`logy, analysis of nucleic acid variants has been useful in
`HIV medicine (22), organ transplantation (23), and prenatal
`diagnosis (24). Additional unanticipated uses may become
`apparent
`as next-generation sequencing
`technologies
`
`multiplied this value by the fraction of mutant molecules
`determined by deep sequencing. The median concentration
`among samples with detectable mutations was 5,694 mutant
`KRAS molecules per mL (IQR: 2,655–25,123). Time courses of
`mutant ctDNA measurements for patients who had 3 or more
`samples collected are shown in Fig. 4 (data for patients with
`fewer measurements are shown in Supplementary Fig. S2). In
`2 cases, we observed a decrease in the ctDNA level upon
`treatment with radiation and/or systemic therapy. Aggressive
`progression of metastatic disease in a different patient was
`accompanied by a substantial rise in ctDNA. In another 2 cases,
`we saw an increase in ctDNA levels shortly after initiating
`treatment, perhaps because more tumor DNA was released
`into the bloodstream as cancer cells were being killed.
`
`Discussion
`The deep sequencing approach described above shows the
`successful application of principles that we believe could
`form the basis of a practical diagnostic test to measure
`tumor-derived DNA levels in blood. The accuracy and sen-
`sitivity provided by the error suppression strategy enables
`quantitation of mutant ctDNA within a clinically informative
`range of concentrations. The multiplexing scheme allows
`parallel analysis of mutations within several genes from
`hundreds of patient samples with low marginal cost and
`effort. Thus, analysis need not be limited to mutations that
`are already identified in a patient's tumor tissue. Moreover,
`the flexibility and scalability afforded by the modular bar-
`coding approach makes it fairly easy to customize the panel
`of genes to be evaluated according to the prevalence of
`mutations in different malignancies. These features suggest
`a potential application of this technology that takes advan-
`tage of the cancer specificity of mutant ctDNA: early cancer
`detection. To be practically useful for cancer screening,
`however, such an assay would have to include analysis of
`mutations in a larger panel of genes to maximize the
`probability of detecting ctDNA. Also, extensive testing of
`healthy volunteers would be required to determine the false-
`positive rate and positive predictive value of the assay before
`proceeding with clinical screening studies. Although much
`work remains to be done, the methods presented in this
`article illustrate how next-generation sequencing can bring
`us a step closer to evaluation of ctDNA as a noninvasive
`biomarker for cancer screening.
`Although many other techniques have been developed to
`analyze ctDNA, we believe that
`this deep sequencing
`approach offers important advantages. Several commonly
`used strategies facilitate detection of rare mutant DNA
`molecules from clinical specimens by preferentially ampli-
`(7–12).
`fying mutant
`relative to wild-type sequences
`Although these methods have excellent sensitivity, they are
`unable to provide quantitative information because of the
`poor reproducibility of biased amplification. The most sen-
`sitive existing methods for quantifying ctDNA (13, 20)
`require custom primers or probes to be synthesized for a
`specific known mutation in a patient's tumor, making it
`impractical to use these assays for cancer screening. Kinde
`
`www.aacrjournals.org
`
`Cancer Res; 72(14) July 15, 2012
`
`3497
`
`00006
`
`
`
`Narayan et al.
`
`Downloaded from http://aacrjo