`
`D I A G N O S T I C S
`Detection of Chromosomal Alterations
`in the Circulation of Cancer Patients with
`Whole-Genome Sequencing
`Rebecca J. Leary,1* Mark Sausen,1* Isaac Kinde,1* Nickolas Papadopoulos,1 John D. Carpten,2
`David Craig,2 Joyce O’Shaughnessy,3 Kenneth W. Kinzler,1 Giovanni Parmigiani,4,5
`Bert Vogelstein,1 Luis A. Diaz Jr.,1† Victor E. Velculescu1†
`
`Clinical management of cancer patients could be improved through the development of noninvasive
`approaches for the detection of incipient, residual, and recurrent tumors. We describe an approach to directly
`identify tumor-derived chromosomal alterations through analysis of circulating cell-free DNA from cancer pa-
`tients. Whole-genome analyses of DNA from the plasma of 10 colorectal and breast cancer patients and 10
`healthy individuals with massively parallel sequencing identified, in all patients, structural alterations that
`were not present in plasma DNA from healthy subjects. Detected alterations comprised chromosomal copy
`number changes and rearrangements, including amplification of cancer driver genes such as ERBB2 and
`CDK6. The level of circulating tumor DNA in the cancer patients ranged from 1.4 to 47.9%. The sensitivity
`and specificity of this approach are dependent on the amount of sequence data obtained and are derived from
`the fact that most cancers harbor multiple chromosomal alterations, each of which is unlikely to be present in
`normal cells. Given that chromosomal abnormalities are present in nearly all human cancers, this approach rep-
`resents a useful method for the noninvasive detection of human tumors that is not dependent on the availability
`of tumor biopsies.
`
`INTRODUCTION
`Abnormal chromosome content, or aneuploidy, is a common charac-
`teristic of tumors, which manifests at the earliest stages of tumori-
`genesis and increases throughout subsequent tumor development
`(1–4). In addition to losses and gains of entire chromosomes, altera-
`tions of chromosome arms, focal amplifications and deletions, and re-
`arrangements are observed in nearly all cancer genomes. Analysis of
`such alterations in cancer began with karyotyping but is now generally
`carried out with molecular methods that can more easily assess ge-
`nomes in a comprehensive manner. For example, an approach based
`on sequencing and enumerating genomic DNA tags, called digital
`karyotyping (DK), was developed for the analysis of copy number al-
`terations on a genome-wide scale (5). Similar tag-based approaches
`have been adapted to next-generation sequencing methods (6, 7).
`Likewise, the analysis of chromosomal rearrangements with large-scale
`DNA sequencing approaches allows for high-resolution mapping of
`rearrangement breakpoints (3).
`Given the universal nature of chromosomal alterations in human
`cancer and improved methods for detecting such changes, we won-
`dered whether we could directly identify chromosomal alterations in
`the circulation of cancer patients. Sequencing analyses of chromosome
`content in the maternal circulation are now being used for detection of
`fetal aneuploidy (8, 9), although such approaches have not been eval-
`
`1Ludwig Center for Cancer Genetics and Howard Hughes Medical
`Institutions, Johns
`Hopkins Kimmel Cancer Center, Baltimore, MD 21287, USA. 2Translational Genomics
`Research Institute, Phoenix, AZ 85044, USA. 3Baylor Sammons Cancer Center, Texas On-
`cology, US Oncology, Dallas, TX 75246, USA. 4Department of Biostatistics and Com-
`putational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA. 5Department
`of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA.
`*These authors contributed equally to this work.
`†To whom correspondence should be addressed. E-mail: velculescu@jhmi.edu (V.E.V.);
`ldiaz1@jhmi.edu (L.A.D.)
`
`uated for detection of chromosomal alterations in cancer patients. Sim-
`ilarly, analysis of circulating tumor DNA in patients with hematopoietic
`malignancies has been useful for the detection of known recurrent chro-
`mosomal rearrangements, such as those that involve the BCR-ABL on-
`cogene and genes that encode immunoglobulin chains, T cell receptor
`subunits, and the retinoic acid receptor (10–15). More recently, anal-
`ysis of tumor rearrangements has allowed the development of patient-
`specific biomarkers that can be evaluated in plasma for the detection
`of residual disease or for tumor monitoring (6, 16). However, such mon-
`itoring approaches rely on analyses of known alterations identified in
`resected tumors from the same patients and cannot be directly applied
`to the detection of new alterations in the circulation of patients in
`whom biopsied material is unavailable. Recurrent mutations, includ-
`ing those identified in oncogenes such as KRAS, have also been readily
`identified in a fraction of patients with solid tumors (17–19).
`An alternative to these approaches is the identification of de novo
`tumor-derived chromosomal alterations through massively parallel di-
`rect sequencing of DNA from the circulation of cancer patients. Such
`approaches would be applicable to more patients than those that rely
`on recurrent oncogene alterations and could theoretically permit non-
`invasive detection of nearly all cancer types. Herein, we compare whole-
`genome analyses of DNA from the plasma of late-stage cancer patients
`to healthy individuals with massively parallel sequencing and detect
`structural alterations specific to patients.
`
`RESULTS
`
`Overview
`A schematic of our approach to examine chromosomal abnormalities
`directly in the plasma of cancer patients is illustrated in Fig. 1. As a
`
`www.ScienceTranslationalMedicine.org
`
`28 November 2012
`
`Vol 4 Issue 162 162ra154
`
`1
`
`00001
`
`EX1052
`
`
`
`R E S E A R C H A R T I C L E
`
`Fig. 1. Schematic of analyses for direct detection of chromosomal altera-
`tions in plasma. The method uses next-generation paired-end sequencing
`of cell-free DNA isolated from plasma to identify chromosomal alterations
`
`characteristic of tumor DNA. Such alterations include copy number changes
`(gains and losses of chromosome arms) as well as rearrangements resulting
`from translocations, amplifications, or deletions.
`
`proof-of-principle analysis, we obtained 4 to 18 ml of plasma from
`each of 10 healthy individuals (N1 to N10), 7 patients with colorectal
`cancer (CRC11 to CRC17), and 3 patients with breast cancer (BR1 to
`BR3) (table S1). Plasma DNA was purified and used to generate
`paired-end libraries for whole-genome sequencing, and each library
`was analyzed on two lanes of an Illumina HiSeq instrument (see Ma-
`terials and Methods). An average of 249,378,422 distinct paired se-
`quences [50 base pairs (bp) from each end] was obtained for each
`sample (Table 1). The resulting sequence data from circulating DNA
`were analyzed for chromosome copy number changes and for intra-
`and interchromosomal rearrangements.
`
`Analysis of chromosomal copy number changes
`Losses or gains of specific chromosomal regions are a hallmark of
`many cancers and have been used historically to identify tumor sup-
`pressors or oncogenes targeted by the alterations (20–22). Such chro-
`mosomal imbalances could be useful as markers of tumorigenesis
`because they should, in principle, alter the chromosomal representa-
`tion of circulating DNA. To adapt DK to detect tumor-specific (so-
`matic) chromosomal alterations in the plasma, we used the equivalent
`of one lane of HiSeq single-read sequence data per sample (average of
`144,543,191 distinct reads) and applied a number of filtering steps to
`remove sources of variation that were not tumor-specific (see Mate-
`rials and Methods). For example, we removed sequences that are known
`to vary in the germlines of normal individuals, because these could
`confound identification of somatic copy number changes. In addition,
`
`we applied a weight to each sequence read based on local GC content.
`This weighting has been shown to remove bias introduced by next-
`generation sequencing and allows for a more accurate assessment of
`chromosomal representation of the original genomic DNA (see Mate-
`rials and Methods) (23, 24). The resulting weighted reads were used to
`determine the proportion of reads that mapped to specific regions in
`the genome (fig. S1). We performed analyses of entire chromosomes,
`of chromosome arms, and of sequential regions of specified sizes (for
`example, 10 Mb) throughout the genome. Although each of these ap-
`proaches has certain advantages, we chose to analyze chromosome
`arms because these were frequently altered in breast and colorectal
`cancer samples previously analyzed for copy number alterations and
`would be expected to be altered in most human cancers (see Materials
`and Methods).
`The proportion of sequences that represented each chromosome
`arm (excluding short arms of acrocentric chromosomes) was calcu-
`lated, for each sample, by dividing the sum of the weighted reads map-
`ping to that arm by the total number of weighted reads mapping to
`the reference genome. For the normal samples, N1 to N10, the pro-
`portion of chromosomal arm sequences ranged from 0.46 to 6.19%,
`closely corresponding to the expected fraction based on genomic size
`and the applied mapping criteria (table S2) (R2 = 0.95; P < 0.0001,
`Pearson correlation). The variation among the normalized propor-
`tions of each chromosomal arm in the plasma from normal individ-
`uals was very low (average, 2.56 ± 0.0065%; range of SD, ±0.0025% to
`±0.014%). These results are consistent with similar measurements of
`
`www.ScienceTranslationalMedicine.org
`
`28 November 2012
`
`Vol 4 Issue 162 162ra154
`
`2
`
`00002
`
`
`
`R E S E A R C H A R T I C L E
`
`Table 1. Summary of next-generation sequencing analyses performed.
`Data were obtained using next-generation sequencing analyses per-
`formed on Illumina HiSeq instruments using 50-bp PE reads. Distinct
`paired reads correspond to read pairs having unique genomic start
`
`sites. Sequence coverage indicates average number of reads per base per
`haploid genome. Physical coverage indicates average number of paired
`reads spanning any base in a haploid genome assuming a 165-bp frag-
`ment size.
`
`Sample
`name
`
`CRC11
`CRC12
`CRC12-PT
`CRC13
`CRC14
`CRC14-0
`CRC14-4
`CRC14-PT
`CRC15
`CRC15-PT
`CRC16
`CRC16-PT
`CRC17
`BR1
`BR2
`BR3
`N1
`N2
`N3
`N4
`N5
`N6
`N7
`N8
`N9
`N10
`
`Patient
`diagnosis
`
`Colorectal cancer
`Colorectal cancer
`Colorectal cancer
`Colorectal cancer
`Colorectal cancer
`Colorectal cancer
`Colorectal cancer
`Colorectal cancer
`Colorectal cancer
`Colorectal cancer
`Colorectal cancer
`Colorectal cancer
`Colorectal cancer
`Breast cancer
`Breast cancer
`Breast cancer
`Normal
`Normal
`Normal
`Normal
`Normal
`Normal
`Normal
`Normal
`Normal
`Normal
`
`Sample
`origin
`
`Plasma
`Plasma
`Tumor
`Plasma
`Plasma
`Plasma
`Plasma
`Tumor
`Plasma
`Tumor
`Plasma
`Tumor
`Plasma
`Plasma
`Plasma
`Plasma
`Plasma
`Plasma
`Plasma
`Plasma
`Plasma
`Plasma
`Plasma
`Plasma
`Plasma
`Plasma
`
`Total bases
`sequenced
`
`24,728,144,682
`25,707,029,400
`14,984,097,228
`24,033,905,652
`23,774,411,124
`3,113,755,960
`3,755,921,750
`7,156,542,105
`34,216,224,513
`12,466,375,200
`32,670,584,037
`6,199,959,150
`33,060,006,522
`34,918,959,429
`30,171,911,865
`31,294,294,671
`26,918,560,359
`25,928,759,499
`21,331,401,576
`25,735,971,696
`33,535,967,796
`32,892,667,872
`27,558,615,816
`32,472,392,886
`30,058,183,548
`33,068,060,850
`
`Total distinct
`paired reads
`
`Sequence
`coverage
`
`Physical
`coverage
`
`216,092,204
`221,881,499
`113,414,859
`206,926,843
`201,571,426
`17,297,631
`21,193,176
`55,882,387
`299,895,779
`102,193,630
`284,131,684
`41,700,493
`286,524,067
`304,798,840
`240,263,505
`259,443,659
`231,520,314
`224,708,017
`161,874,934
`223,342,309
`288,318,721
`285,785,107
`226,653,076
`279,224,469
`257,431,312
`287,180,675
`
`8.2
`8.6
`5.0
`8.0
`7.9
`1.0
`1.3
`2.4
`11.4
`4.2
`10.9
`2.1
`11.0
`11.6
`10.1
`10.4
`9.0
`8.6
`7.1
`8.6
`11.2
`11.0
`9.2
`10.8
`10.0
`11.0
`
`11.9
`12.2
`6.2
`11.4
`11.1
`1.0
`1.2
`3.1
`16.5
`5.6
`15.6
`2.3
`15.8
`16.8
`13.2
`14.3
`12.7
`12.4
`8.9
`12.3
`15.9
`15.7
`12.5
`15.4
`14.2
`15.8
`
`circulating DNA from the plasma of pregnant women carrying eu-
`ploid fetuses (8, 9). In contrast, the normalized proportions of chro-
`mosomal arm sequences in the plasma of cancer patients were much
`more variable, ranging from 0.61- to 1.97-fold of the average found in
`the plasma of normal individuals (table S2).
`To determine whether sequenced reads for an individual patient
`sample deviate from patterns in normal samples, we used the fraction
`of reads that mapped to each arm to calculate a z score. For each arm,
`the z score was calculated as the number of SDs from the mean of the
`reference plasma samples (N1 to N10). After Bonferroni correction for
`multiple comparisons of the 39 chromosomal arms, an absolute z
`score of ≥11.88 was determined to represent a statistically significant
`gain or loss of a chromosomal arm (P < 0.05, Student’s t test). All
`chromosome arms of the 10 normal plasma samples had absolute z
`scores of less than 2.62. In contrast, plasma samples from all 10 of the
`cancer patients showed evidence of copy number gains or losses, with
`
`the highest absolute z score in each sample ranging from 13.3 to 434.4
`(Fig. 2A).
`Although such analyses could be used to evaluate specific chromo-
`somal arms, a statistical approach that uses a combination of the
`most markedly altered chromosome arms in each sample would be
`expected to provide a more sensitive measure of circulating tumor
`DNA. We analyzed previously obtained genome-wide copy number
`alterations detected from single-nucleotide polymorphism (SNP) ar-
`rays of 36 colorectal cancer samples (25) to determine how frequently
`tumors lost multiple chromosome arms. As shown in fig. S2, we found
`that the mean number of chromosome arms altered in these colorectal
`cancers was 21 and ranged from 5 to 35. Accordingly, we constructed
`a log-scale plasma aneuploidy score (PA score) based on the five chro-
`mosomes whose arms had the highest absolute z scores (see Materials
`and Methods). The PA score from the plasma of healthy individuals
`ranged from 0.1 to 2.4, and we calculated that a threshold PA score of
`
`www.ScienceTranslationalMedicine.org
`
`28 November 2012
`
`Vol 4 Issue 162 162ra154
`
`3
`
`00003
`
`
`
`R E S E A R C H A R T I C L E
`
`5.84 would provide a specificity of >99% (Student’s t distribution) for
`indicating aneuploidy (Fig. 2B). All plasma samples from the colorec-
`tal and breast cancer patients had PA scores above this threshold, rang-
`ing from 11.9 to 41.5 (Fig. 2B and tables S1 and S2). The two plasma
`samples with the lowest PA scores represented
`those with the lowest amounts of circulating tu-
`mor DNA, and the PA score generally correlated
`with tumor burden (R2 = 0.53; P = 0.017, Pearson
`correlation) (Fig. 2B, table S2, and Materials and
`Methods).
`
`not in the matched normal DNA. Independent sequencing of the rear-
`ranged regions identified the expected rearrangement junctions in all
`nine cases analyzed. We further evaluated the specificity of the approach
`by analyzing more than 5.6 billion paired-end Illumina reads of normal
`
`Analysis of rearrangements
`The chromosomal instability that underlies large
`chromosomal gains and losses in tumorigenesis
`is associated with genomic rearrangements. Such
`somatic rearrangements are not present in nor-
`mal cells in a clonal fashion and would therefore
`be expected to provide a highly sensitive and spe-
`cific marker for the presence of clonal tumor-
`specific genetic alterations. We previously developed
`a technique, personalized analysis of rearranged
`ends (PARE), to identify rearranged breakpoints
`from tumor DNA for individual patients (see Ma-
`terials and Methods). A challenge in adapting PARE
`to detection of rearrangements directly from plas-
`ma DNA is distinguishing the relatively few so-
`matic rearrangements present in circulating tumor
`DNA from the much larger number of structural
`variants resulting from copy number variations
`in the germline of all individuals. To overcome
`this obstacle, we used bioinformatic filters that
`enriched for high-confidence somatic structural
`alterations while removing germline and artifactual
`changes. These filters included selecting paired-
`end reads that (i) mapped to different chromosomes
`or to the same chromosome but at large distances
`(≥30 kb) apart, (ii) spanned rearrangement junc-
`tions that were observed in multiple reads, (iii)
`contained sequenced rearrangement breakpoints,
`and (iv) mapped to genomic regions that did not
`contain known germline copy number variants or
`repeated sequences (26, 27) (fig. S1).
`Paired-end Illumina sequence data for DNA
`in plasma samples from the 10 cancer patients
`and 10 healthy individuals (table S1) revealed a
`total of 65,402,563 aberrantly mapped paired-
`end reads, most of which were expected to result
`from either germline changes or mapping arti-
`facts (26, 27). Application of the criteria described
`above identified 14 candidate rearrangements in
`9 of the 10 plasma samples from cancer patients
`but none in the plasma samples from healthy in-
`dividuals (Fig. 3). These rearranged sequences
`were evaluated further by polymerase chain re-
`action (PCR) amplifications across the rearrange-
`ment junctions in tumor and normal DNA from
`the same nine cancer patients, and all were con-
`firmed to be present in the tumor samples but
`
`Fig. 2. Copy number analysis of plasma samples. (A) The z scores for each chromosome arm
`indicate the number of SDs from the mean of the mapped read fraction of the plasma DNA from
`unaffected individuals (N1 to N10). Positive z scores indicate chromosome gains, whereas negative
`z scores indicate chromosome losses. Significant chromosome arm gains and losses were ob-
`served only in plasma samples from patients with cancer (CRC11 to CRC17 and BR1 to BR3).
`(B) The PA score was calculated as the number of SDs from the mean of the sum of the −log
`of the P values for the top five chromosome z scores of the 10 reference samples (N1 to N10). A PA
`score of 5.84 (horizontal line) was estimated to indicate aneuploidy in the plasma sample at a
`specificity greater than 99% (Student’s t distribution) (see Materials and Methods).
`
`www.ScienceTranslationalMedicine.org
`
`28 November 2012
`
`Vol 4 Issue 162 162ra154
`
`4
`
`00004
`
`
`
`R E S E A R C H A R T I C L E
`
`Fig. 3. Detection of tumor-specific rearrangements in plasma samples. The
`Circos plot at the top indicates the rearrangements identified in plasma
`samples from cancer patients (CRC11 to CRC17 and BR1 to BR3). The type
`and individual boundaries of the rearrangements are indicated in the lower
`
`table. No rearrangements were identified in plasma samples from unaffected
`individuals (N1 to N10). Rearrangements listed for sample CRC12 were iden-
`tified in tumor DNA and confirmed in patient plasma, whereas those listed
`for all other samples were identified directly from patient plasma.
`
`www.ScienceTranslationalMedicine.org
`
`28 November 2012
`
`Vol 4 Issue 162 162ra154
`
`5
`
`00005
`
`
`
`R E S E A R C H A R T I C L E
`
`lymphocyte DNA from 28 individuals (see table S3 and Materials and
`Methods). These analyses did not identify any candidate rearrange-
`ments, providing further evidence that the approach is highly specific
`to tumor-derived structural alterations.
`Two of the rearrangement regions were associated with amplified
`genes known to be drivers of cancer development (Fig. 3). The chromo-
`somal rearrangement identified in the CRC16 plasma sample corre-
`sponded to a breakpoint resulting from the amplification of the genetic
`locus that contains the ERBB2 gene, which encodes HER2/neu, the tar-
`get of trastuzumab (Herceptin) (Fig. 3 and fig. S3). The level of ampli-
`fication in the plasma was estimated using DK to be 10.5-fold higher
`than that of plasma from a healthy individual. In addition, two of the
`four candidate rearrangements detected in plasma sample BR1 were asso-
`ciated with amplification of the cell cycle regulatory gene cyclin-dependent
`kinase 6 (CDK6) (Fig. 3), where the level of amplification in the plasma was
`estimated using DK to be 6.5-fold. Inhibition of the CDK6 protein is
`currently being evaluated in clinical trials for breast and other cancer
`types (ClinicalTrial.gov identifier NCT01320592). These analyses show
`that amplified genes can be identified through detection of amplification-
`associated rearrangements by direct sequencing in the plasma.
`Rearrangements detected directly in patient plasma may be used to
`develop PCR-based breakpoint-specific biomarkers for the analysis
`of circulating tumor DNA levels in plasma samples. Such breakpoint
`biomarkers as identified by PARE could be useful for providing a mea-
`sure of circulating tumor DNA at the time of detection or for quanti-
`tative monitoring during therapy. Analyses of the nine patient plasma
`samples with plasma rearrangements were found to have concentra-
`tions of circulating mutant DNA ranging from 4.7 to 47.9% as deter-
`mined by digital PCR with a PARE biomarker or as estimated using
`chromosomal representation (see Materials and Methods). The ab-
`sence of detected rearrangements in the plasma of CRC12 could be
`a result either of the absence of structural alterations in the tumor
`DNA or of the failure to detect such rearrangements in the plasma
`DNA. To distinguish between these possibilities, we searched for re-
`arrangements in the DNA of this patient’s tumor using whole-genome
`sequencing (Table 1). We identified three rearrangements and showed
`that rearranged sequences were indeed present in the plasma DNA, the
`matching tumor, and the CRC12 plasma DNA library with PCR using
`primers that spanned the rearranged sequences. Using digital PCR, we
`estimated that the fraction of circulating mutant DNA in the plasma
`of this patient was 1.4%. These analyses suggested that obtaining ad-
`ditional sequence data would likely have identified the tumor rear-
`rangements in this plasma sample.
`
`Sensitivity of detection
`As shown above, the sequence information obtained from circulating
`plasma DNA can be analyzed in an integrated fashion to obtain a com-
`prehensive analysis of chromosome content and rearrangements of
`the same sample. For cases CRC14, CRC15, and CRC16, multiple sam-
`ples, including plasma and the primary tumor, were available and
`could be directly examined for chromosomal abnormalities (6). These
`analyses allowed us to evaluate similarities between copy number al-
`terations in plasma and the primary tumor and the sensitivity of de-
`tecting alterations in plasma during disease progression. For CRC15
`and CRC16, we analyzed primary tumor samples and plasma from
`33 and 50 months after surgery, respectively. For CRC14, the analyzed
`samples included plasma at the time of initial evaluation (CRC14-0),
`tumor tissue obtained from surgical resection 1 week later (CRC14-PT),
`
`plasma from 4 months after diagnosis subsequent to chemother-
`apy and resection of a metastatic lesion (CRC14-4), and plasma from
`62 months after diagnosis at which time the tumor had recurred
`(CRC14). The analyses were normalized to the amount of sequence
`data obtained, and chromosomal representation analyses using DK
`of tumor and plasma DNA samples are shown in Fig. 4 and fig. S3.
`The copy number patterns observed for plasma samples at the time
`of initial evaluation (CRC14-0) and recurrence (CRC14) were marked-
`ly similar to those of the resected tumor (CRC14-PT), with significant
`losses of chromosomes 1p, 4q, 14q, and 22q and gains of chromosomes
`13q and 20q (for each alteration, P < 0.05, Student’s t test) (Fig. 4).
`Likewise, similar patterns of chromosomal alterations between plasma
`samples and primary tumors were observed for CRC15 and CRC16
`(table S2 and fig. S3). Overall, for samples for which tumors were also
`analyzed using our stringent bioinformatic criteria (CRC12, CRC14,
`CRC15, and CRC16), most of the detected structural alterations in the
`tumors were independently identified in the plasma (67 of 125, tables S2
`and S4). For plasma sample CRC14-4, although the copy number
`graphs appeared similar to those derived from normal DNA, there
`was still significant alteration in chromosomal arm content (PA score =
`6.6) (table S4 and Fig. 4). The fraction of mutant tumor DNA in CRC14-4
`was previously measured using a PARE biomarker to be 0.3% (6),
`consistent with the predicted sensitivity of the approach using the avail-
`able sequencing data (see Materials and Methods).
`To evaluate the potential sensitivity and specificity of applying the
`DK approach to cell-free DNA for discriminating individuals with co-
`lorectal and breast cancer from healthy individuals, we performed
`receiver operating characteristic (ROC) analyses of simulated next-
`generation sequencing data from 81 cancer patients and 10,000 simu-
`lated normal controls based on data from 10 healthy individuals. For
`the 81 tumor cases, chromosomal arm alterations were determined using
`previously obtained genome-wide copy number information from SNP
`arrays of 36 colorectal cancers and 45 breast cancers (25). These analy-
`ses simulated mixtures of different concentrations of each tumor DNA
`with normal DNA (as would be expected in the circulation of cancer
`patients) using the experimentally observed means and SDs for each
`chromosome arm proportion for unaffected individuals (N1 to N10)
`(table S2, see Materials and Methods). Using the equivalent of one
`lane of HiSeq reads, these analyses suggested that tumor DNA concen-
`trations at levels ≥0.75% could be detected in the circulation of pa-
`tients with breast and colorectal cancers with a sensitivity of >90% and
`specificity of >99% when the five chromosome arms with the largest ab-
`solute z scores were evaluated (Fig. 5). Analyses of the single most altered
`chromosome arm (17p) were much less sensitive, and a specificity of
`>99% could only be achieved with circulating tumor DNA concentra-
`tions of 5% or more. This single chromosome arm sensitivity is in
`accord with the results of previously described approaches for the de-
`tection of fetal trisomy 21 in maternal DNA (8, 9) (Fig. 5).
`To determine the potential of this approach for detecting circulat-
`ing tumor DNA at levels below 0.75%, we used simulations to predict
`the amount of sequencing required to achieve >90% sensitivity and
`>99% specificity using both copy number and rearrangement analyses
`(fig. S4). These simulations showed that the ability to detect chromo-
`somal arm gains or losses increased proportionately as one over the
`square root of the number of reads obtained, similar to that expected
`from previous analyses of circulating fetal DNA (8, 9) (see Materials
`and Methods). On the other hand, the lower limit of detection of rear-
`ranged sequences decreased proportionately as one over the total number
`
`www.ScienceTranslationalMedicine.org
`
`28 November 2012
`
`Vol 4 Issue 162 162ra154
`
`6
`
`00006
`
`
`
`R E S E A R C H A R T I C L E
`
`Fig. 4. Copy number analyses of tumor and serial plasma samples from
`patient CRC14. CRC14 primary tumor and plasma samples taken at various
`time points over 62 months of multimodality treatment were analyzed
`using DK in nonoverlapping 1-Mb windows and compared with un-
`
`matched normal plasma using the same methodology. The plasma sam-
`ples were obtained at the time of initial evaluation (0 months), after
`extensive chemotherapy and surgical intervention (4 months), and at the
`time of cancer recurrence (62 months).
`
`of reads obtained. This sug-
`gests that the sensitivity of
`PARE for detecting very low
`levels of circulating tumor
`DNA is higher than DK when
`assessing a large number of
`reads (>109). One advantage
`of an integrated approach
`using both methods is that
`the overall detection limit
`of the combined approaches
`would be expected to be the
`greater of the two at any giv-
`en sequence depth.
`
`DISCUSSION
`
`Fig. 5. Detection of circulating tumor DNA in breast and colon cancers using simulated copy number analyses. ROC
`analyses of simulated mixtures of breast cancer DNA (left) or colorectal cancer DNA (right) with normal plasma DNA
`using the PA score derived from the five chromosomal arm copy number alterations with the highest absolute z scores
`in each sample. Detection of 0.75% circulating tumor DNA could be achieved with a sensitivity of >90% and specificity of
`>99% using the equivalent of one HiSeq lane of sequencing and a fixed PA score threshold in both tumor types (see
`Materials and Methods). ROC analyses of a z score from a single chromosome arm, 17p, were similar to chance alone at
`this simulated tumor DNA concentration in the plasma.
`
`In this study, we have dem-
`onstrated the feasibility of
`directly detecting chromo-
`somal alterations in the plasma of cancer patients. Like many large-
`scale genomic analyses, our approach has limitations. First, sensitivity is
`largely dependent on the amount of sequence data obtained. Previous
`studies of circulating tumor DNA have shown that a sensitivity of
`<0.10% is often needed to detect patients with potentially curative tumors
`(17, 18). Currently, the cost of the sequencing necessary for detection
`of rearrangements at this level is prohibitive for routine clinical imple-
`mentation. Detection of chromosomal copy number changes requires
`
`less sequencing and has been shown to be feasible at levels of 0.75% in
`this study. Next-generation methods aimed at detection of somatic al-
`terations in known driver genes require substantially less sequencing
`but are limited to patients with alterations in the analyzed regions
`(19). If sequencing technologies continue to improve at their current
`pace (28), the amount of sequencing needed for detection of whole-
`genome structural alterations will soon become affordable. Although
`substantial clinical studies will be needed to determine the use for
`
`www.ScienceTranslationalMedicine.org
`
`28 November 2012
`
`Vol 4 Issue 162 162ra154
`
`7
`
`00007
`
`
`
`R E S E A R C H A R T I C L E
`
`early-stage disease as well as for direct genotyping of specific struc-
`tural alterations, detection of medium- to late-stage tumors, such as
`those analyzed in this study, may provide clinical benefit for certain
`tumor types (29).
`Second, the stringent approaches used to enrich for bona fide so-
`matic alterations may fail to detect certain structural alterations (for
`example, small rearrangements or copy number changes), thereby un-
`derestimating the number of total alterations present in a sample.
`Analysis of a larger number of normal DNA samples combined with
`deeper sequencing may allow for a more comprehensive detection of
`somatic alterations in the plasma. Additional development of meth-
`ods for detection of point mutations in cell-free DNA may provide a
`complementary approach for detecting disease in a subset of patients
`(17–19). Third, previously undetected constitutional germline or mo-
`saic structural alterations along with mapping or sequence artifacts
`could lead to false positives in individual patients (30–32). Performing
`a comparative sequence analysis of plasma to another normal tissue
`(for example, buccal, skin, or lymphocyte DNA) from the same indi-
`vidual could help minimize this issue by removal of such alterations.
`Fourth, the information obtained through these analyses does not di-
`rectly indicate the source of circulating tumor DNA. Further clinical
`evaluation combined with imaging studies will be helpful to determine
`the tumor location and subsequent interventions.
`Despite these limitations, the combination of these methods has
`the potential to detect cancers in a noninvasive, specific, and unbiased
`manner. Direct identification of amplified genes in patient plasma
`may provide information for potential therapeutic targets without
`the need for tumor biopsies. Given the major contribution to morbid-
`ity and mortality caused by delayed diagnosis of primary or recurrent
`tumors, the approach described here, combined with further advances
`in sequencing technologies, has the potential to improve patient man-
`agement and outcomes.
`
`MATERIALS AND METHODS
`
`Sample collection and preparation
`Plasma samples were collected from cancer patients CRC11 to CRC17,
`BR1 to BR3 (4 to 17 ml each), and unrelated normal controls N1 to N10
`(5 to 18 ml each). Matching tumor DNA samples were obtained from
`patients’ formalin-fixed paraffin-embedded (FFPE) surgically resected
`tumor. Normal DNA samples were obtained from either matched lym-
`phocytes or matched normal FFPE tissue obtained at the time of surgery.
`Whole-genome sequence data from normal lymphocytes of 28 individ-
`uals with neuroblastoma or pancreatic cancer were obtained from previ-
`ous studies (33). Genotype and focal amplification data from 36 colorectal
`and 45 breast cancer samples, representing early-passage cell lines or
`xenografts established from patients with late-stage disease, were ob-
`tained from previous studies (25). All samples were obtained in accord-
`ance with the Health Insurance Portability and Accountability Act.
`
`no. 28104, Qiagen), and eluted with 45 ml of elution buffer (EB) pre-
`warmed to 70°C. I