`sequencing DNA from maternal blood
`
`H. Christina Fan*, Yair J. Blumenfeld†, Usha Chitkara†, Louanne Hudgins‡, and Stephen R. Quake*§
`
`*Department of Bioengineering, Stanford University and Howard Hughes Medical Institute, 318 Campus Drive, Clark Center, Room E300, Stanford, CA
`94305; †Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, Stanford University, 300 Pasteur Drive, Room HH333, Stanford, CA
`94305; and ‡Division of Medical Genetics, Department of Pediatrics, Stanford University, 300 Pasteur Drive, Stanford, CA 94305
`
`Communicated by Leonard A. Herzenberg, Stanford University School of Medicine, Stanford, CA, August 22, 2008 (received for review July 13, 2008)
`
`We directly sequenced cell-free DNA with high-throughput shotgun
`sequencing technology from plasma of pregnant women, obtaining,
`on average, 5 million sequence tags per patient sample. This enabled
`us to measure the over- and underrepresentation of chromosomes
`from an aneuploid fetus. The sequencing approach is polymorphism-
`independent and therefore universally applicable for the noninvasive
`detection of fetal aneuploidy. Using this method, we successfully
`identified all nine cases of trisomy 21 (Down syndrome), two cases of
`trisomy 18 (Edward syndrome), and one case of trisomy 13 (Patau
`syndrome) in a cohort of 18 normal and aneuploid pregnancies;
`trisomy was detected at gestational ages as early as the 14th week.
`Direct sequencing also allowed us to study the characteristics of
`cell-free plasma DNA, and we found evidence that this DNA is
`enriched for sequences from nucleosomes.
`
`fetal DNA 兩 next-generation sequencing 兩 noninvasive prenatal diagnosis 兩
`Down syndrome 兩 trisomy
`
`Fetal aneuploidy and other chromosomal aberrations affect 9 of
`
`1,000 live births (1). The gold standard for diagnosing chromo-
`somal abnormalities is karyotyping of fetal cells obtained via
`invasive procedures such as chorionic villus sampling and amnio-
`centesis. These procedures impose small but potentially significant
`risks to both the fetus and the mother (2). Noninvasive screening
`of fetal aneuploidy using maternal serum markers and ultrasound
`are available but have limited reliability (3–5). There is therefore a
`desire to develop noninvasive genetic tests for fetal chromosomal
`abnormalities.
`Since the discovery of intact fetal cells in maternal blood, there
`has been intense interest in trying to use them as a diagnostic
`window into fetal genetics (6–9). Although this has not yet moved
`into practical application (10), the later discovery that significant
`amounts of cell-free fetal nucleic acids also exist in maternal
`circulation has led to the development of new noninvasive prenatal
`genetic tests for a variety of traits (11, 12). However, measuring
`aneuploidy remains challenging because of the high background of
`maternal DNA; fetal DNA often constitutes ⬍10% of total DNA
`in maternal cell-free plasma (13). Recently developed methods for
`aneuploidy detection focus on allelic variation between the mother
`and the fetus. Lo et al. (14) demonstrated that allelic ratios of
`placental-specific mRNA in maternal plasma could be used to
`detect trisomy 21 (T21) in certain populations. Similarly, they also
`showed the use of allelic ratios of imprinted genes in maternal
`plasma DNA to diagnose trisomy 18 (T18) (15). Dhallan et al. (16)
`used fetal-specific alleles in maternal plasma DNA to detect trisomy
`21. However, these methods are limited to specific populations
`because they depend on the presence of genetic polymorphisms at
`specific loci. We and others argued that it should be possible, in
`principle, to use digital PCR to create a universal, polymorphism-
`independent test for fetal aneuploidy by using maternal plasma
`DNA (17–19), but because of technical challenges relating to the
`low fraction of fetal DNA, such a test has not yet been practically
`realized.
`An alternative method to achieve digital quantification of DNA
`is direct shotgun sequencing, followed by mapping to the chromo-
`
`some of origin and enumeration of fragments per chromosome.
`Recent advances in DNA-sequencing technology allow massively
`parallel sequencing (20), producing tens of millions of short se-
`quence tags in a single run and enabling a deeper sampling than can
`be achieved by digital PCR. By counting the number of sequence
`tags mapped to each chromosome, the over- or underrepresenta-
`tion of any chromosome in maternal plasma DNA contributed by
`an aneuploid fetus can be detected. This method does not require
`the differentiation of fetal versus maternal DNA, and with large
`enough tag counts, it can be applied to arbitrarily small fractions of
`fetal DNA. We demonstrate here the successful use of shotgun
`sequencing to detect fetal trisomy 21 (Down syndrome), trisomy 18
`(Edward syndrome), and trisomy 13 (T13) (Patau syndrome)
`noninvasively by using cell-free fetal DNA in maternal plasma. This
`forms the basis of a universal, polymorphism-independent nonin-
`vasive diagnostic test for fetal aneuploidy. The sequence data also
`allowed us to characterize plasma DNA in unprecedented detail,
`suggesting that it is enriched for nucleosome-bound fragments.
`
`Results
`Shotgun Sequencing of Cell-Free Plasma DNA. Cell-free plasma DNA
`from 18 pregnant women and a male donor, as well as whole-blood
`genomic DNA from the same male donor, were sequenced on the
`Solexa/Illumina platform. We obtained on average ⬇10 million
`25-bp sequence tags per sample. Approximately 50% (i.e., ⬇5
`million) of the reads mapped uniquely to the human genome with,
`at most, one mismatch against the human genome, covering ⬇4%
`of the entire genome. An average of ⬇154,000, ⬇135,000, and
`⬇65,700 sequence tags mapped to chromosomes 13, 18, and 21,
`respectively. The number of sequence tags for each sample is
`detailed in supporting information (SI) Table S1.
`We observed a nonuniform distribution of sequence tags across
`each chromosome. This pattern of intrachromosomal variation was
`common among all samples, including randomly sheared genomic
`DNA, indicating that the observed variation was most probably due
`to sequencing artifacts. We applied a sliding window of 50 kb across
`each chromosome and counted the number of tags falling within
`each window. The median count per 50-kb window for each
`chromosome was selected. The median of the autosomal values was
`used as a normalization constant to account for the differences in
`
`Author contributions: H.C.F., Y.J.B., U.C., L.H., and S.R.Q. designed research; H.C.F. per-
`formed research; H.C.F. analyzed data; Y.J.B., U.C., and L.H. designed the IRB-approved
`clinical protocol and coordinated patient recruitment and enrollment; and H.C.F., Y.J.B.,
`and S.R.Q. wrote the paper.
`
`Conflict of interest statement: S.R.Q. is a founder, shareholder, and consultant of Fluidigm
`Corporation. S.R.Q. and H.C.F. have applied for a patent relating to the method described
`in this study. Other authors declare no conflict of interest.
`
`Freely available online through the PNAS open access option.
`
`Data deposition: Sequence data have been deposited at the National Center for Biotech-
`nology Information short read archive (www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi), acces-
`sion no. SRA001174.
`§To whom correspondence should be addressed. E-mail: quake@stanford.edu.
`
`This article contains supporting information online at www.pnas.org/cgi/content/full/
`0808319105/DCSupplemental.
`
`© 2008 by The National Academy of Sciences of the USA
`
`16266 –16271 兩 PNAS 兩 October 21, 2008 兩 vol. 105 兩 no. 42
`
`www.pnas.org兾cgi兾doi兾10.1073兾pnas.0808319105
`
`Downloaded by guest on January 21, 2022
`
`00001
`
`EX1053
`
`
`
`1.5
`
`1.4
`
`1.3
`
`1.2
`
`1.1
`
`1
`
`0.9
`
`0.8
`
`0.7
`
`0.6
`
`A
`
`sequence tag density relative to the corresponding value of
`
`gDNA control
`
`4
`
`B
`
`1.2
`
`1.15
`
`1.1
`
`1.05
`
`1
`
`0.95
`
`sequence tag density of chromosome 21 relative to the
`
`median value of disomy 21 cases
`
`MEDICALSCIENCES
`
`Fetal aneuploidy is detectable by the overrep-
`Fig. 1.
`resentation of the affected chromosome in maternal
`blood. (A) Sequence tag density relative to the corre-
`sponding value of genomic DNA control; chromo-
`somes are ordered by increasing GC content. (B) Chro-
`mosome 21 sequence tag density relative to the
`median chromosome 21 sequence tag density of the
`normal cases. Note that the values of three disomy 21
`cases overlap at 1.0. The dashed line represents the
`upper boundary of the 99% confidence interval con-
`structed from all disomy 21 samples. Number of disomy
`21 samples ⫽ 9. Number of trisomy 21 samples ⫽ 9.
`
`trisomy 21 fetuses
`disomy 21 fetuses
`adult male plasma DNA
`
`chromosome 21
`
`total number of sequence tags obtained for different samples.
`(From this point forward, ‘‘sequence tag density’’ refers to the
`normalized value and is used for comparing different samples and
`for subsequent analysis). The interchromosomal variation within
`each sample was also consistent among all samples (including
`genomic DNA control). The mean sequence tag density of each
`chromosome correlates with the GC content of the chromosome
`(P ⬍ 10⫺9) (Fig. S1 A and B). The standard deviation of sequence
`tag density for each chromosome also correlates with the absolute
`degree of deviation in chromosomal GC content from the genome-
`wide GC content (P ⬍ 10⫺12) (Fig. S1 A and C). The GC content
`of sequenced tags of all samples (including the genomic DNA
`control) was, on average, ⬇10% higher than the value of the
`sequenced human genome (41%) (21) (Table S1), suggesting that
`there is a strong GC bias stemming from the sequencing process.
`We plotted in Fig. 1A the sequence tag density for each chromo-
`some (ordered by increasing GC content) relative to the corre-
`sponding value of the genomic DNA control to remove such bias.
`
`Detection of Fetal Aneuploidy. The distribution of chromosome 21
`sequence tag density for all nine T21 pregnancies is clearly sepa-
`rated from that of pregnancies bearing disomy 21 fetuses (P ⬍ 10⫺5,
`Student’s t test) (Fig. 1 A and B). The coverage of chromosome 21
`for T21 cases is ⬇4–18% higher (average ⬇11%) than that of the
`disomy 21 cases. Because the sequence tag density of chromosome
`21 for T21 cases should be (1 ⫹ /2) of that of disomy 21
`pregnancies, where is the fraction of total plasma DNA originat-
`ing from the fetus (see SI Appendix for derivations), such increase
`in chromosome 21 coverage in T21 cases corresponds to a fetal
`DNA fraction of ⬇8–35% (average ⬇23%) (Table S1 and Fig. 2).
`We constructed a 99% confidence interval of the distribution of
`chromosome 21 sequence tag density of disomy 21 pregnancies. The
`values for all nine T21 cases lie outside the upper boundary of the
`confidence interval, and those for all nine disomy 21 cases lie below
`the boundary (Fig. 1B). If we used the upper bound of the
`confidence interval as a threshold value for detecting T21, the
`minimum fraction of fetal DNA that would be detected is ⬇2%.
`
`Fan et al.
`
`PNAS 兩 October 21, 2008 兩 vol. 105 兩 no. 42 兩 16267
`
`Downloaded by guest on January 21, 2022
`
`13
`
`5
`
`6
`
`3
`
`18
`
`8
`
`2
`
`14
`21
`12
`chromosome
`plasma DNA from woman bearing T21 fetus
`plasma DNA from woman bearing normal fetus
`plasma DNA from woman bearing T18 fetus
`plasma DNA from woman bearing T13 fetus
`plasma DNA from normal adult male
`
`7
`
`9
`
`11
`
`10
`
`1
`
`15
`
`20
`
`16
`
`17
`
`22
`
`19
`
`00002
`
`
`
`quence tags of the sex chromosomes for male pregnancies. By
`comparing the sequence tag density of chromosome Y of plasma
`DNA from male pregnancies to that of adult male plasma DNA, we
`estimated fetal DNA percentage to be, on average, ⬇19% (range:
`4–44%) for all male pregnancies (Table S1 and Fig. 2). Because
`human males have one fewer chromosome X than human females,
`the sequence tag density of chromosome X in male pregnancies
`should be (1 ⫺ /2) of that of female pregnancies, where is fetal
`DNA fraction (see SI Appendix for derivation). We indeed observed
`underrepresentation of chromosome X in male pregnancies as
`compared with that of female pregnancies (Fig. S2). Based on the
`data from chromosome X, we estimated fetal DNA percentage to
`be, on average, ⬇19% (range: 8–40%) for all male pregnancies
`(Table S1 and Fig. 2). The fetal DNA percentage estimated from
`chromosomes X and Y for each male pregnancy sample correlated
`with each other (P ⫽ 0.0015) (Fig. S3).
`We plotted in Fig. 2 the fetal DNA fraction calculated from the
`overrepresentation of trisomic chromosome in aneuploid pregnan-
`cies and the underrepresentation of chromosome X and the pres-
`ence of chromosome Y for male pregnancies against gestational
`age. The average fetal DNA fraction for each sample correlates
`with gestational age (P ⫽ 0.0051), a trend that is also previously
`reported (13).
`
`Size Distribution of Cell-Free Plasma DNA. We analyzed the sequenc-
`ing libraries with a commercial lab-on-a-chip capillary electro-
`phoresis system. There is a striking consistency in the peak fragment
`size, as well as the distribution around the peak, for all plasma DNA
`samples, including those from pregnant women and male donor.
`The peak fragment size was, on average, 261 bp (range: 256–264 bp)
`(Fig. S4). Subtracting the total length of the Solexa adaptors (92 bp)
`from 261 bp gives 169 bp as the actual peak fragment size. This size
`corresponds to the length of DNA wrapped in a chromatosome,
`which is a nucleosome bound to a H1 histone (24). Because the
`library preparation includes an 18-cycle PCR, there are concerns
`that the distribution might be biased. To verify that the size
`distribution observed in the electropherograms is not an artifact of
`PCR, we also sequenced cell-free plasma DNA from a pregnant
`woman carrying a male fetus by using the 454 platform. The sample
`preparation for this system uses emulsion PCR, which does not
`require competitive amplification of the sequencing libraries and
`creates product that is largely independent of the amplification
`efficiency. The size distribution of the reads mapped to unique
`locations of the human genome resembled those of the Solexa
`sequencing libraries, with a predominant peak at 176 bp, after
`subtracting the length of 454 universal adaptors (Fig. 3 and Fig. S5).
`These findings suggest that the majority of cell-free DNA in the
`plasma is derived from apoptotic cells, in accordance with previous
`findings (22, 23, 25, 26).
`Of particular interest is the size distribution of maternal and fetal
`DNA in maternal cell-free plasma. Two groups have previously
`shown that the majority of fetal DNA has size range of that of
`mononucleosome (⬍200–300 bp), whereas maternal DNA is
`longer (22, 23). Because 454 sequencing has a targeted read length
`of 250 bp, we interpreted the small peak at ⬇250 bp (Fig. 3 and Fig.
`S5) as the instrumentation limit from sequencing higher-molecular-
`mass fragments. We plotted the distribution of all reads and those
`mapped to Y chromosome (Fig. 3). We observed a slight depletion
`of Y-chromosome reads in the higher end of the distribution. Reads
`⬍220 bp constitute 94% of Y-chromosome and 87% of the total
`reads. Our results are not in complete agreement with previous
`findings in that we do not see as dramatic an enrichment of fetal
`DNA at short lengths (22, 23). Future studies will be needed to
`resolve this point and to eliminate any potential residual bias in the
`454 sample preparation process, but it is worth noting that the
`ability to sequence single plasma samples permits one to measure
`the distribution in length enrichments across many individual
`
`R2 = 0.3971
`
`normal male, estimated from chrX
`normal male, estimated from chrY
`T21 male, estimated from chrX
`T21 male, estimated from chrY
`T21, estimated from chr21
`T18 male, estimated from chrX
`T18 male, estimated from chrY
`T18, estimated from chr18
`T13 male, estimated from chrX
`T13 male, estimated from chrY
`T13, estimated from chr13
`detection limit
`
`50
`
`45
`
`40
`
`35
`
`30
`
`25
`
`20
`
`15
`
`10
`
`05
`
`percentage of maternal cell-free DNA that
`
`originiates from the fetus (%)
`
`0
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`gestational age (weeks)
`
`Fetal DNA fraction and gestational age. The fraction of fetal DNA in
`Fig. 2.
`maternal plasma correlates with gestational age. Fetal DNA fraction was esti-
`mated in three different ways: (i) from the additional amount of chromosomes
`13, 18, and 21 sequences for T13, T18, and T21 cases, respectively; (ii) from the
`depletion in amount of chromosome X sequences for male cases; (iii) from the
`amount of chromosome Y sequences present for male cases. The horizontal
`dashed line represents the estimated minimum fetal DNA fraction required for
`the detection of aneuploidy. For each sample, the values of fetal DNA fraction
`calculated from the data of different chromosomes were averaged. There is a
`statistically significant correlation between the average fetal DNA fraction and
`gestational age (P ⫽ 0.0051). The dashed line represents the simple linear regres-
`sion line between the average fetal DNA fraction and gestational age. The R2
`value represents the square of the correlation coefficient.
`
`Plasma DNA of pregnant women carrying T18 fetuses (two
`cases) and a T13 fetus (one case) were also directly sequenced.
`Overrepresentation was observed for chromosomes 18 and 13 in
`T18 and T13 cases, respectively (Fig. 1A). Although there were not
`enough positive samples to measure a representative distribution, it
`is encouraging that all of these three positives are outliers from the
`distribution of disomy values. The T18 are large outliers and are
`clearly statistically significant (P ⬍ 10⫺7), whereas the statistical
`significance of the single T13 case is marginal (P ⬍ 0.05). Fetal
`DNA fraction was also calculated from the overrepresented chro-
`mosome as described above (Fig. 2 and Table S1).
`
`Fetal DNA Fraction in Maternal Plasma. Using digital TaqMan PCR
`for a single locus on chromosome 1, we estimated the average
`cell-free DNA concentration in the sequenced maternal plasma
`samples to be ⬇360 cell equivalents per milliliter of plasma (range:
`57–761 cell equivalents per milliliter of plasma) (Table S1), in rough
`accordance with previously reported values (13). The cohort in-
`cluded 12 male pregnancies (6 normal cases, 4 T21 cases, 1 T18 case,
`and 1 T13 case) and 6 female pregnancies (5 T21 cases and 1 T18
`case). DYS14, a multicopy locus on chromosome Y, was detectable
`in maternal plasma by real-time PCR in all these pregnancies but
`not in any of the female pregnancies (data not shown). The fraction
`of fetal DNA in maternal cell-free plasma DNA is usually deter-
`mined by comparing the amount of fetal-specific locus (such as the
`SRY locus on chromosome Y in male pregnancies) to that of a locus
`on any autosome that is common to both the mother and the fetus
`by using quantitative real-time PCR (13, 22, 23). We applied a
`similar duplex assay on a digital PCR platform (see Materials and
`Methods) to compare the counts of the SRY locus and a locus on
`chromosome 1 in male pregnancies. SRY locus was not detectable
`in any plasma DNA samples from female pregnancies. We found
`with digital PCR that for the majority samples, fetal DNA consti-
`tuted ⱕ10% of total DNA in maternal plasma (Table S1), agreeing
`with previously reported values (13).
`The percentage of fetal DNA among total cell-free DNA in
`maternal plasma can also be calculated from the density of se-
`
`16268 兩 www.pnas.org兾cgi兾doi兾10.1073兾pnas.0808319105
`
`Fan et al.
`
`Downloaded by guest on January 21, 2022
`
`00003
`
`
`
`MEDICALSCIENCES
`
`we saw that for most plasma DNA samples, at least three well
`positioned nucleosomes downstream of transcription start sites
`could be detected, and in some cases, up to five well positioned
`nucleosomes could be detected, in rough accordance with the
`results of Schones et al. (27) (Fig. 4 and Fig. S6). We applied the
`same analysis on sequence tags of randomly sheared genomic DNA
`and observed no obvious pattern in tag localization, although the
`density of tags was higher at the transcription start site (Fig. 4).
`
`Discussion
`Noninvasive prenatal diagnosis of aneuploidy has been a challeng-
`ing problem because fetal DNA constitutes a small percentage of
`total DNA in maternal blood (13), and intact fetal cells are even
`rarer (6, 7, 9, 31, 32). We showed in this study the successful
`development of a truly universal, polymorphism-independent non-
`invasive test for fetal aneuploidy. By directly sequencing maternal
`plasma DNA, we could detect fetal trisomy 21 as early as the 14th
`week of gestation. The use of cell-free DNA instead of intact cells
`allows one to avoid complexities associated with microchimerism
`and foreign cells that might have colonized the mother; these cells
`occur at such low numbers that their contribution to the cell-free
`DNA is negligible (33, 34). Furthermore, there is evidence that
`cell-free fetal DNA clears from the blood to undetectable levels
`within a few hours of delivery and therefore is not carried forward
`from one pregnancy to the next (35–37).
`Rare forms of aneuploidy caused by unbalanced translocations
`and partial duplication of a chromosome are, in principle, detect-
`able by the approach of shotgun sequencing, because the density of
`sequence tags in the triplicated region of the chromosome would be
`higher than the rest of the chromosome. Detecting incomplete
`aneuploidy caused by mosaicism is also possible in principle but may
`be more challenging, because it depends not only on the concen-
`tration of fetal DNA in maternal plasma but also the degree of fetal
`mosaicism. Further studies are required to determine the effec-
`tiveness of shotgun sequencing in detecting these rare forms of
`aneuploidy.
`An advantage of using direct sequencing to measure aneuploidy
`noninvasively is that it is able to make full use of the sample,
`whereas PCR-based methods analyze only a few targeted se-
`quences. In this study, we obtained on average 5 million reads per
`sample in a single run, of which ⬇66,000 mapped to chromosome
`21. Because those 5 million reads represent only a portion of one
`human genome, in principle less than one genomic equivalent of
`DNA is sufficient for the detection of aneuploidy by using direct
`
`0.4
`0.35
`0.3
`0.25
`0.2
`0.15
`0.1
`0.05
`0
`60 80
`100
`180
`160
`140
`120
`320
`300
`200
`280
`260
`240
`220
`length of sequenced DNA fragment (bp)
`
`chromosome Y sequences (%)
`
`cumulative fraction of
`
`total DNA
`chrY DNA (fetal)
`
`0.2
`
`0.15
`
`0.1
`
`0.05
`
`normalized frequency
`
`0
`50 60 70 80 90
`
`100
`
`110
`
`120
`
`240
`200
`260
`250
`220
`230
`210
`180
`190
`170
`160
`150
`140
`130
`size of sequenced fragment (bp)
`
`270
`
`280
`
`290
`
`300
`
`310
`
`320
`
`Size distribution of maternal and fetal DNA in maternal plasma. A
`Fig. 3.
`histogram showing the size distribution of total and chromosome Y-specific
`fragments obtained from 454 sequencing of maternal plasma DNA from a
`normal male pregnancy is presented. The distribution is normalized to sum to
`1. The numbers of total reads and reads mapped to the Y chromosome are
`144,992 and 178, respectively. (Inset) Cumulative fetal DNA fraction as a
`function of sequenced fragment size. The error bars correspond to the stan-
`dard error of the fraction estimated, assuming that the error of the counts of
`sequenced fragments follow Poisson statistics.
`
`patients rather than measuring the average length enrichment of
`pooled patient samples.
`
`Cell-Free Plasma DNA Shares Features of Nucleosomal DNA. Because
`our observations of the size distribution of cell-free plasma DNA
`suggested that plasma DNA is mainly apoptotic in origin, we
`investigated whether features of nucleosomal DNA and positioning
`are found in plasma DNA. One such feature is nucleosome
`positioning around transcription start sites. Experimental data from
`yeast and human have suggested that nucleosomes are depleted in
`promoters upstream of transcription start sites, and nucleosomes
`are well positioned near transcription start sites (27–30). We
`applied a 5-bp window spanning ⫾1,000 bp of transcription start
`sites of all RefSeq genes and counted the number of tags mapping
`to the sense and antisense strands within each window. A peak in
`the sense strand represents the beginning of a nucleosome, whereas
`a peak in the antisense strand represents the end. After smoothing,
`
`Fig. 4. Distribution of sequence tags around transcrip-
`tion start sites (TSS) of ReSeq genes on all autosomes and
`chromosome X from plasma DNA sample of a normal
`male pregnancy (Upper) and randomly sheared genomic
`DNA control (Lower). The number of tags within each
`5-bp window was counted within ⫾1,000-bp region
`around each TSS, taking into account the strand to which
`each sequence tag mapped. The counts from all tran-
`scription start sites for each 5-bp window were summed
`and normalized to the median count among the 400
`windows. A moving average was used to smooth the
`data. A peak in the sense strand represents the beginning
`of a nucleosome, whereas a peak in the antisense strand
`represents the end of a nucleosome. In the plasma DNA
`sample shown here, five well positioned nucleosomes are
`observed downstream of transcription start sites and are
`represented as gray ovals. The number within each oval
`represents the distance in base pairs between adjacent
`peaks in the sense and antisense strands, corresponding
`to the size of the inferred nucleosome. No obvious pat-
`tern is observed for the genomic DNA control.
`
`Maternal Plasma DNA from a Male Pregnancy
`175
`165
`160
`
`140
`
`200
`
`antisense
`sense
`
`1.2
`
`1.1
`
`1
`
`0.9
`
`0.8
`
`600
`
`800
`
`1000
`
`0
`bp from TSS
`Randomly Sheared Genomic DNA
`
`200
`
`400
`
`−400
`
`−200
`
`−1000
`
`−800
`
`−600
`
`antisense
`sense
`
`1.4
`
`1.2
`
`1
`
`0.8
`
`−1000
`
`−800
`
`−600
`
`−400
`
`−200
`
`0
`bp from TSS
`
`200
`
`400
`
`600
`
`800
`
`1000
`
`normalized count of sequence tags
`
`normalized count of sequence tags
`
`Fan et al.
`
`PNAS 兩 October 21, 2008 兩 vol. 105 兩 no. 42 兩 16269
`
`Downloaded by guest on January 21, 2022
`
`00004
`
`
`
`sequencing. In practice, a larger amount of DNA was used because
`there is sample loss during sequencing library preparation, but it
`may be possible to further reduce the amount of blood required for
`analysis.
`We observed that certain chromosomes have large variations in
`the counts of sequenced fragments from sample to sample, and that
`this depends strongly on the GC content (Fig. S1 A–C). It is unclear
`at this point whether this stems from PCR artifacts during sequenc-
`ing library preparation or cluster generation or the sequencing
`process itself or whether it is a true biological effect relating to
`chromatin structure. We strongly suspect that it is an artifact
`because we also observe GC bias on genomic DNA control, and
`such bias on the Solexa sequencing platform has recently been
`reported (38, 39). It has a practical consequence because the
`sensitivity to aneuploidy detection will vary from chromosome to
`chromosome; fortunately the most common human aneuploidies
`(such as 13, 18, and 21) have low variation and therefore high
`detection sensitivity. Both this problem and the sample-volume
`limitations may possibly be resolved by the use of single-molecule
`sequencing technologies, which do not require the use of PCR for
`library preparation (40).
`Plasma DNA samples used in this study were obtained ⬇15–30
`min after amniocentesis or chorionic villus sampling. Because these
`invasive procedures disrupt the interface between the placenta and
`maternal circulation, there have been discussions whether the
`amount of fetal DNA in maternal blood might increase after
`invasive procedures. Neither of the studies to date have observed a
`significant effect (41, 42). Our results support this conclusion,
`because using the digital PCR assay, we estimated that fetal DNA
`constituted ⱕ10% of total cell-free DNA in the majority of our
`maternal plasma samples. This is within the range of previously
`reported values in maternal plasma samples obtained before inva-
`sive procedures (13). It would be valuable to have a direct mea-
`surement addressing this point in a future study.
`The average fetal DNA fraction estimated from sequencing data
`of sex chromosomes are higher than the values estimated from
`digital PCR data by an average factor of two (P ⬍ 0.005, paired t
`test on all male pregnancies that have complete set of data). One
`possible explanation for this is that the PCR step during Solexa
`library preparation preferentially amplifies shorter fragments,
`which others have found to be enriched for fetal DNA (22, 23). Our
`own measurements of length distribution on one sample do not
`support this explanation, but we also cannot reject it at this point.
`It should also be pointed out that using the sequence tags, we find
`some variation of fetal fraction even in the same sample depending
`on which chromosome we use to make the calculation (Fig. 2, Fig.
`S3 and Table S1). This is most likely because of artifacts and errors
`in the sequencing and mapping processes, which are substantial—
`recall that only half of the sequence tags map to the human genome
`with one error or less. Finally, it is also possible that the PCR
`measurements are biased because they are only sampling a tiny
`fraction of the fetal genome. These discrepancies will be sorted out
`in future studies as sequencing reliability improves, and our results
`show that they do not materially affect the ability to determine fetal
`aneuploidy.
`Our sequencing data suggest that the majority of cell-free plasma
`DNA is of apoptotic origin and shares features of nucleosomal
`DNA. Because nucleosome occupancy throughout the eukaryotic
`genome is not necessarily uniform and depends on factors such as
`function, expression, or sequence of the region (30, 43), the
`representation of sequences from different loci in cell-free maternal
`plasma may not be equal, as one usually expects in genomic DNA
`extracted from intact cells. Thus, the quantity of a particular locus
`may not be representative of the quantity of the entire chromosome,
`and care must be taken when one designs assays for measuring gene
`dosage in cell-free maternal plasma DNA that target only a few loci.
`Historically, because of risks associated with chorionic villus
`sampling and amniocentesis, invasive diagnosis of fetal aneuploidy
`
`was primarily offered to women who were considered at risk of
`carrying an aneuploid fetus based on evaluation of risk factors such
`as maternal age, levels of serum markers, and ultrasonographic
`findings. Recently, an American College of Obstetricians and
`Gynecologists Practice Bulletin recommended that ‘‘invasive diag-
`nostic testing for aneuploidy should be available to all women,
`regardless of maternal age’’ and that ‘‘pretest counseling should
`include a discussion of the risks and benefits of invasive testing
`compared with screening tests’’ (2). A noninvasive genetic test
`based on the results described here and in future large-scale studies
`would presumably carry the best of both worlds: minimal risk to the
`fetus while providing true genetic information. The costs of the
`assay are already fairly low; the sequencing cost per sample is
`approximately $700, and the cost of sequencing is expected to
`continue to drop dramatically in the near future.
`In conclusion, we demonstrated the use of massively parallel
`sequencing to detect fetal aneuploidy noninvasively with maternal
`cell-free plasma DNA. Shotgun sequencing can potentially reveal
`many more previously unknown features of cell-free nucleic acids
`such as plasma mRNA distributions, as well as epigenetic features
`of plasma DNA such as DNA methylation and histone modifica-
`tion, in fields including perinatology, oncology, and transplantation,
`thereby improving our understanding of the basic biology of
`pregnancy, early human development, and disease.
`
`Materials and Methods
`Subject Enrollment. The study was approved by the Institutional Review Board of
`Stanford University. Pregnant women at risk for fetal aneuploidy were recruited
`at the Lucile Packard Children’s Hospital Perinatal Diagnostic Center of Stanford
`University during the period of April 2007 to May 2008. Informed consent was
`obtained from each participant before the blood draw. Blood was collected
`15–30 min after amniocentesis or chorionic villus sampling except for one sample
`that was collected during the third trimester. Karyotype analysis was performed
`via amniocentesis or chorionic villus sampling to confirm fetal karyotype. Nine
`T21, 2 T18, 1 T13, and 6 normal singleton pregnancies were included in this study.
`The gestational age of the subjects at the time of blood draw ranged from 10 to
`35 weeks (Table S1). A blood sample from a male donor was obtained from the
`Stanford Blood Center.
`
`Sample Processing and DNA Quantification. Seven to 15 ml of peripheral blood
`drawn from each subject and donor was collected in EDTA tubes. Blood was
`centrifuged at 1,600 ⫻ g for 10 min. Plasma was transferred to microcentrifuge
`tubes and centrifuged at 16,000 ⫻ g for 10 min to remove residual c