`
`doi:10.1038/nature11251
`
`Non-invasive prenatal measurement of
`the fetal genome
`
`H. Christina Fan1{*, Wei Gu1*, Jianbin Wang1, Yair J. Blumenfeld2, Yasser Y. El-Sayed2 & Stephen R. Quake1,3,4
`
`The vast majority of prenatal genetic testing requires invasive sampling. However, this poses a risk to the fetus, so one
`must make a decision that weighs the desire for genetic information against the risk of an adverse outcome due to hazards
`of the testing process. These issues are not required to be coupled, and it would be desirable to discover genetic
`information about the fetus without incurring a health risk. Here we demonstrate that it is possible to non-invasively
`sequence the entire prenatal genome. Our results show that molecular counting of parental haplotypes in maternal
`plasma by shotgun sequencing of maternal plasma DNA allows the inherited fetal genome to be deciphered
`non-invasively. We also applied the counting principle directly to each allele in the fetal exome by performing exome
`capture on maternal plasma DNA before shotgun sequencing. This approach enables non-invasive exome screening of
`clinically relevant and deleterious alleles that were paternally inherited or had arisen as de novo germline mutations, and
`complements the haplotype counting approach to provide a comprehensive view of the fetal genome. Non-invasive
`determination of the fetal genome may ultimately facilitate the diagnosis of all inherited and de novo genetic disease.
`
`Our work is based on the phenomenon of circulating cell-free DNA,
`whose existence and role in pregnancy was first investigated in 19481.
`A portion of the cell-free DNA in a pregnant woman’s blood is derived
`from the fetus2, and this fact has enabled the development of a number
`of non-invasive prenatal diagnostic techniques3. A prominent
`example is the non-invasive detection of Down syndrome and other
`aneuploidies, which was first demonstrated by our group4, validated
`by clinical trials5–10, and is now available in the clinic. We describe
`here how the chromosome counting principle we invented for
`aneuploidy detection can be applied to non-invasive fetal genome
`analysis by directly counting haplotypes and even individual alleles.
`Others have studied the relationship between maternal and fetal cell-
`free DNA11, but their approach required invasively sampled fetal
`material, did not determine the fetal genome, and also needed
`knowledge of paternal genetic data.
`
`Measuring the fetal genome by counting parental
`haplotypes
`Maternal plasma DNA is a mixture of maternal and fetal DNA; the
`fraction of fetal DNA ranges from a few percent or lower early in
`pregnancy to as high as ,50%2,7, and generally increases with
`gestational age. Because the fetal genome is a combination of the four
`parental chromosomes, or haplotypes, as a result of random assort-
`ment and recombination during meiosis, three haplotypes exist in
`maternal plasma per genomic region: the maternal haplotype that is
`transmitted to the fetus, the maternal haplotype that is not transmitted,
`and the paternal haplotype that is transmitted. If the relative copy
`number of the untransmitted maternal haplotype is 1 2 e, where e
`is the fetal DNA fraction, then the relative copy number of the
`transmitted maternal haplotype is 1, and the relative copy numbers
`of the transmitted and untransmitted paternal haplotypes are e
`and 0, respectively (Fig. 1). Therefore, within each pair of parental
`haplotypes, the transmitted haplotype is over-represented relative to
`
`the untransmitted one. By measuring the relative amount of parental
`haplotypes through counting the number of alleles specific to each
`parental haplotype (referred to as ‘markers’), one can deduce the inher-
`itance of each parental haplotype and hence build the full inherited
`fetal genome.
`Strictly speaking, the markers that define each maternal haplotype
`are the alleles that are present in one maternal haplotype but not in the
`other maternal haplotype and the two paternal haplotypes. However,
`because it is rare that two unrelated persons share the same long-range
`haplotype, that is, a haplotype much longer than the usual length of
`haplotype blocks observed in the population (,100 kilobases (kb)),
`the presence of alleles contributed by the transmitted paternal
`haplotype at these loci would not interfere with the measurement of
`representation of maternal haplotypes as long as the haplotype being
`considered is sufficiently long (.1 megabase (Mb)). Thus all the
`maternal heterozygous loci can be used to define the two maternal
`haplotypes (Fig. 1). This enables the measurement of relative repres-
`entation of the two maternal haplotypes without the knowledge of
`paternal haplotypes. The relative representation of the two maternal
`haplotypes is the difference in the counts of markers specific to each
`haplotype. Even if
`the over-representation of
`the transmitted
`maternal haplotype is small, the over-represented haplotype can be
`identified provided that the counting depth exceeds the counting
`noise, which is governed by Poisson statistics. Supplementary
`Table 1 and Supplementary Fig. 1 provide estimations of counting
`requirement as a function of confidence of measurement and fetal
`DNA percentage in the clinically observed range. Because the number
`of markers that define each parental haplotype increases with
`haplotype length, the longer the phased haplotypes, the lower the
`average number of sampling per individual marker is required for
`confident determination of the over-represented parental haplotypes.
`If paternal haplotypes are known, it is straightforward to determine
`the inherited paternal haplotypes by comparing the sum of count of
`
`1Department of Bioengineering, Stanford University, Clark Center Rm E300, 318 Campus Drive, Stanford, California 94305, USA. 2Division of Maternal-Fetal Medicine, Department of Obstetrics &
`Gynecology, Stanford University School of Medicine, 300 Pasteur Drive, Room HH333, Stanford, California 94305, USA. 3Department of Applied Physics, Stanford University, Clark Center Room E300, 318
`Campus Drive, Stanford, California 94305, USA. 4Howard Hughes Medical Institute, Stanford University, Clark Center Room E300, 318 Campus Drive, Stanford, California 94305, USA.
`{Current address: ImmuMetrix LLC, 552 Del Rey Avenue, Sunnyvale, California 94085, USA.
`*These authors contributed equally to this work.
`
`3 2 0 | N A T U R E | V O L 4 8 7 | 1 9 J U L Y 2 0 1 2
`©2012
`
`Macmillan Publishers Limited. All rights reserved
`
`00001
`
`EX1062
`
`
`
`ARTICLE RESEARCH
`
`Maternal peripheral blood
`
`Whole-genome haplotyping by
` direct deterministic phasing
` with 3–4 single lymphocytes
`
`Reconstruct paternally
`inherited haplotype from
`paternal specific alleles
`and imputation
`Inherited paternal haplotype
`
`Derive a list of alleles that
`define each maternal
`haplotype
`Maternal
`Maternal
`haplotype 1
`haplotype 2
`
`G
`
`A A G
`
`
`
`A G
`
`G
`
`
`SNP1:
`SNP2:
`SNP3:
`SNP4:
`SNP5:
`SNP6:
`SNP7:
`A
`SNP8:
`Alleles that define
`maternal haplotype 1
`Alleles that define
`maternal haplotype 2
`
`A A G
`
`
`G
`G
`G
`
`A A
`
`A G
`
`G
`
`G
`
`A G
`
`A
`
`
`
`A
`
`SNP1:
`SNP2:
`SNP3:
`SNP4:
`SNP5:
`SNP6:
`SNP7:
`SNP8:
`
`Paternal-specific alleles
`Imputable alleles
`
`Exome sequencing
`
`Count and sum alleles on each of the
`maternal haplotypes
`
`Count alleles at individual locus
`
` Determine fetal genotype
`min(A, G)
`A + G
`
`If minor allele fraction
`
`~0
`
`~ε/2
`
`~1 – ε/2
`
`~1/2
`
`fetus homozygous,
`mother homozygous
`fetus heterozygous,
`mother homozygous
`fetus homozygous,
`mother heterozygous
`fetus heterozygous,
`mother heterozygous
`
`Determine which maternal haplotype is transmitted
`Since:
`
`Contribution from:
`
`Mother
`
`Fetus
`
`Count of alleles that
`Count of alleles that
`define the maternal
`define the maternal
`haplotype that is
`haplotype that is
`untransmitted
`transmitted
`N(1 – ε)
`N(1 – ε)
`Nε
`0
`N(1 – ε)
`N
`Total
`ε = fetal DNA fraction; N ≈ number of genome equivalent sampled
`Then, if:
`count
`count
`
`>
`>
`
`count
`count
`
`Maternal haplotype 1 is transmitted
`Maternal haplotype 2 is transmitted
`
`Fetal exome
`
`Fetal genome
`
`Figure 1 | Molecular counting strategies for measuring the fetal genome
`non-invasively from maternal blood only. Genome-wide, chromosome
`length haplotypes of the mother are obtained using direct deterministic
`phasing. The inheritance of maternal haplotypes is revealed by sequencing
`maternal plasma DNA and summing the count of the alleles specific to each
`haplotype at heterozygous loci and determining the relative representation of
`the two alleles. The inherited paternal haplotypes are defined by the paternal-
`
`specific alleles (that is, those that are different from the maternal ones at
`positions where the mother is homozygous). The allelic identity at loci linked to
`the paternal-specific alleles on the paternal haplotype can be imputed.
`Alternatively, molecular counting can be applied directly to count alleles at
`individual loci to determine fetal genotypes via targeted deep sequencing, such
`as exome-enriched sequencing of maternal plasma DNA. For illustrative
`purpose, each locus is biallelic and carries the ‘A’ or ‘G’ alleles.
`
`alleles specific to each paternal haplotype (Supplementary Fig. 2),
`thereby revealing the entire inherited fetal genome. Supplementary
`Fig. 3 and the accompanying Supplementary Information show how
`this could be achieved using sequencing data of a synthetic mixture of
`DNA from a mother and daughter within a fully phased family trio12.
`However, it is not always possible to obtain paternal information; the
`incidence of non-paternity is estimated to be between 3% and 10%13,14,
`making this a particularly delicate issue. In the absence of paternal
`information, the paternally inherited haplotypes can be reconstructed
`via linkage to observed non-maternal (that is, paternal-specific) alleles
`(Fig. 1).
`We verified this approach on samples collected from two pregnancies.
`Pregnant woman P1 carried a female fetus with normal karyotype,
`whereas pregnant woman P2 is an individual with a ,2.85 Mb
`heterozygous deletion on chromosome 22 that is associated with
`DiGeorge syndrome. To obtain phased maternal chromosomes, we
`performed direct deterministic phasing (DDP)15 on three or four
`maternal metaphase cells obtained by culturing maternal whole blood
`(Supplementary Table 2 and Supplementary Fig. 4). DDP involves
`microfluidic separation and amplification of individual metaphase
`chromosomes from single cells followed by genome-wide genotyping
`analysis of amplified materials, and enables each chromosome in the
`genome to be phased along its full length. Genomic DNA of cord blood
`collected at delivery was also genotyped to serve as the true reference
`for fetal genotypes. The true inheritance of maternal haplotypes was
`determined by aligning the homozygous SNPs of the fetus by cord
`blood genotyping against the two maternal haplotypes defined by
`the phased maternal heterozygous SNPs (Fig. 2). The analysis here
`concerns the approximately 1 million positions across the genome
`present on Omni1-Quad genotyping array. Phase information of the
`remaining genomic positions, particularly those that carry rare var-
`iants of clinical importance, can be obtained by broader array coverage
`or direct sequencing of amplified chromosome materials, as demon-
`strated previously15.
`Maternal cell-free DNA samples were shotgun-sequenced on the
`Illumina platform to a depth of ,52.73 (151 gigabases (Gb)),
`,20.83 (59.7 Gb) and ,1.33 (3.7 Gb) haploid genome coverage
`
`for P1T1 (P1, 1st trimester), P1T2 (P2, 2nd trimester) and P2T3
`(P3, 3rd trimester), respectively (Supplementary Table 2). To
`determine fetal inheritance of maternal haplotypes, we divided each
`chromosome into bins of 2.5–3.5 Mb for autosomal chromosomes
`and 5–7.5 Mb for chromosome X (Supplementary Table 2), with
`sliding steps of 100 kb, and compared the counts of alleles specific
`to each of the two haplotypes. Bin sizes were chosen according to the
`estimated sampling requirement (Supplementary Table 1) based on
`the sequencing depth, density of markers and fetal DNA fraction,
`which was estimated to be ,6%, ,16% and ,30% for P1T1, P1T2
`and P2T3 by comparing relative representation of maternal
`haplotypes, respectively. The lower SNP array density on chro-
`mosome X required larger bin sizes for that chromosome. The
`over-represented maternal haplotype over the entire genome was
`apparent and corresponded to the maternal haplotype transmitted
`to the fetus (Fig. 2). Taking into account the uncertainty surrounding
`regions of crossovers (median ,350–450 kb per crossover, Supplemen-
`tary Fig. 5), maternal inheritance of at least 99.2% of the SNPs could be
`deduced with at least 99.8% accuracy for all samples. Less sequencing
`depth also allowed the inherited maternal haplotypes to be deduced
`(Supplementary Fig. 6) with lower resolution of crossovers due to larger
`bin sizes (Supplementary Fig. 5).
`The paternally inherited haplotypes were reconstructed by detec-
`tion of paternal-specific alleles, followed by imputation at linked
`positions. We used the haplotypes of normal population documented
`by the 1000 Genome Project16 as reference haplotypes for imputation.
`Imputation accuracy is dependent on the density of markers, and the
`number of identified non-maternal alleles is dependent on sequencing
`depth and fetal DNA fraction. At the final sequencing depth (,52.73,
`,20.83 and ,10.73 haploid genome coverage for P1T1, P1T2 and
`P2T3, respectively), we detected ,66–70% of the paternal-specific
`alleles at least once (Supplementary Table 2 and Supplementary Fig. 7).
`Approximately 3.4–5.6% of the non-maternal alleles were sequencing
`noise. Using the non-maternal markers, we deduced ,70% of the
`paternally inherited haplotypes with ,94–97% accuracy via imputation
`(Fig. 3). The loci that could not be confidently imputed reside in
`regions where paternal-specific alleles were not detected, in regions
`
`1 9 J U L Y 2 0 1 2 | V O L 4 8 7 | N A T U R E | 3 2 1
`
`©2012
`
`Macmillan Publishers Limited. All rights reserved
`
`Exome capture of plasma
`cell-free DNA
`~
`~~
`
`Shotgun sequencing of
`plasma cell-free DNA
`
`~~ ~
`~~ ~
`~ ~
`~~
`
`~~
`~
`~~
`~
`~
`~
`~
`~
`~
`Exome enriched
`
`~ ~
`~~
`
`~
`~
`
`Exome
`bead pull-down
`
`~~
`~~
`~
`~~
`~
`~~
`~
`~~~
`~
`~~
`~
`~
`~ ~
`Exome DNA
`Other DNA
`
`~~
`~~
`
`~~
`
`00002
`
`
`
`RESEARCH ARTICLE
`
`a
`
`P1 first trimester (P1T1)
`
`b
`
`P1 second trimester (P1T2)
`
`c
`
`P2 third trimester (P2T3)
`
`chr5
`
`chr4
`
`chr3
`
`chr2
`
`chr1
`
`chr5
`
`chr4
`
`chr3
`
`chr2
`
`chr1
`
`chr5
`
`chr4
`
`chr3
`
`chr2
`
`chr1
`
`chr10
`
`chr15
`
`chr9
`
`chr14
`
`chr8
`
`chr13
`
`chr7
`
`chr12
`
`chr6
`
`chr11
`
`chr10
`
`chr15
`
`chr20
`
`chr9
`
`chr14
`
`chr19
`
`chr8
`
`chr13
`
`chr7
`
`chr12
`
`chr6
`
`chr11
`
`chr10
`
`chr15
`
`chr9
`
`chr14
`
`chr8
`
`chr13
`
`chr7
`
`chr12
`
`chr6
`
`chr11
`
`chr20
`
`chr19
`
`chrX
`
`chr18
`
`chr22
`
`chr17
`
`chr21
`
`chr16
`
`chrX
`
`chr18
`
`chr22
`
`chr17
`
`chr21
`
`chr16
`
`chr20
`
`chr19
`
`chrX
`
`chr18
`
`chr22
`
`chr17
`
`chr21
`
`chr16
`
`Transmitted maternal haplotype (truth)
`Untransmitted maternal haplotype (truth)
`
`Normalized sum of count of alleles on maternal haplotype 1 –
`normalized sum of count of alleles on maternal haplotype 2
`in maternal plasma, per bin (2.5–3.5 Mb for chr1–22; 5–7.5 Mb for chrX)
`DiGeorge syndrome associated region (~2.85 Mb)
`
`Legend:
`
`Maternal haplotype 1 – maternal haplotype 2
`
`chrN
`
`–: maternal haplotype 2
`dominates
`
`0
`
`+: maternal haplotype 1
`dominates
`
`Centromere/heterochromatin
`
`Figure 2 | Non-invasively determining genome-wide fetal inheritance of
`maternal haplotypes via haplotype counting of maternal plasma DNA with
`at least 99.8% accuracy over 99.2% of the genome in three maternal plasma
`samples. a–c, Each point on a black line represents the relative amount of the
`two maternal haplotypes evaluated using the markers lying within a bin centred
`at the point, and is accompanied by a white bar that corresponds to the 95%
`confidence interval for each measurement in P1 first trimester (a), P1 second
`
`trimester (b) and P2 third trimester (c). chr, chromosome. The maternal
`haplotypes are coloured pink or grey according to the true transmission states,
`as determined by fetal cord blood genotypes. Over-representation of ‘maternal
`haplotype 2’ in P2T3 maternal plasma immediately adjacent to the DiGeorge
`syndrome associated deletion (blue) indicates fetal inheritance of the deletion,
`which agrees with fetal cord blood genotype.
`
`P1T1 plasma DNA
`
`P1T2 plasma DNA
`
`P2T3 plasma DNA
`
`0.58% 10.12%
`
`0.36% 10.60%
`
`0.26% 11.11%
`
`28.60%
`
`27.16%
`
`27.06%
`
`3.59%
`
`57.11%
`
`2.55%
`
`59.33%
`
`2.10%
`
`59.46%
`
`Loci with incorrectly
`observed paternal allele
`Loci with correctly
`observed paternal allele
`Loci with correctly
`imputed paternal allele
`Loci with incorrectly
`imputed paternal allele
`Loci at which paternal
`allele identity is uncertain
`
`Figure 3 | Reconstruction of paternally inherited chromosomes non-
`invasively based on imputation using observed non-maternal alleles. The
`paternally inherited haplotypes were reconstructed by detection of paternal-
`specific alleles, followed by imputation at linked positions. At the final
`sequencing depth, ,66–70% of all the paternal-specific alleles were detected at
`least once. Using those markers, ,70% of the paternally inherited haplotypes
`were imputed with ,94–97% accuracy. The loci that could not be confidently
`imputed could in principle be completely determined by deeper sequencing
`and application of the counting principle directly to the individual alleles at
`every genomic position.
`
`that lack paternal-specific alleles, or where the paternal alleles are
`associated with more than one haplotype observed in the population.
`In principle these regions could be completely determined by deeper
`sequencing and application of the counting principle directly to the
`local regions or the individual alleles at every genomic position, as
`shown below.
`
`Counting alleles at individual loci measures fetal exome
`We sought to determine clinically relevant portions of the fetal
`genome in maternal plasma DNA by applying the counting principle
`to each allele at all positions in the exome. Because the exome is two
`orders of magnitude smaller than the genome,
`less sequencing
`throughput is required to provide deep sequencing at individual loci
`and thus allows sensitive and specific detection of clinically relevant
`and deleterious polymorphisms that were either paternally inherited
`alleles or de novo mutations. We performed exome capture and
`sequencing on maternal plasma DNA samples of P1 in all three
`trimesters (Fig. 1 and Supplementary Fig. 9). We obtained a median
`coverage of 1943, 2213 and 6313 per position in the exome for the
`
`3 2 2 | N A T U R E | V O L 4 8 7 | 1 9 J U L Y 2 0 1 2
`©2012
`
`Macmillan Publishers Limited. All rights reserved
`
`00003
`
`
`
`ARTICLE RESEARCH
`
`3rd trimester
`
`Maternal homozygous, fetal heterogyzous
`All homozygous
`Maternal heterozygous, fetal homogyzous
`All heterozygous
`
`Maternal heterozygous
`Maternal homozygous
`0.6 0.8 1.0
`0.2 0.4
`False positive rate
`
`1.0
`0.8
`
`0.6
`0.4
`0.2
`
`0.0
`0.0
`
`True positive rate
`
`1.0
`
`0.9
`
`0.8
`
`0.7
`
`0.6
`
`0.5
`
`0.4
`
`0.3
`
`0.2
`
`0.1
`
`c
`
`Frequency
`
`2nd trimester
`
`Maternal homozygous, fetal heterogyzous
`All homozygous
`Maternal heterozygous, fetal homogyzous
`All heterozygous
`
`Maternal heterozygous
`Maternal homozygous
`0.2 0.4
`0.6 0.8 1.0
`False positive rate
`
`1.0
`0.8
`
`0.6
`0.4
`0.2
`
`0.0
`0.0
`
`True positive rate
`
`1.0
`
`0.9
`
`0.8
`
`0.7
`
`0.6
`
`0.5
`
`0.4
`
`0.3
`
`0.2
`
`0.1
`
`b
`
`Frequency
`
`1st trimester
`
`Maternal homozygous, fetal heterogyzous
`All homozygous
`Maternal heterozygous, fetal homogyzous
`All heterozygous
`
`Maternal heterozygous
`Maternal homozygous
`
`0.2 0.4
`0.6 0.8 1.0
`False positive rate
`
`1.0
`0.8
`
`0.6
`0.4
`0.2
`
`0.0
`0.0
`
`True positive rate
`
`1.0
`
`0.9
`
`0.8
`
`0.7
`
`0.6
`
`0.5
`
`0.4
`
`0.3
`
`0.2
`
`0.1
`
`a
`
`Frequency
`
`10
`
`20
`
`30
`
`40
`
`50
`
`60
`
`0
`0
`
`10
`
`50
`
`60
`
`1.0
`
`0.8
`
`0.6
`
`0.4
`
`e
`
`rue positive rate
`
`0.2T
`
`1st trimester
`2nd trimester
`3rd trimester
`
`0
`0
`
`150,000
`
`100,000
`
`50,000
`
`d
`
`Occurrences
`
`0
`
`0
`
`1,000
`
`2,000
`Coverage
`
`3,000
`
`0.0
`0.0
`
`0.2
`
`0.4
`0.6
`False positive rate
`
`0.8
`
`1.0
`
`Minor allele fraction
`
`Minor allele fraction
`
`0
`0
`
`10
`
`20
`
`30
`
`40
`
`50
`
`60
`
`40
`30
`20
`Minor allele fraction
`
`SeqRef – Tri 1
`SeqRef – Tri 2
`SeqRef – Tri 3
`Array – Tri 1
`Array – Tri 2
`Array – Tri 3
`SeqRef – Array Tri 1
`SeqRef – Array Tri 2
`SeqRef – Array Tri 3
`
`1.0
`
`0.8
`
`0.6
`
`0.4
`
`f
`
`rue positive rate
`
`0.2T
`
`SeqRef – Tri 1
`SeqRef – Tri 2
`SeqRef – Tri 3
`Array – Tri 1
`Array – Tri 2
`Array – Tri 3
`SeqRef – Array Tri 1
`SeqRef – Array Tri 2
`SeqRef – Array Tri 3
`
`0.0
`10–8 10–7 10–6 10–5
`10–4 10–3 10–2 10–1 100 101
`False positive rate
`
`Figure 4 | Exome sequencing of P1 maternal plasma DNA in all three
`trimesters to determine maternal and fetal genotypes. a–c, Histograms of
`minor allele fraction in maternal plasma from first (a), second (b) and third
`(c) trimesters of P1 at positions that are confidently called in both plasma
`sequencing data and pure fetal/maternal DNA genotyping data. Insets:
`Receiver operating characteristic (ROC) curves of positions detecting fetal
`genotypes differing from maternal genotype when the maternal position is
`either homozygous or heterozygous. The higher the fetal fraction (,6, 20, 26%
`for trimester 1, 2, 3, respectively), the more the distributions are separated, and
`
`first, second and third trimester, respectively (Fig. 4d). After stringent
`data filtering to eliminate miscalled paternal-specific alleles due to
`limited sampling and mis-mapping to the reference genome
`(Supplementary Fig. 10), 75%, 78% and 90% of all exomic positions
`in the first, second and third trimester samples, respectively, had
`.1003 coverage and were retained for analysis (Supplementary
`Table 2).
`We calculated minor allele fraction, defined as the second largest
`nucleotide fraction divided by the sum of the two largest nucleotide
`fractions, at positions that are confidently called in genotyping data
`within the exome (Fig. 4a–c) or exome sequencing data (Supplemen-
`tary Fig. 11–13) of fetal cord blood DNA and pure maternal DNA. In
`all three trimesters, fetal genotypes could be assigned robustly at loci
`where the mother is homozygous based on the separation in minor
`allele fraction at a depth of 2003. Paternal-specific alleles were
`detected with sensitivity of 96–99.8% at the specificity threshold of
`99% (Fig. 4e–f and Table 1). Because the minor allele fraction at loci
`with paternal-specific alleles is theoretically half of the fetal DNA
`fraction, we estimated fetal DNA percentage to be 6.6%, 20.1%,
`26.3% for the three trimesters, respectively (Supplementary Table 2).
`For the second and third trimester samples with higher fetal DNA
`fraction, fetal genotypes could be extracted for most loci at which
`the mother is heterozygous, as the separation in minor allele fraction
`for fetal homozygous and fetal heterozygous SNPs was apparent
`
`Table 1 | Exome diagnostic cut-offs and the resulting sensitivity and
`specificity
`
`Specificity cut-offs
`
`Maternal homozygous
`
`Maternal heterozygous
`
`Sensitivity
`
`Trimester Fetal fraction
`
`95%
`
`99%
`
`1
`2
`3
`
`6%
`20%
`26%
`
`98%
`99.8%
`99.7%
`
`96%
`99.8%
`99.6%
`
`85%
`
`25%
`89%
`96%
`
`90%
`
`16%
`85%
`93%
`
`the easier it is to distinguish between the two distributions of fetal genotype.
`d, Histogram of per-position coverage, with bin size of 5. Exome positions
`.1003 are 75%, 78% and 90% respectively for trimester 1, 2, and 3,
`respectively, and .2003 are 48%, 56% and 84%. e, f, ROCs curves at genomic
`positions where mother is heterozygous (e) or homozygous (f), using either
`sequencing or SNP array of pure DNA as references for maternal and fetal
`genotypes. ‘SeqRef’ uses a sequenced reference, ‘Array’ uses a SNP array, and
`‘SeqRef-Array’ uses a sequenced reference only at positions on a SNP array.
`
`(Fig. 4a–c, e–f). For these loci, the ability to differentiate fetal hetero-
`zygosity from homozygosity depended on sequencing depth and fetal
`DNA fraction (Supplementary Fig. 1).
`
`Discussion
`The molecular counting methods described here offer a gateway to
`comprehensive non-invasive prenatal diagnosis of genetic disease.
`There are substantial ethical issues associated with non-invasive pre-
`natal genome determination, which we have not attempted to address.
`We will note however that there are numerous clinical scenarios
`where this approach would be useful. In the first or second trimester,
`it is possible to test for conditions that are not survivable or lead to
`medical complications. As technologies for pharmaceutical and sur-
`gical intervention improve, it may be possible to develop prenatal
`treatment or even cures for these congenital conditions.
`This is illustrated by our data on P2, who is an individual with
`DiGeorge syndrome. Haplotyping of the maternal genome identified
`a ,2.85 Mb deletion on 22q11.1 that is associated with the syndrome
`on one copy of the maternal chromosome 22 (denoted as ‘maternal
`haplotype 2’ in Fig. 2c). Haplotype counting in maternal plasma indi-
`cated an over-representation of ‘maternal haplotype 2’ of the region
`immediately adjacent to that deletion, indicating fetal inheritance of
`the DiGeorge syndrome associated deletion (Fig. 2c, deletion indi-
`cated in blue). This result was confirmed by quantitative PCR of cord
`blood DNA (Supplementary Fig. 8). In this clinical scenario, con-
`firmation of the deletion would argue for a fetal echocardiogram
`and neonatal assessment of calcium levels.
`Knowledge of the fetal genotypes obtained in the third trimester
`enables diagnosis of conditions that would benefit from treatment
`immediately after delivery; these include metabolic and immuno-
`logical disorders such as phenylketonuria, galactosaemia, maple syrup
`urine disease, and severe combined immunodeficiency. Currently,
`newborns with these conditions suffer as symptoms manifest themselves
`in the time it takes to determine the proper diagnosis and treatment,
`
`1 9 J U L Y 2 0 1 2 | V O L 4 8 7 | N A T U R E | 3 2 3
`
`©2012
`
`Macmillan Publishers Limited. All rights reserved
`
`00004
`
`
`
`RESEARCH ARTICLE
`
`which is often as simple as diet change. In summary, we anticipate that
`there is no technical barrier and many practical applications to having
`the entire fetal genome determined non-invasively in clinical settings.
`
`METHODS SUMMARY
`Two pregnant subjects (P1 and P2) were recruited with informed consent and
`approval of the Internal Review Board of Stanford University. Peripheral blood
`was prospectively obtained at each trimester during the course of pregnancy and
`post delivery. Direct deterministic phasing (DDP) was performed on three to four
`single cells obtained from cultures of maternal blood lymphocytes15. Cell-free
`DNA was extracted from maternal plasma during pregnancy and converted into
`Illumina sequencing libraries using previously established methods4. Exome
`capture was performed on cell-free DNA using SeqCap EZ v2.0 Kit (Roche
`NimbleGen). Genomic DNA from postpartum maternal blood cells and cord
`blood cells were assessed by genotyping array (Illumina’s HumanOmni1-
`Quad) and exome sequencing to provide the reference genotypes of the mother
`and the fetus.
`To detect the over-represented parental haplotypes, each chromosome was
`divided into equally sized bins with sliding window of 100 kb. The bin size was
`chosen such that the average count was at least that required to overcome
`counting noise when determining relative representation of the two maternal
`haplotypes. The relative representation of the maternal haplotypes was calculated
`using the expression (Np1/np1 2 Np2/np2), where Npi is the number of occurrences
`of markers defining ‘maternal haplotype i’ within the bin counted by sequencing,
`npi is the total number of usable markers that define ‘maternal haplotype i’ within
`the bin. If the expression was positive, maternal haplotype 1 was considered
`inherited. If the expression was negative, maternal haplotype 2 was considered
`inherited. Imputation of the allelic identity on unobserved loci was calculated
`with Impute v1 (ref. 17) using the –haploid option.
`
`Full Methods and any associated references are available in the online version of
`the paper at www.nature.com/nature.
`
`Received 1 March; accepted 23 May 2012.
`Published online 4 July 2012.
`
`2.
`
`1. Mandel, P. & Metais, P. Les acides nucle´iques du plasma sanguin chez l’homme.
`C. R. Acad. Sci. Paris 142, 241–243 (1948).
`Lo, Y. M. et al. Quantitative analysis of fetal DNA in maternal plasma and serum:
`implications for noninvasive prenatal diagnosis. Am. J. Hum. Genet. 62, 768–775
`(1998).
`3. Bodurtha, J. & Strauss, J. F. III. Genomics and perinatal care. N. Engl. J. Med. 366,
`64–73 (2012).
`Fan, H. C., Blumenfeld, Y. J., Chitkara, U., Hudgins, L. & Quake, S. R. Noninvasive
`diagnosis of fetal aneuploidy by shotgun sequencing DNA from maternal blood.
`Proc. Natl Acad. Sci. USA 105, 16266–16271 (2008).
`
`4.
`
`5.
`
`9.
`
`Sehnert, A. J. et al. Optimal detection of fetal chromosomal abnormalities by
`massively parallel DNA sequencing of cell-free fetal DNA from maternal blood.
`Clin. Chem. 57, 1042–1049 (2011).
`6. Bianchi, D. W. et al. Genome-wide fetal aneuploidy detection by maternal plasma
`DNA sequencing. Obst. Gynecol. 119, 890–901 (2012).
`7. Palomaki, G. E. et al. DNA sequencing of maternal plasma reliably identifies
`trisomy 18 and trisomy 13 as well as Down syndrome: an international
`collaborative study. Genet. Med. 14, 296–305 (2012).
`8. Palomaki, G. E. et al. DNA sequencing of maternal plasma to detect Down
`syndrome: an international clinical validation study. Genet. Med. 13, 913–920
`(2011).
`Ehrich, M. et al. Noninvasive detection of fetal trisomy 21 by sequencing of DNA in
`maternal blood: a study in a clinical setting. Am. J. Obstet. Gynecol. 204,
`205.e1–211.e11 (2011).
`10. Chiu, R. W. et al. Non-invasive prenatal assessment of trisomy 21 by multiplexed
`maternal plasma DNA sequencing: large scale validity study. Br. Med. J. 342,
`c7401 (2011).
`11. Lo, Y. M. et al. Maternal plasma DNA sequencing reveals the genome-wide genetic
`and mutational profile of the fetus. Sci. Transl. Med. 2, 61ra91 (2010).
`12. Fan, H. C. & Quake, S. R. In principle method for noninvasive determination of the
`fetal genome. Preprint at http://precedings.nature.com/documents/5373/
`version/1 (2010).
`13. Macintyre, S. & Sooman, A. Non-paternity and prenatal genetic screening. Lancet
`338, 869–871 (1991).
`14. Bellis, M. A., Hughes, K., Hughes, S. & Ashton, J. R. Measuring paternal discrepancy
`and its public health consequences. J. Epidemiol. Community Health 59, 749–754
`(2005).
`15. Fan, H. C., Wang, J., Potanina, A. & Quake, S. R. Whole-genome molecular
`haplotyping of single cells. Nature Biotechnol. 29, 51–57 (2011).
`16. The 1000 Genomes Project Consortium. A map of human genome variation from
`population-scale sequencing. Nature 467, 1061–1073 (2010).
`17. Marchini, J. et al. A comparison of phasing algorithms for trios and unrelated
`individuals. Am. J. Hum. Genet. 78, 437–450 (2006).
`
`Supplementary Information is linked to the online version of the paper at
`www.nature.com/nature.
`
`Acknowledgements The authors would like to thank E. Kogut and staff of the Division of
`Perinatal Genetics and the General Clinical Research Center of Stanford University for
`coordination of patient recruitment; R. Wong for initial sample processing of clinical
`samples; N. Neff, G. Mantalas, B. Passarelli and W. Koh for their help in sequencing
`library preparation and data analysis.
`
`Author Contributions H.C.F., W.G. and S.R.Q. conceived the study. H.C.F., W.G. and J.W.
`performed experiments. H.C.F., W.G. and J.W. analysed the data. Y.J.B. and Y.Y.E.-S.
`coordinated patient recruitment. H.C.F., W.G., J.W. and S.R.Q. wrote the manuscript. All
`authors discussed the results and commented on the manuscript.
`
`Author Information Reprints and permissions information is available at
`www.nature.com/reprints. The authors declare competing financial interests: details
`are available in the online version of the paper. Readers are welcome to comment on
`the online version of this article at www.nature.com/nature. Correspondence and
`requests for materials should be addressed to S.R.Q. (quake@stanford.edu).
`
`3 2 4 | N A T U R E | V O L 4 8 7 | 1 9 J U L Y 2 0 1 2
`©2012
`
`Macmillan Publishers Limited. All rights reserved
`
`00005
`
`
`
`METHODS
`Prediction of counting depth requirement for determination of over-
`representation of transmitted maternal haplotypes. Given two distributions
`of Poisson random variables, one with mean of N, and the other with mean of
`N(1 2 e), where N is the cumulative sum of the count of all usable markers on the
`transmitted maternal haplotype, the sampling requirement of N to differentiate
`the two distributions can be estimated from the following expression, using the
`normal approximation of the Poisson distribution for large values of N:
`ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffip
`
`ffiffiffi