`1042–1049 (2011)
`
`Molecular Diagnostics and Genetics
`
`Optimal Detection of Fetal Chromosomal Abnormalities by
`Massively Parallel DNA Sequencing of Cell-Free Fetal DNA
`from Maternal Blood
`Amy J. Sehnert,1 Brian Rhees,1,2 David Comstock,1 Eileen de Feo,1 Gabrielle Heilek,1,3
`John Burke,4 and Richard P. Rava1*
`
`BACKGROUND: Massively parallel DNA sequencing of
`cell-free fetal DNA from maternal blood can detect fe-
`tal chromosomal abnormalities. Although existing al-
`gorithms focus on the detection of fetal trisomy 21
`(T21), these same algorithms have difficulty detecting
`trisomy 18 (T18).
`
`METHODS: Blood samples were collected from 1014 pa-
`tients at 13 US clinic locations before they underwent
`an invasive prenatal procedure. All samples were pro-
`cessed to plasma, and the DNA extracted from 119
`samples underwent massively parallel DNA sequenc-
`ing. Fifty-three sequenced samples came from women
`with an abnormal fetal karyotype. To minimize the
`intra- and interrun sequencing variation, we developed
`an optimized algorithm by using normalized chromo-
`some values (NCVs) from the sequencing data on a
`training set of 71 samples with 26 abnormal karyo-
`types. The classification process was then evaluated on
`an independent test set of 48 samples with 27 abnormal
`karyotypes.
`
`RESULTS: Mapped sites for chromosomes of interest in
`the sequencing data from the training set were normal-
`ized individually by calculating the ratio of the number
`of sites on the specified chromosome to the number of
`sites observed on an optimized normalizing chromo-
`some (or chromosome set). Threshold values for tri-
`somy or sex chromosome classification were then es-
`tablished for all chromosomes of
`interest, and a
`classification schema was defined. Sequencing of the
`independent test set led to 100% correct classification
`of T21 (13 of 13) and T18 (8 of 8) samples. Other chro-
`mosomal abnormalities were also identified.
`
`CONCLUSION: Massively parallel sequencing is capable
`of detecting multiple fetal chromosomal abnormalities
`
`from maternal plasma when an optimized algorithm is
`used.
`© 2011 American Association for Clinical Chemistry
`
`The American College of Obstetrics and Gynecology
`Practice Bulletin no. 77, published in 2007, supports
`the measurement of nuchal translucency and surrogate
`biochemical markers in all pregnant women in the first
`trimester to assess the risk of aneuploidy for Down syn-
`drome (1 ). These screening tests can provide only an
`inconclusive risk determination; they have nonoptimal
`detection and high false-positive rates. Today, only in-
`vasive methods, including chorionic villus sampling
`(CVS),3 amniocentesis, or cordocentesis, provide defi-
`nite genetic information about the fetus, but these pro-
`cedures are associated with risks to both mother and
`fetus (2– 4 ). Therefore, a noninvasive means to obtain
`definite information on fetal chromosomal status is
`desirable.
`Fan et al., in 2008, were the first to suggest count-
`ing chromosomes by mapping sequence tags as a po-
`tential quantification method for detecting fetal aneu-
`ploidy from cell-free DNA (cfDNA) obtained from
`maternal blood (5, 6 ). In these studies, massively par-
`allel DNA sequencing of cfDNA obtained from the ma-
`ternal plasma yielded millions of short sequence tags
`that could be aligned and uniquely mapped to sites
`from a reference human genome. The depth of se-
`quencing and subsequent counting statistics determine
`the sensitivity of detection for fetal aneuploidy (7 ).
`Although 2 recently published reports of studies
`with larger populations have described the successful
`use of sequence tag mapping and chromosome count-
`ing to detect fetal aneuploidy, these studies focused
`only on the classification of trisomy 21 (T21) (8, 9 ).
`The algorithms used in these studies appear to be un-
`
`1 Verinata Health, Inc., San Carlos, CA; 2 current affiliation: Caris Life Sciences,
`Phoenix, AZ; 3 current affiliation: Roche Molecular Systems, Pleasanton, CA;
`4 Biotique Systems, Reno, NV.
`* Address correspondence to this author at: 1531 Industrial Rd., San Carlos, CA
`94070. Fax ⫹650-362-2151; e-mail rrava@verinata.com.
`
`Received March 17, 2011; accepted April 8, 2011.
`Previously published online at DOI: 10.1373/clinchem.2011.165910
`3 Nonstandard abbreviations: CVS, chorionic villus sampling; cfDNA, cell-free
`DNA; T21, trisomy 21; T18, trisomy 18; NCV, normalized chromosome value;
`T13, trisomy 13.
`
`1042
`
`00001
`
`EX1054
`
`
`
`Optimal Detection of Fetal Aneuploidy by Sequencing
`
`able to effectively detect other aneuploidies, such as
`trisomy 18 (T18), that would inevitably occur in a clin-
`ical population being offered a commercially available
`test. In this study, we developed and tested an opti-
`mized algorithm from massively parallel sequencing
`data and demonstrated the potential universality
`of the sequence tag mapping and chromosome-
`quantification method for the detection of multiple
`chromosomal abnormalities.
`
`Materials and Methods
`
`BLOOD SAMPLES AND CLINICAL INFORMATION
`The study was conducted by qualified clinical research
`personnel at 13 US clinic locations between April 2009
`and July 2010 under a human participant protocol ap-
`proved by institutional review boards at each institu-
`tion. Informed written consent was obtained from each
`woman before her inclusion in the study.
`The protocol was designed to provide blood sam-
`ples and clinical data to support the development of
`noninvasive prenatal genetic diagnostic methods.
`Pregnant women age 18 years or older were eligible for
`inclusion. For patients undergoing clinically indicated
`CVS or amniocentesis, blood was collected before per-
`formance of the procedure, and fetal-karyotyping re-
`sults were also collected. Peripheral blood samples (2
`tubes or approximately 20 mL total) were drawn from
`all participants and collected into tubes containing acid
`citrate dextrose (Becton Dickinson). All samples were
`deidentified and assigned an anonymous study identi-
`fication number. Blood samples were shipped over-
`night to Verinata Health, Inc. (San Carlos, CA) in
`temperature-controlled shipping containers provided
`for the study. The time elapsed between blood draw
`and sample receipt was recorded upon accessioning at
`the Verinata Health laboratory.
`Site research coordinators used the anonymous
`patient identification number in entering clinical data
`relevant to the patient’s current pregnancy and history
`into study case-report forms. Cytogenetic analysis of
`the fetal karyotype from samples obtained in invasive
`prenatal procedure was performed per the local labo-
`ratories, and these results were also recorded in the
`study case-report forms. All data obtained on the forms
`were entered into a clinical database at Verinata
`Health.
`
`SAMPLE PROCESSING AND SEQUENCING
`Cell-free plasma was obtained from individual blood
`tubes within 24 – 48 h of venipuncture via centrifuga-
`tion at 1600g for 10 min, transfer to microcentrifuge
`tubes, and centrifugation at 16 000g for 10 min to re-
`move residual cells. Plasma from a single blood tube
`was sufficient for sequencing analysis. cfDNA was ex-
`
`tracted from cell-free plasma with the QIAamp DNA
`Blood Mini Kit (Qiagen) according to the manufactur-
`er’s instructions. Because cfDNA fragments are known
`to be approximately 170 bp in length (10 ), a DNA-
`fragmentation step was not required before sequenc-
`ing. For the training set samples, we sent cfDNA to
`Prognosys Biosciences to prepare a sequencing library
`(cfDNA blunt-ended and ligated to universal adapters)
`and for sequencing on the Illumina Genome Ana-
`lyzer IIx instrumentation according to the manufac-
`turer’s standard protocols (http://www.illumina.
`com/). Single-end reads of 36 bp were obtained. Upon
`completion of the sequencing, all base-call files were
`transferred to Verinata Health for further analysis. For
`the test set samples, we prepared the sequencing librar-
`ies and carried out the sequencing on the Illumina Ge-
`nome Analyzer IIx instrument at Verinata Health. For
`both the training and test sample sets, single-end reads
`of 36 bp were sequenced.
`
`DATA ANALYSIS AND SAMPLE CLASSIFICATION
`Sequence reads of 36 bases in length were aligned to the
`human genome assembly hg18 obtained from the Uni-
`versity of California, Santa Cruz database (http://
`hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/).
`Alignments were carried out by using the Bowtie short
`read aligner (version 0.12.5) and allowing for up to 2
`base mismatches during alignment (11 ). Only reads
`that unambiguously mapped to a single genomic loca-
`tion were included. The genomic sites where reads
`mapped were counted and included in the calculation
`of chromosome ratios (see below). Regions on the Y
`chromosome where sequence tags from male and fe-
`male fetuses map without any discrimination were ex-
`cluded from the analysis (specifically, from base 0 to
`base 2 ⫻ 106, base 10 ⫻ 106 to base 13 ⫻ 106, and base
`23 ⫻ 106 to the end of chromosome Y).
`Intrarun and interrun sequencing variation in the
`chromosomal distribution of sequence reads can ob-
`scure the effects of fetal aneuploidy on the distribution
`of mapped sequence sites. To correct for such varia-
`tion, we calculate a chromosome ratio, in which the
`count of mapped sites for a given chromosome of in-
`terest is normalized to counts observed on another pre-
`determined chromosome (or set of chromosomes). To
`identify the optimal chromosome ratio for each chro-
`mosome of interest, we reviewed the unaffected subset
`of the training data (i.e., including only samples with
`diploid karyotypes for chromosomes 21, 18, 13, and X)
`and considered each autosome as a potential denomi-
`nator in a ratio of counts with our chromosomes of
`interest. We selected denominator chromosomes that
`minimized the variation of the chromosome ratios
`within and between sequencing runs. Each chromo-
`
`Clinical Chemistry 57:7 (2011) 1043
`
`00002
`
`
`
`Table 1. Chromosome ratio calculation rules.
`
`Chromosome
`of interest
`
`Numerator
`(chromosome
`mapped sites)
`
`Denominator
`(chromosome
`mapped sites)
`
`21
`18
`13
`X
`Y
`
`21
`18
`13
`X
`Y
`
`9
`8
`Sum (2–6)
`6
`Sum (2–6)
`
`,
`
`NCVij ⫽
`
`some of interest was determined to have a distinct de-
`nominator (Table 1).
`The full training set was then used to set the
`boundaries for sample classification. The means and
`SDs of chromosome ratios for the unaffected samples
`in the training set were determined. For each sample
`and chromosome of interest, a normalized chromo-
`some value (NCV) was calculated with the equation:
`xij ⫺ ˆ j
`ˆ j
`where ˆ j and ˆ j are the estimated training set mean and
`SD, respectively, for the j-th chromosome ratio and xij
`is the observed j-th chromosome ratio for sample i.
`When chromosome ratios are normally distributed,
`the NCV is equivalent to a statistical z score for the
`ratios. No significant departure from linearity was ob-
`served in a quantile– quantile plot of the NCVs from
`unaffected samples. In addition, standard tests of nor-
`mality for the NCVs failed to reject the null hypothesis
`of normality. For both the Kolmogorov–Smirnov and
`Shapiro–Wilk tests, the significance value was ⬎0.05.
`For the test set, an NCV was calculated for each
`chromosome of interest—21, 18, 13, X, and Y—for ev-
`ery sample. To ensure a safe and effective classification
`scheme, we chose conservative boundaries for aneu-
`ploidy classification. For classification of the auto-
`somes’ aneuploidy state, we required an NCV ⬎4.0 to
`classify the chromosome as affected (i.e., aneuploid for
`that chromosome) and an NCV ⬍2.5 to classify a chro-
`mosome as unaffected. Samples with autosomes that
`had an NCV between 2.5 and 4.0 were classified as “no
`call.”
`Sex chromosome classification in the test is per-
`formed in a somewhat more complex fashion— by se-
`quential application of NCVs for both X and Y.
`Specifically:
`1. If the NCV for Y is greater than ⫺2.0 SDs from the
`mean of male samples, then the sample is classified
`as male (XY).
`2. If the NCV for Y is less than ⫺2.0 SDs from the
`mean of male samples and the NCV for X is greater
`
`1044 Clinical Chemistry 57:7 (2011)
`
`than ⫺2.0 SDs from the mean of female samples,
`then the sample is classified as female (XX).
`3. If the NCV for Y is less than ⫺2.0 SDs from the
`mean of male samples and the NCV for X is less than
`⫺3.0 SDs from the mean of female samples, then the
`sample is classified as monosomy X, i.e., Turner
`syndrome.
`4. If the NCVs do not fit into any of the above 3 crite-
`ria, then the sample is classified as a “no call” for sex.
`
`Results
`
`STUDY POPULATION DEMOGRAPHICS
`We enrolled 1014 patients between April 2009 and July
`2010. The patient demographic characteristics, the
`type of invasive procedure, and karyotype results are
`summarized in Table 2. The mean age of the study par-
`ticipants was 35.6 years (range, 17– 47 years), and ges-
`tational age ranged from 6 weeks, 1 day to 38 weeks, 1
`day (mean, 15 weeks, 4 days). The observed overall
`incidence of abnormal fetal chromosome karyotypes
`was 6.8%, with a T21 incidence of 2.5%. Of 946 partici-
`pants with singleton pregnancies and a karyotype, 906
`(96%) showed at least 1 clinically recognized risk factor
`for fetal aneuploidy before undergoing the prenatal
`procedure. Even after eliminating the women with ad-
`vanced maternal age as their sole indication, the data
`demonstrate a very high false-positive rate for current
`screening modalities. Ultrasound findings of increased
`nuchal translucency, cystic hygroma, or another struc-
`tural congenital abnormality were most predictive of
`an abnormal karyotype in this cohort.
`The distribution of the diverse ethnic backgrounds
`represented in this study population is also shown in
`Table 2. Overall, the patients were 63% Caucasian, 17%
`Hispanic, 6% Asian, 5% multiethnic, and 4% African
`American. We noted that the ethnic diversity varied
`substantially from site to site. For example, one site
`enrolled 60% Hispanic and 26% Caucasian individu-
`als, whereas 3 other clinics located in the same state
`enrolled no Hispanic participants. As expected, there
`were no discernible differences in our results with re-
`spect to different ethnicities.
`
`TRAINING SET DATA
`The training set study selected 71 samples from the
`initial sequential accumulation of 435 samples that
`were collected between April 2009 and December 2009.
`All participants with affected fetuses (abnormal karyo-
`types) in this first series of participants, as well as a
`random selection of nonaffected individuals with ade-
`quate sample and data, were included for sequencing.
`The clinical characteristics of the patients in the train-
`ing set were consistent with the overall study demo-
`graphics summarized in Table 2. The gestational age
`
`00003
`
`
`
`Optimal Detection of Fetal Aneuploidy by Sequencing
`
`Demographic characteristics
`
`Dates of enrollment
`Patients enrolled, n
`Maternal age
`Mean (SD), years
`Minimum/maximum, years
`Not specified, n
`Ethnicity, n (%)
`Caucasian
`Hispanic
`Asian
`Multiethnic (⬎1)
`African American
`Other
`Native American
`Not specified
`Gestational age, weeks, days
`Mean
`Minimum/maximum
`No. of fetuses, n
`1
`2
`3
`Prenatal procedure, n (%)
`CVS
`Amniocentesis
`Not specified
`Not performed
`Fetal karyotype, n (%)
`46,XX
`46,XY
`47,⫹21 (both sexes)
`47,⫹18 (both sexes)
`47,⫹13 (both sexes)
`45,X
`Complex, other
`Karyotype not available
`Prenatal screening risks for
`karyotyped singletons
`AMAb only (ⱖ35 years)
`Screen positive (trisomy)c
`Increased NT
`Cystic hygroma
`Cardiac defect
`Other congenital abnormality
`Other maternal risk
`None specified
`
`Table 2. Patient demographics.
`
`Total enrolled
`(n ⴝ 1014)
`
`Apr 2009 to Jul 2010
`1014
`
`Training set
`(n ⴝ 71)
`
`Test set
`(n ⴝ 48)
`
`Apr 2009 to Dec 2009
`435
`
`Jan 2010 to Jun 2010
`575
`
`35.6 (5.66)
`17/47
`11
`
`636 (62.7)
`167 (16.5)
`63 (6.2)
`53 (5.2)
`41 (4.0)
`36 (3.6)
`9 (0.9)
`9 (0.9)
`
`15, 4
`6, 1/38, 1
`
`982
`30
`2
`
`430 (42.4)
`571 (56.3)
`3 (0.3)
`10 (1.0)
`
`453a (43.9)
`474a (45.9)
`25a (2.4)
`14 (1.4)
`4 (0.4)
`8 (0.8)
`18a (1.7)
`36 (3.5)
`Nonsequenced
`(n ⴝ 834), n (%)
`445 (53.4)
`149 (17.9)
`35 (4.2)
`12 (1.4)
`14 (1.7)
`78 (9.4)
`64 (7.7)
`37 (4.4)
`
`36.4 (6.05)
`20/46
`3
`
`50 (70.4)
`6 (8.5)
`6 (8.5)
`6 (8.5)
`1 (1.3)
`2 (2.8)
`0 (0.0)
`0 (0.0)
`
`14, 5
`10, 0/23, 1
`
`67
`4
`0
`
`38 (53.5)
`32 (45.1)
`1 (1.4)
`0 (0.0)
`
`22a (29.7)
`26a (35.1)
`10a (13.5)
`5 (6.8)
`2 (2.7)
`3 (4.1)
`6 (8.1)
`0 (0.0)
`Analyzed training set
`(n ⴝ 65), n (%)
`27 (41.5)
`18 (27.7)
`3 (4.6)
`5 (7.7)
`0 (0.0)
`4 (6.2)
`5 (7.7)
`3 (4.6)
`
`34.2 (8.22)
`18/46
`0
`
`24 (50.0)
`13 (27.0)
`5 (10.4)
`1 (2.1)
`3 (6.3)
`1 (2.1)
`1 (2.1)
`0 (0.0)
`
`15, 3
`10, 4/28, 3
`
`47
`1
`0
`
`28 (58.3)
`20 (41.7)
`0 (0.0)
`0 (0.0)
`
`7a (14.6)
`14 (29.2)
`13 (27.1)
`8 (16.7)
`1 (2.1)
`3 (6.3)
`2 (4.2)
`0 (0.0)
`Analyzed test set
`(n ⴝ 47), n (%)
`21 (44.7)
`9 (19.1)
`5 (10.6)
`4 (8.5)
`4 (8.5)
`3 (6.4)
`1 (2.1)
`0 (0.0)
`
`a Includes results of fetuses from multiple gestations.
`b AMA, advanced maternal age; NT, nuchal translucency.
`c Assessed and reported by clinicians.
`
`Clinical Chemistry 57:7 (2011) 1045
`
`00004
`
`
`
`for the samples in the training set ranged from 10
`weeks, 0 days to 23 weeks, 1 day. Thirty-eight patients
`underwent CVS, 32 underwent amniocentesis, and 1
`patient did not have the type of invasive procedure
`specified (an unaffected karyotype, 46,XY). The pa-
`tients were 70% Caucasian, 8.5% Hispanic, 8.5%
`Asian, and 8.5% multiethnic. Six sequenced samples
`were removed from this set for the purposes of
`training— 4 samples from individuals with twin gesta-
`tions (further discussed below), 1 sample with T18 that
`was contaminated during preparation, and 1 sample
`with fetal karyotype 69,XXX—leaving 65 samples for
`the training set.
`The number of unique sequence sites (i.e., tags
`identified with unique sites in the genome) increased
`from 2.2 ⫻ 106 in the early phases of the training set
`study to 13.7 ⫻ 106 in the latter phases because of im-
`provements in the sequencing technology over this pe-
`riod. To monitor for any potential shifts in the chro-
`mosome ratios over this 6-fold range in unique sites,
`we ran different unaffected samples at the beginning
`and the end of the study. For the first run of 15 unaf-
`fected samples, the mean number of unique sites was
`3.8 ⫻ 106, and the mean chromosome ratios for chro-
`mosomes 21 and 18 were 0.314 and 0.528, respectively.
`For the last run of 15 unaffected samples, the mean
`number of unique sites was 10.7 ⫻ 106, and the mean
`chromosome ratios for chromosomes 21 and 18 were
`0.316 and 0.529, respectively. There was no statistically
`significant difference in the chromosome ratios over
`the time of the training set study for chromosome 21 or
`for chromosome 18.
`The training set NCVs for chromosomes 21, 18,
`and 13 are shown in Fig. 1. These results are consistent
`with an assumption of normality, in that approxi-
`mately 99% of the diploid NCVs fall within ⫾2.5 SDs
`of the mean. Of this set of 65 samples, 8 samples with
`clinical karyotypes indicating T21 had NCVs between 6
`and 20. Four samples with clinical karyotypes indica-
`tive of fetal T18 had NCVs between 3.3 and 12, and the
`2 samples with karyotypes indicative of fetal trisomy 13
`(T13) had NCVs of 2.6 and 4. The spread in the NCVs
`in affected samples is due to their dependence on the
`percentage of fetal cfDNA in the individual samples.
`Similarly to the autosomes, the means and SDs for
`the sex chromosomes were established in the training
`set. The sex chromosome thresholds allowed 100% of
`the male and female fetuses in the training set to be
`identified.
`
`TEST SET DATA
`Having established chromosome ratio means and SDs
`from the training set, we selected a test set of 48 samples
`from 575 samples collected between January 2010 and
`June 2010. One of the samples from a twin gestation
`
`1046 Clinical Chemistry 57:7 (2011)
`
`Fig. 1. NCVs for the 65 samples in the training set.
`The last 8 samples in the chromosome 21 data set (NCVs
`6 –20) have T21 karyotypes. The last 4 samples in the
`chromosome 18 data set (NCVs 3.3–12) have T18 karyo-
`types. The last 2 samples in the chromosome 13 data set
`(NCVs 2.6 and 4) have T13 karyotypes.
`
`was removed from the final analysis, leaving 47 samples
`in the test set. The personnel preparing samples for
`sequencing and operating the equipment were blinded
`to the clinical karyotype information. The range of ges-
`tational ages was similar to that of the training set (Ta-
`ble 2). Fifty-eight percent of the invasive procedures
`were CVS, higher than the percentage of the overall
`procedural demographics, but similar to that of the
`training set. The participants were 50% Caucasian,
`27% Hispanic, 10.4% Asian, and 6.3% African
`American.
`In the test set, the number of unique sequence tags
`varied from approximately 13 ⫻ 106 to 26 ⫻ 106. For
`unaffected samples, the chromosome ratios for chro-
`mosomes 21 and 18 were 0.313 and 0.527, respectively.
`The test set NCVs for chromosomes 21, 18, and 13
`are shown in Fig. 2, and the classifications are given
`in Table 3. In the test set, 13 of 13 individuals with
`clinical karyotypes indicating fetal T21 were cor-
`rectly identified, with NCVs between 5 and 14. All 8
`individuals with karyotypes indicating fetal T18
`were correctly identified, with NCVs between 8.5
`and 22. The single sample with a karyotype classified
`as T13 in this test set was classified as a “no call,” with
`an NCV of approximately 3.
`For the test data set, all male samples were cor-
`rectly identified [including a sample with complex
`karyotype, 46,XY plus a marker chromosome (uniden-
`tifiable by cytogenetics); Table 3]. Nineteen of 20 fe-
`male samples were correctly identified; 1 female sample
`was categorized as a “no call.” Two of 3 samples in the
`test set with a karyotype of 45,X were correctly identi-
`fied as monosomy X. The third sample was classified as
`a “no call” (Table 3).
`
`00005
`
`
`
`Optimal Detection of Fetal Aneuploidy by Sequencing
`
`Fig. 2. NCVs for the 47 samples in the test set.
`The last 13 samples in the chromosome 21 data (NCVs
`5–14) and the last 8 samples in the chromosome 18 data
`(NCVs 8.5–22) were correctly classified as T21 and T18,
`respectively. The last sample in the chromosome 13 data
`set (NCV of approximately 3) was classified as a “no
`call.”
`
`TWINS
`Although the method is currently envisioned for use
`with singleton pregnancies, 4 of the samples initially
`selected for the training set and 1 of the samples in the
`test set were from twin gestations. The thresholds we
`are using could be confounded by the different
`amounts of cfDNA expected in the setting of a twin
`gestation. In the training set, the karyotype from one of
`the twin samples was monochorionic 47,XY,⫹21. A
`second twin sample was fraternal, and amniocentesis
`was carried out on each of the fetuses. One of these
`fetuses had a karyotype of 47,XY,⫹21, whereas the
`other had a normal karyotype, 46,XX. In both of these
`cases, the cell-free classification based on the methods
`discussed above classified the sample as T21. The other
`2 twin gestations in the training set were classified cor-
`rectly as nonaffected for T21 (all twins showed a dip-
`loid karyotype for chromosome 21). For the twin ges-
`tation sample in the test set, a karyotype was established
`only for twin B (46,XX); the algorithm correctly classi-
`fied this patient as nonaffected for T21.
`
`Discussion
`
`In this study, we have optimized the power of massively
`parallel sequencing for detecting multiple abnormal fe-
`tal karyotypes from the blood of pregnant women. To
`our knowledge, this study is the first to demonstrate
`100% correct classification of samples with trisomy 21
`and trisomy 18 with an independent set of test data.
`Even in the case of fetuses with abnormal sex chromo-
`some karyotypes, no sample was incorrectly classified
`
`Table 3. Test set classification data.
`
`T21 classification
`
`Classification
`
`Karyotype
`
`Unaffected, n
`
`T21, n
`
`No call, n
`
`Diploid Chr 21a
`47, XX, or XY,⫹21
`
`34
`
`13
`
`T18 classification
`
`Classification
`
`Karyotype
`
`Unaffected, n
`
`T18, n
`
`No call, n
`
`Diploid Chr 18
`47, XX, or XY,⫹18
`
`39
`
`8
`
`T13 classification
`
`Classification
`
`Karyotype
`
`Unaffected, n
`
`T13, n
`
`No call, n
`
`Diploid Chr 13
`47,XY,⫹13
`
`46
`
`1
`
`Sex chromosome classification
`
`Classification
`
`Karyotype
`
`XY, n
`
`XX, n
`
`MX, n
`
`No call, n
`
`XY
`XX
`45,X
`Complex, other
`
`23
`
`1
`
`18
`
`1
`
`2
`
`1
`1
`
`a Chr 21, chromosome 21; MX, monosomy in the X chromosome with no
`evidence of Y chromosome.
`
`with our algorithm. Importantly, the algorithm also
`performed well in detecting the presence of T21 in 2
`sets of twin pregnancies with at least 1 affected fetus, a
`result that has not previously been reported. Further-
`more, our study examined a variety of sequential sam-
`ples from multiple centers that not only represented
`the range of abnormal karyotypes one is likely to wit-
`ness in a commercial clinical setting but also demon-
`strated the importance of accurately classifying preg-
`nancies unaffected by common trisomies to address
`the unacceptably high false-positive rates that occur in
`prenatal screening today. The data provide valuable in-
`sight into the vast potential of this method for use in the
`future.
`An analysis of subsets of the unique genomic sites
`showed increases in the variance consistent with Pois-
`son counting statistics. Our work builds on the findings
`
`Clinical Chemistry 57:7 (2011) 1047
`
`00006
`
`
`
`of Fan and Quake, who demonstrated that the sensitiv-
`ity of noninvasive prenatal detection of fetal aneu-
`ploidy from maternal plasma via massively parallel se-
`quencing is limited only by the counting statistics (7 ).
`Because we are collecting information across the entire
`genome, this method is capable of detecting any aneu-
`ploidy or other copy number variation including inser-
`tions and deletions. The karyotype for one our samples
`had a small deletion in chromosome 11 between q21
`and q23 that we observed as an approximately 10%
`decrease in the relative number of tags in a 25-Mb re-
`gion starting at q21 when we analyzed the sequencing
`data in 500-kb bins. In addition, 3 of the samples in the
`training set had complex sex karyotypes that the cyto-
`genetic analysis revealed to be due to mosaicism. These
`karyotypes were: (a) 47,XXX[9]/45,X[6], (b) 45,X[3]/
`46,XY[17], and (c) 47,XXX[13]/45,X[7]. Sample b,
`which showed some XY-containing cells, was correctly
`classified as XY. Samples a (from a CVS procedure) and
`c (from amniocentesis), which a cytogenetic analysis
`revealed both to be a mixture of XXX and X cells (con-
`sistent with mosaic Turner syndrome), were classified
`as a “no call” and monosomy X, respectively. Further
`work is warranted to compare the results obtained
`from sequencing data—not only to cytogenetic results
`obtained via invasive prenatal procedures (particularly
`CVS, which can reveal confined placental mosaicism)
`but also to birth outcomes—to better understand test
`performance in the setting of such complex cases.
`In testing our algorithm, we observed another in-
`teresting result for one of the samples from our test set,
`which had an NCV between ⫺5 and ⫺6 for chromo-
`some 21 (Fig. 2). Although cytogenetic analysis re-
`vealed this sample to be diploid for chromosome 21,
`the karyotype showed mosaicism, with triploidy for
`chromosome 9: 47,XX,⫹9[9]/46,XX[6]. Because chro-
`mosome 9 is used in the denominator of our algorithm
`for determining the chromosome 21 ratio (Table 1),
`that mosaicism lowered the overall NCV value. This
`result strikingly demonstrates the ability of this algo-
`rithm to detect fetal trisomy 9 in this case. In subse-
`quent studies, which we are now conducting, we are
`using multiple chromosome ratios to ensure correct
`classification for the chromosomes of interest. In addi-
`tion, we are establishing normalizing chromosomes for
`all of the autosomes to increase the probability of de-
`tecting rare aneuploidies across the genome.
`The conclusion of Fan et al. regarding the sensitiv-
`ity of these methods is correct only if the algorithms
`being used are able to account for any random or sys-
`tematic biases introduced by the sequencing method. If
`the sequencing data are not properly normalized, the
`resulting analysis will be inferior to the counting statis-
`tics. Chiu et al. noted in their recent report that their
`measurement of chromosomes 18 and 13 with the mas-
`
`1048 Clinical Chemistry 57:7 (2011)
`
`Fig. 3. NCVs for the 47 samples in the test set for
`chromosomes 21 and 18 by using the normalization
`procedure of Chiu et al.
`The last 13 samples in the chromosome 21 data set have
`T21 clinical karyotypes; 10 of the 13 samples were classi-
`fied as T21. The last 8 samples in the chromosome 18 data
`set have T18 clinical karyotypes; 5 of 8 samples were
`classified as T18.
`
`sively parallel sequencing method was imprecise and
`concluded that more research was necessary in order to
`apply the method to the determination of T18 and T13
`(8 ). The method described by Chiu et al. simply uses
`the number of sequence tags on the chromosome of
`interest—in their case chromosome 21—normalized
`by the total number of tags in the sequencing run. The
`challenge for this approach is that the distribution of
`tags on each chromosome can vary from sequencing
`run to sequencing run, and this variation thus can in-
`crease the overall variation of the aneuploidy-detection
`metric. To compare the results obtained with the Chiu
`algorithm to the chromosome ratios we describe in this
`report, we reanalyzed our set of test data for chromo-
`somes 21 and 18 with the method recommended by
`Chiu et al. (Fig. 3). Overall, we observed a compression
`in the range of NCVs for chromosomes 21 and 18 sep-
`arately, as well as a decrease in the detection rate, with
`10 of 13 T21 samples and 5 of 8 of T18 samples cor-
`rectly identified from our test set with an NCV thresh-
`old of 4.0 for aneuploidy classification.
`Ehrich et al. also focused only on T21 and used the
`same algorithm as Chiu et al. (9 ). After observing a
`shift in their test set z-score metric from the external
`reference data (training set), they retrained on the test
`set to establish the classification boundaries. Al-
`though this approach is feasible in principle, in prac-
`tice it would be challenging to decide how many
`samples would be required for training and how of-
`ten one would need to retrain to ensure that the
`classification boundaries were correct. One method
`
`00007
`
`
`
`Optimal Detection of Fetal Aneuploidy by Sequencing
`
`of mitigating this issue is to include in every se-
`quencing run controls that measure the baseline and
`calibrate for quantitative behavior. We are currently
`developing such controls and are incorporating
`them into our future clinical studies.
`In conclusion, we have shown that massively par-
`allel sequencing is capable of detecting multiple fetal
`chromosomal abnormalities from the plasma of preg-
`nant women when the algorithm for normalizing the
`chromosome-counting data is optimized. Our algo-
`rithms for quantification not only minimize random
`and systematic variation between sequencing runs but
`also allow for effective classification of aneuploidies
`across the entire genome, most notably T21 and T18.
`Larger sample collections are required to further test
`the algorithm for T13 detection. To this end, we are
`currently conducting a prospective, blinded, multisite
`clinical study to further demonstrate the diagnostic ac-
`curacy of our methods and to validate the conclusions
`presented in this report.
`
`Author Contributions: All authors confirmed they have contributed to
`the intellectual content of this paper and have met the following 3 re-
`quirements: (a) significant contributions to the conception and design,
`acquisition of data, or analysis and interpretation of data; (b) drafting
`or revising the article for intellectual content; and (c) final approval of
`the published article.
`
`Authors’ Disclosures or Potential Conflicts of Interest: Upon man-
`uscript submission, all authors completed the Disclosures of Potential
`Conflict of Interest form. Potential conflicts of interest:
`
`Employment or Leadership: A.J. Sehnert, Verinata Health; B.K.
`Rhees, Verinata Health; D. Comstock, Verinata Health; E. de Feo,
`Verinata Health; G. Heilek, Verinata Health; J. Burke, Verinata
`Health; R.P. Rava, Verinata Health.
`Consultant or Advisory Role: J. Burke, Verinata Health.
`Stock Ownership: D. Comstock, Verinata Health; G. Heilek, Amgen,
`Verinata Health, and Roche; R.P. Rava, Verinata Health.
`Honoraria: None declared.
`Research Funding: None declared.
`Expert Testimony: None declared.
`
`Role of Sponsor: The funding organizations played a direct role in
`the design of the study, the choice of enrolled patients, the review and
`interpretation of data, and the preparation and final approval of the
`manuscript.
`
`Acknowledgments: We thank the pregnant women who enrolled
`in this study, without whom this research could not have been
`conducted. We also thank the following members of the Verinata
`Health research team who processed samples and generated the
`data: Neha Agarwal, Manjula Chinnappa, Gail Chinno