`
`US 20090029377Al
`
`(19) United States
`(12) Patent Application Publication (10) Pub. No.: US 2009/0029377 A1
` Lo et a]. (43) Pub. Date: Jan. 29, 2009
`
`
`
`(54) DIAGNOSING FETAL CHROMOSOMAL
`ANEUPLOIDY USING MASSIVELY
`PARALLEL GENOMIC SEQUENCING
`
`Related US. Application Data
`.
`.
`.
`.
`goggggnal application No. 60/951,438, filed on Jul.
`
`(60)
`
`(75)
`
`Inventors:
`
`Yuk-Ming Dennis Lo, Howloon
`(HK); Rossa Wai Kwun Chiu, NeW
`Territories (HK); Kwan Chee
`Chan, KOWIOOH (HK)
`
`Correspondence Address:
`TOWNSEND AND TOWNSEND AND CREW,
`LLP
`;I\:\S)OREMBARCADERO CENTER, EIGHTH
`SAN FRANCISCO, CA 94111-3834 (US)
`
`(73) Assignee:
`
`The Chinese University Of Hong
`Kong, New Territories (HK)
`
`(2]) App]. No.:
`
`12/178,181
`
`(22) Filed:
`
`Jul. 23, 2003
`
`Publication Classification
`
`(51)
`
`Int. Cl.
`(2006.01)
`C12Q 1/68
`(2006.01)
`0an 19/00
`(52) US. Cl. . .............................................. 435/6; 702/20
`
`ABSTRACT
`(57)
`Embodiments of this invention provide methods, systems,
`and apparatus for determining Whether a fetal chromosomal
`aneuploidy exists from a biological sample obtained from a
`pregnant female. Nucleic acid molecules of the biological
`sample are sequenced, such that a fraction of the genome is
`sequenced. Respective amounts of a clinically-relevant chro—
`mosome and of background chromosomes are determined
`from results of the sequencing. A parameter derived from
`these amounts (e.g. a ratio) is compared to one or more cutoff
`values, thereby determining a classification ofwhether a fetal
`chromosomal aneuploidy exists.
`
`
`
`
`
`
`
`
`12 3 4 5 6 7 8 910111213141516171819202122X
`
`
`
`Page 1 0f 22
`
`SEQUENOM EXHIBIT 1002
`
`SEQUENOM EXHIBIT 1002
`
`Page 1 of 22
`
`
`
`Patent Application Publication
`
`Jan. 29, 2009 Sheet 1 of 9
`
`US 2009/0029377 A1
`
`Receive sample
`
`1.10
`
`
`
`
`
`130
`
`120
`
`140
`
`Sequence fraction of genome in
`sample
`‘
`
`
`
`Based on sequencing, determine first
`amount of a first chromosome
`
`Determine second amount of one
`or more second chromosomes
`
`
`
`150
`
`Determine parameter from first
`amount and second amount
`
`
`160
`
`
`
`Based on the comparison, a classification of
`whether a fetal chromosomal aneuploidy exists
`for the first chromosome is determined
`
`100 /
`
`FIG. 1
`
`Page 2 of 22
`
`Page 2 of 22
`
`
`
`Patent Application Publication
`
`Jan. 29, 2009 Sheet 2 0f 9
`
`US 2009/0029377 A1
`
`220
`
`230
`
`240
`
`250
`
`210
`
`.
`Receive sample
`
`
`
`Calculate number of sequences
`needed
`
`
`
`
`
`Randomly sequence fraction of
`genome
`‘
`
`
`
`
`
`
`
`
`
`Based on sequencing, determine first
`amount of a first chromosome '
`
`
`
`Determine second amount of one
`or more second chromosomes
`
`
`260
`
`Determine parameter from first
`amount and second amount
`
`270
`
`
`Based on the comparison, a classification of
`whether a fetal chromosomal aneuploidy exists
`
`for the first chromosome is determined
`
`
`
`
`200 /
`
`FIG. 2
`
`Page 3 of 22
`
`Page 3 of 22
`
`
`
`Patent Application Publication
`
`Jan. 29, 2009 Sheet 3 0f 9
`
`US 2009/0029377 A1
`
`
`
`Percentageofsequencesmappedtochromosome21(%)
`
`1.52
`
`1.50
`
`1.48
`
`1.48
`
`1.44
`
`1.42
`
`1.40
`
`1.38
`
`1.36
`
`1.34
`
`1.32
`
`13
`
`12
`
`11
`
`10
`
`Trisomy 21
`
`Normal
`
`Status of the fetus
`
`FIG. 3A
`
`8
`
`10.
`
`12
`
`14
`
`16
`
`18
`
`Fetal DNA percentage determined by microfluidics digital PCR
`ZFY/ZFX assays (%)
`
`FIG. 38
`
`
`
`
`
`determinedbythepercentageofYchromosomesequencesusingmassivelyparallelsequencing(%)
`
`
`FetalDNApercentage
`
`
`
`
`
`
`
`Page 4 of 22
`
`Page 4 of 22
`
`
`
`Patent Application Publication
`
`Jan. 29, 2009 Sheet4of9
`
`US 2009/0029377 A1
`
`El Tn'somy 21
`
`
`
`12 3 4 5 6 7 8 910111213141516171819202122X Y
`
`
`
`FIG. 4A
`
`
`
`Percentage difference in chromosomal representation (%)
`
`
`
`
`
`
`
`1
`
`2
`
`3 4
`
`5
`
`6
`
`7
`
`8 910111213141516171819202122
`
`
`
`
`
`
`
`mEoonoEo9:5933855:6mmmeEwQ
`
`1|11.
`
`$3wmmmo239:“o<2032%9:.08595859
`
`
`
`Chromosome
`
`FIG. 48
`
`Page 5 of 22
`
`Page 5 of 22
`
`
`
`
`
`Patent Application Publication
`
`Jan. 29, 2009 Sheet 5 0f 9
`
`US 2009/0029377 A1
`
`A A
`
`‘_
`
`N.
`
`10
`
`0)
`CE’
`833‘
`gm
`5% 9
`0:0
`“5’6.
`
`88 8
`al—
`93cm
`35 7m:
`5.-
`«:8'70
`1—:
`00)
`>3
`OCTw—m
`
`6
`
`§0:
`8
`
`4
`
`7
`
`8
`
`_
`
`9
`
`10
`
`11
`
`12
`
`13
`
`Fractional fetal DNA concentration (°/p)
`
`FIG. 5
`
`Page 6 of 22
`
`Page 6 of 22
`
`
`
`Patent Application Publication
`
`Jan. 29, 2009 Sheet 6 0f 9
`
`US 2009/0029377 A1
`
`
`
`tauS+m°md12.817mg:v.3,,21mg...”3.63...N295%
`
`
`
`
`
`
`
`OWNS+m5K«.2_mo+mmo.~Earwo+mow.mwarm—cm.»mmaEmw
`
`
`
`
`
`
`
`5dho+m~odh.Emo+mn~.~adv27mg.»him—Na;v9953
`
`
`
`..EL.
`
`2«NF
`
`.KP
`
`
`
`c5mEocmm.
`
`
`
`
`u—.o:ao:.6.022:380:hocoanoE“6.0:68%ho.o::30...
`cmE:mmtmammmn39V430VmEocmmmtmamwmomom»
`
`
`
`
`
`
`0cofimaowwnmocwsummEnoo58E38035:5;9:305303805332953
`
`
`
`
`
`.Erodham—wadQ:wo+Mnmév.2wo+m~o€S+w~ErmaEmw
`
`Page 7 of 22
`
`
`
`
`
`
`
`«Wu21mg.»99.oo+m~fi~.Quewoman...”B+wmo€m2953.
`
`
`
`
`
`
`
`
`
`mud3+mmwd9321mg.rm.E.mo+mmv.nmo+mwmd.o295%
`
`
`
`
`
`223$.
`
`229.6
`
`
`
`
`
`3dB+m-fio._.Nmo+mvo.~_W:wo+wvvdmo+wmm.mn995$
`
`
`
`
`
`
`
`
`
`ENS+mmmfi9mmmo+mmo.~Q33.5.3.»wo+mmc.mm2923
`
`m.9“.
`
`._2296
`
`.220.8
`
`Page 7 of 22
`
`
`
`Patent Application Publication
`
`Jan. 29, 2009 Sheet 7 of 9
`
`US 2009/0029377 A1
`
`33300.5
`
`.EQEEQ
`
`cozfl
`
`A;m3
`
`3.3mm
`
`23:8
`
`.3:m
`
`23:8
`
`:35.
`
`5:2
`
`_Emmwoo=m
`
`.2222:
`
`.3:mm
`
`25:3
`
`.3:m
`
`25:8
`
`:35.
`
`02
`
`mg
`
`mg
`
`wm>
`
`m8»
`
`mm;
`
`mm>
`
`mm>
`
`m9»
`
`fiommv.vQumtu.wo\orwviv3mm?wQommv.vQanwm.v
`
`$2.1v3&5»._.
`
`£9_\EV._\
`
`o\oOEu.r«\onm.r
`
`Qommv.vo\ohN.—V.rflow/NV—.
`
`Qommuu.vRev”?_‘
`
`Qonw.—.
`
`Qowmw._.
`
`«owmfie
`022m3
`Qovwvgx
`
`fiovmvé
`
`fiofimv.wfiovmv._.
`
`{ovmvé
`
`$vmv€
`
`Qovmw.vfiovmvé
`
`Qovmvé
`
`02
`
`mm>
`
`mm>
`
`mm>
`
`mm>
`
`mo>
`
`m9»
`
`m8»
`
`«Vanni—x
`
`o\oowv€
`
`o\omSir
`
`o\oovwé
`
`o\o®O.v._‘
`
`##0##
`
`Qomové
`
`Qomové
`
`fiomrmé
`
`Qovvme
`
`fioxmme
`
`{creme
`
`gnome
`
`{owomé
`
`Qomwme
`
`“Verge
`
`Qomnme
`
`flywmme
`
`k.mt
`
`gummme
`
`$wmme
`
`flowmme
`
`obwwme
`
`flamme
`
`ficwwme
`
`flowwme
`
`flowwmé
`
`02
`
`DZ
`
`02
`
`02
`
`02
`
`02
`
`02
`
`02
`
`02
`
`oxowwve
`
`Qooove
`
`.xbwm...
`
`o\owwme
`
`o\oh\.m.v
`
`figmume
`
`Dink;
`
`$Nwmé
`
`fiomnme
`
`figmwme
`
`“xv—.56
`
`Qovwme
`
`oxewmme
`
`fivvmme
`
`Axummme
`
`flammme
`
`o\owmme
`
`{ommme
`
`0&mewo\ommm.v
`
`nxbmm.wo\ommm.r
`
`Oxbow.wo\ommm.ro\ommm.vo\omm.m.wD\ommm._‘
`
`fiommme
`
`Qonwme
`
`fommme
`
`o\omvm.v
`
`oawwme
`
`o\om¢m.r
`
`$30?
`
`Qommme
`
`Qommme
`
`Qoomme
`
`floane
`
`flormwe
`
`{ommme
`
`florome
`
`QoNome
`
`Qomome
`
`o\omome
`
`o\o®ome
`
`_Emmwoo:w
`
`.EoEtE
`
`:052
`
`2|.9:
`
`:6:mm
`
`23:8
`
`Lo:m
`
`25:8
`
`:35.
`
`:5:m
`
`25:3
`
`3.2:m
`
`25:8
`
`Page 8 of 22
`
`Page 8 of 22
`
`
`
`Patent Application Publication
`
`Jan. 29, 2009 Sheet 8 of 9
`
`US 2009/0029377 A1
`
`
`
`mm.OE
`
`Eggfig
`N803:Egsmgs
`
`Egéwmiavw
`8&8:gétmimi
`
`Egg?
`
`$383E33853
`mmmmmmi238:$2538953Smmmmi9853$253@053H
`EEEE'I
`:82:Egg:
`
`
`Squaw:”$08va8&5:SE
`
`38%.:Egg:22%:Egg!
`5%:gégfi
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`wkvommvv.hwmmm:womvemvr@53va
`
`
`vavawovwwm:mmwovmvromomomvr
`
`
`$233885332538853EE25:885:5ng:58:i5883$853$853
`mmmmmeiH.590.5
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`waoowmmacrowmmFNmmpmmmmoowmmmwmmnmrmwoowmFowmomm
`
`
`gagE
`
`.wocmzuww
`
`$258828m85memmwcowmN588888m228m3ngII¢ommm
`
`
`SmmmmmommmmmmNwmmmnm
`.385EEEEEEEE
`
`Egfigggggll
`
`«£25588EggvgmflmEEH
`0585$83083%$808E3255EE'I
`
`5205385gig$880EEEEE
`«353EEEEEEEE
`.0:meEEEEEEEH
`
`
`
`08838:55nmommnm3803EE...2.;Kb3...vmu_..
`
`
`
`E
`
`
`
`
`
`
`
`
`
`mEEmwmEEmm.mEEwmoEEmw
`
`wucozamm
`
`Page 9 of 22
`
`
`
`Page 9 of 22
`
`
`
`
`
`
`
`
`
`
`
`
`Patent Application Publication
`
`Jan. 29, 2009 Sheet 9 0f 9
`
`US 2009/0029377 A1
`
`._<me:km
`
`w0<umwih§aav.90DmXEDm<Om>mx.KOQJimmm
`
`
`
`
`
`kin—ma
`
`meQ<O<
`
`Nwdl
`
`am.OEmotzoz
`
`mum
`
`awhzim
`
`a«Mal
`
`ma..0883.»E05:.wjoEzoo
`
`
`Em.~w>w
`
`Page 10 of 22
`
`Page 10 of 22
`
`
`
`
`US 2009/0029377 A1
`
`Jan. 29, 2009
`
`DIAGNOSING FETAL CHROMOSOMAL
`ANEUPLOIDY USING MASSIVELY
`PARALLEL GENOMIC SEQUENCING
`
`CLAIM OF PRIORITY
`
`[0001] The present application claims priority from and is a
`non-provisional application of US. Provisional Application
`No. 60/951,438, entitled “DETERMINING A NUCLEIC
`ACID SEQUENCE IMBALANCE” filed Jul. 23, 2007 (At-
`torney Docket No. 016285—005200US), the entire contents of
`which are herein incorporated by reference for all purposes.
`
`CROSS-REFERENCES TO RELATED
`APPLICATIONS
`
`[0002] The present application is also related to concur-
`rently filed non-provisional application entitled “DETER—
`MINING A NUCLEIC ACID SEQUENCE IMBALANCE,”
`(Attorney Docket No. 016285-005210US) the entire contents
`of which are herein incorporated by reference for all pur-
`poses.
`
`FIELD OF THE INVENTION
`
`[0003] This invention generally relates to the diagnostic
`testing of fetal chromosomal aneuploidy by determining
`imbalances between different nucleic acid sequences, and
`more particularly to the identification of trisomy 21 (Down
`syndrome) and other chromosomal aneuploidies via testing a
`maternal sample (e.g. blood).
`
`BACKGROUND
`
`Fetal chromosomal aneuploidy results from the
`[0004]
`presence of abnormal dose(s) of a chromosome or chromo-
`somal region. The abnormal dose(s) can be abnormally high,
`e.g. the presence ofan extra chromosome 21 or chromosomal
`region in trisomy 21 ; or abnormally low, e.g. the absence of a
`copy of chromosome X in Turner syndrome.
`[0005] Conventional prenatal diagnostic methods of a fetal
`chromosomal aneuploidy, e.g., trisomy 21, involve the sam-
`pling of fetal materials by invasive procedures such as amnio—
`centesis or chorionic villus sampling, which pose a finite risk
`of fetal loss. Non—invasive procedures, such as screening by
`ultrasonography and biochemical markers, have been used to
`risk-stratify pregnant women prior to definitive invasive diag-
`nostic procedures However, these screening methods typi—
`cally measure epiphenomena that are associated with the
`chromosomal aneuploidy, e.g., trisomy 2 l , instead of the core
`chromosomal abnormality, and thus have suboptimal diag—
`nostic accuracy and other disadvantages, such as being highly
`influenced by gestational age.
`[0006] The discovery of circulating cell-free fetal DNA in
`maternal plasma in 1997 offered new possibilities for nonin-
`vasive prenatal diagnosis (Lo, Y M D and Chiu, R W K 2007
`Nat Rev Genet 8, 71-77). While this method has been readily
`applied to the prenatal diagnosis of sex—linked (Costa, J M et
`al. 2002 N Engl J Med 346, 1502) and certain single gene
`disorders (L0, Y M D et al. l998 N EnglJMed 339, 1734—
`1738), its application to the prenatal detection of fetal chro-
`mosomal aneuploidies has represented a considerable chal-
`lenge (Lo, Y M D and Chiu, R W K 2007, supra). First, fetal
`nucleic acids co-exist in maternal plasma with a high back—
`ground of nucleic acids of maternal origin that can often
`interfere with the analysis of fetal nucleic acids (Lo, Y M D et
`al. l998Am JHum Genet 62, 768-775). Second, fetal nucleic
`
`Page 11 of 22
`
`acids circulate in maternal plasma predominantly in a cell-
`free form, making it difficult to derive dosage information of
`genes or chromosomes within the fetal genome.
`[0007]
`Significant developments overcoming these chal-
`lenges have recently been made (Benachi, A & Costa, J M
`2007 Lancet 369, 440-442). One approach detects fetal-spe-
`cific nucleic acids in the maternal plasma, thus overcoming
`the problem of maternal background interference (Lo, Y M D
`and Chiu, R W K 2007, supra). Dosage of chromosome 21
`was inferred from the ratios of polymorphic alleles in the
`placenta-derived DNA/RNA molecules. However,
`this
`method is less accurate when samples contain lower amount
`ofthe targeted nucleic acid and can only be applied to fetuses
`who are heterozygous for the targeted polymorphisms, which
`is only a subset of the population if one polymorphism is
`used.
`
`[0008] Dhallan et al (Dhallan, R, et al. 2007, supra Dhallan,
`R, et al. 2007 Lancet 369, 474—481) described an alternative
`strategy of enriching the proportion of circulating fetal DNA
`by adding formaldehyde to maternal plasma. The proportion
`of chromosome 21 sequences contributed by the fetus in
`maternal plasma was determined by assessing the ratio of
`paternally-inherited fetabspecific alleles to non-fetal—specific
`alleles for single nucleotide polymorphisms (SNPs) on chro-
`mosome 21. SNP ratios were similarly computed for a refer-
`ence chromosome. An imbalance of fetal chromosome 21
`was then inferred by detecting a statistically significant dif-
`ference between the SNP ratios for chromosome 21 and those
`of the reference chromosome, where significant is defined
`using a fixed p-value of $0.05. To ensure high population
`coverage, more than 500 SNPs were targeted per chromo-
`some. However, there have been controversies regarding the
`effectiveness of formaldehyde to enrich fetal DNA to a high
`proportion (Chung, G T Y, et al. 2005 Clin Chem 51, 655-
`658), and thus the reproducibility of the method needs to be
`further evaluated. Also, as each fetus and mother would be
`informative for a different number of SNPs for each chromo-
`some, the power of the statistical test for SNP ratio compari-
`son would be variable frorn case to case (Lo, Y M D & Chiu,
`R W K. 2007 Lancet 369, 1997). Furthermore, since these
`approaches depend on the detection of genetic polymor—
`phisms, they are limited to fetuses heterozygous for these
`polymorphisms.
`[0009] Using polymerase chain reaction (PCR) and DNA
`quantification of a chromosome 21 locus and a reference
`locus in amniocyte cultures obtained from trisomy 21 and
`euploid fetuses, Zimmermann et al (2002 Clin Chem 48,
`362-363) were able to distinguish the two groups of fetuses
`based on the 1.5—fold increase in chromosome 21 DNA
`
`sequences in the former. Since a 2-fold difference in DNA
`template concentration constitutes a difference of only one
`threshold cycle (Ct), the discrimination of a 1.5—fold differ-
`ence has been the limit of conventional real-time PCR. To
`
`achieve finer degrees of quantitative discrimination, alterna-
`tive strategies are needed.
`[0010] Digital PCR has been developed for the detection of
`allelic ratio skewing in nucleic acid samples (Chang, H W et
`al. 2002 J Natl Cancer Inst 94, 1697—1703). Digital PCR is an
`amplification based nucleic acid analysis technique which
`requires the distribution of a specimen containing nucleic
`acids into a multitude of discrete samples where each sample
`containing on average not more than about one target
`sequence per sample. Specific nucleic acid targets are ampli~
`lied with sequence-specific primers to generate specific
`
`Page 11 of 22
`
`
`
`US 2009/0029377 A1
`
`Jan. 29, 2009
`
`amplicons by digital PCR. The nucleic acid loci to be targeted
`and the species of or panel of sequence-specific primers to be
`included in the reactions are determined or selected prior to
`nucleic acid analysis.
`[0011] Clinically, it has been shown to be useful for the
`detection of loss of heterozygosity (LOH) in tumor DNA
`samples (Zhou, W. et a1. 2002 Lancet 359, 219-225). For the
`analysis of digital PCR results, sequential probability ratio
`testing (SPRT) has been adopted by previous studies to clas—
`sify the experimental results as being suggestive of the pres-
`ence of LOH in a sample or not (El Karoui at a]. 2006 Stat
`Med 25, 3124-3133).
`[0012]
`In methods used in the previous studies, the amount
`of data collected from the digital PCR is quite low. Thus, the
`accuracy can be compromised due to the small number ofdata
`points and typical statistical fluctuations.
`[0013]
`It is therefore desirable that noninvasive tests have
`high sensitivity and specificity to minimize false negatives
`and false positives, respectively. However, fetal DNA is
`present in low absolute concentration and represent a minor
`portion of all DNA sequences in maternal plasma and serum.
`It is therefore also desirable to have methods that allow the
`noninvasive detection of fetal chromosomal aneuploidy by
`maximizing the amount of genetic information that could be
`inferred from the limited amount of fetal nucleic acids which
`
`exist as a minor population in a biological sample containing
`maternal background nucleic acids.
`
`BRIEF SUMMARY
`
`[0014] Embodiments of this invention provide methods,
`systems, and apparatus for determining whether a nucleic
`acid sequence imbalance (e.g., chromosome imbalance)
`exists within a biological sample obtained from a pregnant
`female. This determination may be done by using a parameter
`of an amount of a clinically-relevant chromosomal region in
`relation to other
`non—clinical1y~re1evant
`chromosomal
`regions (background regions) within a biological sample. In
`one aspect, an amount of chromosomes is determined from a
`sequencing of nucleic acid molecules in a maternal sample,
`such as urine, plasma, serum, and other suitable biological
`samples. Nucleic acid molecules of the biological sample are
`sequenced, such that a fraction of the genome is sequenced.
`One or more cutoff values are chosen for determining
`whether a change compared to a reference quantity exists (i.e.
`an imbalance), for example, with regards to the ratio of
`amounts of two chromosomal regions (or sets of regions).
`[0015] According to one exemplary embodiment, a bio-
`logical sample received from a pregnant female is analyzed to
`perform a prenatal diagnosis of a fetal chromosomal aneup-
`loidy. The biological sample includes nucleic acid molecules.
`A portion of the nucleic acid molecules contained in the
`biological sample are sequenced. In one aspect, the amount of
`genetic information obtained is sufficient for accurate diag-
`nosis yet not overly excessive so as to contain costs and the
`amount of input biological sample required.
`[0016] Based on the sequencing, a first amount of a first
`chromosome is determined from sequences identified as
`originating from the first Chromosome. A second amount of
`one or more second chromosomes is determined from
`sequences identified as originating from one of the second
`chromosomes. A parameter from the first amount and the
`second amount is then compared to one or more cutoffvalues.
`Based on the comparison, a classification of whether a fetal
`chromosomal aneuploidy exists for the first chromosome is
`
`Page 12 of 22
`
`determined. The sequencing advantageously maximizes the
`amount of genetic information that could be inferred from the
`limited amount of fetal nucleic acids which exist as a minor
`population in a biological sample containing maternal back—
`ground nucleic acids.
`[0017] According to one exemplary embodiment, a bio-
`logical sample received frorn a pregnant female is analyzed to
`perform a prenatal diagnosis of a fetal chromosomal aneup-
`loidy. The biological sample includes nucleic acid molecules.
`A percentage of fetal DNA in the biological sample is iden-
`tified. A number N of sequences to be analyzed based on a
`desired accuracy is calculated based on the percentage. At
`least N ofthe nucleic acid molecules contained in the biologi-
`cal sample are randomly sequenced.
`[0018] Based on the random sequencing, a first amount of a
`first chromosome is determined from sequences identified as
`originating from the first chromosome. A second amount of
`one or more second chromosomes is determined from
`
`sequences identified as originating from one of the second
`chromosomes. A parameter from the first amount and the
`second amount is then compared to one or more cutoffvalues.
`Based on the comparison, a classification of whether a fetal
`chromosomal aneuploidy exists for the first chromosome is
`determined. The random sequencing advantageously maxi—
`miles the amount of genetic information that could be
`inferred from the limited amount of fetal nucleic acids which
`
`exist as a minor population in a biological sample containing
`maternal background nucleic acids.
`[0019] Other embodiments of the invention are directed to
`systems and computer readable media associated with meth-
`ods described herein.
`
`[0020] A better understanding ofthe nature and advantages
`of the present invention may be gained with reference to the
`following detailed description and the accompanying draw—
`ings.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a flowchart of a method 100 for performing
`[0021]
`prenatal diagnosis of a fetal chromosomal aneuploidy in a
`biological sample obtained from a pregnant female subject
`according to an embodiment of the present invention.
`[0022]
`FIG. 2 is a flowchart of a method 200 for performing
`prenatal diagnosis of a fetal chromosomal aneuploidy using
`random sequencing according to an embodiment of the
`present invention.
`[0023]
`FIG. 3A shows a plot of percentage representation
`of chromosome 21 sequences in maternal plasma samples
`involving trisomy 21 or euploid fetuses according to an
`embodiment of the present invention.
`[0024]
`FIG. 3B shows a correlation between maternal
`plasma fractional fetal DNA concentrations detennined by
`massively parallel sequencing and microfluidics digital PCR
`according to an embodiment of the present invention.
`[0025]
`FIG. 4A shows a plot of percentage representation
`of aligned sequences per chromosome according to an
`embodiment of the present invention.
`[0026]
`FIG. 4B shows a plot ofdifference (%) in percentage
`representation per chromosome between the trisomy 21 case
`and euploid case shown in FIG. 4A.
`[0027]
`FIG. 5 shows a correlation between degree of over-
`representation in chromosome 21 sequences and the frac—
`tional fetal DNA concentrations in maternal plasma involving
`trisomy 21 fetuses according to an embodiment ofthe present
`invention.
`
`Page 12 of 22
`
`
`
`US 2009/0029377 A1
`
`Jan. 29, 2009
`
`FIG. 6 shows a table of a portion of human genome
`[0028]
`that was analyzed according to an embodiment of the present
`invention. T21 denote a sample obtained from a pregnancy
`involving a trisomy 21 fetus.
`[0029]
`FIG. 7 shows a table of a number of sequences
`required to differentiate euploid from trisomy 21 fetuses
`according to an embodiment of the present invention.
`[0030]
`FIG. 8A shows a table oftop ten starting positions of
`sequenced tags aligned to chromosome 21 according to an
`embodiment of the present invention.
`[0031]
`FIG. 8B shows a table oftop ten starting positions of
`sequenced tags aligned to chromosome 22 according to an
`embodiment of the present invention.
`[0032]
`FIG. 9 shows a block diagram of an exemplary
`computer apparatus usable with system and methods accord-
`ing to embodiments of the present invention.
`
`DEFINITIONS
`
`[0033] The term “biological sample” as used herein refers
`to any sample that is taken from a subject (e.g., a human, such
`as a pregnant woman) and contains one or more nucleic acid
`molecule(s) of interest.
`[0034] The term “nucleic acid” or “polynucleotide” refers
`to a deoxyribonucleic acid (DNA) or ribonucleic acid (RNA)
`and a polymer thereof in either single— or double~stranded
`form. Unless specifically limited,
`the term encompasses
`nucleic acids containing known analogs of natural nucle-
`otides that have similar binding properties as the reference
`nucleic acid and are metabolized in a manner similar to natu-
`
`rally occurring nucleotides. Unless otherwise indicated, a
`particular nucleic acid sequence also implicitly encompasses
`conservatively modified variants thereof (e.g., degenerate
`codon substitutions), alleles, orthologs, SNPs, and comple-
`mentary sequences as well as the sequence explicitly indi-
`cated. Specifically, degenerate codon substitutions may be
`achieved by generating sequences in which the third position
`of one or more selected (or all) codons is substituted with
`mixed-base and/or deoxyinosine residues (Batzer et al.,
`Nucleic Acid Res. 19:5081 (1991); Ohtsuka et al., J. Biol.
`Chem. 260:2605—2608 (1 985); and Rossolini et al., M0]. Cell.
`Probes 8:9I~98 (1994)). The term nucleic acid is used inter—
`changeably with gene, cDNA, mRNA, small noncoding
`RNA, micro RNA (miRNA), Piwi-interacting RNA, and
`short hairpin RNA (shRNA) encoded by a gene or locus.
`[0035] The term “gene” means the segment of DNA
`involved in producing a polypeptide chain. It may include
`regions preceding and following the coding region (leader
`and trailer) as Well as intervening sequences (introns)
`between individual coding segments (exons).
`[0036] The term “reaction” as used herein refers to any
`process involving a chemical, enzymatic, or physical action
`that is indicative of the presence or absence of a particular
`polynucleotide sequence of interest. An example of a “reac—
`tion” is an amplification reaction such as a polymerase chain
`reaction (PCR). Another example of a “reaction” is
`a
`sequencing reaction, either by synthesis or by ligation. An
`“informative reaction” is one that indicates the presence of
`one or more particular polynucleotide sequence of interest.
`and in one case where only one sequence ofinterest is present.
`The term “well” as used herein refers to a reaction at a pre—
`determined location within a confined structure, e.g., a well-
`shaped vial, cell, or chamber in a PCR array.
`[0037] The term “clinically relevant nucleic acid sequence”
`as used herein can refer to a polynucleotide sequence corre-
`
`Page 13 of 22
`
`spending to a segment of a larger genomic sequence whose
`potential imbalance is being tested or to the larger genomic
`sequence itself. One example is the sequence of chromosome
`21. Other examples include chromosome 18, 13, X andY. Yet
`other examples include mutated genetic sequences or genetic
`polymorphisms or copy number variations that a fetus may
`inherit from one or both of its parents. Yet other examples
`include sequences which are mutated, deleted, or amplified in
`a malignant tumor, e.g. sequences in which loss of heterozy-
`gosity or gene duplication occur. In some embodiments, mul-
`tiple clinically relevant nucleic acid sequences, or equiva—
`lently multiple makers of the clinically relevant nucleic acid
`sequence, can be used to provide data for detecting the imbal-
`ance. For instance, data from five non—consecutive sequences
`on chromosome 21 can be used in an additive fashion for the
`determination of possible chromosomal 21 imbalance, effec-
`tively reducing the need of sample volume to 1/5.
`[0038] The term “background nucleic acid sequence” as
`used herein refers to a nucleic acid sequence whose normal
`ratio to the clinically relevant nucleic acid sequence is known,
`for instance a 1-to-1 ratio. As one example, the background
`nucleic acid sequence and the clinically relevant nucleic acid
`sequence are two alleles from the same chromosome that are
`distinct due to heterozygosity. In another example, the back-
`ground nucleic acid sequence is one allele that is heterozy-
`gous to another allele that is the clinically relevant nucleic
`acid sequence. Moreover, some of each of the background
`nucleic acid sequence and the clinically relevant nucleic acid
`sequence may come from different individuals.
`[0039] The term “reference nucleic acid sequence” as used
`herein refers to a nucleic acid sequence whose average con-
`centration per reaction is known or equivalently has been
`measured.
`
`[0040] The term “overrepresented nucleic acid sequence”
`as used herein refers to the nucleic acid sequence among two
`sequences of interest (e.g., a clinically relevant sequence and
`a background sequence) that is in more abundance than the
`other sequence in a biological sample.
`[0041] The term “based on” as used herein means “based at
`least in part on” and refers to one value (or result) being used
`in the determination of another value, such as occurs in the
`relationship of an input of a method and the output of that
`method. The term “derive” as used herein also refers to the
`relationship of an input of a method and the output of that
`method, such as occurs when the derivation is the calculation
`of a formula.
`
`[0042] The term “quantitative data” as used herein means
`data that are obtained from one or more reactions and that
`
`provide one or more numerical values. For example, the num—
`ber of wells that show a fluorescent marker for a particular
`sequence would be quantitative data.
`[0043] The term “parameter” as used herein means a
`numerical value that characterizes a quantitative data set and/
`or a numerical relationship between quantitative data sets. For
`example, a ratio (or function of a ratio) between a first amount
`of a first nucleic acid sequence and a second amount of a
`second nucleic acid sequence is a parameter.
`[0044] The term “cutoff value” as used herein means a
`numerical value whose value is used to arbitrate between two
`or more states (e.g. diseased and non—diseased) ofclassifica—
`tion for a biological sample. For example, if a parameter is
`greater than the cutoffvalue, a first classification ofthe quan—
`titative data is made (cg. diseased state); or if the parameter
`
`Page 13 of 22
`
`
`
`US 2009/0029377 A1
`
`Jan. 29,2009
`
`is less than the cutoff value, a different classification of the
`quantitative data is made (e.g. non-diseased state).
`[0045] The term “imbalance” as used herein means any
`significant deviation as defined by at least one cutoff value in
`a quantity of the clinically relevant nucleic acid sequence
`from a reference quantity. For example, the reference quantity
`could be a ratio of 3/s, and thus an imbalance would occur if
`the measured ratio is 1:1.
`
`[0046] The term “chromosomal aneuploidy” as used herein
`means a variation in the quantitative amount ofa chromosome
`from that of a diploid genome. The variation may be a gain or
`a loss. It may involve the whole of one chromosome or a
`region of a chromosome.
`[0047] The term “random sequencing” as used herein refers
`to sequencing whereby the nucleic acid fragments sequenced
`have not been specifically identified or targeted before the
`sequencing procedure. Sequence-specific primers to target
`specific gene loci are not required. The pools of nucleic acids
`sequenced vary from sample to sample and even from analy-
`sis to analysis for the same sample. The identities of the
`sequenced nucleic acids are only revealed from the sequenc—
`ing output generated. In some embodiments of the present
`invention, the random sequencing may be preceded by pro-
`cedures to enrich a biological sample with particular popula-
`tions of nucleic acid molecules sharing certain common fea-
`tures. In one embodiment, each of the fragments in the
`biological
`sample have an equal probability of being
`sequenced.
`[0048] The term “fraction of the human genome” or “por-
`tion of the human genome” as used herein refers to less than
`100% of the nucleotide sequences in the human genome
`which comprises of some 3 billion basepairs of nucleotides.
`In the context of sequencing, it refers to less than 1-fold
`coverage of the nucleotide sequences in the human genome.
`The term may be expressed as a percentage or absolute num—
`ber of nucleotides/basepairs. As an example of use, the term
`may be used to refer to the actual amount of sequencing
`performed. Embodiments may determine the required mini—
`mal value for the sequenced fraction ofthe human genome to
`obtain an accurate diagnosis. As another example of use, the
`term may refer to the amount of sequenced data used for
`deriving a parameter or amount for disease classification.
`[0049] The term “sequenced tag” as used herein refers to
`string of nucleotides sequenced from any part or all of a
`nucleic acid molecule. For example, a sequenced tag may be
`a short string of nucleotides sequenced from a nucleic acid
`fragment, a short string of nucleotides at both ends of a
`nucleic acid fragment, or the sequencing of the entire nucleic
`acid fragment that exists in the biological sample. A nucleic
`acid fragment is any part of a larger nucleic acid molecule. A
`fragment (e.g. a gene) may exist separately (i.e. not con-
`nected) to the other parts of the larger nucleic acid molecule.
`
`DETAILED DESCRIPTION
`
`[0050] Embodiments of this invention provide methods,
`systems, and apparatus for determining whether an increase
`or decrease (diseased state) ofa clinically—relevant chromo-
`somal region exists compared to a non—diseased state. This
`determination may be done by using a parameter of an
`amount of a Clinically-relevant chromosomal region in rela—
`tion to other non—clinically-relevant chromosomal regions
`(background regions) within a biological sample. Nucleic
`acid molecules of the biological sample are sequenced. such
`that a fraction of the genome is sequenced. and the amount
`
`may be determined from results of the sequencing. One or
`more cutoff values are chosen for determining whether a
`change compared to a reference quantity exists (i.e. an imbal—
`ance), for example, with regards to the ratio of amounts oftwo
`chromosomal regions (or sets of regions).
`[0051] The change detected in the reference quantity may
`be any deviation (upwards or downwards) in the relation of
`the clinically—relevant nucleic acid sequence to the other non-
`clinically-relevant sequences. Thus, the reference state may
`be any ratio or other quantity (e.g. other than a 1-1 correspon-
`dence), and a measured state signifying a change may be any
`ratio or other quantity that differs from the reference quantity
`as determined by the one or more cutoff values.
`[0052] The clinically relevant chromosomal region (also
`called a clinically relevant nucleic acid sequence) and the
`background nucleic acid sequence may come from a first type
`of cells and from one or more second types of cells. For
`example, fetal nucleic acid sequences originating from fetal/
`placental c