`
`PROVISIONAL
`
`PATENT APPLICATION
`
`DETERMININGA NUCLEIC ACID SEQUENCE IMBALANCE
`
`Inventors :
`
`Yuk-Ming Dennis LO, a citizen of Great Britain, residing at:
`4th Floor, 7 King Tak Street, Homantin, Kowloon, Hong Kong
`
`Rossa Wai Kwun CHIU, a citizen of Australia, residing at:
`Flat 1A, Block 1, Constellation Cove,1 Hung Larn Drive, Tai Po,
`New Territories, Hong Kong SAR
`
`Kwan Chee CHAN, a citizen of Hong Kong SAR, residing at:
`Flat A, 13/F, Block 34, Broadway Street, Mei Foo Sun Chuen, Kowloon,
`Hong Kong SAR
`
`Benny Chung Ying ZEE, a citizen of Canada, residing at :
`Flat 18E, Tower 2, La Costa, Ma On Shan, New Territories, Hong Kong SAR
`
`Ka Chun CHONG a citizen of Hong Kong SAR, residing at:
`Flat 06, 29/F, Shin King House, Fu Shin Estate, Tai Po, New Territories, Hong
`Kong SAR
`
`Assignee:
`
`The Chinese University of Hong Kong
`Technology and Licensing Office, Room 226 Pi Chiu Building
`Shatin, NT. Hong Kong, SAR
`
`Entity:
`
`Srnall
`
`TOWNSEND and TOW NSEND and CREW LLP
`
`Two Embarcadero Center, Eighth Floor
`San Francisco, California 9411 1-3 834
`Tel: 415-576-0200
`Page 1 0f 93
`
`SEQUENOM EXHIBIT 1003
`
`SEQUENOM EXHIBIT 1003
`
`Page 1 of 93
`
`
`
`Attorney Docket N0.: 016285-005200US
`
`PATENT
`
`DETERMININGA NUCLEIC ACID SEQUENCE IMBALANCE
`
`FIELD OF THE INVENTION
`
`[0001]
`
`This invention generally relates to the diagnostic testing of genotypes and diseases
`
`by determining an imbalance between two different nucleic acid sequences, and more
`
`particularly to the identification of Down syndrome, other chromosomal aneuploidies,
`
`mutations and genotypes in a fetus via testing a sample of maternal blood. The invention also
`
`relates to the detection of cancer, the monitoring of transplantation, and the monitoring of
`
`infectious diseases.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`BACKGROUND OF THE INVENTION
`
`[0002] Genetic diseases, cancers, and other conditions often result from or produce an
`
`imbalancc in two corrcsponding chromosomcs or allclcs or othcr nuclcic acid scqucnccs.
`
`That is an amount of one sequence relative to another sequence is larger or smaller than
`
`normal. Usually, the normal ratio is an even 50/50 ratio. Down Syndrome (trisomy 21) is
`
`such a disease having an imbalance of an extra chromosome 21.
`
`[0003] Conventional prenatal diagnostic methods of trisomy 21 involve the sampling of
`
`fetal materials by invasive procedures such as amniocentesis or chorionic villus sampling,
`
`which pose a finite risk of fetal loss. Non-invasive procedures, such as screening by
`
`ultrasonography and biochemical markers, have been used to risk—stratify pregnant women
`
`prior to definitive invasive diagnostic procedures. However, these screening methods
`
`typically mcasurc cpiphcnomcna that are associatcd with trisomy 21 instcad of thc corc
`
`chromosomal abnormality, and thus have suboptimal diagnostic accuracy and other
`
`disadvantages, such as being highly influenced by gestational age.
`
`[0004]
`
`The discovery of circulating cell-free fetal DNA in maternal plasma in 1997 offered
`
`new possibilities for noninvasive prenatal diagnosis (Lo, YMD and Chiu, RWK 2007 Nat
`
`Rev Genet 8, 71-77). While this method has been readily applied to the prenatal diagnosis of
`
`sex—linked (Costa, JM et a1. 2002 N Engl J Med 346, 1502) and certain single gene disorders
`
`(Lo, YMD et a1. 1998 NEnglJMed 339, 1734—173 8), its application to the prenatal detection
`
`of fetal chromosomal aneuploidies has represented a considerable challenge (Lo, YMD and
`
`Page 2 0f 93
`
`Page 2 of 93
`
`
`
`Chiu, RWK 2007, supra). First, fetal nucleic acids co-exist in maternal plasma with a high
`
`background of nucleic acids of maternal origin that can often interfere with the analysis (Lo,
`
`YMD et al. 1998 Am J Hum Genet 62, 768-775). Second, fetal nucleic acids circulate in
`
`maternal plasma predominantly in a cell-free form, making it difficult to derive dosage
`
`information of genes or chromosomes within the fetal genome.
`
`[0005]
`
`Significant developments overcoming these challenges have recently been made
`
`(Benachi, A & Costa, JM 2007 Lancet 369, 440—442). One approach detects fetal—specific
`
`nucleic acids in the maternal plasma, thus overcoming the problem of maternal background
`
`interference (Lo, YMD and Chiu, RWK 2007, supra). Dosage of chromosome 21 was
`
`inferred from the ratios of polymorphic alleles in the placenta-derived DNA/RNA molecules.
`
`However, this method is less accurate when samples contain lower amount of the targeted
`
`gene and can only be applied to fetuses who are heterozygous for the targeted
`
`polymorphisms, which is about 50% 0f the population.
`
`[0006] Dhallan et a1 (Dhallan, R, et a]. 2007, supra Dhallan, R, et al. 2007 Lancet 369,
`
`474—48 1) described an alternative strategy of enriching the proportion of circulating fetal
`
`DNA by adding formaldehyde to maternal plasma. The proportion of chromosome 21
`
`sequences contributed by the fetus in maternal plasma was determined by assessing the ratio
`
`of patemally-inherited fetal-specific alleles to non-fetal-spccific alleles for single nucleotide
`
`polymorphisms (SNPs) on chromosome 21. SNP ratios were similarly computed for a
`
`reference chromosome. An imbalance of fetal chromosome 21 was then inferred by detecting
`
`a statistically significant difference between the SNP ratios for chromosome 21 and those of
`
`the reference chromosome, where significant is defined as a p-value of S 0.05. To ensure
`
`high population coverage, more than 500 SNPs were targeted per chromosome. However,
`
`there have been controversies regarding the effectiveness of formaldehyde to enrich to a high
`
`proportion (Chung, GTY, et a1. 2005 Clin Chem 51, 655—658), and thus the reproducibility of
`
`the method needs to be further evaluated. Also, as each fetus and mother would be
`
`informative for a different number of SNPs for each chromosome, the power of the statistical
`
`test for SNP ratio comparison would be variable from case to case (Lo, YMD & Chiu, RWK.
`
`2007 Lancet 369, 1997). Furthermore, since these approaches depend on the detection of
`
`genetic polymorphisms, they are limited to fetuses heterozygous for these polymorphisms.
`
`[0007] Using polymerase chain reaction (PCR) and DNA quantification of a chromosome
`
`21 locus and a reference locus in amniocyte cultures obtained from trisomy 21 and euploid
`
`10
`
`15
`
`20
`
`25
`
`30
`
`Page 3 0f 93
`
`Page 3 of 93
`
`
`
`fetuses, Zimmermann et al (2002 Clin Chem 48, 362-363) were able to distinguish the two
`
`groups of fetuses based on the 15-fold increase in chromosome 21 DNA sequences in the
`
`former. Since a 2-fold difference in DNA template concentration constitutes a difference of
`
`only one threshold cycle (Ct), the discrimination of a 1.5-fold difference has been the limit of
`
`conventional real-time PCR. To achieve finer degrees of quantitative discrimination,
`
`alternative strategies are needed. Accordingly, some embodiments of the present invention
`
`use digital PCR (Vogelstein, B et a1. 1999 Proc Natl Acad Sci US A 96, 9236—9241) for this
`
`purpose.
`
`10
`
`15
`
`20
`
`[0008] Digital PCR has been developed for the detection of allelic ratio skewing in nucleic
`
`acid samples (Chang, HW et al. 2002 J Natl Cancer Inst 94, 1697-1703). Clinically, it has
`
`been shown to be useful for the detection of loss of heterozygosity (LOH) in tumor DNA
`
`samples (Zhou, W. et al. 2002 Lancet 359, 219-225). For the analysis of digital PCR results,
`
`sequential probability ratio testing (SPRT) has been adopted by previous studies to classify
`
`the experimental results as being suggestive of the presence of LOH in a sample or not (El
`
`Karoui at al. 2006 Stat Med 25, 3124—3133). In methods used in the previous studies, the
`
`cutoff value to determine LOH used a fixed reference ratio of the two alleles in the DNA of
`
`2/3. As the amount, proportion and concentration of fetal nucleic acids in maternal plasma
`
`are variable, these methods are not suitable for detecting trisomy 21 using fetal nucleic acids
`
`in a background ofmaternal nucleic acids in maternal plasma.
`
`[0009]
`
`It is desirable to have a noninvasive test for fetal trisomy 21 detection based on
`
`circulating fetal nucleic acid analysis, especially one that is independent of the use of genetic
`
`polymorphisms and/or of fetal-specific markers. It is also desirable to have accurate
`
`determination of cutoff values, which can reduce the number of wells of data and/or the
`
`amount of maternal plasma nucleic acid molecules necessary for accuracy, thus providing
`
`25
`
`increased efficiency and cost—effectiveness.
`
`BRIEF SUMMARY OF THE INVENTION
`
`[0010]
`
`This invention provides methods, systems, and apparatus for determining whether a
`
`nucleic acid sequence imbalance (e.g., allelic imbalance) exists within a biological sample.
`
`One or more cutoff values for determining an imbalance of, for example, the ratio of the two
`
`30
`
`sequences (or sets of sequences) are chosen. In one embodiment, the cutoff value is
`
`determined based at least in part on the percentage of fetal (clinically relevant nucleic acid)
`
`sequences in a biological sample, such as maternal plasma or serum or urine, which contains
`
`Page 4 0f 93
`
`Page 4 of 93
`
`
`
`a background of maternal nucleic acid sequences. In another embodiment, the cutoff value is
`
`determined based on an average concentration of a sequence in a plurality of reactions. In
`
`one aspect, the cutoff value is determined from a proportion of informative wells that are
`
`estimated to contain a particular nucleic acid sequence, where the proportion is determined
`
`based on the above-mentioned percentage and/or average concentration. The cutoff value
`
`may be determined using many different types of methods, such as SPRT, false discovery,
`
`confidence interval, receiver operating characteristic (ROC). This strategy further minimized
`
`the amount of testing required before confident classification could be made. This is of
`
`particular relevance to plasma nucleic acid analysis where the template amount is often
`
`l0
`
`limiting.
`
`[0011] According to one exemplary embodiment, a method is provided for determining
`
`whether a nucleic acid sequence imbalance exists within a biological sample, the method
`
`comprising: receiving data from a plurality of reactions, wherein the data includes: (1) a first
`
`set of quantitative data indicating a first amount of a clinically relevant nucleic acid sequence;
`
`and (2) a second set of quantitative data indicating a second amount of a background nucleic
`
`acid sequence different from the clinically relevant nucleic acid sequence; determining a
`
`parameter from the two data sets; deriving a first cutoff value from an average concentration
`
`of a reference nucleic acid sequence in each of the plurality of reactions, wherein the
`
`reference nucleic acid sequence is either the clinically relevant nucleic acid sequence or the
`
`background nucleic acid sequence; comparing the parameter to the first cutoff value; and
`
`based on the comparison, determining a classification of whether a nucleic acid sequence
`
`imbalance exists.
`
`[0012] According to another exemplary embodiment, a method is provided for determining
`
`whether a nucleic acid sequence imbalance exists within a biological sample, the method
`
`comprising: receiving data from a plurality of reactions, wherein the data includes: (1) a first
`
`set of quantitative data indicating a first amount of a clinically relevant nucleic acid sequence;
`
`and (2) a second set of quantitative data indicating a second amount of a background nucleic
`
`acid sequence different from the clinically relevant nucleic acid sequence, wherein the
`
`clinically relevant nucleic acid sequence and the background nucleic acid sequence come
`
`from a first type of cells and from one or more second types of cells; determining a parameter
`
`from the two data sets; deriving a first cutoff value from a first percentage resulting from a
`
`measurement of an amount of a nucleic acid sequence from the first type of cells in the
`
`15
`
`20
`
`25
`
`30
`
`Page 5 0f 93
`
`Page 5 of 93
`
`
`
`biological sample; comparing the parameter to the cutoff value; and based on the comparison,
`
`determining a classification of whether a nucleic acid sequence imbalance exists.
`
`[0013] A better understanding of the nature and advantages of the present invention may be
`
`gained with reference to the following detailed description and the accompanying drawings.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`[0014]
`
`FIG. 1 is a flowchart illustrating a digital PCR experiment.
`
`[0015]
`
`FIG. 2 illustrates a digital RNA—SNP and RCD method according to an embodiment
`
`of the present invention.
`
`[0016]
`
`FIG. 3 illustrates a graph having SPRT curves used to determine Down syndrome
`
`10
`
`according to an embodiment of the present invention.
`
`[0017]
`
`FIG. 4 shows a method of determining a disease state using a percentage of fetal
`
`cells according to an embodiment of the present invention.
`
`[0018]
`
`FIG. 5 shows a method of determining a disease state using an average
`
`concentration according to an embodiment of the present invention.
`
`[0019]
`
`FIG. 6 shows a table that tabulates the expected digital RNA-SNP allelic ratio and
`
`Pr of trisomy 21 samples for a range of template concentrations expressed as the average
`
`reference template concentration per well (mr) according to an embodiment of the present
`
`invention.
`
`[0020]
`
`FIG. 7 shows a table that tabulates the expected Pr for the fractional fctal DNA
`
`concentrations of 10%, 25%, 50% and 100% in trisomy 21 samples at a range of template
`
`concentrations expressed as the average reference template concentration per well (mr)
`
`according to an embodiment of the present invention.
`
`15
`
`20
`
`[0021]
`
`FIG. 8 shows a plot illustrating the degree of differences in the SPRT curves for mr
`
`values of 0.1, 0.5 and 1.0 for digital RNA-SNP analysis according to an embodiment of the
`
`25
`
`present invention.
`
`[0022]
`
`FIG. 9A shows a table of a comparison of the effectiveness of the new and old
`
`SPRT algorithms for classifying euploid and trisomy 21 cases in 96—well digital RNA—SNP
`
`analyses according to an embodiment of the present invention.
`
`Page 6 0f 93
`
`Page 6 of 93
`
`
`
`[0023]
`
`FIG. 9B shows a table ofa comparison of the effectiveness ofthe new and old
`
`SPRT algorithms for classifying euploid and trisomy 21 cases in 384-well digital RNA-SNP
`
`analyses according to an embodiment of the present invention.
`
`[0024]
`
`FIG. 10 is a table showing the percentages of fetuses correctly and incorrectly
`
`classified as euploid or aneuploid and those not classifiable for the given informative counts
`
`according to an embodiment of the present invention.
`
`[0025]
`
`FIG. 11 is a table 1100 showing computer simulations for digital RCD analysis for a
`
`pure (100%) fetal DNA sample according to an embodiment of the present invention.
`
`[0026]
`
`FIG. 12 is a table 1200 showing results of computer simulation of accuracies of
`
`digital RCD analysis at mr=0.5 for the classification of samples from euploid or trisomy 21
`
`fetuses with different fractional concentrations of fetal DNA according to an embodiment of
`
`the present invention.
`
`[0027]
`
`FIG. 13A shows a table 1300 of digital RNA-SNP analysis in placental tissues of
`
`euploid and trisomy 21 pregnancies according to an embodiment of the present invention.
`
`[0028]
`
`FIG. 13B shows a table 1350 of digital RNA-SNP analysis of maternal plasma from
`
`euploid and trisomy 21 pregnancies according to an embodiment of the present invention.
`
`[0029]
`
`FIG. 14A—14C show plots illustrating a cutoff curve resulting from an RCD analysis
`
`according to an embodiment of the present invention.
`
`DEFINITIONS
`
`[0030]
`
`The term ”biological sample" as used herein refers to any sample that is taken from
`
`a subject (e.g., a human, such as a pregnant woman) and contains one or more nucleic acid
`
`sof interest.
`
`[0031]
`
`The term “nucleic acid” or ‘polynucleotz‘de” refers to a deoxyribonucleic acid
`
`(DNA) or ribonucleic acid (RNA) and a polymer thereof in either single- or double-stranded
`
`form. Unless specifically limited, the term encompasses nucleic acids containing known
`
`analogs of natural nucleotides that have similar binding properties as the reference nucleic
`
`acid and are metabolized in a manner similar to naturally occurring nucleotides. Unless
`
`otherwise indicated, a particular nucleic acid sequence also implicitly encompasses
`
`conservatively modified variants thereof (e.g., degenerate codon substitutions), alleles,
`
`orthologs, SNPs, and complementary sequences as well as the sequence explicitly indicated.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`Page 7 0f 93
`
`Page 7 of 93
`
`
`
`Specifically, degenerate codon substitutions may be achieved by generating sequences in
`
`which the third position of one or more selected (or all) codons is substituted with mixed-
`
`base and/or deoxyinosine residues (Batzer et al., Nucleic Acid Res. 19:5081 (1991); Ohtsuka
`
`et al., J. Biol. Chem. 260:2605-2608 (1985); and Rossolini et al., M01. Cell. Probes 8:91-98
`
`(1994)). The term nucleic acid is used interchangeably with gene, cDNA, mRNA, small
`
`noncoding RNA, micro RNA (miRNA), Piwi—interacting RNA, and short hairpin RNA
`
`(shRNA) encoded by a gene or locus.
`
`[0032]
`
`The term “gene” means the segment of DNA involved in producing a polypeptide
`
`chain. It may include regions preceding and following the coding region (leader and trailer)
`
`as well as intervening sequences (introns) between individual coding segments (exons).
`
`[0033]
`
`The term ”reaction" as used herein refers to any process involving a chemical,
`
`enzymatic, or physical action that is indicative of the presence or absence of a particular
`
`polynucleotide sequence of interest. A preferred example of a "reaction" is an amplification
`
`reaction such as a polymerase chain reaction (PCR). An "informative reaction" is one that
`
`indicates the presence of one or more particular polynucleotide sequence of interest, and in
`
`one case where only one sequence of interest is present. The term "well" as used herein
`
`refers to a reaction at a predetermined location within a confined structure, e. g., a well—shaped
`
`vial, cell, or chamber in a PCR array.
`
`[0034]
`
`The term "clinically relevant nucleic acid sequence" as used herein can refer to a
`
`polynucleotide sequence corresponding to a segment of a larger genomic sequence whose
`
`potential imbalance is being tested or to the larger genomic sequence itself. One example is
`
`the sequence of chromosome 21. Other examples include chromosome 18, 13, X and Y. Yet
`
`other examples include mutated genetic sequences or genetic polymorphisms or copy number
`
`variations that a fetus may inherit from one or both of its parents. Yet other examples include
`
`sequences which are mutated, deleted, or amplified in a malignant tumor, e. g. sequences in
`
`which loss of heterozygosity or gene duplication occur. In some embodiments, multiple
`
`clinically relevant nucleic acid sequences, or equivalently multiple makers of the clinically
`
`relevant nucleic acid sequence, can be used to provide data for detecting the imbalance. For
`
`instance, data from five non-consecutive sequences on chromosome 21 can be used in an
`
`additive fashion for the determination of possible chromosomal 21 imbalance, effectively
`
`reducing the need of sample volume to 1/5.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`Page 8 0f 93
`
`Page 8 of 93
`
`
`
`[0035]
`
`The term ”background nucleic acid sequence" as used herein refers to a nucleic acid
`
`sequence whose normal ratio to the clinically relevant nucleic acid sequence is known, for
`
`instance a 1-to-1 ratio. As one example, the background nucleic acid sequence and the
`
`clinically relevant nucleic acid sequence are two alleles from the same chromosome that are
`
`distinct due to heterozygosity. In another example, the background nucleic acid sequence is
`
`one allele that is heterozygous to another allele that is the clinically relevant nucleic acid
`
`sequence. Moreover, some of each of the background nucleic acid sequence and the
`
`clinically relevant nucleic acid sequence may come from different individuals.
`
`[0036]
`
`The term ”reference nucleic acid sequence" as used herein refers to a nucleic acid
`
`10
`
`sequence whose average concentration per reaction is known or equivalently has been
`
`measured.
`
`[0037]
`
`The term "overrepresented nucleic acid sequence" as used herein refers to the
`
`nucleic acid sequence among two sequences of interest (e.g., a clinically relevant sequence
`
`and a background sequence) that is in more abundance than the other sequence in a biological
`
`15
`
`sample.
`
`[0038]
`
`The term ”based on" as used herein means "based at least in part on" and refers to
`
`one value (or result) being used in the determination of another value, such as occurs in the
`
`relationship of an input of a method and the output of that method. The term ”derive" as used
`
`herein also refers to the relationship of an input of a method and the output of that method,
`
`20
`
`such as occurs when the derivation is the calculation of a formula.
`
`[0039]
`
`The term ”quantitative data" as used herein means data that are obtained from one
`
`or more reactions and that provide one or more numerical values. For example, the number
`
`of wells that show a fluorescent marker for a particular sequence would be quantitative data.
`
`25
`
`30
`
`[0040]
`
`The term ”parameter" as used herein means a numerical value that characterizes a
`
`quantitative data set and/or a numerical relationship between quantitative data sets. For
`
`example, a ratio (or function of a ratio) between a first amount of a first nucleic acid sequence
`
`and a second amount of a second nucleic acid sequence is a parameter.
`
`[0041]
`
`The term ”cutoflvalue" as used herein means a numerical value whose value is used
`
`to arbitrate between two or more states (e.g. diseased and non-diseased) of classification for a
`
`biological sample. For example, if a parameter is greater than the cutoff value, a first
`
`classification of the quantitative data is made (cg. diseased state); or if the parameter is less
`
`Page 9 0f 93
`
`Page 9 of 93
`
`
`
`than the cutoff value, a different classification of the quantitative data is made (e. g.
`
`non-diseased state).
`
`[0042] The term "imbalance" as used herein means any significant deviation as defined by
`
`at least one cutoff value in a quantity of the clinically relevant nucleic acid sequence from a
`
`reference quantity. For example, the reference quantity could be a ratio of 3/5, and thus an
`
`imbalance would occur if the measured ratio is 1:1.
`
`DETAILED DESCRIPTION OF THE INVENTION
`
`[0043]
`
`This invention provides methods, systems, and apparatus for determining whether
`
`an increase or decrease compared to a reference (e.g. non-diseased) quantity of a
`
`clinically-relevant nucleic acid sequence in relation to other non-clinically-relevant sequences
`
`(e.g., a chromosomal or allelic imbalance) exists within a biological sample. One or more
`
`cutoff values are chosen for determining whether a change compared to the reference
`
`quantity exists (i.e. an imbalance), for example, with regards to the ratio of amounts of two
`
`sequences (or sets of sequences). The change detected in the reference quantity may be any
`
`deviation (upwards or downwards) in the relation of the clinically-relevant nucleic acid
`
`sequence to the other non—clinically—relevant sequences. Thus, the reference state may be any
`
`ratio or other quantity (e. g. other than a 1—1 correspondence), and a measured state signifying
`
`a change may be any ratio or other quantity that differs from the reference quantity as
`
`determined by the one or more cutoff values.
`
`[0044]
`
`In one embodiment, the cutoffvalue is determined based at least in part on a
`
`percentage of fetal nucleic acid sequences in a biological sample, such as maternal plasma,
`
`which contains a background of maternal nucleic acid sequences. Note the percentage of
`
`fetal sequences in a sample may not be the same as the percentage of the clinically-relevant
`
`nucleic acid sequences in the sample as different loci may be used to determine the
`
`percentage. In another embodiment, the cutoff value is determined at least in part on the
`
`percentage of tumor sequences in a biological sample, such as plasma, serum, saliva or urine,
`
`which contains a background of nucleic acid sequences derived from the non-malignant cells
`
`within the body.
`
`[0045]
`
`In yet another embodiment, the cutoff value is determined based on an average
`
`concentration of a sequence in a plurality of reactions. In one aspect, the cutoff value is
`
`determined from a proportion of informative wells that are estimated to contain a particular
`
`10
`
`15
`
`20
`
`25
`
`30
`
`Page 10 0f 93
`
`Page 10 of 93
`
`
`
`nucleic acid sequence, where the proportion is determined based on the above-mentioned
`
`percentage and/or average concentration. The cutoff value may be determined using many
`
`different types of methods, such as SPRT, false discovery, confidence interval, receiver
`
`operating characteristic (ROC). This strategy further minimizes the amount of testing
`
`required before confident classification can be made. This is of particular relevance to
`
`plasma nucleic acid analysis where the template amount is often limiting. Although
`
`presented with respect to digital PCR, other methods may be used.
`
`[0046] Digital PCR involves multiple PCR analyses on extremely dilute nucleic acids such
`
`that most positive amplifications reflect the signal from a single tcmplatc molcculc. Digital
`
`PCR thereby permits the counting of individual template molecules. The proportion of
`
`positive amplifications among the total number of PCRs analyzed allows an estimation of the
`
`template concentration in the original or non-diluted sample. This technique has been
`
`proposed to allow the detection of a variety of genetic phenomena (Vogelstein, B et a1. 1999,
`
`supra) and has previously been used for the detection of loss of heterozygosity in tumor
`
`samples (Zhou, W. et a]. 2002, supra) and in the plasma of cancer patients (Chang, HW er a].
`
`2002, supra). Since template molecule quantification by digital PCR does not rely on dose—
`
`response relationships between reporter dyes and nucleic acid concentrations, its analytical
`
`precision should theoretically be superior to that of real-time PCR. Hence, digital PCR could
`
`potentially allow the discrimination of finer degrees of quantitative differences between
`
`target and reference loci. A question is whether this approach is precise enough to detect
`
`fetal chromosomal aneuploidies in maternal plasma.
`
`[0047]
`
`To test this, we first assessed if digital PCR could determine the allelic ratio of
`
`PLAC4 mRNA (Lo, YMD, er a]. 2007 Nat Med 13, 218-223), a placental transcript from
`
`chromosome 21, in maternal plasma and thereby distinguish trisomy 21 and euploid fetuses.
`
`This approach is referred as the digital RNA—SNP method. We then evaluated whether the
`
`increased precision of digital PCR would allow the detection of fetal chromosomal
`
`aneuploidies without depending on genetic polymorphisms. We call this digital relative
`
`chromosome dosage (RCD) analysis. The former approach is polymorphism-dependent but
`
`requires less precision in quantitative discrimination while the latter approach is
`
`polymorphism-independent but requires a higher precision for quantitative discrimination.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`1. DIGITAL RNA-SNP
`
`Page 11 0f 93
`
`10
`
`Page 11 of 93
`
`
`
`A. Overview
`
`[0048] Digital PCR is capable of detecting the presence of allelic ratio skewing of two
`
`alleles in a DNA sample. For example, it has been used to detect loss of heterozygosity
`
`(LOH) in a tumor DNA sample. Assuming that there are two alleles in the DNA sample,
`
`namely A and G, and the A allele would be lost in the cells with LOH. When LOH is present
`
`in 50% of cells in the tumor sample, the allelic ratio ofG:A in the DNA sample would be 221.
`
`However, if LOH is not present in the tumor sample, the allelic ratio of G:A would be l:l.
`
`[0049]
`
`FIG. 1 is a flowchart 100 illustrating a digital PCR experiment. In step 110, the
`
`DNA sample is diluted and then distributed to separate wells. Note that the inventors have
`
`determined that some plasma nucleic acid species are already quite diluted in the original
`
`sample. Accordingly, there is no need for dilution for some templates, if they are already
`
`present at the necessary concentrations. In the previous studies (e.g. Zhou et a1 2002, supra),
`
`a DNA sample is diluted to an extent such that the average concentration of a specific
`
`"template DNA" is approximately 0.5 molecule of one of the two templates per well. Note
`
`that the term " template DNA" appears to refer to either the A and the G alleles, and that there
`
`is no rationale provided for this specific concentration.
`
`[0050]
`
`In step 120, in each well, a PCR process is carried out to detect the A and/or the G
`
`allele simultaneously. In step 130, the markers in each well are identified (e. g. via
`
`fluorescence), e. g. A, G, A and G, or neither. In the absence of LOH, the abundance of the A
`
`and the G alleles in the DNA sample would be the same (one copy each per cell). Therefore,
`
`the probabilities of a well being positive for the A allele and for the G allele would be the
`
`same. This would be reflected by the similar numbers of wells being positive for the A or the
`
`G alleles. However, when LOH is present in 50% or greater of cells in a tumor sample, the
`
`allelic ratio ofthe G and the A alleles would be at least 221. Previous methods simply
`
`assumed that the sample was at least 50% cancerous. Thus, the probability of a well being
`
`positive for the G allele would be higher than that for the A allele. As a result, the number of
`
`wells being positive for the G allele would be higher than that for the A allele.
`
`[0051]
`
`In step 140, to classify the digital PCR results, the number of wells being positive
`
`for each allele but not the other would be counted. In the above example, the number of
`
`wells being positive for the A allele but negative for the G allele, and the number of wells
`
`positive for the G allele but negative for the A allele are counted. In one embodiment, the
`
`allele showing less positive wells is regarded as the reference allele.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`Page 12 0f 93
`
`11
`
`Page 12 of 93
`
`
`
`[0052]
`
`In step l50, the total number ofinformative wells is determined as the sum ofthe
`
`numbers of positive wells for the two alleles. In step 160, the proportion (Pr) of informative
`
`wells contributed by the allele with more positive wells is calculated.
`
`Pr = N0. of wells only positive for the allele with more positive wells / Total no. of wells
`
`positive for only one allele (A or G).
`
`Other embodiments could use all wells with one of the alleles divided by all wells with at
`
`least one allele.
`
`[0053]
`
`In step 170, it is determined whether the value of P shows an allelic imbalance. As
`
`accuracy and efficiency are desired, this task is not straightforward. One method for
`
`10
`
`determining an imbalance uses a Bayesian-type likelihood method, sequential probability
`
`ratio testing (SPRT). SPRT is a method which allows two probabilistic hypotheses to be
`
`compared as data accumulate. In other words, it is a statistical method to classify the results
`
`of digital PCR as being suggestive of the presence or absence of allelic skewing. It has the
`
`advantage of minimizing the number of wells to be analyzed to achieve a given statistical
`
`15
`
`power and accuracy.
`
`20
`
`25
`
`[0054]
`
`In an exemplary SPRT analysis, the experimental results would be tested against the
`
`null and alternative hypotheses. The alternative hypothesis is accepted when there is allelic
`
`ratio skewing in the sample. The null hypothesis is accepted when there is no allelic ratio
`
`skewing in the sample. The value Pr would be compared with two cutoff values to accept the
`
`null or alternative hypotheses. If neither hypothesis is accepted, the sample would be marked
`
`as unclassified which means that the observed digital PCR result is not sufficient to classify
`
`the sample with the desired statistical confidence.
`
`[0055]
`
`The cutoff values for accepting the null or alternative hypotheses have typically
`
`been calculated based on a fixed value of Pr under the assumptions stated in the hypotheses.
`
`In the null hypothesis, the sample is assumed to exhibit no allelic ratio skewing. Therefore,
`
`the probabilities of each well being positive for the A and the G alleles would be the same
`
`and, hence, the expected value of Pr would be 1/2. In the alternative hypothesis, the expected
`
`value of Pr has been taken to be 2/3 or about halfway between 0.5 and 2/3, e.g. 0.585. Also,
`
`due to a limited number of experiments, one can choose an upper bound (.585+3fN) and a
`
`30
`
`lower bound taken as (.585-3/N).
`
`Page 13 0f 93
`
`12
`
`Page 13 of 93
`
`
`
`B. Detection of Down Syndrome
`
`[0056]
`
`In one embodiment of the present invention, digital SNP is used to detect fetal
`
`Down syndrome from a pregnant woman’s pl