`(19) World Intellectual Property
`Organization
`International Bureau
`
`(10) International Publication Number
`
`(43) International Publication Date
`WO 2013/052907 A2
`11 April 2013 (11.04.2013) WIPOI PCT
`
`
`g
`
`(51)
`
`International Patent Classification:
`G06F 19/00 (2011.01)
`
`(21)
`
`International Application Number:
`
`PCT/U82012/059114
`
`(22)
`
`International Filing Date:
`
`(25)
`
`Filing Language:
`
`(26)
`
`Publication Language:
`
`5 October 2012 (05.10.2012)
`
`English
`
`English
`
`US
`US
`US
`US
`
`(30)
`
`(71)
`
`(72)
`
`Priority Data:
`61/544,251
`61/545,053
`61/663,477
`61/709,899
`
`6 October 2011 (06.10.2011)
`7 October 2011 (07.10.2011)
`22 June 2012 (22.06.2012)
`4 October 2012 (04.10.2012)
`
`[US/US]; 3595 John
`INC.
`Applicant: SEQUENOM,
`Hopkins Court, San Diego, CA 92121 (US).
`
`Inventors: VAN DEN BOMM, Dirk, Johannes; 638
`Bonair Way, Unit A, La Jolla, CA 92037 (US). CANTOR,
`Charles, R.; 526 Stratford Court, Unit E, Del Mar, CA
`92014 (US). KIM, Sung, Kyun; 662 Glenmore Boulevard,
`Glendale, CA 91206 (US). DZAKULA, Zeljko; 12830
`Sundance Avenue, San Diego, CA 92129 (US). DECIU,
`Cosnlin; 10545 Sea Mist Way, San Diego, CA 92121
`(US).
`
`(81)
`
`Designated States (unless otherwise indicated, for every
`kind ofnational protection available): AE, AG, AL, AM,
`A0, AT, AU, AZ, BA, BB, BG, BH, BN, BR, BW, BY,
`BZ, CA, CH, CL, CN, CO, CR, CU, CZ, DE, DK, DM,
`DO, DZ, EC, EE, EG, ES, Fl, GB, GD, GE, GH, GM, GT,
`HN, HR, HU, ID, IL, IN, IS, JP, KE, KG, KM, KN, KP,
`KR, KZ, LA, LC, LK, LR, LS, LT, LU, LY, MA, MD,
`ME, MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI,
`NO, NZ, OM, PA, PE, PG, PH, PL, PT, QA, RO, RS, RU,
`RW, SC, SD, SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ,
`TM, TN, TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA,
`ZM, ZW.
`
`(84)
`
`Designated States (unless otherwise indicated, for eveiy
`kind of regional protection available): ARIPO (BW, GH,
`GM, KE, LR, LS, MW, MZ, NA, RWY, SD, SL, SZ, TZ,
`UG, ZM, ZW), Eurasian (AM, AZ, BY, KG, KZ, RU, TJ,
`TM), European (AL, AT, BE, BG, CH, CY, CZ, DE, DK,
`EE, ES, FI, FR, GB, GR, HR, HU, IE, IS, IT, LT, LU, LV,
`MC, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK, SM,
`TR), OAPI (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GVV,
`ML, MR, NE, SN, TD, TG).
`Published:
`
`without international search report and to be republished
`upon receipt oft/tat report (Rule 48.2(g))
`
`(74)
`
`Agents: DICKINSON, Kari, A. et al.; Grant Anderson
`LLP, c/o Portfolioip, PO. Box 52050, Minneapolis, MN
`55402 (US).
`
`(54) Title: METHODS AND PROCESSES FOR NON-INVASIVE ASSESSMENT OF GENETIC VARIATIONS
`
`(57) Abstract: Technology provided herein relates in part to methods, processes and apparatuses for non-invasive assessment of ge-
`netic variations.
`
`
`
`W02013/052907A21||||||||||||||||||||||||||||||||||||||||||||||||||1|||||||||||||||||||||||||||||||||||||||||||
`
`
`
`WO 2013/052907
`
`PCT/US2012/059114
`
`METHODS AND PROCESSES FOR NON-INVASIVE ASSESSMENT OF GENETIC VARIATIONS
`
`Related Patent Applications
`
`This patent application claims the benefit of US. Provisional Patent Application No. 61/545,053
`
`filed on October 7, 2011, entitled METHODS AND PROCESSES FOR NON—INVASIVE
`
`ASSESSMENT OF GENETIC VARIATIONS, naming Dirk Johannes Van Den Boom and Charles
`
`R. Cantor as inventors, and designated by Attorney Docket No. SEQ-6036-PV. This patent
`
`application also claims the benefit of US. Provisional Patent Application No. 61/709,899 filed on
`
`October 4, 2012, entitled METHODS AND PROCESSES FOR NON-INVASIVE ASSESSMENT OF
`
`GENETIC VARIATIONS, naming Cosmin Deciu, Zeljko Dzakula, Mathias Ehrich and Sung Kyun
`
`Kim as inventors, and designated by Attorney Docket No. SEQ-6034-PV3; US. Provisional Patent
`
`Application No. 61/663,477 filed on June 22, 2012, entitled METHODS AND PROCESSES FOR
`
`NON-INVASIVE ASSESSMENT OF GENETIC VARIATIONS, naming Zeljko Dzakula and Mathias
`
`Ehrich as inventors, and designated by Attorney Docket No. SEQ-6034-PV2; and US. Provisional
`
`Patent Application No. (31/544,251 filed on October 6, 2011, entitled METHODS AND
`
`PROCESSES FOR NON-INVASIVE ASSESSMENT OF GENETIC VARIATIONS, naming Zeljko
`
`Dzakula and Mathias Ehrich as inventors, and designated by Attorney Docket No. SEQ-6034-PV.
`
`The entire content of the foregoing provisional applications are incorporated herein by reference,
`
`including all text, tables and drawings.
`
`Field
`
`Technology provided herein relates in part to methods, processes and apparatuses for non-
`
`invasive assessment of genetic variations.
`
`Background
`
`Genetic information of living organisms (e.g., animals, plants and microorganisms) and other forms
`
`0 Genetic information of living organisms (e.g., animals, plants and microorganisms) and other
`
`forms of replicating genetic information (e.g., viruses) is encoded in deoxyribonucleic acid (DNA) or
`
`ribonucleic acid (RNA). Genetic information is a succession of nucleotides or modified nucleotides
`
`representing the primary structure of chemical or hypothetical nucleic acids.
`
`In humans, the
`
`complete genome contains about 30,000 genes located on twenty-four (24) chromosomes (see
`
`The Human Genome, T. Strachan, BIOS Scientific Publishers, 1992). Each gene encodes a
`
`1
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`
`
`WO 2013/052907
`
`PCT/US2012/059114
`
`specific protein, which after expression via transcription and translation fulfills a specific
`
`biochemical function within a living cell.
`
`Many medical conditions are caused by one or more genetic variations. Certain genetic variations
`
`cause medical conditions that include, for example, hemophilia, thalassemia, Duchenne Muscular
`
`Dystrophy (DMD), Huntington's Disease (HD), Alzheimer's Disease and Cystic Fibrosis (CF)
`
`(Human Genome Mutations, D. N. Cooper and M. Krawczak, BIOS Publishers, 1993). Such
`
`genetic diseases can result from an addition, substitution, or deletion of a single nucleotide in DNA
`
`of a particular gene. Certain birth defects are caused by a chromosomal abnormality, also referred
`
`to as an aneuploidy, such as Trisomy 21 (Down's Syndrome), Trisomy 13 (Patau Syndrome),
`
`Trisomy 18 (Edward's Syndrome), Monosomy X (Turner's Syndrome) and certain sex chromosome
`
`aneuploidies such as Klinefelter's Syndrome (XXY), for example. Another genetic variation is fetal
`
`gender, which can often be determined based on sex chromosomes X and Y. Some genetic
`
`variations may predispose an individual to, or cause, any of a number of diseases such as, for
`
`example, diabetes, arteriosclerosis, obesity, various autoimmune diseases and cancer (e.g.,
`
`colorectal, breast, ovarian, lung).
`
`Identifying one or more genetic variations or variances can lead to diagnosis of, or determining
`
`predisposition to, a particular medical condition.
`
`Identifying a genetic variance can result in
`
`facilitating a medical decision and/or employing a helpful medical procedure.
`
`In some cases,
`
`identification of one or more genetic variations or variances involves the analysis of cell-free DNA.
`
`Cell-free DNA (CF-DNA) is composed of DNA fragments that originate from cell death and circulate
`
`in peripheral blood. High concentrations of CF-DNA can be indicative of certain clinical conditions
`
`such as cancer, trauma, burns, myocardial infarction, stroke, sepsis, infection, and other illnesses.
`
`Additionally, cell-free fetal DNA (CFF-DNA) can be detected in the maternal bloodstream and used
`
`for various noninvasive prenatal diagnostics.
`
`The presence of fetal nucleic acid in maternal plasma allows for non-invasive prenatal diagnosis
`
`through the analysis of a maternal blood sample. For example, quantitative abnormalities of fetal
`
`DNA in maternal plasma can be associated with a number of pregnancy-associated disorders,
`
`including preeclampsia, preterm labor, antepartum hemorrhage, invasive placentation, fetal Down
`
`syndrome, and other fetal chromosomal aneuploidies. Hence, fetal nucleic acid analysis in
`
`maternal plasma can be a useful mechanism for the monitoring of fetomaternal well-being.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`
`
`WO 2013/052907
`
`PCT/US2012/059114
`
`Early detection of pregnancy-related conditions, including complications during pregnancy and
`
`genetic defects of the fetus is important, as it allows early medical intervention necessary for the
`
`safety of both the mother and the fetus. Prenatal diagnosis traditionally has been conducted using
`
`cells isolated from the fetus through procedures such as chorionic villus sampling (CVS) or
`
`amniocentesis. However, these conventional methods are invasive and present an appreciable
`
`risk to both the mother and the fetus. The National Health Service currently cites a miscarriage
`
`rate of between 1 and 2 per cent following the invasive amniocentesis and chorionic villus sampling
`
`(CVS) tests. The use of non-invasive screening techniques that utilize circulating CFF-DNA can be
`
`an alternative to these invasive approaches.
`
`Summary
`
`Provided in some aspects are methods for determining the presence or absence of a genetic
`
`variation and methods for determining the presence or absence of a fetal aneuploidy using partial
`
`nucleotide sequence reads, and computer program products and systems for implementing
`
`methods discussed herein.
`
`Also provided, in some aspects, are methods for detecting the presence or absence of a genetic
`
`variation, comprising: (a) obtaining counts of partial nucleotide sequence reads mapped to
`
`genomic sections of a reference genome, which partial nucleotide sequence reads are reads of
`
`circulating cell-free nucleic acid from a test sample, where at least some of the partial nucleotide
`
`sequence reads comprise: i) multiple nucleobase gaps between identified nucleobases, or ii) one
`
`or more nucleobase classes, where each nucleobase class comprises a subset of nucleobases
`
`present in the sample nucleic acid, or a combination of (i) and (ii), (b) normalizing the counts of the
`
`partial nucleotide sequence reads, thereby providing normalized counts, and (c) detecting the
`
`presence or absence of a genetic variation based on the normalized counts.
`
`In some cases, the
`
`genetic variation is a nucleic acid sequence variation.
`
`In some cases, the genetic variation is a
`
`copy number variation.
`
`In some embodiments, the test sample is from a pregnant female and the
`
`genetic variation is a fetal aneuploidy.
`
`In some embodiments, a method comprises comparing the normalized counts to a reference,
`
`thereby making a comparison, where determining the presence or absence of the genetic variation
`
`in (c) is based on the normalized counts and the comparison.
`
`In some cases, the reference is
`
`counts of sequence reads mapped to a reference chromosome or segment thereof.
`
`In some
`
`10
`
`15
`
`20
`
`25
`
`30
`
`
`
`WO 2013/052907
`
`PCT/US2012/059114
`
`cases, the reference chromosome is chromosome 1, chromosome 14, chromosome 19 or
`
`combination thereof.
`
`In some embodiments, the counts of the partial nucleotide sequence reads obtained in (a)
`
`comprise counts of partial nucleotide sequence reads mapped to a test chromosome or segment
`
`thereof.
`
`In some cases, the test chromosome is chromosome 13, chromosome 18, chromosome
`
`21 or combination thereof.
`
`In some cases, the counts are expressed as a ratio of counts for
`
`genomic sections in a test chromosome or segment thereof to counts for genomic sections in
`
`autosomes or segment thereof, thereby providing a count representation.
`
`In some embodiments, the normalizing in (b) comprises normalizing according to guanine and
`
`cytosine (GC) content of the genomic sections, and providing calculated genomic section levels.
`
`In
`
`some embodiments, the normalizing in (b) comprises: (i) determining a guanine and cytosine (GC)
`
`bias for each of the genomic sections for multiple samples from a fitted relation for each sample
`
`between (1) the counts of the partial nucleotide sequence reads mapped to each of the genomic
`
`sections, and (2) GC content for each of the genomic sections; and (ii) calculating a genomic
`
`section level for each of the genomic sections from a fitted relation between (1) the GC bias and
`
`(2) the counts of the partial nucleotide sequence reads mapped to each of the genomic sections,
`
`thereby providing calculated genomic section levels, whereby bias in the counts of the partial
`
`nucleotide sequence reads mapped to each of the portions of the reference genome is reduced in
`
`the calculated genomic section levels, and where the normalized counts in (b) comprise the
`
`calculated genomic section levels.
`
`In some cases, the normalized counts in (b) are the calculated
`
`genomic section levels.
`
`In some embodiments, the normalized counts are adjusted for a first level for a first set of genomic
`
`sections which first level is significantly different than a second level for a second set of genomic
`
`sections, thereby providing adjusted normalized counts, where determining the presence or
`
`absence of the genetic variation in (c) is based on the adjusted normalized counts.
`
`In some embodiments, a method comprises (i) identifying a first elevation of the normalized counts
`
`significantly different than a second elevation of the normalized counts in a normalized counts
`
`profile, which first elevation is for a first set of genomic sections, and which second elevation is for
`
`a second set of genomic sections; (ii) determining an expected elevation range for a homozygous
`
`and heterozygous copy number variation according to an uncertainty value for a segment of the
`
`10
`
`15
`
`20
`
`25
`
`30
`
`
`
`WO 2013/052907
`
`PCT/US2012/059114
`
`genome; and (iii) adjusting the first elevation by a predetermined value, or adjusting the first
`
`elevation to the second elevation, when the first elevation is within one of the expected elevation
`
`ranges, thereby providing an adjustment of the first elevation.
`
`In some cases, the segment of the
`
`genome comprises the first elevation or the second elevation, or the first elevation and the second
`
`elevation.
`
`In some embodiments, the normalizing in (b) comprises performing a local regression on the
`
`counts of the partial nucleotide sequence reads or the calculated genomic section levels, or the
`
`counts of the partial nucleotide sequence reads and the calculated genomic section levels.
`
`Sometimes the local regression comprises a weighted least squares fit. Sometimes the local
`
`regression comprises a LOESS regression.
`
`In some embodiments, the partial nucleotide sequence reads are unary partial reads, for which
`
`unary partial reads one nucleotide is known at known positions and the other positions can be any
`
`one of three other nucleotides. Sometimes the partial nucleotide sequence reads are about 30
`
`base pairs or more.
`
`In some embodiments, the partial nucleotide sequence reads are binary partial reads, for which
`
`binary partial reads a first nucleotide class consisting of two possible bases is known at known
`
`positions and a second nucleotide class consisting of two possible bases is known at known
`
`positions, where the bases of the first nucleotide class are different than the bases of the second
`
`nucleotide class. Sometimes the partial nucleotide sequence reads are about 30 base pairs or
`more.
`
`In some embodiments, the partial nucleotide sequence reads are ternary partial reads, for which
`
`ternary partial reads a first nucleotide is known at known positions, a second nucleotide is known
`
`at other known positions and the other positions are any one of two nucleotides other than the first
`
`nucleotide and the second nucleotide. Sometimes the partial nucleotide sequence reads are about
`
`20 base pairs or more.
`
`In some embodiments, a method comprises determining partial nucleotide sequence reads of the
`
`nucleic acid from the test sample.
`
`In some cases, the partial nucleotide sequence reads are
`
`determined using a method comprising a massively parallel sequencing (MPS) process or a
`
`nanopore process, or a massively parallel sequencing (MPS) process and a nanopore process.
`
`In
`
`10
`
`15
`
`20
`
`25
`
`30
`
`
`
`WO 2013/052907
`
`PCT/US2012/059114
`
`some embodiments, a method comprises mapping the partial nucleotide sequence reads to
`
`genomic sections of the reference genome.
`
`In some embodiments, a method comprises isolating the nucleic acid from the test sample.
`
`In
`
`some embodiments, a method comprises obtaining the test sample.
`
`In some cases, the test
`
`sample is obtained from a pregnant female. The test sample sometimes is blood plasma, blood
`
`serum or urine.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`Also provided, in some aspects, are systems comprising one or more processors and memory,
`
`which memory comprises instructions executable by the one or more processors and which
`
`memory comprises counts of partial nucleotide sequence reads mapped to genomic sections of a
`
`reference genome, which partial nucleotide sequence reads are reads of circulating cell-free
`
`nucleic acid from a test sample, where at least some of the partial nucleotide sequence reads
`
`comprise: i) multiple nucleobase gaps between identified nucleobases, or ii) one or more
`
`nucleobase classes, where each nucleobase class comprises a subset of nucleobases present in
`
`the sample nucleic acid, or a combination of (i) and (ii); and which instructions executable by the
`
`one or more processors are configured to: (a) normalize the counts of the partial nucleotide
`
`sequence reads, thereby providing normalized counts, and (b) detect the presence or absence of a
`
`genetic variation based on the normalized counts.
`
`Also provided, in some aspects, are apparatuses comprising one or more processors and memory,
`
`which memory comprises instructions executable by the one or more processors and which
`
`memory comprises counts of partial nucleotide sequence reads mapped to genomic sections of a
`
`reference genome, which partial nucleotide sequence reads are reads of circulating cell-free
`
`nucleic acid from a test sample, where at least some of the partial nucleotide sequence reads
`
`comprise: i) multiple nucleobase gaps between identified nucleobases, or ii) one or more
`
`nucleobase classes, where each nucleobase class comprises a subset of nucleobases present in
`
`the sample nucleic acid, or a combination of (i) and (ii); and which instructions executable by the
`
`one or more processors are configured to: (a) normalize the counts of the partial nucleotide
`
`sequence reads, thereby providing normalized counts, and (b) detect the presence or absence of a
`
`genetic variation based on the normalized counts.
`
`Also provided, in some aspects, are computer program products tangibly embodied on a computer-
`
`readable medium, comprising instructions that when executed by one or more processors are
`
`
`
`WO 2013/052907
`
`PCT/US2012/059114
`
`configured to: (a) access counts of partial nucleotide sequence reads mapped to genomic sections
`
`of a reference genome, which partial nucleotide sequence reads are reads of circulating cell-free
`
`nucleic acid from a test sample, where at least some of the partial nucleotide sequence reads
`
`comprise: i) multiple nucleobase gaps between identified nucleobases, or ii) one or more
`
`nucleobase classes, where each nucleobase class comprises a subset of nucleobases present in
`
`the sample nucleic acid, or a combination of (i) and (ii), (b) normalize the counts of the partial
`
`nucleotide sequence reads, thereby providing normalized counts, and (c) detect the presence or
`
`absence of a genetic variation based on the normalized counts.
`
`10
`
`15
`
`Also provided, in some aspects, are methods for detecting the presence or absence of a fetal
`
`aneuploidy comprising: (a) obtaining partial nucleotide sequence reads from a sample comprising
`
`circulating, cell-free nucleic acid from a pregnant female, where at least some partial nucleotide
`
`sequence reads comprise: i) multiple nucleobase gaps between identified nucleobases, or ii) one
`
`or more nucleobase classes, where each nucleobase class comprises a subset of nucleobases
`
`present in the sample nucleic acid, or combination of (i) and (ii), (b) mapping the partial nucleotide
`
`sequence reads to reference genome sections, (0) counting the number of partial nucleotide
`
`sequence reads mapped to each reference genome section, (d) comparing the number of counts
`
`of the partial nucleotide sequence reads mapped in (c), or derivative thereof, to a reference,
`
`thereby making a comparison, and (e) determining the presence or absence of a fetal aneuploidy
`
`20
`
`based on the comparison.
`
`Also provided, in some aspects, are methods for detecting the presence or absence of a fetal
`
`aneuploidy comprising: (a) mapping partial nucleotide sequence reads that have been obtained
`
`from a sample comprising circulating, cell-free nucleic acid from a pregnant female, to reference
`
`genome sections, where at least some partial nucleotide sequence reads comprise: i) multiple
`
`nucleobase gaps between identified nucleobases, or ii) one or more nucleobase classes, where
`
`each nucleobase class comprises a subset of nucleobases present in the sample nucleic acid, or
`
`combination of (i) and (ii), (b) counting the number of partial nucleotide sequence reads mapped to
`
`each reference genome section, (c) comparing the number of counts of the partial nucleotide
`
`sequence reads mapped in (b), or derivative thereof, to a reference, thereby making a comparison,
`
`and (d) determining the presence or absence of a fetal aneuploidy based on the comparison.
`
`25
`
`30
`
`Also provided, in some aspects, are methods for detecting the presence or absence of a fetal
`
`aneuploidy comprising: (a) obtaining a sample comprising circulating, cell-free nucleic acid from a
`
`
`
`WO 2013/052907
`
`PCT/US2012/059114
`
`pregnant female, (b) isolating sample nucleic acid from the sample, (c) obtaining partial nucleotide
`
`sequence reads from the sample nucleic acid, where at least some partial nucleotide sequence
`
`reads comprise: i) multiple nucleobase gaps between identified nucleobases, or ii) one or more
`
`nucleobase classes, where each nucleobase class comprises a subset of nucleobases present in
`
`the sample nucleic acid, or combination of (i) and (ii), (d) mapping the partial nucleotide sequence
`
`reads to reference genome sections, (e) counting the number of partial nucleotide sequence reads
`
`mapped to each reference genome section, (f) comparing the number of counts of the partial
`
`nucleotide sequence reads mapped in (e), or derivative thereof, to a reference, thereby making a
`
`comparison, and (g) determining the presence or absence of a fetal aneuploidy based on the
`
`10
`
`comparison.
`
`15
`
`20
`
`25
`
`Also provided, in some aspects, are methods for detecting the presence or absence of a genetic
`
`variation comprising: (a) obtaining partial nucleotide sequence reads from a sample comprising
`
`nucleic acid from a subject, where at least some partial nucleotide sequence reads comprise: i)
`
`multiple nucleobase gaps between identified nucleobases, or ii) one or more nucleobase classes,
`
`where each nucleobase class comprises a subset of nucleobases present in the sample nucleic
`
`acid, or combination of (i) and (ii), (b) mapping the partial nucleotide sequence reads to reference
`
`genome sections, (0) comparing the partial nucleotide sequence reads mapped in (b) to a
`
`reference, thereby making a comparison, and (d) determining the presence or absence of a
`
`genetic variation based on the comparison.
`
`Also provided, in some aspects, are methods for detecting the presence or absence of a genetic
`
`variation comprising: (a) mapping partial nucleotide sequence reads that have been obtained from
`
`a sample comprising nucleic acid from a subject, to reference genome sections, where at least
`
`some partial nucleotide sequence reads comprise: i) multiple nucleobase gaps between identified
`
`nucleobases, or ii) one or more nucleobase classes, where each nucleobase class comprises a
`
`subset of nucleobases present in the sample nucleic acid, or combination of (i) and (ii), (b)
`
`comparing the partial nucleotide sequence reads mapped in (a) to a reference, thereby making a
`
`comparison, and (c) determining the presence or absence of a genetic variation based on the
`
`30
`
`comparison.
`
`Also provided, in some aspects, are methods for detecting the presence or absence of a genetic
`
`variation comprising: (a) obtaining a sample comprising nucleic acid from a subject, (b) isolating
`
`sample nucleic acid from the sample, (c) obtaining partial nucleotide sequence reads from the
`
`
`
`WO 2013/052907
`
`PCT/US2012/059114
`
`sample nucleic acid, where at least some partial nucleotide sequence reads comprise: i) multiple
`
`nucleobase gaps between identified nucleobases, or ii) one or more nucleobase classes, where
`
`each nucleobase class comprises a subset of nucleobases present in the sample nucleic acid, or
`
`combination of (i) and (ii), (d) mapping the partial nucleotide sequence reads to reference genome
`
`sections, (e) comparing the partial nucleotide sequence reads mapped in (d) to a reference,
`
`thereby making a comparison, and (f) determining the presence or absence of a genetic variation
`
`based on the comparison.
`
`In some embodiments, the genetic variation is a nucleic acid sequence variation.
`
`In some cases,
`
`nucleotide sequences of the partial nucleotide sequence reads are compared to a reference and
`
`sometimes a sequence match or mismatch is determined.
`
`In some embodiments, the genetic variation is a nucleic acid copy number variation.
`
`In some
`
`embodiments, a method further comprises after the mapping of partial nucleotide sequence reads,
`
`counting the number of partial nucleotide sequence reads mapped to each reference genome
`
`section. Often, the number of counts of the partial nucleotide sequence reads, or derivative
`
`thereof, are compared to a reference.
`
`In some embodiments, the subject is a fetus and the sample is from a pregnant female that bears a
`
`fetus.
`
`In some cases, the sample comprises circulating, cell—free nucleic acid and sometimes the
`
`sample nucleic acid comprises maternal and fetal nucleic acid.
`
`In some embodiments, the sample is blood, urine, saliva, cervical swab, serum, and/or plasma.
`
`In some embodiments, the partial nucleotide sequence reads comprise relative positional
`
`information for one or more nucleobase species.
`
`In some cases, the partial nucleotide sequence
`
`reads contain relative positional information for adenine.
`
`In some cases, the partial nucleotide
`
`sequence reads contain relative positional information for guanine.
`
`In some cases, the partial
`
`nucleotide sequence reads contain relative positional information for thymine.
`
`In some cases, the
`
`partial nucleotide sequence reads contain relative positional information for cytosine.
`
`In some
`
`cases, the partial nucleotide sequence reads contain relative positional information for methyl-
`
`cytosine.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`
`
`WO 2013/052907
`
`PCT/US2012/059114
`
`In some embodiments, the partial nucleotide sequence reads contain relative positional information
`
`for two nucleobase species selected from the group consisting of adenine, guanine, thymine,
`
`cytosine or methyl-cytosine.
`
`In some embodiments, the partial nucleotide sequence reads contain
`
`relative positional information for three nucleobase species selected from the group consisting of
`
`adenine, guanine, thymine, cytosine or methyl-cytosine.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`In some embodiments, the one or more nucleobase species comprise one or more detectable
`
`labels.
`
`In some embodiments, the partial nucleotide sequence reads contain relative positional
`
`information for sequences complementary to one or more hybridized probe species.
`
`In some
`
`cases, the one or more hybridized probe species comprise one or more detectable labels.
`
`In some embodiments, the nucleobase class is purine.
`
`In some embodiments, the nucleobase
`
`class is pyrimidine.
`
`In some cases, purines are distinguished from pyrimidines in the partial
`
`nucleotide sequence reads.
`
`In some embodiments, the sample nucleic acid comprises single stranded nucleic acid.
`
`In some
`
`embodiments, the sample nucleic acid comprises double stranded nucleic acid.
`
`In some cases,
`
`the nucleobase class is a nucleobase pair species in a duplex nucleic acid.
`
`In some embodiments, the obtaining partial nucleotide sequence reads includes subjecting the
`
`sample nucleic acid to a sequencing process using a sequencing device.
`
`In some cases, the
`
`partial nucleotide sequence reads are obtained by nanopore sequencing.
`
`In some cases, the
`
`partial nucleotide sequence reads are obtained by reversible terminator-based sequencing.
`
`In
`
`some cases, the partial nucleotide sequence reads are obtained by pyrosequencing.
`
`In some
`
`cases, the partial nucleotide sequence reads are obtained by real time sequencing.
`
`In some
`
`cases, the partial nucleotide sequence reads are obtained by oligonucleotide probe ligation
`
`sequencing.
`
`In some cases, the partial nucleotide sequence reads are obtained by sequencing by
`
`hybridization.
`
`In some embodiments, the partial nucleotide sequence reads comprise a number of discrete
`
`position identities sufficient to map to a reference genome section.
`
`In some embodiments, the
`
`partial nucleotide sequence read is of sufficient length to map to a reference genome section.
`
`In
`
`some cases, the partial nucleotide sequence read length is at least about 36 nucleobases.
`
`some cases, the partial nucleotide sequence read length is at least about 72 nucleobases.
`
`In
`
`In
`
`10
`
`
`
`WO 2013/052907
`
`PCT/US2012/059114
`
`some cases, the partial nucleotide sequence read length is at least about 108 nucleobases.
`
`In
`
`some embodiments, the nucleobase gaps in each partial nucleotide sequence read independently
`
`are about 1 to about 100 sequential nucleobases.
`
`In some embodiments, a method comprises obtaining full nucleotide sequence reads, which
`
`nucleotide sequence reads do not contain nucleobase gaps between identified nucleobases or a
`
`nucleobase class comprising a subset of nucleobases present in the sample nucleic acid.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`In some embodiments, the genetic variation is associated with a medical condition.
`
`In some
`
`cases, the medical condition is cancer.
`
`In some cases, the medical condition is an aneuploidy and
`
`is sometimes a fetal aneuploidy.
`
`In some embodiments, the fetal aneuploidy is trisomy 13, trisomy
`
`18 or trisomy 21.
`
`Also provided, in some aspects, are computer program products, comprising a computer usable
`
`medium having a computer readable program code embodied therein, the computer readable
`
`program code comprising distinct software modules comprising a sequence receiving module, a
`
`logic processing module, and a data display organization module, the computer readable program
`
`code adapted to be executed to implement a method for identifying the presence or absence of a
`
`fetal aneuploidy, the method comprising: (a) obtaining, by the sequence receiving module, partial
`
`nucleotide sequence reads from a sample comprising circulating, cell-free nucleic acid from a
`
`pregnant female, where at least some partial nucleotide sequence reads comprise: i) multiple
`
`nucleobase gaps between identified nucleobases, or ii) one or more nucleobase classes, where
`
`each nucleobase class comprises a subset of nucleobases present in the sample nucleic acid, or
`
`combination of (i) and (ii); (b) receiving, by the logic processing module, the partial nucleotide
`
`sequence reads; (0) mapping, by the logic processing module, the partial nucleotide sequence
`
`reads to reference genome sections; (d) counting, by the logic processing module, the number of
`
`partial nucleotide sequence reads mapped to each reference genome section; (e) comparing, by
`
`the logic processing module, the number of counts of the partial nucleotide sequence reads, or
`
`derivative thereof, to a reference, or portion thereof, thereby making a comparison; (f) providing, by
`
`the logic processing module, an outcome determinative of the presence or absence of a fetal
`
`aneuploidy based on the comparison; and (g) organizing, by the data display organization module
`
`in response to being determined by the logic processing module, a data display indicating the
`
`presence or absence of a fetal aneuploidy.
`
`11
`
`
`
`WO 2013/052907
`
`PCT/US2012/059114
`
`In some embodiments, a computer program product is stored in an apparatus comprising memory.
`
`In some cases, the apparatus comprises a processor that implements one or more functions of the
`
`computer program product specified in any of the above embodiments.
`
`Also provided, in some aspects, are systems comprising a nucleic acid sequencing apparatus and
`
`a processing apparatus, where the sequencing apparatus obtains sequence reads from a sample,
`
`and the processing apparatus obtains the sequence reads from the sequencing apparatus and
`
`carries out a method comprising: (a) mapping partial nucleotide sequence reads from the
`
`sequencing apparatus that have been obtained from a sample comprising circulating, cell-free
`
`nucleic acid from a pregnant female, to reference genome sections