`(19) World Intellectual Property
`Organization
`International Bureau
`
`(43) International Publication Date
`14 April 2016 (14.04.2016)
`
`WIPOI PCT
`
`\9
`
`(10) International Publication Number
`
`WO 2016/055956 A1
`
`(51)
`
`International Patent Classification:
`CIZQ 1/68 (2006.01)
`
`(21)
`
`International Application Number:
`
`PCT/IB2015/057679
`
`(22)
`
`International Filing Date:
`
`8 October 2015 (08.10.2015)
`
`(25)
`
`(26)
`
`(30)
`
`(71)
`
`(72)
`
`(74)
`
`(81)
`
`Filing Language:
`
`Publication Language:
`
`English
`
`English
`
`Priority Data:
`62/062,612
`62/062,616
`
`10 October 2014 (10.10.2014)
`10 October 2014 (10.10.2014)
`
`US
`US
`
`INVITAE CORPORATION [US/US]; 458
`Applicant:
`Brannan Street, San Francisco, California 94107 (US).
`
`Inventor: OLIVARES, Eric; 458 Brannan Street, San
`Francisco, California 94107 (US).
`
`Agents: FORCE, Walker et al.; Pillsbury Winthrop Shaw
`Pittman Llp (cv)attention: Docketing Department, P.o BOX
`10500, McLean, Virginia 22102 (US).
`
`Designated States (unless otherwise indicated, for everv
`kind ofnational protection available): AE, AG, AL, AM,
`AO, AT, AU, AZ, BA, BB, BG, BH, BN, BR, BW, BY,
`
`BZ, CA, CH, CL, CN, CO, CR, CU, CZ, DE, DK, DM,
`DO, DZ, EC, EE, EG, ES, FI, GB, GD, GE, G11, GM, GT,
`HN, HR, HU, ID, IL, IN, IR, IS, JP, KE, KG, KN, KP, KR,
`KZ, LA, LC, LK, LR, LS, LU, LY, MA, MD, ME, MG,
`MK, MN, MW, MX, MY, MZ, NA, NG, NI, No, NZ, OM,
`PA, PE, PG, PH, PL, PT, QA, RO, RS, RU, RW, SA, SC,
`SD, SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, TN,
`TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW.
`
`(84) Designated States (unless otherwise indicated, for every
`kind of regional protection available): ARIPO (BW, GH,
`GM, KE, LR, LS, MW, MZ, NA, RW, SD, SL, ST, SZ,
`TZ, UG, ZM, ZW), Eurasian (AM, AZ, BY, KG, KZ, RU,
`TJ, TM), European (AL, AT, BE, BG, CH, CY, CZ, DE,
`DK, EE, ES, FI, FR, GB, GR, HR, HU, IE, IS, IT, LT, LU,
`LV, MC, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK,
`SM, TR), OAPI (BF, BJ, CF, CG, CI, CM, GA, GN, GQ,
`GW, KM, ML, MR, NE, SN, TD, TG).
`Published:
`
`with international search report (Art. 21(3))
`
`before the expiration of the time limit for amending the
`claims and to be republished in the event of receipt of
`amendments (Rule 48.2(h))
`
`(54) Title: UNIVERSAL BLOCKING OLIGO SYSTEM AND IMPROVED HYBRIDIZATION CAPTURE METHODS FOR
`MULTIPLEXED CAPTURE REACTIONS
`
`FIG. 1
`
`Four-OI igo
`Blocking Strategy
`
`Library Insert
`
`
`
` Z A B c D E F G
`
`
`
`
`a a
`a m
`A’
`C’
`E’
`G’
`
`(57) Abstract: Provided herein, in some embodiments, are novel compositions and improved methods for nucleic acid manipulation
`and analysis that can be applied to multiplex nucleic acid sequencing. In certain embodiments, the novel compositions and methods
`presented herein are more cost effective, more conducive to automation, and faster than traditional approaches. Also provided herein
`are novel blocking nucleic acids.
`
`
`
`W02016/055956A1|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
`
`
`
`WO 2016/055956
`
`PCT/IB2015/057679
`
`UNIVERSAL BLOCKING OLIGO SYSTEM AND IMPROVED HYBRIDIZATION CAPTURE
`
`METHODS FOR MULTIPLEXED CAPTURE REACTIONS
`
`Related Patent Application
`
`This patent application claims the benefit of United States Provisional Patent Application No.
`
`62/062612 filed on October 10, 2014, entitled "UNIVERSAL BLOCKING OLIGO SYSTEM FOR
`
`MULTIPLEXED CAPTURE REACTIONS", naming Eric Olivares as an inventor, and designated
`
`by attorney docket no. 055911-0432232 and United States Provisional Patent Application No.
`
`62/062616 filed on October 10, 2014, entitled "METHODS OF HYBRIDIZATION CAPTURE
`
`USING NUCLEIC ACID BAITS FROM PAIRED-END SEQUENCING", naming Eric Olivares as
`
`an inventor, and designated by attorney docket no. 055911-0432231. The entire content of the
`
`foregoing patent applications are incorporated herein by reference, including all text, tables and
`
`drawings.
`
`Field
`
`The technology relates in part to compositions and methods of nucleic acid manipulation,
`
`analysis and high-throughput sequencing.
`
`Background
`
`Genetic information of living organisms (e.g., animals, plants, microorganisms, viruses) is
`
`encoded in deoxyribonucleic acid (DNA) or ribonucleic acid (RNA). Genetic information is a
`
`succession of nucleotides or modified nucleotides representing the primary structure of nucleic
`
`acids. The nucleic acid content (e.g., DNA) of an organism is often referred to as a genome.
`
`In
`
`humans, the complete genome typically contains about 30,000 genes located on twenty-four
`
`(24) chromosomes. Most gene encodes a specific protein, which after expression via
`
`transcription and translation fulfills a specific biochemical function within a living cell.
`
`Many medical conditions are caused by one or more genetic variations within a genome. Some
`
`genetic variations may predispose an individual to, or cause, any of a number of diseases such
`
`as, diabetes, arteriosclerosis, obesity, various autoimmune diseases and cancer (e.g.,
`
`colorectal, breast, ovarian, lung), for example. Such genetic diseases can result from an
`
`addition, substitution, insertion or deletion of one or more nucleotides within a genome.
`
`10
`
`15
`
`2O
`
`25
`
`30
`
`35
`
`
`
`WO 2016/055956
`
`PCT/IB2015/057679
`
`Genetic variations can be identified by multiplex analysis of mixtures of nucleic acids often
`
`obtained from multiple sources, for example by use of next generation sequencing techniques.
`
`Such multiplex analysis often involves a significant amount of manipulation of nucleic acids
`
`prior to analysis involving many different steps that are not conducive to high-throughput
`
`processing.
`
`In addition, current methods of nucleic acid manipulation are often costly, time
`
`consuming and often present substantial pitfalls that can lead to contamination of samples.
`
`Compositions and methods herein offer significant improvements over current nucleic acid
`
`manipulation and analysis techniques that are more conducive to high—throughput automation,
`
`more cost efficient, less time consuming and/or provide for less risk of contamination.
`
`Summary
`
`Presented herein, in some aspects, is a composition for use in massive parallel nucleic acid
`
`sequencing comprising, a) a library of nucleic acids comprising a plurality of library inserts
`
`where each nucleic acid of the library comprises (i) at least one library insert obtained from one
`
`of four or more samples, (ii) a first non-native nucleic acid, and (iii) a second non-native nucleic
`
`acid, where the first non-native nucleic acid and the second non-native nucleic acid are located
`
`on opposing sides of the at least one library insert, and the first non—native nucleic acid
`
`comprises a first distinguishable nucleic acid barcode and the second non-native nucleic acid
`
`comprises a second distinguishable nucleic acid bar code, where the first and second
`
`distinguishable nucleic acid barcodes are unique to the one of the four or more samples; and b)
`
`four U-block nucleic acids, where (i) a first and second U-block nucleic acid are configured to
`
`hybridize to the first non-native nucleic acid on opposing sides of the first distinguishable
`
`nucleic acid barcode and (ii) a third and fourth U-block nucleic acid are configured to hybridize
`
`to the second non—native nucleic acid on opposing sides of the second distinguishable nucleic
`
`acid barcode, and (iii) each of the U-block nucleic acids do not substantially hybridize to a
`
`portion of the first or second distinguishable nucleic acid barcodes.
`
`In certain aspects, the
`
`library of nucleic acids comprises at least eight distinguishable nucleic acid barcodes.
`
`In some aspects the compositions further comprise one or more capture nucleic acids, where (i)
`
`the capture nucleic acids comprise a member of a binding pair, and (ii) each of the capture
`
`nucleic acids is configured to specifically hybridize to a subset of the one or more library inserts.
`
`Also presented herein, in certain embodiments, is method of analyzing a nucleic acid library
`
`comprising, a) obtaining a library of nucleic acids comprising a plurality of library inserts where
`
`each nucleic acid of the library comprises (i) at least one library insert obtained from one of four
`
`or more samples, (ii) a first non-native nucleic acid, and (iii) a second non-native nucleic acid,
`
`1O
`
`15
`
`2O
`
`25
`
`30
`
`35
`
`
`
`WO 2016/055956
`
`PCT/IB2015/057679
`
`where the first non—native nucleic acid and the second non—native nucleic acid are located on
`
`opposing sides of the at least one library insert, and the first non-native nucleic acid comprises
`
`a first distinguishable nucleic acid barcode and the second non-native nucleic acid comprises a
`
`second distinguishable nucleic acid bar code, where the first and second distinguishable
`
`nucleic acid barcodes are unique to the one of the four or more samples; b) contacting the
`
`library of nucleic acids with four U-block nucleic acids, where (i) a first and second U-block
`
`nucleic acid are configured to hybridize to the first non-native nucleic acid on opposing sides of
`
`the first distinguishable nucleic acid barcode and (ii) a third and fourth U—block nucleic acid are
`
`configured to hybridize to the second non-native nucleic acid on opposing sides of the second
`
`distinguishable nucleic acid barcode, and (iii) each of the U-block nucleic acids does not
`
`substantially hybridize to a portion of the first or second distinguishable nucleic acid barcodes;
`
`and
`
`c) contacting the library of nucleic acids with one or more capture nucleic acids, each
`
`comprising a first member of a binding pair, where the one or more capture nucleic acids are
`
`configured to specifically hybridize to a subset of the nucleic acids of the library; d) capturing
`
`the capture nucleic acids, thereby providing captured nucleic acids comprising the subset of
`
`nucleic acids of the library; e) contacting the captured nucleic acids with a set of primers under
`
`amplification condition, thereby providing amplicons; and f) analyzing the amplicons.
`
`In certain aspects the analyzing comprising providing sequence reads.
`
`In some aspects
`
`sequencing reads can be obtained by a method comprising massive parallel sequencing and/or
`
`pair-end sequencing.
`
`In certain aspects regarding the compositions and methods herein, the non-native nucleic acids
`
`comprise universal nucleic acids.
`
`In some aspects the nucleic acids of the library comprise four
`
`or more, or ten or more barcode nucleic acids.
`
`In some aspects each library insert comprises
`
`one or two barcode sequences.
`
`In certain aspects U-block nucleic acids comprise a length of
`
`10 to 40 nucleotides.
`
`In certain aspects U-block nucleic acids comprise a length of 10 to 20
`
`nucleotides.
`
`In some aspects the U-block nucleic acids comprise locked nucleic acids and/or
`
`bridged nucleic acids.
`
`In certain aspects the U-block nucleic acids comprise a melting
`
`temperature of between about 6590 and about 90 9C.
`
`In certain aspects the U-block nucleic
`
`acids comprise a melting temperature of at least 659C or at least 75 9C.
`
`In some aspects the U-
`
`block nucleic acids do not comprise a degenerate nucleotide base.
`
`In some aspects the U—
`
`block nucleic acids do not comprise a 3-nitropyrrole, a 5-nitroindole, inosine, a 2’-deoxyinosine,
`
`analogues, derivatives or combinations thereof.
`
`In some aspects provided herein is a method of analyzing a nucleic acid library comprising, a)
`
`obtaining a library of nucleic acids comprising a first set of amplicons, where each amplicon
`
`1O
`
`15
`
`2O
`
`25
`
`30
`
`35
`
`
`
`WO 2016/055956
`
`PCT/IB2015/057679
`
`comprises a first non—native nucleic acid and a second non—native nucleic acid, one or more
`
`distinguishable identifiers, and a library insert obtained from one of one or more samples,
`
`where the library insert is located between the first and the second non-native nucleic acids, b)
`
`preparing a mixture comprising contacting the nucleic acids of the library with one or more
`
`blocking nucleic acids and capture nucleic acids, where (i) the one or more blocking nucleic
`
`acids are configured to specifically hybridize to the first and second non-native nucleic acids, (ii)
`
`the capture nucleic acids comprise a first member of a binding pair, and (ii) the capture nucleic
`
`acids are configured to specifically hybridize to a subset of amplicons of the first set, c) purifying
`
`the mixture, thereby providing purified nucleic acid, where the purified nucleic acid comprises
`
`the nucleic acids of the library, the one or more blocking nucleic acids, and the capture nucleic
`
`acids, d) hybridizing the purified nucleic acid under hybridization conditions,
`
`e) capturing
`
`the capture nucleic acids, thereby providing captured nucleic acid, f) contacting the captured
`
`nucleic acid with a set of primers under amplification condition, thereby providing a second set
`
`of amplicons, and g) analyzing the second set of amplicons.
`
`In some aspects, the amplification
`
`conditions comprise a heat—stable polymerase and/or a polymerase chain reaction.
`
`In certain
`
`aspects the preparing in (b) comprises contacting the nucleic acids of the library with competitor
`
`nucleic acids.
`
`In some embodiments, the capture nucleic acids are configured to specifically
`
`hybridize to a portion of the library insert.
`
`In certain embodiments the one or more blocking
`
`1O
`
`15
`
`nucleic acids are configured to specifically hybridize to a portion of the first non-native nucleic
`acid and/or the second non-native nucleic acid.
`In certain embodiments the one or more
`
`2O
`
`blocking nucleic acids comprise locked nucleic acids and/or bridged nucleic acids.
`
`In some aspects the capture nucleic acids comprising a first member of a binding pair are
`
`configured to specifically hybridize to a portion of an exon, an intron, a portion of a selected
`
`chromosome and/or to a regions of DNA comprising a genetic variation (e.g., a repeat, a
`
`polymorphism).
`
`In some embodiments the first member of the binding pair comprises a biotin,
`
`an antigen, a hapten, an antibody or a portion thereof.
`
`In some aspects the capturing in (e)
`
`comprises contacting the mixture with a second member of a binding pair.
`
`In some aspects
`
`the second member of the binding pair comprises avidin, protein A, protein G, an antibody, or a
`
`binding portion thereof.
`
`In certain embodiments the second member of the binding pair
`
`comprises a substrate.
`
`In some embodiments, the substrate comprises a magnetic compound.
`
`In some embodiments, the substrate comprises a bead.
`
`In some embodiments, the substrate
`
`comprises polystyrene, polycarbonate, sepharose or agarose.
`
`In some embodiments, the
`
`substrate comprises a metal.
`
`25
`
`30
`
`35
`
`In certain embodiments the hybridization conditions comprise denaturing.
`
`In certain
`
`embodiments the hybridization conditions comprise hybridizing the captured nucleic acids to a
`
`
`
`WO 2016/055956
`
`PCT/IB2015/057679
`
`portion of one or more of the amplicons of the first set.
`
`In certain embodiments the
`
`hybridization conditions comprise incubating the captured nucleic acid at a temperature
`
`between about 25 9C and about 70 9C.
`
`In certain embodiments the hybridization conditions
`
`comprise incubating the captured nucleic acid at a temperature between about 35 QC and about
`
`In certain embodiments the hybridization conditions comprise incubating for an amount
`60 QC.
`of time between about 1 hour and about 24 hours or between about 12 hours and about 20
`
`hours.
`
`In certain embodiments the hybridization conditions do not include a polymerase.
`
`In
`
`some embodiments the hybridizing in (d) comprises contacting the mixture with a hybridization
`
`buffer.
`
`In some embodiments the hybridizing in (d) comprises the sequential steps of (i)
`
`1O
`
`contacting the mixture with a hybridization buffer, (ii) denaturing and (iii) hybridizing.
`
`In some aspects the analyzing comprises providing sequence reads. Sometimes the sequence
`
`reads are obtained by a method comprising next generation sequencing (e.g., massive parallel
`
`sequencing). Sometime the sequence reads are obtained by a method comprising pair-end
`
`15
`
`sequencing.
`
`In certain embodiments the first non-native nucleic acid comprises at least one nucleic acid
`
`barcode.
`
`In certain embodiments the second non—native nucleic acid comprises at least one
`
`nucleic acid barcode.
`
`In certain embodiments the claimed methods herein do not comprise a drying step.
`
`In some
`
`embodiments the method does not comprise a denaturation step prior to (c).
`
`In some
`
`embodiments the method does not comprise a denaturation step prior to (d).
`
`In certain
`
`embodiments the method does not comprise heating to a temperature above 80 QC prior to (d).
`
`In certain embodiments the method does not comprise heating to a temperature above 90 9C
`
`prior to (d).
`
`In some embodiments, the mixture is not immobilized on a substrate of a flow cell
`
`or an array prior to (e). In some embodiments the purifying in (c) does not comprise addition of
`
`a second member of a binding pair configured to bind to the first member of the binding pair.
`
`In some aspects samples can be obtained from one or more species, one or more tissues, one
`
`or more mammals or one or more human subjects.
`
`Certain embodiments are described further in the following description, examples, claims and
`
`drawings.
`
`Brief Description of the Drawings
`
`2O
`
`25
`
`30
`
`35
`
`
`
`WO 2016/055956
`
`PCT/IB2015/057679
`
`The drawings illustrate embodiments of the technology and are not limiting. For clarity and
`
`ease of illustration, the drawings are not made to scale and, in some instances, various aspects
`
`may be shown exaggerated or enlarged to facilitate an understanding of particular
`embodiments.
`
`Figure 1 shows an embodiment of a blocking method comprising four U-block nucleic acids (A’,
`
`C’, E’ and G’)). Figure 1 shows a representative nucleic acid of a library (Z) comprising a library
`
`insert (D) and distinguishable nucleic acid barcodes (B and F), where a plurality of different
`
`inserts and different distinguishable barcodes are present in the many nucleic acids of the
`
`library. Figure 1 shows U-block nucleic acids (A’, C’, E’ and G’)) each of which are configured
`
`to specifically hybridize to portions of non-native nucleic acids (A, C, E and G) as shown, and
`
`which U-block nucleic acids hybridize adjacent to nucleic acid barcodes (B or F).
`
`Detailed Description
`
`Next generation sequencing (NGS) allows for sequencing and analysis of nucleic acids on a
`
`genome-wide scale by methods that are faster and cheaper than traditional methods of
`
`sequencing. Methods and compositions herein provide for improvements of advanced
`
`sequencing technologies that can be used to locate and identify genetic variations and/or
`
`associated diseases and disorders.
`
`In some embodiments, provided herein are methods that
`
`comprise, in part, manipulation and preparation of nucleic acid mixtures for NGS.
`
`Sequencing applications with genomic nucleic acids as the target material often requires
`
`selection of nucleic acid targets of interest from a highly complex mixture. The quality of the
`
`sequencing efforts often depend on the efficiency of the selection process, which, in turn, relies
`
`upon how well nucleic acid targets can be enriched relative to non-target sequences. Selection
`
`and enrichment of a nucleic acid library sometimes comprises capture of adapter-ligated inserts
`
`(e.g., genomic DNA inserts) by a hybrid capture approach.
`
`Most next generation sequencing library molecules contain non-native sequences (e.g., adapter
`
`nucleic acid, barcode sequences, primer binding sites, and universal sequences) which enable
`
`their subsequent sequencing. During hybridization capture reactions, non—native sequences
`
`can anneal to one another resulting in contamination of an enriched nucleic acid pool. A large
`
`fraction of these unwanted sequences are often due to undesired hybridization events between
`
`portions of terminal adapter sequences that are ligated to library inserts. Sometimes multiple
`
`library inserts can non-specifically anneal to each other through their terminal adapters, thereby
`
`1O
`
`15
`
`2O
`
`25
`
`30
`
`35
`
`
`
`WO 2016/055956
`
`PCT/IB2015/057679
`
`resulting in a “daisy chain” of otherwise unwanted DNA fragments being linked and isolated
`
`together.
`
`One method of reducing the so called “daisy chain” effect utilizes blocking nucleic acids
`
`directed to hybridize to large portions of adapter sequences. For traditional approaches, a
`
`blocking nucleic acid is required for each side of an adapter and each blocking nucleic acid
`
`contains a perfect complementary match to the adapter sequences (including the barcode
`
`sequences (e.g., index sequences)) contained in each of the adapters. For high throughput
`
`multiplex sequencing methods, multiple libraries are often mixed, each library consisting of
`
`different adapters sequences and different barcode sequences. For such multiplex
`
`approaches, multiple sets of traditional blocking nucleic acids are required to be synthesized,
`
`each specific for the adapters of each library. This approach is cumbersome and costly and
`
`requires manufacture of many different, relatively long oligonucleotides which hinders efficient
`
`and cost-effective automation of a library preparation and sequencing process.
`
`Provided herein, in some aspects, are novel and improved compositions for, and methods of,
`
`reducing unwanted capture events.
`
`In some embodiments, presented herein are novel U-block
`
`(i.e., universal blocking) nucleic acids. The compositions comprising the novel U—block nucleic
`
`presented herein and methods that utilize the U-block nucleic acids provided herein are less
`
`costly than traditional approaches, increase efficiency and work flow, and are more favorable to
`automation.
`
`Further, traditional applications of a hybrid capture approach often involve combining a pool of
`
`adapter-ligated library inserts or amplicons thereof with COt-1 DNA and blocking
`
`oligonucleotides followed by a drying step. The drying step is often conducted in a vacuum
`
`which is time consuming and is performed in an open system which provides for high risk of
`
`cross-contamination between samples. After drying, samples are denatured followed by
`
`annealing for several days. Biotinylated capture oligonucleotides (e.g., “baits”) are then added
`
`and the hybridized nucleic acids are typically pulled down with avidin coated beads. The
`
`retained pool of nucleic acids are then eluted from the beads and can be introduced into an
`
`automated sequencing process. The above described procedure is inefficient and time
`
`consuming, is not conducive to automation and can lead to cross—contamination.
`
`Presented herein, in some aspects, are improved method for manipulating and preparing a
`
`nucleic acid library for analysis (e.g., for high throughput sequencing) which methods do not
`
`require prolonged incubation times and/or a drying step.
`
`1O
`
`15
`
`2O
`
`25
`
`30
`
`35
`
`
`
`WO 2016/055956
`
`PCT/IB2015/057679
`
`Subjects
`
`A subject can be any living or non-living organism, including but not limited to a human, non-
`
`human animal, plant, bacterium, fungus, virus or protist. A subject may be any age (e.g., an
`
`embryo, a fetus, infant, child, adult). A subject can be of any sex (e.g., male, female, or
`
`combination thereof). A subject may be pregnant. A subject can be a patient (e.g. a human
`
`patient).
`
`Samples
`
`1O
`
`15
`
`2O
`
`25
`
`30
`
`35
`
`Provided herein are methods and compositions for analyzing a sample. A sample (e.g., a
`
`sample comprising nucleic acid) can be obtained from a suitable subject. A sample can be
`
`isolated or obtained directly from a subject or part thereof.
`
`In some embodiments, a sample is
`
`obtained indirectly from an individual or medical professional. A sample can be any specimen
`
`that is isolated or obtained from a subject or part thereof. A sample can be any specimen that
`
`is isolated or obtained from multiple subjects. Non-limiting examples of specimens include fluid
`
`or tissue from a subject, including, without limitation, blood or a blood product (e.g., serum,
`
`plasma, platelets, buffy coats, or the like), umbilical cord blood, chorionic villi, amniotic fluid,
`
`cerebrospinal fluid, spinal fluid, lavage fluid (e.g., lung, gastric, peritoneal, ductal, ear,
`
`arthroscopic), a biopsy sample, celocentesis sample, cells (blood cells, lymphocytes, placental
`
`cells, stem cells, bone marrow derived cells, embryo or fetal cells) or parts thereof (e.g.,
`
`mitochondrial, nucleus, extracts, or the like), urine, feces, sputum, saliva, nasal mucous,
`
`prostate fluid, lavage, semen, lymphatic fluid, bile, tears, sweat, breast milk, breast fluid, the like
`
`or combinations thereof. A fluid or tissue sample from which nucleic acid is extracted may be
`
`acellular (e.g., cell—free). Non—limiting examples of tissues include organ tissues (e.g., liver,
`
`kidney, lung, thymus, adrenals, skin, bladder, reproductive organs, intestine, colon, spleen,
`
`brain, the like or parts thereof), epithelial tissue, hair, hair follicles, ducts, canals, bone, eye,
`
`nose, mouth, throat, ear, nails, the like, parts thereof or combinations thereof. A sample may
`
`comprise cells or tissues that are normal, healthy, diseased (e.g., infected), and/or cancerous
`
`(e.g., cancer cells). A sample obtained from a subject may comprise cells or cellular material
`
`(e.g., nucleic acids) of multiple organisms (e.g., virus nucleic acid, fetal nucleic acid, bacterial
`
`nucleic acid, parasite nucleic acid).
`
`In some embodiments, a sample comprises nucleic acid, or fragments thereof. A sample can
`
`comprise nucleic acids obtained from one or more a subjects.
`
`In some embodiments a sample
`
`comprises nucleic acid obtained from a single subject.
`
`In some embodiments, a sample
`
`comprises a mixture of nucleic acids. A mixture of nucleic acids can comprise two or more
`
`
`
`WO 2016/055956
`
`PCT/IB2015/057679
`
`nucleic acid species having different nucleotide sequences, different fragment lengths, different
`
`origins (e.g., genomic origins, cell or tissue origins, subject origins, the like or combinations
`
`thereof), or combinations thereof. A sample may comprise synthetic nucleic acid.
`
`Nucleic Acids
`
`1O
`
`15
`
`2O
`
`25
`
`30
`
`35
`
`The terms “nucleic acid” refers to one or more nucleic acids (e.g., a set or subset of nucleic
`
`acids) of any composition from, such as DNA (e.g., complementary DNA (cDNA), genomic DNA
`
`(gDNA) and the like), RNA (e.g., message RNA (mRNA), short inhibitory RNA (siRNA),
`
`ribosomal RNA (rRNA), tRNA, microRNA, and/or DNA or RNA analogs (e.g., containing base
`
`analogs, sugar analogs and/or a non-native backbone and the like), RNA/DNA hybrids and
`
`polyamide nucleic acids (PNAs), all of which can be in single- or double-stranded form, and
`
`unless otherwise limited, can encompass known analogs of natural nucleotides that can
`
`function in a similar manner as naturally occurring nucleotides. Unless specifically limited, the
`
`term encompasses nucleic acids comprising deoxyribonucleotides, ribonucleotides and known
`
`analogs of natural nucleotides. A nucleic acid may include, as equivalents, derivatives, or
`
`variants thereof, suitable analogs of RNA or DNA synthesized from nucleotide analogs, single-
`
`stranded ("sense" or "antisense", "plus" strand or "minus" strand, "forward" reading frame or
`
`"reverse" reading frame) and double-stranded polynucleotides. Nucleic acids may be single or
`
`double stranded. A nucleic acid can be of any length of 2 or more, 3 or more, 4 or more or 5 or
`
`more contiguous nucleotides. A nucleic acid can comprise a specific 5’ to 3’ order of
`
`nucleotides known in the art as a sequence (e.g., a nucleic acid sequence).
`
`A nucleic acid may be naturally occurring and/or may be synthesized, copied or altered by the
`
`hand of man. For, example, a nucleic acid may be an amplicon. A nucleic acid may be from a
`
`nucleic acid library, such as a gDNA, cDNA or RNA library, for example. A nucleic acid can be
`
`synthesized (e.g., chemically synthesized) or generated (e.g., by polymerase extension in vitro,
`
`e.g., by amplification, e.g., by PCR). A nucleic acid may be, or may be from, a plasmid, phage,
`
`virus, autonomously replicating sequence (ARS), centromere, artificial chromosome,
`
`chromosome, or other nucleic acid able to replicate or be replicated in vitro or in a host cell, a
`
`cell, a cell nucleus or cytoplasm of a cell in certain embodiments. Nucleic acids (e.g., a library
`
`of nucleic acids) may comprise nucleic acid from one sample or from two or more samples
`
`(e.g., from 1 or more, 2 or more, 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or
`
`more, 9 or more, 10 or more, 11 or more, 12 or more, 13 or more, 14 or more, 15 or more, 16 or
`
`more, 17 or more, 18 or more, 19 or more, or 20 or more samples). Nucleic acid provided for
`
`processes or methods described herein may comprise nucleic acids from 1 to 1000, 1 to 500, 1
`
`to 200, 1 to 100, 1 to 50, 1 to 20 or 1 to 10 samples.
`
`
`
`WO 2016/055956
`
`PCT/IB2015/057679
`
`10
`
`The term "gene" means the segment of DNA involved in producing a polypeptide chain and can
`
`include regions preceding and following the coding region (leader and trailer) involved in the
`
`transcription/translation of the gene product and the regulation of the transcription/translation,
`
`as well as intervening sequences (introns) between individual coding segments (exons). A
`
`gene may not necessarily produce a peptide or may produce a truncated or non-functional
`
`protein due to genetic variation in a gene sequence (e.g., mutations in coding and non-coding
`
`portions of a gene). A gene, whether functional or non—functional, can often be identified by
`
`homology to a gene in a reference genome.
`
`Oligonucleotides are relatively short nucleic acids. Oligonucleotides can be from about 2 to
`
`150, 2 to 100, 2 to 50, or 2 to about 35 nucleic acids in length.
`
`In some embodiments
`
`oligonucleotides are single stranded.
`
`In certain embodiments, oligonucleotides are primers.
`
`Primers are often configured to hybridize to a selected complementary nucleic acid and are
`
`configured to be extended by a polymerase after hybridizing.
`
`Nucleic Acid Isolation and Purification
`
`Nucleic acid may be derived, isolated, extracted, purified or partially purified from one or more
`
`subjects, one or more samples or one or more sources using suitable methods known in the
`
`art. Any suitable method can be used for isolating, extracting and/or purifying nucleic acid.
`
`The term “isolated” as used herein refers to nucleic acid removed from its original environment
`
`(e.g., the natural environment if it is naturally occurring, or a host cell if expressed
`
`exogenously), and thus is altered by human intervention (e.g., "by the hand of man") from its
`
`original environment. The term “isolated nucleic acid” as used herein can refer to a nucleic acid
`
`removed from a subject (e.g., a human subject). An isolated nucleic acid can be provided with
`
`fewer non-nucleic acid components (e.g., protein, lipid) than the amount of components present
`
`in a source sample. A composition comprising isolated nucleic acid can be about 50% to
`
`greater than 99% free of non-nucleic acid components. A composition comprising isolated
`
`nucleic acid can be about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or greater
`
`than 99% free of non—nucleic acid components. The term “purified” as used herein can refer to
`
`a nucleic acid provided that contains fewer non-nucleic acid components (e.g., protein, lipid,
`
`carbohydrate, salts, buffers, detergents, and the like, or combinations thereof) than the amount
`
`of non—nucleic acid components present prior to subjecting the nucleic acid to a purification
`
`procedure. A composition comprising purified nucleic acid may be at least about 60%, 70%,
`
`80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%,
`
`1O
`
`15
`
`2O
`
`25
`
`30
`
`35
`
`
`
`WO 2016/055956
`
`PCT/IB2015/057679
`
`11
`
`96%, 97%, 98%, 99% or greater than 99% free of other non—nucleic acid components. A
`
`composition comprising purified nucleic acid may comprise at least 80%, 81%, 82%, 83%, 84%,
`
`85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or
`
`greater than 99% of the total nucleic acid present in a sample prior to application of a
`
`purification method.
`
`1O
`
`15
`
`2O
`
`25
`
`30
`
`35
`
`In some embodiments purifying a mixture (e.g., purifying nucleic acids in a mixture) provides
`
`purified nucleic acid.
`
`In certain embodiments, a mixture comprising nucleic acids of a library,
`
`blocking nucleic acids, capture nucleic acids, competitor nucleic acids and/or combinations
`
`thereof, is purified, thereby providing purified nucleic acid. Nucleic acid purification sometimes
`
`comprises a DNA clean-up column or DNA clean up beads. Various nucleic acid clean-up
`
`columns, resins, substrates and kits are known in the art. Any suitable nucleic acid purification
`
`methods, resin,