`(19) World Intellectual Property
`Organization
`International Bureau
`
`\9
`
`(43) International Publication Date
`6 April 2017 (06.04.2017)
`
`WIPOI PCT
`
`(51)
`
`International Patent Classification:
`C07H 21/00 (2006.01)
`C12P 19/34 (2006.01)
`C07H 21/02 (2006.01)
`C12N 15/10 (2006.01)
`C07H 21/04 (2006.01)
`
`(72)
`
`(21)
`
`International Application Number:
`
`PCT/US2016/055078
`
`(10) International Publication Number
`
`WO 2017/059399 A1
`
`Inventors: LAJOIE, Marc Joseph; c/o University of
`Washington, 4545 Roosevelt Way NE, Suite 500, Seattle,
`WA 98105-4608 (US). KLEIN, Jason Chesler; c/o Uni-
`versity of Washington, 4545 Roosevelt Way NE, Suite
`500, Seattle, WA 98105—4608 (US). SCHWARTZ, Jerrod
`Joseph; c/o University of Washington, 4545 Roosevelt
`Way NE, Suite 500, Seattle, WA 98105—4608 (US).
`BAKER, David; c/o University of Washington, 4545
`Roosevelt Way NE, Suite 500, Seattle, WA 98105-4608
`(US). SHENDURE, Jay Ashok; c/O University of Wash-
`ington, 4545 Roosevelt Way NE, Suite 500, Seattle, WA
`98105-4608 (US). STEWART, Lance Joseph; c/o Uni-
`versity of Washington, 4545 Roosevelt Way NE, Suite
`500, Seattle, WA 98105-4608 (US).
`
`(22)
`
`International Filing Date:
`
`1 October 2016 (01.10.2016)
`
`(25)
`
`(26)
`
`(30)
`
`(71)
`
`Filing Language:
`
`Publication Language:
`
`English
`
`English
`
`Priority Data:
`62/235,974
`
`1 October 2015 (01.10.2015)
`
`US
`
`OF WASHINGTON
`UNIVERSITY
`Applicant:
`[US/US]; 4545 Roosevelt Way NE, Suite 500, Seattle, WA
`98105-4608 (US).
`
`(74)
`
`Agent: BOSMAN, Joshua D.; McDonnell Boehnen Hul-
`bert & Berghoff LLP, 300 South Wacker Drive, Chicago,
`IL 60606 (US).
`
`[Continued on nextpuge]
`
`(54) Title: MULTIPLEX PAIRWISE ASSEMBLY OF DNA OLIGONUCLEOTIDES
`
`FIG.1
`
`(57) Abstract: The present invention provides methods for
`multiplex assembly of oligonucleotides.
`
`
`
`W sew m“wax fimw WWW
`
`
`a t
`:3
`
`
`W 123:“me WWWW
`
`
`
`W02017/059399A1|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
`
`
`
`WO 2017/059399 A1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
`
`(81) Designated States (unless otherwise indicated, for every
`kind ofnational protection available): AE, AG, AL, AM,
`AO, AT, AU, AZ, BA, BB, BG, BH, BN, BR, BW, BY,
`BZ, CA, CH, CL, CN, CO, CR, CU, CZ, DE, DJ, DK,
`DM, DO, DZ, EC, EE, EG, ES,
`FI, GB, GD, GE, GH,
`GM, GT, HN, HR, HU, ID, IL,
`IN, IR, IS, JP, KE, KG,
`KN, KP, KR, KW, KZ, LA, LC, LK, LR, LS, LU, LY,
`MA,MD,,ME,,,,,MGMKMNMWMXMY,,MZNA,
`NG, NI, NO, NZ, OM, PA, PE, PG, PH, PL, PT, QA, RO,
`RS, RU, RW, SA, SC, SD, SE, SG, SK, SL, SM, ST, SV,
`SY, TH, TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, VC,
`VN, ZA, ZM, ZW.
`
`(84)
`
`Designated States (unless otherwise indicated, for everv
`kind ofregional protection available): ARIPO (BW, GH,
`
`GM, KE, LR, LS, MW, MZ, NA, RW, SD, SL, ST, SZ,
`TZ, UG, ZM, ZW), Eurasian (AM, AZ, BY, KG, KZ, Ru,
`TJ, TM), European (AL, AT, BE, BG, CH, CY, CZ, DE,
`DK, EE, ES, FI, FR, GB, GR, HR, HU, IE, IS, IT, LT,
`LU, LV, MC, MK, MT, NL, NO, PL, PT, R0, RS, SE,
`SI, SK, SM, TR), OAPI (BF, BJ, CF, CG, CI, CM, GA,
`GN, GQ, GW, KM, ML, MR, NE, SN, TD, TG).
`Published:
`
`with international search report (Art. 21(3))
`
`with sequence listing part ofdescription (Rule 5.2(a))
`
`
`
`WO 2017/059399
`
`PCT/USZOl6/055078
`
`MULTIPLEX PAIRWISE ASSEMBLY OF DNA OLIGONUCLEOTIDES
`
`CROSS REFERENCE
`
`[0001]
`
`This application is related to US. provisional patent application, Serial No.
`
`62/235,974, filed October 1, 2015, the disclosure of which is incorporated by reference
`
`herein in its entirety.
`
`STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH
`
`[0002]
`
`This invention was made with US. government support under Department of
`
`Energy-Lawrence Berkeley National Laboratory-Joint Genome Institute award number
`
`DE-AC02—05C1-111231, and National Institutes of Health (NIH) award number
`
`1R21CA160080. The US. Government has certain rights in the invention.
`
`SEQUENCE LISTING
`
`[0003]
`
`The sequence listing submitted herewith, entitled “16-1242-
`
`PCT_SequenceListing_ST25.txt” and 7kb in size, is incorporated by reference in its
`
`entirety .
`
`BACKGROUND
`
`[0004]
`
`Traditionally, DNA has been synthesized by solid—phase phosphoramidite
`
`chemistry. Column-based synthesis generates up to 200-mers with error rates of about 1
`
`in 200 nucleotides and yields of 10 to 100 nmol per product. Column based DNA
`
`synthesis is limited in throughput to 384-wellplates, and oligonucleotides cost from $0.05
`
`to $1.00/base-pairs (bp) depending on length and yield. The commercialization of inkj et-
`
`based printing of nucleotides with phosphoramidite chemistries (cg, Agilent) and
`
`semiconductor-based electrochemical acid production arrays (e.g., CustomArray) have
`
`increased throughput and decreased the cost of oligonucleotide synthesis. These
`
`oligonucleotides range from $0.00001—0.00l/bp in cost, depending on length, scale and
`
`platform. However, these platforms are limited by short synthesis lengths, high synthesis
`
`error rates, low yield and the challenges of assembling long constructs from complex
`
`pools.
`
`[0005]
`
`Many methods have recently addressed the high error rates of array—
`
`synthesized oligonucleotides, with a trade-off between cost and fidelity. Low-cost
`
`1
`
`
`
`WO 2017/059399
`
`PCT/USZOl6/055078
`
`methods include proteins such as MutS, polymerases and other proteins that bind and cut
`
`heteroduplexes. However, as these methods rely on identifying mismatches and require
`
`the majority of sequences to be identical, they are not always compatible with complex
`
`libraries and therefore must be performed after individual gene assemblies. Furthermore,
`
`as these methods retain error rates as high as 1 per 1000 nucleotides, further screening is
`
`required to confirm the correct sequence. More recent methods such as Dial-Out PCR rely
`
`on DNA sequencing followed by retrieval of sequence-verified constructs, achieving error
`
`rates as low as 1077. While these methods can work on complex oligonucleotide pools and
`
`yield very low error rates, they are costly, time—intensive and do not always recover
`
`targeted molecules.
`
`[0006]
`
`Despite their high error rates, inexpensive oligonucleotide pools cleaved from
`
`microarrays have recently enabled high-throughput analysis of promoter and enhancer
`
`function, providing novel insight into the vocabulary of these regulatory elements. They
`
`have also been used in deciphering the role of genetic variants in protein function.
`
`However, these studies were all limited by short synthesis lengths , about 160 bp for
`
`CustomArray and 230 bp for Agilent.
`
`[0007]
`
`Short synthesis lengths and high error rates present bottlenecks to the use of
`
`array-derived oligonucleotides for both functional assays and gene assembly. Described
`
`herein is a method to assemble thousands of array-derived oligonucleotides into targets
`
`approaching length estimates of cis-regulatory elements and protein domains. Compared
`
`to existing methods, the methods described here do not limit sequence space by using
`
`restriction enzymes, are high throughput, and offer an efficient way to retrieve error—free
`
`assemblies.
`
`SUMMARY OF THE INVENTION
`
`[0008]
`
`In a first aspect, the present invention provides a method for assembly of one
`
`or more double-stranded polynucleotides, the method comprising: (a) amplifying a first
`
`plurality of single-stranded overlapping oligonucleotides, wherein the first plurality of
`
`single-stranded overlapping oligonucleotides comprises: (i) overlapping regions with
`
`homology capable of annealing to produce one or more double—stranded polynucleotides,
`
`and (ii) at least one common primer binding site in each single-stranded overlapping
`
`oligonucleotide, (b) assembling one or more double-stranded polynucleotides, wherein the
`
`assembling comprises denaturing, annealing and extending the first plurality of single-
`
`stranded overlapping oligonucleotides to generate the one or more double-stranded
`
`
`
`WO 2017/059399
`
`PCT/USZOl6/055078
`
`polynucleotides.
`
`[0009]
`
`The inventors have surprisingly discovered that methods of the present
`
`invention provide high-throughput, multiplex assembly of thousands of polynucleotides
`
`between approximately 200—400 or more nucleotides in length. Furthermore, the methods
`
`of the invention provide efficient way to retrieve error-free assemblies of the thousands of
`
`polynucleotides. These findings can provide methods for both complex library generation
`
`and gene synthesis. For example, creating a library of 3,1 18 such 200 bp polynucleotides
`
`would be ~38-fold less expensive than column-based synthesis methods (~0,84
`
`USD/target). The methods of the invention can be utilized to synthesize polynucleotide
`
`libraries at an unprecedented cost allowing researchers to address questions using
`
`precisely designed sequences rather than relying on biased mutagenesis methods.
`
`Moreover, the methods described herein can be used for gene synthesis, gene regulation,
`
`protein function and directed evolution, all of which have contributed to novel
`
`pharmaceuticals and a better understanding of genome organization. Finally, increasing
`
`the length of polynucleotide assemblies that can be produced with low—cost, high
`
`complexity DNA synthesis will provide new opportunities for protein design and synthetic
`
`biology.
`
`[0010]
`
`In some embodiments, the method further comprises: (c) tagging the one or
`
`more double-stranded polynucleotides, wherein the tagging comprises amplifying the one
`
`or more double-stranded oligonucleotides using a pair of tagging primers to generate one
`
`or more tagged double-stranded polynucleotides, wherein each tagging primer in the pair
`
`of tagging primers comprises: (i) a first segment comprising a unique flanking sequence,
`
`and (ii) a second segment comprising a seed sequence, (d) sequencing the one or more
`
`tagged double-stranded polynucleotides, wherein the sequencing comprises binding of the
`
`seed sequence to a sequencing platform and performing a sequencing reaction to identify
`
`one or more sequence verified polynucleotides; and (e) retrieving the one or more
`
`sequence verified polynucleotides, wherein the retrieving comprises base—pairing a
`
`complementary primer to the first segment of at least one tagging primer in the one or
`
`more sequence verified polynucleotides and, under conditions suitable and in the presence
`
`of suitable reagents, amplifying the sequence verified polynucleotides to produce one or
`
`more verified polynucleotides; or (c) phenotypic selection of functional polypeptides,
`
`wherein the phenoytypic selection comprises of one or more of yeast display, phage
`
`display, mRNA display, ribosome display, mammalian cell display, bacterial cell display,
`
`
`
`WO 2017/059399
`
`PCT/USZOl6/055078
`
`emulsion-based protein selection, functional complementation of a portion of a genome, or
`
`other selection methods known to experts in the field of polypeptideevolution.
`
`[0011]
`
`In another embodiment, the method further comprises step-wise assembly of
`
`two or more of the double—stranded or verified polynucleotides into an assembled
`
`polynucleotide product, wherein the two or more double-stranded or verified
`
`polynucleotides have overlapping regions with homology capable of annealing and at least
`
`one common primer binding site in each of the double-stranded or verified
`
`polynucleotides; and (f) combining the two more double-stranded or verified
`
`polynucleotides under conditions suitable for annealing the overlapping regions with
`
`homology and in the presence of suitable reagents for assembling an initial desired
`
`polynucleotide product by extension of the double-stranded or verified polynucleotides to
`
`produce the initial desired polynucleotide product, and (g) combining the initial desired
`
`polynucleotide product and a next double-stranded or verified polynucleotide, wherein the
`
`initial desired polynucleotide product and the next double-stranded or verified
`
`polynucleotide have overlapping regions with homology capable of annealing and at least
`
`one common primer binding site in each of the double-stranded or verified
`
`polynucleotides, and assembling the initial desired polynucleotide product and the next
`
`double-stranded or verified polynucleotide in the presence of suitable reagents for
`
`assembling the assembled polynucleotide product by extension of the initial desired
`
`polynucleotide product and the next double—stranded or verified polynucleotide; and (h)
`
`reiteratively repeating (g) to step-wise add additional next double-stranded or verified
`
`polynucleotides to the initial desired polynucleotide product to produce the assembled
`
`polynucleotide product.
`
`[0012]
`
`In yet another embodiment, the method further comprises hierarchical
`
`assembly of two or more of the double-stranded or verified polynucleotides into an
`
`assembled polynucleotide product, wherein the two or more double-stranded or verified
`
`polynucleotides have overlapping regions with homology capable of annealing and at least
`
`one common primer binding site in each of the double-stranded or verified
`
`polynucleotides; and (f) combining the two double-stranded or verified polynucleotides
`
`under conditions suitable for annealing the overlapping regions with homology and in the
`
`presence of suitable reagents for assembling a first desired polynucleotide product by
`
`extension of the double—stranded or verified polynucleotides to produce the first desired
`
`polynucleotide product; and (g) repeating (f) with another two double—stranded or verified
`
`
`
`WO 2017/059399
`
`PCT/USZOl6/055078
`
`polynucleotides to produce a second desired polynucleotide product; (e) combining the
`
`first desired polynucleotide product and the second desired polynucleotide product,
`
`wherein the first desired polynucleotide product and the second desired polynucleotide
`
`product have overlapping regions with homology capable of annealing and at least one
`
`common primer binding site in each of the first and the second desired polynucleotide
`
`products, and assembling the first desired polynucleotide product and the second desired
`
`polynucleotide product in the presence of suitable reagents for assembling the assembled
`
`polynucleotide product by extension of the first desired polynucleotide product and the
`
`second desired polynucleotide product; and (h) repeating (f), (g) and (e) to hierarchically
`
`assemble pairs of desired polynucleotides to produce the assembled polynucleotide
`
`product.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`[0013]
`
`The disclosed exemplary aspects have other advantages and features which
`
`will be more readily apparent from the detailed description, the appended claims, and the
`
`accompanying figures. A brief description of the drawings is below.
`
`[0014]
`
`FIG.1 shows an overview of multiplex pairwise assembly. A total of 2,271
`
`oligonucleotide targets were separated into 10 sets of 131—250 oligonucleotides. Each
`
`oligonucleotide was split into A and B fragments with overlapping sequences providing
`
`>56°C melting temperature (Tm) for PCR-mediated assembly. All oligonucleotides were
`
`cleaved off the array into one tube. Each sub-pool was then amplified with one common
`
`primer and one uracil-containing pool-specific primer. The uracil-containing pool-specific
`
`primer was then removed with Uracil Specific Excision Reagent (USERTM) followed by
`
`New England BioLabs End Repair kit. During PCR assembly, corresponding sub-pools
`
`were allowed to anneal and extend through 5 cycles of PCR, before adding a set of
`
`common, outer primers for amplification. During PCR assembly, Ml3F and M13R
`
`sequences can be introduced to the polynucleotide products in order to allow for Dial-Out
`
`Tagging and retrieval of sequence—verified polynucleotide products. Up to 252—mers were
`
`assembled from l60—mer CustomArray oligonucleotides.
`
`[0015]
`
`FIG.2 shows a pipeline for generation of static tag library. First, 1.2 million
`
`random 13-mers (5’-NNNNNNNNNNNNN—3’, SEQ ID NO:26) were generated, and
`
`screened for no homoguanine or homocytosine stretches >5 bp (5’-ATTCGGCGGATAT-
`
`3’, SEQ ID NO:27), no homoadenine or homothymine stretches >8 bp and GC content
`
`
`
`WO 2017/059399
`
`PCT/USZOl6/055078
`
`between 45% and 65%. The 13-mers were also screened for <90% nucleotide identity in
`
`the last 10 bp, which generated a set of 7,411 13-mers. From this set of 7,41 1 sequences,
`
`every pairwise Gibbs free energy was calculated, and the maximum number of sequences
`
`such that no two members had a dG 3—9 kcal/mol were identified. This left a set of 4,637
`
`sequences, which were split into a set of 2,318 forward tags and 2,319 reverse tags.
`
`[0016]
`
`FIG.3A shows a uniformity plot of error-free array-derived oligonucleotides
`
`by rank-ordered percentile for all 2,271 oligonucleotide targets assembled in sets of 131-
`
`250.
`
`[0017]
`
`FIG.38 shows the number and size of oligonucleotide targets, and error-free
`
`yield for each set of oligonucleotides assembled in sets of 131-250.
`
`[0018]
`
`FIG.3C shows the percent yield of assemblies when assembling
`
`oligonucleotide targets in sets of 1317250. Each oligonucleotide target is placed into a bin
`
`based on the limiting oligonucleotide count, which is the number of error-free reads out of
`
`1.2 million that are limiting for its corresponding oligonucleotide target. The percent yield
`
`of assemblies is the percentage of oligonucleotide targets in that bin with at least one
`
`perfect assembly.
`
`[0019]
`
`FIG.3D shows the percentage of perfect, mismatch only, small indel (<5 bp),
`
`large indel (25 bp), truncations and unmapped reads for all oligonucleotides when
`
`assembled in sets of 131-250.
`
`[0020]
`
`FIG.3E shows the percentage of perfect, mismatch only, small indel (<5 bp),
`
`large indel (25 bp), chimeras, truncations and unmapped reads for each assembled library
`
`set when assembled in sets of 131-250.
`
`[0021]
`
`FIG.3F shows the uniformity of each set of oligonucleotide targets (sets 1-9
`
`are between 131-250 oligonucleotide targets and set 10 has 131 oligonucleotide targets).
`
`[0022]
`
`FIG.4A shows the effect of complexity on assembly performance and the
`
`percentage of oligonucleotide targets with at least one error-free assembly for each level of
`
`complexity.
`
`[0023]
`
`FIG.4B shows the effect of complexity on assembly performance and the
`
`yield (number of oligonucleotide targets with at least one perfect read) versus complexity.
`
`Red bars show the total number of oligonucleotide targets with error free assemblies at
`
`each level of complexity. Black bars show the number of oligonucleotide targets from the
`
`corresponding sets with error-free assemblies, which were individually assembled in sets
`
`of complexity ranging from 131—250.
`
`
`
`WO 2017/059399
`
`PCT/US2016/055078
`
`[0024]
`
`FIG.4C shows the effect of complexity on assembly performance and that
`
`each oligonucleotide target is placed into a bin based on the limiting oligonucleotide
`
`count, which is the number of error—free reads (out of 1.2 million), that are limiting for its
`
`corresponding oligonucleotide target. The percent yield of assemblies is the percentage of
`
`oligonucleotide targets in that bin with at least one perfect assembly.
`
`[0025]
`
`FIG.4D shows the effect of complexity on assembly performance and the
`
`percentage of perfect, mismatch only, small indels (<5 bp), large indels (25 bp), chimeras,
`
`truncations and unmapped reads in sets of increasing complexity.
`
`[0026]
`
`FIG.4E shows the effect of complexity on assembly performance and the
`
`uniformity of each set of oligonucleotide targets.
`
`[0027]
`
`FIG.5A shows the error correction of assembled constructs and the per-base
`
`accuracy of assembled constructs in black and their corresponding oligonucleotides in red
`
`and blue. Increased accuracy is seen at both priming sites and the overlap region.
`
`[0028]
`
`FIG.5B shows the error correction of assembled constructs and the bar graphs
`
`for the percentage of tags identified on only one, two, three, four or at least five different
`
`molecules in the sequenced library. Orange (pool 2) and purple bars (pool 6) are two
`
`different assembly sets, each with 250 oligonucleotide targets
`
`[0029]
`
`FIG.5C shows the error correction of assembled constructs and the percentage
`
`of aligning reads that contain no errors for each of the 25 retrieved assemblies.
`
`[0030]
`
`FIG.6A shows the percentage of perfect, mismatch only, small indels (<5bp),
`
`large indels (25bp), chimeras, truncations, and unmapped reads for assemblies using one
`
`or two unique primers for initial amplification of oligonucleotides, for two independent
`
`sub pools when comparing one versus two unique primers per oligonucleotide pool. Pools
`
`of oligonucleotides were amplified off the array using either one unique primer (Uracil-
`
`containing A/B fragment primer) and one common primer (YF/YR), or two unique
`
`primers (Uracil-containing A/B fragment primer and A/B fragment unique F/R) (Table 1).
`
`Each pool was then assemble and sequenced to 115,000 reads.
`
`[0031]
`
`FIG.6B shows the uniformity for one sub pool with one or two unique primers
`
`when comparing one versus two unique primers per oligonucleotide pool.
`
`[0032]
`
`FIG.7A shows a representative Sanger trace (SEQ ID NO:28) for 22/25
`
`retrieval reactions for dial-out PCR retrieval.
`
`[0033]
`
`FIG.7B shows a representative Sanger trace (SEQ ID NO:29) for 3/25
`
`retrieval reactions for dial-out PCR retrieval.
`
`
`
`WO 2017/059399
`
`PCT/USZOl6/055078
`
`[0034]
`
`FIG.8A shows oligonucleotide uniformity across 10,000 oligonucleotides
`
`corresponding to 10 sub-pools of oligonucleotide targets for assembly without duplicated
`
`oligonucleotides.
`
`[0035]
`
`FIG.8B shows assembly yield of sets of 500 oligonucleotide targets for
`
`assembly without duplicated oligonucleotides.
`
`[0036]
`
`FIG.8C shows aggregate data for assembly without duplicated
`
`oligonucleotides from all pools of 500. Each oligonucleotide target is placed into a bin
`
`based on the limiting oligonucleotide count, which is the number of error-free reads (out
`
`of 525K), that are limiting for its corresponding oligonucleotide target. Percent yield of
`
`assemblies is the percentage of oligonucleotide targets in that bin with 21 perfect
`
`assembly.
`
`[0037]
`
`FIG.8D shows aggregate data for assembly without duplicated
`
`oligonucleotides from all pools of 2,000. Each oligonucleotide target is placed into a bin
`
`based on the limiting oligonucleotide count, which is the number of error-free reads (out
`
`of 525,000), that are limiting for its corresponding oligonucleotide target. Percent yield of
`
`assemblies is the percentage of oligonucleotide targets in that bin with 21 perfect assembly
`
`[0038]
`
`FIG.9 shows yield versus oligonucleotide target length. After assembly,
`
`oligonucleotide targets were binned according to their target size. Black bars show the %
`
`of oligonucleotide targets assembled with at least one error—free yield in individual sub
`
`pools of 131-250. Red bars show the same breakdown for assembly in one pool of 2,271
`
`oligonucleotide targets.
`
`[0039]
`
`FIG.10 shows the uniformity plots of each set 1 and set 9 of oligonucleotide
`
`targets when performed with a higher quality, higher uniformity of input oligonucleotides
`
`from Twist compared to previous input oligonucleotides from CustomArray.
`
`[0040]
`
`FIG.11 shows a uniformity plot of smaller sets of longer oligonucleotides
`
`(23 0bp sequences) from a different vendor (Agilent), resulted in assembly of greater than
`
`90% of 393bp target sequences.
`
`[0041]
`
`FIG.12 shows an overview of hierarchical multiplex pairwise assembly.
`
`[0042]
`
`FIG.13 shows a DNA gel demonstrating hierarchical multiplex pairwise
`
`assembly.
`
`[0043]
`
`FIG.14 shows a uniformity plot of a hierarchical multiplex pairwise assembly.
`
`[0044]
`
`FIG.15 demonstrates increased adapter cleavage efficiency using USERTM
`
`cleavage with additional uracils for adapter cleavage.
`
`
`
`WO 2017/059399
`
`PCT/US2016/055078
`
`DETAILED DESCRIPTION OF THE INVENTION
`
`[0045]
`
`All references cited are herein incorporated by reference in their entirety.
`
`Within this application, unless otherwise stated, the techniques utilized may be found in
`
`any of several well—known references such as: Molecular Cloning: A Laboratory Manual
`
`(Sambrook, et al., 1989, Cold Spring Harbor Laboratory Press), Gene Expression
`
`Technology (Methods in Enzymology, Vol. 185, edited by D. Goeddel, 1991. Academic
`
`Press, San Diego, CA), “Guide to Protein Purification” in Methods in Enzymology (M.P.
`
`Deutshcer, ed., (1990) Academic Press, Inc), PCR Protocols: A Guide to Methods and
`
`Applications (Innis, et a1. 1990. Academic Press, San Diego, CA), Culture of Animal
`
`Cells: A Manual of Basic Technique, 2nd Ed. (R1. Freshney. 1987. Liss, Inc. New York,
`
`NY), Gene Transfer and Expression Protocols, pp. 109-128, ed. E.J. Murray, The Humana
`
`Press Inc., Clifton, N.J.), and the Ambion 1998 Catalog (Ambion, Austin, TX).
`
`[0046]
`
`Terms used in the claims and specification are defined as set forth below
`
`unless otherwise specified. In the case of direct conflict with a term used in a parent
`
`provisional patent application, the term used in the instant specification shall control.
`
`[0047]
`
`The particulars shown herein are by way of example and for purposes of
`
`illustrative discussion of the preferred embodiments of the present invention only and are
`
`presented in the cause of providing what is believed to be the most useful and readily
`
`understood description of the principles and conceptual aspects of various embodiments of
`
`the invention. In this regard, no attempt is made to show structural details of the invention
`
`in more detail than is necessary for the fundamental understanding of the invention, the
`
`description taken with the drawings and/or examples making apparent to those skilled in
`
`the art how the several forms of the invention may be embodied in practice.
`
`[0048]
`
`The following definitions and explanations are meant and intended to be
`
`controlling in any future construction unless clearly and unambiguously modified in the
`
`following examples or when application of the meaning renders any construction
`
`meaningless or essentially meaningless. In cases where the construction of the term would
`
`render it meaningless or essentially meaningless, the definition should be taken from
`
`Webster's Dictionary, 3rd Edition or a dictionary known to those of skill in the art, such as
`
`the Oxford Dictionary of Biochemistry and Molecular Biology (Ed. Anthony Smith,
`
`Oxford University Press, Oxford, 2004).
`
`[0049]
`
`As used herein, the singular forms "a", "an" and "the" include plural referents
`
`
`
`WO 2017/059399
`
`PCT/USZOl6/055078
`
`unless the context clearly dictates otherwise. “And” as used herein is interchangeably
`
`used with “or” unless expressly stated otherwise.
`
`[0050]
`
`The terms “nucleic acid,” "polynucleotide" and "oligonucleotide" are used
`
`interchangeably and refer to deoxyribonucleotides or ribonucleotides or modified forms of
`
`either type of nucleotides, and polymers thereof in either single- or double-stranded form.
`
`The terms should be understood to include equivalents, analogs of either RNA or DNA
`
`made from nucleotide analogs and as applicable to the embodiment being described, single
`
`stranded or double stranded polynucleotides. In certain embodiments, an oligonucleotide
`
`may be chemically synthesized.
`
`[0051]
`
`All embodiments disclosed herein can be used in combination unless the
`
`context clearly dictates otherwise.
`
`[0052]
`
`In a first aspect, the present invention provides a method for assembly of one
`
`or more double-stranded polynucleotides, the method comprising: (a) amplifying a first
`
`plurality of single-stranded overlapping oligonucleotides, wherein the first plurality of
`
`single-stranded overlapping oligonucleotides comprises: (i) overlapping regions with
`
`homology capable of annealing to produce one or more double-stranded polynucleotides,
`
`and (ii) at least one common primer binding site in each single—stranded overlapping
`
`oligonucleotide; (b) assembling one or more double—stranded polynucleotides, wherein the
`
`assembling comprises denaturing, annealing and extending the first plurality of single-
`
`stranded overlapping oligonucleotides to generate the one or more double-stranded
`
`polynucl eoti des.
`
`[0053]
`
`In some embodiments, the first plurality of single-stranded overlapping
`
`oligonucleotides can be derived from an array. In such embodiments, the oligonucleotides
`
`may be obtained from a commercial source. For example, the oligonucleotides may be
`
`from arrays that are constructed, custom ordered or purchased from a commercial vendor.
`
`Such vendors include, but are not limited to, Agilent, Affymetrix, CustomArray,
`
`Nimblegen, MycroArray, LC Sciences and Twist. Single-stranded oligonucleotides are
`
`typically synthesized in situ on a common support wherein each oligonucleotide is
`
`synthesized on a separate spot on the substrate. In an embodiment, oligonucleotides can
`
`be of any length, but are typically 10-400 bases long or loner. For example,
`
`oligonucleotides may be from 10 to about 300 nucleotides, from 20 to about 400
`
`nucleotides, from 30 to about 500 nucleotides, from 40 to about 600 nucleotides, or more
`
`than about 600 nucleotides long. Accordingly, oligonucleotides of 5, 6, 7, 8, 9, 10, l l, 12,
`
`10
`
`
`
`WO 2017/059399
`
`PCT/US2016/055078
`
`13, 14,15, 16,17,18,19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
`
`37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
`
`61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
`
`85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 110, 120, 130, 140, 150, 160,
`
`170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340,
`
`350, 360, 370, 380, 390 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520,
`
`530, 540, 550, 560, 570, 580, 590 and 600 nucleotides in length are contemplated.
`
`Oligonucleotides from such an array may be covalently attached to the surface or
`
`deposited on the surface. Various methods of array construction are known in the art (for
`
`example, maskless array synthesizers, light directed methods utilizing masks, flow channel
`
`methods, or spotting methods).
`
`[0054]
`
`In some embodiments, the plurality of single-stranded oligonucleotides can be
`
`two, three, four, 5 or more, 10 or more, 20 or more, 50 or more, 100 or more, 250 or more,
`
`500 or more, 1,000 or more, 1,500 or more, 2,000 or more, or 2,500 or more
`
`oligonucleotides. For example, a plurality can be approximately 2-100, 100-250,
`
`approximately 250-450, approximately 450-700, approximately 700-950, approximately
`
`950-1,200, approximately 1,200-1,450, approximately 1,450-1,675, approximately 1675-
`
`1800, approximately 1,800-2,025, or approximately 2,025-2,275 oligonucleotides. More
`
`specifically, a plurality can be 250, or 462, or 712, or 962, or 1212, or 1452, or 1674, or
`
`1805, or 2021 or 2271 oligonucleotides.
`
`[0055]
`
`The oligonucleotides and/or polynucleotides used and generated in the
`
`methods described herein can be predefined or have desired sequences, meaning that the
`
`sequences of the oligonucleotides and/or polynucleotides are known and chosen before
`
`synthesis or assembly of the oligonucleotides and/or polynucleotides.
`
`In some
`
`embodiments, the methods described herein use oligonucleotides and/or polynucleotides
`
`with sequences determined based on the sequence of the final assembled polynucleotides
`
`products to be synthesized. It should be appreciated that different oligonucleotides may be
`
`designed to have different lengths. In some embodiments, the sequence of the assembled
`
`polynucleotide product may be divided up into a plurality of shorter oligonucleotide
`
`sequences that can be assembled step-wise, hierarchically and/or in parallel into a single or
`
`a plurality of desired or assembled polynucleotide products using the methods described
`
`herein. In certain embodiments, the predefined sequence of each of the oligonucleotides
`
`in the first plurality of single—stranded overlapping oligonucleotides further comprises an
`
`11
`
`
`
`WO 2017/059399
`
`PCT/US2016/055078
`
`adaptor sequence. In some embodiments, the adaptor sequence can comprise a degenerate
`
`sequence that is a completely degenerative sequence or a partially degenerate sequence.
`
`[0056]
`
`In certain embodiments, the adaptor sequence may be of any suitable length.
`
`In some embodiments, the adaptor sequence is between approximately 5 to 30, 5 to 25, 5
`
`to 20, 5 to 15,5 to 10,10 to 30,10 to 25,10 to 20,10 to 15, 15 to 30,15 to 25,15 to 20,
`
`20 to 30, 20 to 25, 25 to 30 or more than 30 nucleotides in length. In other embodiments,
`
`the adaptor sequence is approximately 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
`
`20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more than 30 nucleotides in length.