(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY(PCT)
`
`(19) World Intellectual Property
`Organization
`International Bureau
`
`(43) International Publication Date
`8 April 2004 (08.04.2004)
`
`
`
`(51)
`
`(21)
`
`(22)
`
`International Patent Classification’:
`
`C12N
`
`(74)
`
`International Application Number:
`PCT/US2003/030940
`
`(81)
`
`International Filing Date:
`26 September 2003 (26.09.2003)
`
`(25)
`
`Filing Language:
`
`(26)
`
`Publication Language:
`
`English
`
`English
`
`(30)
`
`Priority Data:
`60/414,085
`
`26 September 2002 (26.09.2002)
`
`US
`
`(71)
`
`(72)
`(75)
`
`Applicant (for all designated States except US): KOSAN
`BIOSCIENCES,INC. [US/US]; 3832 Bay Center Place,
`Hayward, CA 94545 (US).
`
`Inventors; and
`Inventors/Applicants (for US only): SANTI, Daniel, V.
`[IN/US]; 211 Belgrave Avenue, San Francisco, CA 94117
`(US). REID, Ralph, C. [US/US]; 600 Galerita Way, San
`Rafael, CA 94903 (US). KODUMAL,Sarah,J. [US/US];
`3933 Harrison Street, Apartment # 102, Oakland, CA
`94611 (US). JAYARAJ, Sebastian [IN/US]; 1709 Shat-
`tuck Avenue, Apartment # 214, Berkeley, CA 94709 (US).
`
`(10) International Publication Number
`WO 2004/029220 A2
`
`Agents: APPLE, Randolph,Ted et al.; Morrison & Foer-
`ster LLP, 755 Page Mill Road, Palo Alto, CA 94304 (US).
`
`Designated States (national): AE, AG, AL, AM, Al, AU,
`AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU,
`CZ, DE, DK, DM, DZ, EC, EE, EG, ES, FI, GB, GD, GE,
`GH, GM, HR, HU,ID,IL, IN, IS, JP, KE, KG, KP, KR,
`KZ, LC, LK, LR, LS, LT, LU, LV, MA, MD, MG, MK,
`MN, MW, MX, MZ, NI, NO, NZ, OM, PG, PH, PL, PT,
`RO, RU, SC, SD, SE, SG, SK, SL, SY, TJ, TM, TN, TR,
`TT, TZ, UA, UG, US, UZ, VC, VN, YU, ZA, ZM, ZW.
`
`(84)
`
`Designated States (regional): ARIPO patent (GH, GM,
`KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW),
`Eurasian patent (AM, AZ, BY, KG, KZ, MD, RU, TJ, T),
`Luropean patent (AT, BE, BG, CH, CY, CZ, DE, DK, EE,
`ES, FI, FR, GB, GR, HU, IK, IT, LU, MC, NL, PT, RO,
`SE, SI, SK, TR), OAPI patent (BF, BJ, CF, CG, CI, CM,
`GA, GN, GQ, GW, ML, MR,NE, SN, TD, TG).
`
`Published:
`without international search report and to be republished
`upon receipt of that report
`
`For two-letter codes and other abbreviations, refer to the "Guid-
`ance Notes on Codes and Abbreviations" appearing at the begin-
`ning of each regular issue of the PCT Gazette.
`
`(54)
`
`Title: SYNTHETIC GENES
`
`(57) Abstract: The invention providesstrategies, methods, vectors, reagents, and systems for production of synthetic genes, produc-
`tion of libraries of such genes, and manipulation and characterization of the genes and corresponding encoded polypeptides. In one
`aspect, the synthetic genes can encode polyketide synthase polypeptides and facilitate production of therapeutically or commercially
`important polyketide compounds.
`
`
`
`WO2004/029220A2[IITNIMNINIIINAITINTTTAAMINOTAM
`
`

`

`WO 2004/029220
`
`PCT/US2003/030940
`
`SYNTHETIC GENES
`
`STATEMENT CONCERNING GOVERNMENT SUPPORT
`
`Subject matter disclosed in this application was made, in part, with government
`[0001]
`support under NationalInstitute of Standards and Technology ATP Grant No. 7ONANB2H3014.
`As such, the United States government may havecertain rights in this invention.
`
`CROSS-REFERENCE TO RELATED APPLICATIONS
`
`This application claims benefit under 35 U.S.C. § 119(e) ofprovisional application
`[0002]
`No.60/414,085, filed 26 September 2002, the contents of which are incorporated herein by
`
`reference.
`
`FIELD OF THE INVENTION
`
`The invention provides strategies, methods, vectors, reagents, and systems for
`[0003]
`production of synthetic genes, production oflibraries of such genes, and manipulation and
`characterization of the genes and corresponding encoded polypeptides. In one aspect, the
`synthetic genes can encode polyketide synthase polypeptides and facilitate production of
`therapeutically or commercially important polyketide compounds. The invention finds
`application in the fields ofhuman and veterinary medicine, pharmacology, agriculture, and
`molecular biology.
`
`BACKGROUND
`
`Polyketides represent a large family of compounds produced by fungi, mycelial
`[0004]
`bacteria, and other organisms. Numerous polyketides have therapeutically relevant and/or
`commercially valuable activities. Examples of useful polyketides include erythromycin, FK-
`506, FK-520, megalomycin, narbomycin, oleandomycin, picromycin, rapamycin, spinocyn, and
`
`tylosin.
`Polyketides are synthesized in nature from 2-carbon units through a series of
`[0005]
`condensations and subsequent modifications by polyketide synthases (PKSs). Polyketide
`
`

`

`WO 2004/029220
`
`PCT/US2003/030940
`
`synthases are multifunctional enzyme complexes composed of multiple large polypeptides. Each
`of the polypeptide components of the complex is encoded by a separate open reading frame, with
`the open reading frames corresponding to a particular PKS typically being clustered together on
`the chromosome. Thestructure of PKSs and the mechanismsof polyketide synthesis are
`reviewed in Caneet al., 1998, “Harnessing the biosynthetic code: combinations, permutations,
`and mutations” Science 282:63-8.
`
`PKSpolypeptides comprise numerous enzymatic and carrier domains, including
`[0006]
`acyltransferase (AT), acy! carrier protein (ACP), and beta-ketoacylsynthase (KS)activities,
`involved in loading and condensation steps; ketoreductase (KR), dehydratase (DH), and
`enoylreductase (ER) activities, involved in modification at 8-carbon positions ofthe growing
`chain, and thioesterase (TE) activities involved in release ofthe polyketide from the PKS.
`Various combinations of these domains are organized in units called “modules.” For example,
`the 6-deoxyerythronolide B synthase ("DEBS"), which is involved in the production of
`erythromycin, comprises 6 modules on three separate polypeptides (2 modules per polypeptide).
`The number, sequence, and domain content of the modules of a PKS determinethe structure of
`the polyketide product of the PKS.
`[0007]
`Given the importanceofpolyketides, the difficulty in producing polyketide
`compoundsbytraditional chemical methods, and the typically low production of polyketides in
`wild-type cells, there has been considerable interest in finding improvedor alternate means for
`producing polyketide compounds. This interest has resulted in the cloning, analysis and
`manipulation by recombinant DNAtechnology of genes that encode PKS enzymes. The resulting
`technology allows one to manipulate a known PKSgenecluster to produce the polyketide
`synthesized by that PKS at higher levels than occur in nature, or in hosts that otherwise do not
`producethe polyketide. The technologyalso allows one to produce molecules that are
`structurally related to, but distinct from, the polyketides produced from known PKSgeneclusters
`by inactivating a domain in the PKS and/or by adding a domain not normally found in the PKS
`though manipulation of the PKS gene.
`[0008]
`While the detailed understanding of the mechanisms by which PKS enzymes function
`and the development of methods for manipulating PKS genes have facilitated the creation of
`novel polyketides, there are presently limits to the creation ofnovel polyketides by genetic
`engineering. One suchlimit is the availability ofPKS genes. Many polyketides are known but
`
`

`

`WO 2004/029220
`
`PCT/US2003/030940
`
`only a relatively small portion of the corresponding PKS genes have been cloned and are
`available for manipulation. Moreover, in many instances the organism producing an interesting
`polyketide is obtainable only with greatdifficulty and expense, and techniquesfor its growth in
`the laboratory and, production of the polyketideit produces are unknown ordifficult or time-
`consuming to practice. Also, even ifthe PKS genes for a desired polyketide have been cloned,
`those genes maynotserve to drive the level of production desired in a particular hostcell.
`[0009]
`Ifthere was a method to produce a desired polyketide without having to access the
`genes that encode the PKS that produces the polyketide, then manyofthese difficulties could be
`ameliorated or avoided altogether. The present invention meets this and other needs.
`
`BRIEF SUMMARY OF THE INVENTION
`In one aspect, the invention provides a synthetic gene encoding a polypeptide
`.[0010}
`segment that correspondsto a reference polypeptide segment encoded by a naturally occurring
`gene. The polypeptide segment-encoding sequence of the synthetic gene is different from the
`polypeptide segment-encoding sequenceof the naturally occurring gene. In one aspect, the :
`polypeptide segment-encoding sequenceof the synthetic geneis less than about 90% identical to
`the polypeptide segment-encoding sequenceofthe naturally occurring gene, or in some
`embodiments, less than about 85% or Jess than about 80% identical. In one aspect, the
`polypeptide segment-encoding sequence ofthe synthetic gene comprisesat least one (and in
`other embodiments, more than one,e.g., at least two,at least three, or at least four) unique
`restriction sites that are not present or are not unique in the polypeptide segment-encoding
`sequence ofthe naturally occurring gene. In an aspect, the polypeptide segment-encoding
`sequence of the synthetic gene is free from at least onerestriction site that is present in the
`polypeptide segment-encoding sequence ofthe naturally occurring gene. In an embodiment of
`the invention, the polypeptide segment encodedby the synthetic gene correspondsto at least 50
`contiguous amino acid residues encoded by the naturally occurring gene.
`[0011]
`in an embodiment, the polypeptide segmentis from a polyketide synthase (PKS) and
`maybe or include a PKS domain(e.g., AT, ACP, KS, KR, DH, ER, and/or TE) or one or more
`PKS modules. In some embodiments, the synthetic PKS genehas, at most, one copy per
`module-encoding sequence ofa restriction enzymerecognitionsite selected from the group
`consisting of Spe I, Mfe I, Afi Il, Bsi WI, Sac II, Ngo MIV, Nhe I, Kpn I, MscI, Bgl I, Bss HO,
`
`

`

`WO 2004/029220
`
`PCT/US2003/030940
`
`Sac II, Age I, Pst I, Kas I, Mlu I, Xba I, Sph I, Bsp E, and Ngo MIV recognition sites. In an
`embodiment, the polypeptide segment-encoding sequence of the synthetic geneis free from at
`
`least one Type IIS enzymerestriction site (e.g., Bci VI, Bmr I, Bpm I, Bpu EI, Bse RI, BsgI, Bsr
`Di, BtsI, Eci I, Ear I, Sap I, Bsm BI, Bsp MI, BsaI, Bbs I, Bfu AI, Fok I and Alw J) present in
`the polypeptide segment-encoding sequenceofthe naturally occurring gene.
`[0012]
`In a related embodiment, the invention provides a synthetic gene encoding a
`polypeptide segmentthat corresponds to a reference polypeptide segment encodedbya naturally
`occurring PKS gene, where the polypeptide segment-encoding sequenceofthe synthetic geneis
`different from the polypeptide segmentencoding sequenceofthe naturally occurring PKS gene
`and comprises at least two of (a) a Spe I site near the sequence encoding the amino-terminus of
`the module; (b) a Mfe I site near the sequence encoding the amino-terminus of a KS domain;(c)
`a KpnIsite near the sequence encoding the carboxy-terminus of a KS domain; (d) a MscIJsite
`near the sequence encoding the amino-terminus of an AT domain;(e) a Pst I site near the
`
`sequence encoding the carboxy-terminus of an AT domain; (f) a Bsr BI site near the sequence
`encoding the amino-terminus of an ER domain; (g) an Age I site near the sequence encoding the
`amino-terminus of a KR domain; and(h) an XbaI site near the sequence encoding the amino-
`
`terminus of an ACP domain.
`
`[0013]
`In related aspects, the invention provides a vector (e.g., cloning or expression vector)
`comprising a synthetic gene ofthe invention. In an embodiment, the vector comprises an open
`reading frame encoding a first PKS module and one or more of (a) a PKS extension module; (b)
`‘a PKS loading module; (c) a releasing (e.g., thioesterase) domain; and (d) an interpolypeptide
`linker.
`
`Cells that comprise or express a gene or vector of the invention are provided, as well
`[0014]
`as a cell comprising a polypeptide encodedbythe vectoror, a functional polyketide synthase,
`wherein the PKS comprises a polypeptide encoded by the vector. In one aspect, a PKS .
`polypeptide having a non-natural amino sequenceis provided, such as a polypeptide
`characterized by a KS domain comprising the dipeptide Leu-Gin at the carboxy-terminal edge of
`the domain; and/or an ACP domain comprising the dipeptide Ser-Ser at the carboxy-terminal
`edge of the domain. A method is provided for making a polyketide comprising culturing a cell
`comprising a synthetic DNA ofthe invention under conditions in which a polyketide is
`produced, wherein the polyketide would not be produced bythe cell in the absence of the vector.
`
`

`

`WO 2004/029220
`
`PCT/US2003/030940
`
`In oneaspect, the invention provides a method for high throughput synthesis of a
`[0015]
`plurality of different DNA units comprising different polypeptide encoding sequences
`comprising: for each DNA unit, performing polymerase chain reaction (PCR) amplification ofa
`plurality of overlapping oligonucleotides to generate a DNA unit encoding a polypeptide
`segment and adding UDG-containing linkers to the 5’ and 3’ ends of the DNA unit by PCR
`amplification, thereby generating a linkered DNA unit, wherein the same UDG-containing
`linkers are added to said different DNA units. In embodiments, the plurality comprises more
`than 50 different DNA units, more than 100 different DNA units, or more than 500 different
`
`DNAunits (synthons). In a related aspect, the invention provides a method for producing a
`vector comprising a polypeptide encoding sequence comprising cloning the linkered DNA unit
`into a vector using a ligation-independent-cloning method.
`
`The invention provides gene libraries. In one embodiment, a genelibrary is provided
`[0016]
`that contains a plurality of different PKS module-encoding genes, where the module-encoding
`genes in the library have at least one (or more than one, such as at least 3, at least 4, at least 5 or
`
`at least 6) restriction site(s) in common,the restriction site is found no more than one time in
`each module, and the modules encoded in the library correspond to modules from five or more
`different polyketide synthase proteins. Vectors for gene libraries include cloning and expression
`
`vectors. In some embodiments, a library includes open reading frames that contain an extension
`
`module andat least one of a second PKS extension module, a PKS loading module, a
`thioesterase domain, and an interpolypeptidelinker.
`
`In a related aspect, the invention provides a method for synthesis of an expression
`[0017]
`library ofPKS module-encoding genes by making a plurality of different PKS module-encoding
`genes as described above and cloning each gene into an expression vector. The library may
`include, for example, at least about 50 or at least about 100 different module-encoding genes.
`[0018]
`The invention provides a variety of cloning vectors useful for stitching (e.g., a vector
`comprising, in the order shown, SM4 — SIS — SM2 — R; or L— SIS — SM2 ~ R, where SIS isa
`
`synthon insertion site, SM2 is a sequence encodinga first selectable marker, SM4 is a sequence
`encoding a second selectable marker different from thefirst, R; is a recognition site for a
`
`restriction enzyme, and L is a recognition site for a different restriction enzyme. The invention
`further provides vectors comprising synthon sequences, e.g. comprising, in the order shown,
`SM4 — 2S — Sy; —2S2 —SM2~R, or L— 28, —Sy2—2S2 ~SM2 —R,where 2S; is a
`
`

`

`WO 2004/029220
`
`PCT/US2003/030940
`
`recognitionsite for first Type IIS restriction enzyme, 2S2is a recognition site for a different Type
`IIS restriction enzyme, and Sy is synthon coding region. Also provided are compositions of a
`vector and a TypeIIS orotherrestriction enzymethat recognizes a site on the vector,
`compositions comprising cognate pairs of vectors,kits, and thelike.
`[0019]
`In one embodiment, the invention provides a vector comprisinga first selectable
`marker, a restriction site (R;) recognized bya first restriction enzyme, and a synthon coding
`region that is flanked by a restriction site recognized by a first Type IIS restriction enzyme and a
`restriction site recognized by a second TypeIISrestriction enzyme, wherein digestion of the
`vector with the first restriction enzyme andthefirst Type IIS restriction enzyme produces a
`fragment comprising the first selectable marker and the synthon coding region, and digestion of
`the vector with the first restriction enzyme and the second Type IISrestriction enzyme produces
`
`a fragment comprising the synthon codingregion and not comprising the first selectable marker.
`In an embodiment, the vector comprising a second selectable marker wherein digestion of the
`
`vector with the first restriction enzyme andthe firstType IS restriction enzyme produces a
`fragment comprising the first selectable marker and the synthon coding region, and not
`comprising the second selectable marker, digestion of the vector with the first restriction enzyme-
`and the second TypeIISrestriction enzyme produces a fragment comprising the second
`selectable marker and the synthon coding region, and not comprising the first selectable marker.
`The invention provides methodsofstitching adjacent DNA units (synthons) to synthesize a
`larger unit. For example, the invention provides a method for making a synthetic gene encoding
`a PKS module by producinga plurality (i.e., at least 3) of DNA units by assembly PCR, wherein
`each DNAunit encodesa portion of the PKS module and combining the plurality of DNA units
`in a predetermined sequence to produce PKS module-encoding gene. In an embodiment, the
`methodincludes combining the module-encoding gene in-frame with a nucleotide sequence
`encoding a PKSextension module, a PKS loading module, a thioesterase domain, or an PKS
`interpolypeptide linker, to produce a PKS open reading frame.
`[0020]
`In arelated embodiment, the invention provides a methodfor joining a series of
`DNAunits using a vector pair by a) providing a first set of DNA units, each in a first-type
`selectable vector comprising a first selectable marker and providing a second set ofDNA units,
`each in a second-type selectable vector comprising a second selectable marker different from the
`first, wherein the first-type and second-type selectable vectors can be selected based on the
`
`

`

`WO 2004/029220
`
`PCT/US2003/030940
`
`different selectable markers, b) recombinantly joining a DNA unit from thefirst set with an
`adjacent DNAunit from the second set to generate a first-type selectable vector comprising a
`third DNA unit, and obtaining a desired clone by selecting for the first selectable marker c)
`recombinantly joining the third DNA unit with an adjacent DNA unit from the secondset to
`
`generate a first-type selectable vector comprising a fourth DNA unit, and obtaining a desired
`clone by selecting for the first selectable marker, or recombinantly joining the third DNA unit
`
`with an adjacent DNA unit from the second set to generate a second-type selectable vector
`
`comprising a fourth DNA unit, and obtaining a desired clone by selecting for the second
`
`selectable marker. In an embodiment, the step (c) comprises recombinantly joining the third
`
`DNAunit with an adjacent DNA unit from the second set to generate a first-type selectable
`
`vector comprising a fourth DNA unit, and obtaining a desired clone by selecting for the first
`selectable marker, the method further comprising recombinantly combining the fourthDNA unit
`with an adjacent DNA unit from the second set to generate.a first-type selectable vector
`
`comprising a fifth DNA unit, and obtaining a desired clone by selecting for the first selection
`
`marker, or recombinantly combining the third DNA unit with an adjacent DNA unit from the
`
`second set to generate a second-type selectable vector comprising a fifth DNA unit, and
`
`obtaining a desired clone by selecting for the second selection marker. In an embodiment, step
`
`(c). comprises recombinantly joining the third DNA unit with an adjacent DNA unit from the
`
`second series to generate a second-type selectable vector comprising a fourth DNA unit, and
`
`obtaining a desired clone by selecting for the second selectable marker, the method further
`
`comprising recombinantly joining the fourth DNA unit with an adjacent DNA unit from thefirst
`
`set to generate a first-type selectable vector comprising a fifth DNA unit, and obtaining a desired
`
`clone byselecting for the first selection marker, or recombinantly joining the third DNA unit
`
`with an adjacent DNA unit from the second set to generate a first-type selectable vector
`
`comprising a fifth DNA unit and obtaining a desired clone by selecting for the first selection
`
`marker.
`
`[0021]
`
`In a related aspect, the invention provides a methodfor joining a series of DNA units
`
`to generate a DNA construct by (a) providing a first plurality of vectors, each comprising a DNA
`
`unit and a first selectable marker; (b) providing a second plurality of vectors, each comprising a
`
`DNAunit and a second selectable marker; (c) digesting a vector from (a) to producea first
`
`fragment containing a DNA unit andat least one additional fragment not containing the DNA
`
`

`

`WO 2004/029220
`
`PCT/US2003/030940
`
`unit; (d) digesting a DNA from (b) to produce a second fragment containing a DNA unit andat
`least one additional fragment not containing the DNA unit, where only oneofthefirst and
`second fragments contains an origin ofreplication; ligating the fragments to generate a product
`vector comprising a DNA unit from (c) ligated to a DNA unit from (d); selecting the product
`vector by selecting for either the first or second selectable marker; (e) digesting the product
`vector to produce a third fragment containing a DNA unit andat least one additional fragment
`not containing the DNA unit; (d) digesting a DNA from (a) or (b) to produce a fourth fragment
`containing a DNA unit andat least one additional fragment not containing the DNA unit, where
`only one of the third and fourth fragments contains an origin ofreplication; (f) ligating the third
`and fourth fragments to generate a product vector comprising a DNA unit from (e) ligated to a
`DNAunit from (d) and selecting the product vector by selecting for either the first or second
`
`selectable marker.
`
`.
`
`In another aspect, an open reading frame vector is provided, which has an internal
`[0022]
`type {4-[7-*]-[*-8]-3}, left-edge type {4-[7-1]-[*-8]-3} or right-edge type {4-[7-*]-[6-8]-3}
`architecture where 7 and 8 are recognition sites for Type IIS restriction enzymes which cut to
`produce compatible overhangs “*” ; 1 and 6 are Type II restriction enzymesites that are
`optionally present; and 3 and4are recognition sites for restriction enzymes with 8-basepair
`recognition sites. In various embodiments, 1 is Nde I and/or 6 is Eco RI and/or 4 is Not I and/or
`3 is Pac I.
`In another aspect, a methodfor identifying restriction enzyme recognition sites useful
`[0023]
`for design of synthetic genes is provided. The method includes thesteps of obtaining amino acid
`sequences for a plurality of functionally related polypeptide segments; reverse-translating the
`amino acid sequences to produce multiple polypeptide segment-encoding nucleic acid sequences
`for each polypeptide segment; andidentifying restriction enzyme recognitionsites that are found
`in at least one polypeptide segment-encoding nucleic acid sequence ofat least about 50% ofthe
`polypeptide segments. In certain embodiments,the functionally related polypeptide segments
`are polyketide synthase modules or domains, such as regions of high homology in PKS modules
`
`or domains.
`
`In amethodfor designing a synthetic gene in accordance with the present invention a
`[0024]
`reference amino acid sequenceis provided and reverse translated to a randomized nucleotide
`sequence which encodesthe amino acid sequence using a random selection of codons which,
`
`

`

`WO 2004/029220
`
`PCT/US2003/030940
`
`optionally, have been optimized for a codon preference of a host organism. One or more
`parameters for positions of restriction sites on a sequence of the synthetic gene are provided and
`
`occurrences of one or more selected restriction sites from the randomized nucleotide sequence
`
`are removed. One or moreselected restriction sites are inserted at selected positions in the
`
`randomized nucleotide sequence to generate a sequence of the synthetic gene.
`
`[0025]
`
`In one aspect of the invention, a set of overlapping oligonucleotide sequences which
`
`together comprise a sequence of the synthetic gene are generated.
`
`[0026]
`
`In another aspect of the invention, one or more parametersfor positionsofrestriction
`
`sites on a sequence of the synthetic gene comprise one or more preselected restriction sites at
`
`selected positions.
`
`{0027}
`
`In another aspect of the invention, the selected position of the preselected restrictions
`
`site corresponds to a positions selected from the group consisting of a synthon edge, a domain
`
`edge and a module edge.
`
`[0028]
`
`In another aspect of the invention, providing one or more parameters for positions of
`
`restriction sites on a sequence of the synthetic gene is followed by predicting all possible
`
`restriction sites that can be inserted in the randomized nucleotide sequence and optionally,
`
`identifying one or more uniquerestriction sites.
`
`[0029]
`
`In another aspect of the invention, the sequence of the synthetic gene is divided into a
`
`series of synthons of selected length and then a set of overlapping oligonucleotide sequencesis
`
`generated which together comprise a sequence of each synthon.
`
`[0030]
`
`In another aspect of the invention, the set of overlapping oligonucleotide sequences
`
`comprise (a) oligonucleotide sequences which together comprise a synthon coding region
`
`corresponding to the synthetic gene, and (b) oligonucleotide sequences which comprise one or
`
`more synthon flanking sequences.
`
`[0031]
`
`In another aspect of the invention, one or more quality tests are performed on the set
`
`of overlapping oligonucleotide sequences, wherein the tests are selected from the group
`
`consisting of: translational errors, invalid restriction sites, incorrect positions ofrestriction sites,
`
`and aberrant priming.
`
`[0032]
`
`In another aspect of the invention, each oligonucleotide sequence is of a selected
`
`length and comprises an overlap of a predetermined length with adjacent oligonucletides of the
`set of oligonucleotides which together comprise the sequence of the synthetic gene.
`
`

`

`WO 2004/029220
`
`PCT/US2003/030940
`
`In another aspect ofthe invention, each oligonucleotide is about 40 nucleotides in
`[0033]
`length and comprises overlaps ofbetween about 17 and 23 nucleotides with adjacent
`oligonucleotides.
`{0034}
`In another aspectof the invention, a set of overlapping oligonucleotide sequences are
`selected wherein each oligonucleotide anneals with its adjacent oligonucleotide within a selected
`
`temperature range.
`[0035]
`In another aspect of the invention, generating a set of overlapping oligonucleotide
`sequencesincludes providing an alignment cutoff value for sequence specificity, aligning each
`oligonucleotide sequence with the sequenceof the synthetic gene and determiningits alignment
`value, and identifying andrejecting oligonucleotides comprising alignment values lowerthan the
`
`alignment cutoffvalue.
`10036)
`In another aspectofthe invention, a regionoferror in a rejected oligonucleotide is
`identified and optionally, one or more nucleotides in the region oferror are substituted such that
`the alignmentvalue ofthe rejected oligonucleotide is raised above the alignment cutoffvalue.
`{0037]
`In another aspect of the invention, an orderlist of oligonucleotides which comprise a
`synthetic gene or a synthon is generated.
`|
`[0038]
`In another aspect of the invention, removing ofrestriction sites includes
`[0039]
`identifying positions of preselected restriction sites in the randomized nucleotide
`sequence, identifying an ‘ability of one or more codons comprising the nucleotide sequence ofthe
`restriction site for accepting a substitution in the nucleotide sequenceofthe restriction site
`wherein such substitution will (a) remove the restriction site and (b) create a codon encoding an
`aminoacid identical to the codon whose sequence has been changed, and changing the sequence
`
`of therestriction site at the identified codon.
`[0040]
`In another aspect ofthe invention, inserting of restriction sites includes identifying
`selected positions for insertion of a selectedrestriction site in the randomized nucleotide
`sequence, performing a substitution in the nucleotide sequence at the selected position such that
`the selected restriction site sequence is created at the selected position, translating the substituted
`sequence to an amino acid sequence, and accepting a substitution wherein the translated amino
`acid sequenceis identical to the reference amino acid sequence at the selected position and
`rejecting a substitution wherein the translated amino acid sequenceis different from the
`reference amino acid sequenceat the selected position.
`
`10
`
`

`

`WO 2004/029220
`
`PCT/US2003/030940
`
`{0041]
`
`In another aspect of the invention, a translated amino acid sequence identical to the
`
`reference amino acid sequence comprises substitution of an amino acid with a similar amino acid
`
`at the selected position.
`
`[0042]
`
`In another aspect of the invention, the synthetic gene encodes a PKS module.
`
`In another aspect of the invention, the reference amino acid sequenceis of a naturally
`[0043]
`occurring polypeptide segment.
`
`In another aspect of the invention, one or more steps of the method may performed by
`[0044]
`a programmed computer.
`
`[0045]
`
`In another aspect of the invention, a computer readable storage medium contains
`
`computer executable code for carrying out the method ofthe present invention.
`
`[0046]
`
`In a method for analyzing a nucleotide sequence of a synthon in accordance with the
`
`present invention, a sequence of a synthetic gene is provided, wherein the synthetic geneis
`
`divided into a plurality of synthons. Sequences ofa plurality of synthon samplesare also
`
`provided wherein each synthon ofthe plurality of synthons is cloned in a vector. And, a
`
`. sequence of the vector without an insert is provided. Vector sequences from the sequence of the
`
`cloned synthonare eliminated and a contig map of sequencesofthe plurality of synthonsis
`constructed. The contig map of sequencesis aligned with the sequence ofthe synthetic gene;
`and a measure of alignment for each of the plurality of synthonsis identified.
`
`[0047]
`In another aspect of the invention, errors in one or more synthon sequences are
`identified; and one or more informations are reported, the informations selected from the group
`
`consisting of: a ranking of synthon samples by degree of alignment, an error in the sequence of a
`
`synthon sample, and identity of a synthon that can be repaired.
`
`[0048]
`
`In another aspect of the invention,a statistical report on a plurality of alignment
`
`errors is prepared.
`
`[0049]
`A system for high through-put synthesis of synthetic genes in accordance with the
`present invention includes a source microwell plate containing oligonucleotides for assembly
`PCR, a first source for amplification mixture including polymerase and buffers useable for
`
`assembly PCR, a second source for LIC extension primer mixture, and a PCR microwell plate
`for amplification of oligonucleotides. A liquid handling deviceretrieves a plurality of
`
`predetermined sets of oligonucleotides from the source microwell plate(s), combines the
`predetermined sets and the amplification mixture in wells of the PCR microwell plate, LIC
`
`11
`
`

`

`WO 2004/029220
`
`PCT/US2003/030940
`
`extension primer mixture, and combines the LIC extension primer mixture and ampliconsin a
`well ofthe PCR microwell plate. The system also includes a heat source for PCR amplification
`configured to accept the at least one PCR microwell plate.
`
`BRIEF DESCRIPTION OF THE FIGURES
`FIGURE 1 shows a UDG-cloningcassette (“cloning linker”) and a scheme ofvector
`[0050]
`preparation for ligation-independent cloning (LIC) using the nicking endonuclease N. BbvC IA.
`FIGURE 1A. UDG-cloning cassette. Sac I and nicking enzymesites used in vector preparation
`are labeled. FIGURE 1B. Schemeof vector preparation for LIC using nicking endonuclease N.
`BbvC IA.
`
`[0051] FIGURE2illustrates the Method S joining method using Bbs I and Bsa I as the Type
`
`IIS restriction enzymes.
`[0052]
`. FIGURE 3A showsthe Method S joining method using Vector Pair 1. FIGURE 3B
`shows the MethodS joining using Vector Pair II. 2S)4 are recognition sites for Type IIS

`restriction enzymes, and A, B, B and C,respectively, are the cleavage sites for the enzymes.
`[0053]
`FIGURE 4 showsa vectorpair useful for stitching. FIGURE 4A: Vector pKos293-
`172-2. FIGURE 4B: Vector pKos293-172-A76. Both vectors contain a UDG-cloningcassette
`with N.Bbv C IA recognit

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.

We are unable to display this document.

PTO Denying Access

Refresh this Document
Go to the Docket