`(12) Patent Application Publication (10) Pub. No.: US 2004/0185484A1
`(43) Pub. Date:
`Sep. 23, 2004
`Costa et al.
`
`US 20040185484A1
`
`(54) METHOD FOR PREPARING
`SINGLE-STRANDED DNA LIBRARIES
`(76) Inventors: Gina L. Costa, Beverly, MA (US);
`John H. Leamon, Guilford, CT (US);
`Jonathan M. Rothberg, Guilford, CT
`(US); Michael P. Weiner, Guilford, CT
`(US)
`Correspondence Address:
`MINTZ, LEVIN COHN FERRIS GLOWSKY &
`POPEO
`666 THIRDAVENUE
`NEW YORK, NY 10017 (US)
`(21) Appl. No.:
`10/767,894
`(22) Filed:
`Jan. 28, 2004
`Related U.S. Application Data
`(60) Provisional application No. 60/443,471, filed on Jan.
`29, 2003. Provisional application No. 60/465,071,
`filed on Apr. 23, 2003. Provisional application No.
`
`60/476,504, filed on Jun. 6, 2003. Provisional appli
`cation No. 60/476,313, filed on Jun. 6, 2003. Provi
`sional application No. 60/476,592, filed on Jun. 6,
`2003. Provisional application No. 60/497,985, filed
`on Aug. 25, 2003. Provisional application No. 60/476,
`602, filed on Jun. 6, 2003.
`
`Publication Classification
`
`(51) Int. Cl." ....................................................... C12O 1/68
`(52) U.S. Cl. .................................................................. 435/6
`(57)
`ABSTRACT
`This invention relates to methods of generating Single
`Stranded DNA libraries for use in amplification and Sequenc
`ing reactions. In various aspects, the disclosed methods
`include: fragmenting DNA, polishing the fragments ends,
`ligating the fragments to universal adaptors, performing
`Strand displacement and extension of the nicked fragments,
`purifying the double-Stranded ligation products, capturing
`the double-Stranded ligation products onto a Solid Support;
`and isolating Single Stranded DNA library fragments, and
`binding these fragments to another Solid Support.
`
`00001
`
`EX1030
`
`
`
`Patent Application Publication Sep. 23, 2004 Sheet 1 of 9
`
`US 2004/0185484A1
`
`Fragmentation template DNA DNase
`DNase
`wn sa
`Mir-
`\se S- (a
`2
`P lish: PNA polymerase to enerate but ends
`- P
`--
`
`
`
`Sample Preparation of
`Adenovirus DNA
`
`A B C D
`
`E
`
`A)
`
`i
`
`B)
`
`C)
`
`D)
`
`-
`
`eas -
`
`Adapter ligation
`SO
`A.
`T4 ligase
`A.
`- P- - - - - -
`P 6 Biota -
`Gel extraction of size, nickeccsonAli
`rick repair and extension
`Geisolation
`- A --
`- Bothn
`Nick repair
`A. - HP
`
`Method.
`
`P
`
`T4 Ligase
`
`A.
`
`B
`
`A.
`
`- F -es
`Isolation and Quantitation of ssDNA Library
`E)
`Bind non-nicked dsDNA to streptauldin-coated
`ssDNA
`beads and single strand elution
`B ind to Sepharose DNA capture Bead (1 ssDNA per bead)
`
`Oeposition of a single-A Capture Bead per well
`
`Sequence
`
`00002
`
`
`
`Patent Application Publication Sep. 23,2004 Sheet 2 of 9
`
`
`
`
`
`
`
`
`
`US 2004/0185484 A1
`
`
`
`
`
`(AVAa)5[ClgOZ)audyg----(dgQZ)sdbag-(dqf)Sox]wouseLZWNG[(dgp)hex-(dq07)seunsgbag<dqQZ)ssudYdd].¢puespuy
`
`
`
`
`-€(dqgz)aauunidydqg4{dqOZ)satutigbas-(dqp)Soy)uouidergYN[(dqpy)Aay-(dgq02)sauttagbas-----(dqQZ)soundyddJ.¢Puigasuas
`
`
`
`
`
`qdJojdepy
`
`aldwnaz7dWwnKay
`
`DZensy
`
`
`
`
`
`TURIPIOAO.¢oseqP)gqJ0O)depy_juoulselyVNCVJ0;depy(suvysaao,¢aseqp
`
`
`
`
`
`
`
`
`
`
`
`
`
`1§UPIOTE-09099969959596069699-99923999996395969645-e4ybo-|4€---.S-Boe7-Beor6S5e656er0e6e166-oeebboeoeoeS666ert
`
`
`
`
`
`
`
`uotSarSummyguotZaxSumunadSnouenbss:kaypeo
`
`
`
`
`
`yNOs:gsojdepyjesseaiuny¥JojdepyjesiaAlunuauies4s
`
`
`
`Sueyraao.¢Pe-pnyg:(aa002<)8:(Ip)(4902)i(dz)
`
`
`
`
`
`
`
`
`
`
`
`—SiéS|ifeyworgerSumadMtrmanbsguoaAmuudyg
`
`VZaANSI
`
`
`(4qgz)(402)i(dqp):
`
`
`
`a7andy
`
`yUourselyVNGAexVLdWWVIdWW
`
`——_______yiodepy-———‘aasOSSOSCOsSNSNSNCNCi‘(‘CS
`
`00003
`
`
`
`
`
`
`
`ifBeoeesboeobovoesS-S5eebbbbeoeeoboeseb-eop-Gg---.€-9692-99699999999996393e09-633005359630009933691¢
`
`
`
`
`
`
`
`00003
`
`
`
`
`Patent Application Publication Sep. 23, 2004 Sheet 3 of 9
`
`US 2004/0185484A1
`
`Figure 3
`
`Nick 2
`gdNA fragment (200tp) y Universal Adaptor BC4bp)
`Universal Adaptor A (44bp)
`A S
`S - - - - 3
`
`3'- — Biota S'
`Nick
`
`Nicked double-standed DNA
`Addition of Bst DNA Polymerase
`
`B
`
`averal Adaptor A(44bp)
`3.
`
`
`
`gDNAfragment (200bp)
`
`Universal Adaptor B (4bp) 3'
`
`Biotin S
`
`Nick 2
`
`t
`Nickl
`
`8st DNA Polymerase binds single-stranded gaps,
`strand displaces nicked strand and extends fragment
`
`Universal Adaptor Abp)
`5' ?y 3
`
`goNA fragment e200 bp)
`
`Universal Adaptar B (4bp) 3' f
`Biolins'
`
`Result is non-nicked
`double-stranded DNA fragment
`
`Universal Adaptor Bcp).
`gDNA fragnert (200bp)
`Universal Adaptor A (44bp)
`5' - H -- H
`D. 3. - He- Biotin S
`
`00004
`
`
`
`Patent Application Publication Sep. 23, 2004 Sheet 4 of 9
`
`US 2004/0185484A1
`
`
`
`00005
`
`
`
`Patent Application Publication Sep. 23, 2004 Sheet 5 of 9
`
`US 2004/0185484A1
`
`
`
`VS 9.Inã??
`
`00006
`
`
`
`Patent Application Publication Sep. 23, 2004 Sheet 6 of 9
`
`US 2004/0185484A1
`
`Figure 7
`
`A.
`
`
`
`B.
`
`C.
`
`D.
`
`00007
`
`
`
`Patent Application Publication Sep. 23, 2004 Sheet 7 of 9
`Figure8
`
`US 2004/0185484A1
`
`
`
`Time (seconds)
`
`00008
`
`
`
`Patent Application Publication Sep. 23, 2004 Sheet 8 of 9
`
`US 2004/0185484A1
`
`Figure 9A
`
`10
`
`3. & i
`
`3
`
`3 3. & g 3 3. 8
`
`s
`
`i
`
`s
`
`Time (seconds)
`
`Figure 9B
`
`O
`
`3.
`
`& 5,
`
`9.
`
`3.
`
`g
`
`3 & g
`
`g 3.
`
`g
`
`S i g
`
`s
`
`Time (seonds)
`
`00009
`
`
`
`Patent Application Publication Sep. 23, 2004 Sheet 9 of 9
`
`US 2004/0185484A1
`
`Figure 10
`
`
`
`Primer Candidates by Tm
`8x19x19x19x9 tetrads (493,848 total possiblities)
`
`60000
`
`140000
`
`30799
`
`120000
`
`OOOOO
`
`80000
`
`60000
`
`40000
`
`20000
`
`66 to 68
`64 to 66
`Tm2*(A+T)+ 4*(G+C)
`
`68 to 70
`
`70 to 72
`
`00010
`
`
`
`US 2004/0185484 A1
`
`Sep. 23, 2004
`
`METHOD FOR PREPARING SINGLE-STRANDED
`DNA LIBRARIES
`
`RELATED APPLICATIONS
`0001. This application claims the benefit of priority to the
`following applications: U.S. Ser. No. 60/443,471 filed Jan.
`29, 2003, U.S. Ser. No. 60/465,071 filed Apr. 23, 2003; U.S.
`Ser. No. 60/476,504 filed Jun. 6, 2003, U.S. Ser. No.
`60/476,313 filed Jun. 6, 2003, U.S. Ser. No. 60/476,592 filed
`Jun. 6, 2003, U.S. Ser. No. 60/476,602 filed Jun. 6, 2003,
`U.S. Ser. No. 60/476,592 filed Jun. 6, 2003, and U.S. Ser.
`No. 60/497,985 filed Aug. 25, 2003. All patent and patent
`applications in this paragraph are hereby incorporated herein
`by reference in their entirety.
`0002 This application also incorporates by reference the
`following copending U.S. patent applications: "Bead Emul
`sion Nucleic Acid Amplification” filed Jan. 28, 2004,
`“Double Ended Sequencing filed Jan. 28, 2004, and “Meth
`ods Of Amplifying And Sequencing Nucleic Acids' filed
`Jan. 28, 2004.
`
`FIELD OF THE INVENTION
`0003. The invention relates to protein chemistry, molecu
`lar biology, and methods of preparing Single-Stranded librar
`ies for Sequence analysis. More specifically, this invention
`includes methods of processing DNA for use in amplifica
`tion and Sequencing reactions.
`
`BACKGROUND OF THE INVENTION
`In amplification by polymerase chain reaction
`0004.
`(PCR), two primers are designed to hybridize to the template
`DNA at positions complementary to respective primers that
`are Separated on the DNA template molecule by Some
`number of nucleotides. The base Sequence of the template
`DNA between and including the primers is amplified by
`repetitive complementary Strand extension reactions
`whereby the number of copies of the target DNA fragments
`is increased by Several orders of magnitude. Amplification is
`exponential as 2", where n equals the number of amplifica
`tion cycles. Following PCR, the amplified DNA may be
`Sequenced through conventional sequencing methods (see,
`U.S. Pat. No. 6,274,320).
`0005 Samples comprising large template DNA or whole
`DNA genomes comprising long nucleotide Sequences are
`not conducive to efficient amplification by PCR. These long
`molecules do not naturally possess Sequences useful for
`primer hybridization. In addition, if primer hybridization
`Sequences are added to double Stranded DNA molecules, it
`is difficult to ascertain the directionality of the amplified
`DNA molecules and this frustrates Sequencing efforts.
`0006 Various methods have been designed to overcome
`Some of these deficiencies. For example, U.S. Pat. No.
`5,508,169 describes that subsets of nucleic acid fragments
`may be indexed (i.e., Selected or targeted) based upon the
`information contained in non-identical 5'-protruding or
`3'-protruding cohesive ends. This includes fragments having
`3, 4 or 5 base cohesive ends, Such as those revealed by
`cleavage of DNA by Type II restriction endonucleases and
`interrupted palindrome recognizing type II restriction endo
`nucleases. The patent describes nucleic acid molecules Simi
`lar to adaptors (called indexing linkers) which contain
`
`protruding Single Strands complementary to the cohesive
`ends of cleavage Sites of restriction endonucleases (rather
`than the recognition sequences). Various functional groups
`or Specific nucleic acid Sequences designed for particular
`applications may be selectively attached to the aforemen
`tioned Subsets of fragments. Selective attachment of index
`ing linkers having known base Sequences in their cohesive
`ends to a Subset of fragments bearing the complementary
`cohesive ends can be used for the detection, identification,
`isolation, amplification, and manipulation of the Subset of
`fragments.
`0007 U.S. Pat. No. 6,468,748 describes a method of
`Sorting genes and/or gene fragments comprising Several
`steps. First, ds cDNA molecules are prepared from mRNA
`molecules by reverse transcription, using a poly-T primer
`optionally having a general primer-template Sequence
`upstream from the poly-T Sequence, yielding ds cDNA
`molecules having the poly-T Sequence, optionally having
`the general primer-template Sequence. Second, the ds cDNA
`molecules are digested with a restriction enzyme that pro
`duces digested cDNA molecules with cohesive ends having
`overhanging SSDNA sequences of a constant number of
`arbitrary nucleotides. Third, the digested cDNA molecules
`are ligated to a set of dsDNA oligonucleotide adaptors, each
`of which adaptorS has at one of its ends a cohesive-end
`SSDNA adaptor Sequence complementary to one of the
`possible overhanging SSDNA sequences of the digested
`cDNA, at the opposite end a specific primer-template
`Sequence specific for the SSDNA adaptor complementary
`Sequence, and in between the ends a constant Sequence that
`is the same for all of the different adaptors of the set. Fourth,
`the ligated cDNA molecules are amplified by Separate
`polymerase chain reactions, utilizing for each Separate poly
`merase chain reaction a primer that anneals to the cDNA
`poly-T Sequence optionally having the cDNA general
`primer-template, and a primer from a Set of different specific
`primers that anneal to the cDNA specific primer-template
`sequences. Fifth, the amplified cDNA molecules are sorted
`into nonoverlapping groups by collecting the amplification
`products after each Separate polymerase chain reaction, each
`group of amplified cDNA molecules determined by the
`Specific primer that annealed to the Specific primer-template
`Sequence and primed the polymerase chain reaction.
`0008 U.S. Pat. No. 5,863,722 describes a method and
`materials for Sorting polynucleotides with oligonucleotide
`tags. The oligonucleotide tags are capable of hybridizing to
`complementary oligomeric compounds consisting of Sub
`units having enhanced binding Strength and Specificity as
`compared to natural oligonucleotides. Such complementary
`oligomeric compounds are referred to as "tag comple
`ments.” Subunits of tag complements may consist of mono
`mers of non-natural nucleotide analogs, referred to as “anti
`Sense monomers' or they may comprise oligomers having
`lengths in the range of 3 to 6 nucleotides or analogs thereof,
`including antisense monomers, the oligomers being Selected
`from a minimally cross-hybridizing Set. In Such a Set, a
`duplex made up of an oligomer of the Set and the comple
`ment of any other oligomer of the Set contains at least two
`mismatches. In other words, an oligomer of a minimally
`cross-hybridizing Set at best forms a duplex having at least
`two mismatches with the complement of any other oligomer
`of the same Set. Tag complements attached to a Solid phase
`Support are used to Sort polynucleotides from a mixture of
`polynucleotides each containing a tag. The Surface of each
`
`00011
`
`
`
`US 2004/0185484 A1
`
`Sep. 23, 2004
`
`Support is derivatized by only one type of tag complement
`which has a particular Sequence. Similarly, the polynucle
`otides to be Sorted each comprise an oligonucleotide tag in
`the repertoire, Such that identical polynucleotides have the
`Same tag and different polynucleotides have different tags.
`Thus, when the populations of Supports and polynucleotides
`are mixed under conditions which permit Specific hybrid
`ization of the oligonucleotide tags with their respective
`complements, Subpopulations of identical polynucleotides
`are Sorted onto particular beads or regions. The Subpopula
`tions of polynucleotides can then be manipulated on the
`Solid phase Support by micro-biochemical techniques.
`0009 U.S. Pat. No. 5,728,524 describes a process for the
`categorization of nucleic acid Sequences in which these
`Sequences are linked to a population of adaptor molecules,
`each exhibiting Specificity for linking to a Sequence includ
`ing a predetermined nucleotide base. The resulting linked
`Sequences are then categorized based upon Selection for the
`particular base.
`0.010 However, the art does not describe methods for
`generating libraries of unknown fragment Sequences addi
`tionally comprising two known Sequences, each different
`than the other, one being adjoined at each end. Thus, a need
`exists for a method which overcomes shortcomings of the
`prior art. Accordingly, the present invention is directed to
`describing Such methods, materials, and kits as required to
`facilitate manipulation of multiple DNA sequences in a
`Sample.
`BRIEF SUMMARY OF THE INVENTION
`This invention describes a novel method for pre
`0.011
`paring a library of multiple nucleic acid Sequences from a
`Sample where the library is Suited to further quantitative and
`comparative analysis, particularly where the multiple
`nucleic acid Sequences are unknown and derived from large
`template DNA or whole (or partial) genome DNA. In certain
`embodiments of the invention, Sequences of Single Stranded
`DNA (ssDNA) are prepared from a sample of large template
`DNA or whole or partial DNA genomes through fragmen
`tation, polishing, adaptor ligation, nick repair, and isolation
`of SSDNA.
`0012. Therefore, in one aspect, the present invention
`provides a method for clonally isolating a library comprising
`a plurality of ssDNAS, wherein each ssDNA comprises a
`first Single Stranded universal adaptor and a Second Single
`Stranded universal adaptor, the method comprising:
`0013 (a) fragmenting large template DNA mol
`ecules to generate a plurality of fragmented DNA
`molecules,
`0014 (b) attaching a first or second universal double
`Stranded adaptor to a first end of each fragmented
`DNA molecule and a first or Second universal adap
`tor to a second end of each fragmented DNA mol
`ecule to form a mixture of adaptor ligated DNA
`molecules,
`0015 (c) isolating a plurality of single stranded
`DNA molecules each comprising a first Single
`Stranded universal adaptor and a Second Single
`Stranded universal adaptor, and
`0016 (d) delivering the single stranded DNA mol
`ecules into reactorS Such that a plurality of the
`reactors include one DNA molecule, thereby clonally
`isolating the library.
`
`0017. In certain aspects, the single stranded DNA mol
`ecules are delivered into droplets in a water-in-oil emulsion
`(i.e., microreactors), or onto multiwell Surfaces (e.g., Pico
`Titer plates).
`0018. The single stranded DNA molecules may be deliv
`ered via attachment to a Solid Support (e.g., beads).
`0019. In other aspects, the adaptor ligated DNA mol
`ecules comprising a first double Stranded universal adaptor
`and Second double Stranded universal adaptor is attached to
`a Solid Support via one Strand of the double Stranded
`universal adaptor (via the first or Second universal adaptor).
`The adaptor ligated DNA molecules which have not attached
`to a Solid Support are washed away, and one Strand of the
`adaptor ligated DNA molecules is released. This generates a
`mixture comprising a plurality of SSDNAS comprising a
`population of Single Stranded molecules with a first and
`Second universal adaptor pair, thereby generating a library.
`0020. The sequence of the fragmented DNA may be
`known or unknown. In a preferred embodiment, the
`Sequence of the fragmented DNA, particularly the Sequence
`of the ends of the fragmented DNA, is unknown.
`0021. In another aspect, the present invention includes a
`method for generating a ssDNA library linked to solid
`Supports comprising: (a) generating a library of SSDNA
`templates; (b) attaching the SSDNA templates to Solid Sup
`ports, and (c) isolating the Solid Supports on which one
`SSDNA template is attached. In Still another aspect, the
`present invention includes a library of mobile Solid Supports
`made by the method disclosed herein.
`
`BRIEF DESCRIPTION OF THE FIGURES
`0022 FIG. 1 is a schematic representation of the entire
`process of library preparation including the Steps of template
`DNA fragmentation (FIG. 1A), end polishing (FIG. 1B),
`adaptor ligation (FIG. 1C), nick repair, Strand extension and
`gel isolation (FIG. 1D). FIG. 1 also depicts a representative
`agarose gel containing a Sample preparation of a 180-350
`base pair adenovirus DNA library according to the methods
`of this invention.
`0023 FIG. 2A is a schematic representation of the uni
`Versal adaptor design according one embodiment of the
`present invention. Each universal adaptor is generated from
`two complementary SSDNA oligonucleotides that are
`designed to contain a 20 bp nucleotide sequence for PCR
`priming, a 20 bp nucleotide Sequence for Sequence priming
`and a unique 4 bp discriminating Sequence comprised of a
`non-repeating nucleotide sequence (i.e., ACGT, CAGT,
`etc.).
`0024 FIG. 2B depicts a representative universal adaptor
`Sequence pair for use with the invention. Adaptor A Sense
`strand: SEQ ID NO:1; Adaptor A antisense strand: SEQ ID
`NO:2; Adaptor B sense strand: SEQ ID NO:3: Adaptor B
`antisense strand: SEQ ID NO:4.
`0025 FIG. 2C is a schematic representation of universal
`adaptor design for use with the invention.
`0026 FIG. 3 represents the strand displacement and
`extension of nicked double-stranded DNA fragments
`according to the present invention. Following the ligation of
`universal adaptors generated from Synthetic oligonucle
`otides, double-stranded DNA fragments will be generated
`
`00012
`
`
`
`US 2004/0185484 A1
`
`Sep. 23, 2004
`
`that contain two nicked regions following T4 DNA ligase
`treatment (FIG. 3A). The addition of a strand displacing
`enzyme (i.e., Bst DNA polymerase I) will bind nicks (FIG.
`3B), Strand displace the nicked Strand and complete nucle
`otide extension of the strand (FIG. 3C) to produce non
`nicked double-stranded DNA fragments (FIG. 3D).
`0.027
`FIG. 4 represents the isolation of directionally
`ligated Single-Stranded DNA according to the present inven
`tion using Streptavidin-coated magnetic beads. Following
`ligation with universal adaptors A and B (the two different
`adaptors are sometimes referred to as a “first” and “second”
`universal adaptor), double-stranded DNA will contain adap
`tors in four possible combinations: AA, BB, AB, and B.A.
`When universal adaptor B contains a 5'-biotin, magnetic
`Streptavidin-coated Solid Supports are used to capture and
`isolate the AB, BA, and BB populations (population AA is
`washed away). The BB population is retained on the beads
`as each end of the double-stranded DNA is attached to a
`bead and is not released. However, upon washing in the
`presence of a low salt buffer, only populations AB and BA
`will release a single-Stranded DNA fragment that is comple
`mentary to the bound strand. Single-stranded DNA frag
`ments are isolated from the Supernatant and used as template
`for Subsequent applications. This method is described below
`in more detail.
`0028 FIG.5 represents an insert flanked by PCR primers
`and Sequencing primers.
`0029 FIG. 6 represents truncated product produced by
`PCR primer mismatch at cross-hybridization region (CHR).
`0030 FIGS. 7A-7D depict the assembly for the nebulizer
`used for the methods of the invention. A tube cap was placed
`over the top of the nebulizer (FIG. 7A) and the cap was
`secured with a nebulizer clamp assembly (FIG. 7B). The
`bottom of the nebulizer was attached to the nitrogen Supply
`(FIG. 7C) and the entire device was wrapped in parafilm
`(FIG. 7D).
`0.031
`FIG. 8 depicts representative BioAnalyzer output
`from analysis of a single stranded DNA library.
`0032 FIG. 9A depicts representative results for LabChip
`analysis of a single stranded DNA library following nebu
`lization and polishing.
`0033 FIG. 9B depicts representative size distribution
`results for an adaptor-ligated Single Stranded DNA library
`following nebulization, polishing, and gel purification.
`0034 FIG. 10 depicts the calculation for primer candi
`dates based on melting temperature.
`
`DETAILED DESCRIPTION OF THE
`INVENTION
`0035. This invention relates to the preparation of sample
`DNA for amplification and Sequencing reactions. The inven
`tion includes a method for preparing the Sample DNA
`comprised of the following steps: (a) fragmenting large
`template DNA or whole genomic DNA samples to generate
`a plurality of digested DNA fragments; (b) creating com
`patible ends on the plurality of digested DNA samples; (c)
`ligating a set of universal adaptor Sequences onto the ends of
`fragmented DNA molecules to make a plurality of adaptor
`ligated DNA molecules, wherein each universal adaptor
`Sequence comprises a PCR primer Sequence, a Sequencing
`
`primer Sequence and a discriminating key Sequence and
`wherein one adaptor is attached to biotin; (d) Separating and
`isolating the plurality of ligated DNA fragments; (e) remov
`ing any portion of the plurality of ligated DNA fragments, (f)
`nick repair and Strand extension of the plurality of ligated
`DNA fragments, (g) attaching each of the ligated DNA
`fragments to a Solid Support; and (h) isolating populations
`comprising Single-Stranded adaptor-ligated DNA fragments
`for which there is a unique adaptor at each end (i.e.,
`providing directionality).
`0036). Unless otherwise defined, all technical and scien
`tific terms used herein have the same meaning as commonly
`understood by one of ordinary skill in the art to which this
`invention belongs. Methods and materials similar or equiva
`lent to those described herein can be used in the practice of
`the present invention, and exemplified Suitable methods and
`materials are described below. For example, methods may
`be described which comprise more than two steps. In Such
`methods, not all Steps may be required to achieve a defined
`goal and the invention envisions the use of isolated Steps to
`achieve these discrete goals. The disclosures of all publica
`tions, patent applications, patents, and other references are
`incorporated in toto herein by reference. In addition, the
`materials, methods, and examples are illustrative only and
`not intended to be limiting.
`0037 AS used herein, the term “universal adaptor” refers
`to two complementary and annealed oligonucleotides that
`are designed to contain a nucleotide Sequence for PCR
`priming and a nucleotide Sequence for Sequence priming.
`Optionally, the universal adaptor may further include a
`unique discriminating key Sequence comprised of a non
`repeating nucleotide sequence (i.e., ACGT, CAGT, etc.). A
`Set of universal adaptors comprises two unique and distinct
`double-Stranded Sequences that can be ligated to the ends of
`double-stranded DNA. Therefore, the same universal adap
`tor or different universal adaptors can be ligated to either end
`of the DNA molecule. When comprised in a larger DNA
`molecule that is Single Stranded or when present as an
`oligonucleotide, the universal adaptor may be referred to as
`a single Stranded universal adaptor.
`0038. As used herein, the term “discriminating key
`Sequence” refers to a Sequence including a combination of
`the four deoxyribonucleotides (i.e., A, C, G, T). The same
`discriminating Sequence can be used for an entire library of
`DNA fragments. Alternatively, different discriminating key
`Sequences can be used to track libraries of DNA fragments
`derived from different organisms. Longer discriminating key
`Sequences can be used for a mixture of more than one
`library.
`0039. As used herein, the term “plurality of molecules”
`refers to DNA isolated from the same source, whereby
`different organisms may be prepared Separately by the same
`method. In one embodiment, the plurality of DNA samples
`is derived from large Segments of DNA, e.g., genomic DNA,
`cDNA, viral DNA, plasmid DNA, cosmid DNA, artificial
`chromosome DNA (e.g., BACs, YACs, MACs, PACs), syn
`thetic DNA, phagemid DNA, phasemid DNA, or from
`reverse transcripts of viral RNA. This DNA may be derived
`from any Source, including any mammal (i.e., human, non
`human primate, rodent, or canine), plant, bird, reptile, fish,
`fungus, bacteria, or virus.
`
`00013
`
`
`
`US 2004/0185484 A1
`
`Sep. 23, 2004
`
`0040 AS used herein, the term “library” refers to a subset
`of smaller sized DNA species generated from a larger DNA
`template, e.g., a Segmented or whole genome.
`0041 AS used herein, the term “unique”, as in “unique
`PCR priming regions' refers to a sequence that does not
`exist or exists at an extremely low copy level within the
`DNA molecules to be amplified or sequenced.
`0042. As used herein, the term “compatible” refers to an
`end of double stranded DNA to which an adaptor molecule
`may be attached (i.e., blunt end or cohesive end).
`0.043 AS used herein, the term “fragmenting” refers to a
`process by which a larger molecule of DNA is converted into
`smaller pieces of DNA.
`0044 As used herein, “large template DNA” would be
`DNA of more than 5 kb, 10 kb, or 25 kb, preferably more
`than 500 kb, more preferably more than 1 MB, and most
`preferably 5 MB or larger.
`0.045. As used herein, the term “stringent hybridization
`conditions” refers to those conditions under which only fully
`complimentary Sequences will hybridize to each other.
`0046) The following discussion Summarizes the basic
`steps involved in the methods of the invention. The steps are
`recited in a specific order, however, as would be known by
`one of skill in the art, the order of the steps may be
`manipulated to achieve the same result. Such manipulations
`are contemplated by the inventors. Further, Some Steps may
`be minimized as would also be known by one of skill in the
`art.
`0047 Fragmentation
`0.048. In the practice of the methods of the present
`invention, the fragmentation of the DNA sample can be done
`by any means known to those of ordinary skill in the art.
`Preferably, the fragmenting is performed by enzymatic,
`chemical, or mechanical means. The mechanical means may
`include Sonication, French press, HPLC, HydroShear (Gen
`eMachines, San Carlos, Calif.), and nebulization. The enzy
`matic means may be performed by digestion with Deoxyri
`bonuclease I (DNase I), nonspecific nucleases, or single or
`multiple restriction endonucleases. In a preferred embodi
`ment, the fragmentation results in ends for which the
`Sequence adjacent to the end is not known. The Sequence
`adjacent to the end may be at least 5 bases, 10 bases, 20
`bases, 30 bases, or 50 bases.
`0049. Enzymatic Fragmentation
`0050. In a preferred embodiment, the enzymatic means is
`DNase I. DNase I is a versatile enzyme that nonspecifically
`cleaves double-stranded DNA (dsDNA) to release 5'-phos
`phorylated oligonucleotide products. DNase I has optimal
`activity in buffers containing Mn", Mg" and Ca". The
`purpose of the DNase I digestion Step is to fragment a large
`DNA genome into Smaller Species comprising a library. The
`cleavage characteristics of DNase I will result in random
`digestion of template DNA (i.e., minimal Sequence bias) and
`in the predominance of blunt-ended dsDNA fragments when
`used in the presence of manganese-based buffers (Melgar, E.
`and D. A. Goldthwait. 1968. Deoxyribonucleic acid
`nucleases. II. The effects of metal on the mechanism of
`action of Deoxyribonuclease I. J. Biol. Chem. 243: 4409).
`The range of digestion products generated following DNase
`
`I treatment of genomic templates is dependent on three
`factors: i) amount of enzyme used (units); ii) temperature of
`digestion ( C.); and iii) incubation time (minutes). The
`DNase I digestion conditions outlined below have been
`optimized to yield genomic libraries with a size range from
`50-700 base pairs (bp).
`0051. In a preferred embodiment, DNase I is used to
`digest large template DNA or whole genome DNA for 1-2
`minutes to generate a population of oligonucleotides that
`range from 50 to 500 bp, or 50 to 700 bp. In another
`preferred embodiment, the DNase I digestion is performed
`at a temperature of 10 C.-37 C. In yet another preferred
`embodiment, the digested DNA fragments are 50 bp to 700
`bp in length.
`0052 Mechanical Fragmentation
`0053 Another preferred method for nucleic acid frag
`mentation is mechanical fragmentation. Mechanical frag
`mentation methods include Sonication and nebulization, and
`use of HydroShear, HPLC, and French Press devices. Soni
`cation may be performed by a tube containing DNA in a
`suitable buffer (i.e., 10 mM Tris, 0.1 mM EDTA) and
`Sonicating for a varying number of 10 Second bursts using
`maximum output and continuous power Sonicators are com
`mercially available from, e.g., Misonix Inc. (Farmingdale,
`N.Y.), and can be used essentially as described by Bankier
`and Barrell (Bankier, A. T., Weston, K. M., and Barrell, B.
`G., “Random cloning and Sequencing by the M13/dideoxy
`nucleotide chain termination method”, Meth. Enzymol. 155,
`51-93 (1987). For sonication, it is preferred to maintain the
`nucleic acid at a uniform temperature by keeping the Sample
`on ice. Constant temperature conditions, at 0° C. for
`example, are preferred to maintain an even fragment distri
`bution. The optimal conditions for Sonication may be deter
`mined empirically for a given DNA sample before prepara
`tive Sonication is performed. For example, aliquots of DNA
`can be treated for different times under Sonication and the
`size and quality of DNA can be analyzed by PAGE. Once
`optimal Sonication conditions are determined, the remaining
`DNA can be Sonicated according to those pre-determined
`conditions.
`0054 Another preferred method for nucleic acid frag
`mentation is treatment by nebulizers (e.g., protocols, and
`hardware available from GeneMachines, San Carlos, Calif.
`Also see U.S. Pat. Nos. 5,506,100 and 5,610,010). In nebu
`lization, hydrodynamic shearing forces are used to fragment
`DNA strands. For example, DNA in a aqueous solution can
`be passed through a tube with an abrupt contraction. AS the
`Solution approaches the contraction, the fluid accelerates to
`maintain the Volumetric flow rate through the Smaller area of
`the contraction. During this acceleration, drag forces Stretch
`the DNA until it snaps. Optionally, the DNA solution can be
`passed several times (e.g., 15 to 20 cycles) through the
`contraction until the fragments are too short for further
`Shearing. By adjusting the contraction and the flow rate of
`the fluid, the size of the final DNA fragment may be
`determined. Software for controlling and monitoring reac
`tion conditions is available to allow automation of the
`nebulizing process. AS another advantage, there are no
`Special buffer requirements for nebulization. For example,
`DNA may be Suspended in various Solutions including, but
`not limited to, water, Tris buffer, Tris-EDTA buffer, and
`Tris-EDTA with up to 0.5 M NaCl.
`
`00014
`
`
`
`US 2004/0185484 A1
`
`Sep. 23, 2004
`
`0055 Polishing
`0056 Polishing digestion of genomic DNA (gDNA) tem
`plates with DNase I in the presence of Mn" produces
`fragments of DNA that are either blunt-ended or have
`protruding termini with one or two nucleotides in length.
`Similarly, fragmentation of DNA by mechanical means
`provides a combination of fragments with blunt-ends or
`overhanging ends. These DNA fragments, whether gener
`ated enzymatically or mechanically, may be “polished”
`using the procedure described below.
`0057 Polishing (also called end repair) refers to the
`conversion of non-blunt ended DNA into blunt ended DNA.
`In one method, polishing may be performed by treatment
`with a Single Strand-specific exonuclease, Such as BAL32
`nuclease or Mung B