`
`PCT
`INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT)
`(51) International Patent Classification 7 :
`WO 00/53805
`Cl2Q 1/68
`
`WORLD INTELLECTUAL PROPERTY ORGANJZA TION
`International Bureau
`
`(11) International Publication Number:
`
`Al
`
`(43) International Publication Date:
`
`14 September 2000 (14.09.00)
`
`(21) International Application Nwnber:
`
`PCT/GB00/00873
`
`(22) International Filing Date:
`
`10 March 2000 (10.03.00)
`
`(30) Priority Data:
`09/266,187
`
`10 March 1999 (10.03.99)
`
`us
`
`(71) Applicant (for all designated States except US): ASM SCIEN(cid:173)
`TIFIC, INC. [US/US]; 240 Norfolk Street, Cambridge, MA
`02139 (US).
`
`(72) Inventors; and
`(75) Inventors/Applicants (for US only): STEMPLE, Derek, Lyle
`[US/GB]; 292 Hatfield Road, St. Albans, Hertfordshire ALI
`4UN (GB). ARMES, Niall, Antony [GB/GB]; 140 Long
`Lane, London N3 2HX (GB).
`
`(74) Agents: SCHUCH, George, William et al.; Mathys & Squire,
`100 Gray's Inn Road, London WCIX 8AL (GB).
`
`(81) Designated States: AE, AL, AM, AT, AU, AZ, BA, BB, BG,
`BR, BY, CA, CH, CN, CR, CU, CZ, DE, DK, DM, DZ, EE,
`ES, FI, GB, GD, GE, GH, GM, HR, HU, ID, IL, IN, IS, JP,
`KE, KG, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA,
`MD, MG, MK, MN, MW, MX, NO, NZ, PL, PT, RO, RU,
`SD, SE, SG, SI, SK, SL, TJ, TM, TR, TT, TZ, UA, UG,
`US, UZ, VN, YU, ZA, ZW, ARIPO patent (GH, GM, KE,
`LS, MW, SD, SL, SZ, TZ, UG, ZW), Eurasian patent (AM,
`AZ, BY, KG, KZ, MD, RU, TJ, TM), European patent (AT,
`BE, CH, CY, DE, DK, ES, FI, FR, GB, GR, IE, IT, LU,
`MC, NL, PT, SE), OAPI patent (BF, BJ, CF, CG, CI, CM,
`GA, GN, GW, ML, MR, NE, SN, TD, TG).
`
`Published
`With international search report.
`Before the expiration of the time limit for amending the
`claims and to be republished in the event of the receipt of
`amendments.
`
`(54) Title: A METHOD FOR DIRECT NUCLEIC ACID SEQUENCING
`
`DNA Sample being
`
`sequenced I
`
`Etched/ derivatized spot~ --E- nitrilotriacetic acid
`
`Reaction chamber lower slide
`
`t'
`
`,_ _________ llilli,iii!ia, - - - - - - - - - - - - - - - - -
`
`Example of a DNAS Reaction Center
`
`(57) Abstract
`
`The present invention provides a novel sequencing apparatus and the methods employed to determine the nucleotide sequence of
`many single nucleic acid molecules simultaneously, in parallel. The methods and apparatus of the present invention offer a rapid, cost
`effective, high through-put method by which nucleic acid molecules from any source can be readily sequenced without the need for prior
`amplification of the sample or prior knowledge of any sequence information.
`
`
`
`FOR THE PURPOSES OF INFORMATION ONLY
`
`Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT.
`
`AL
`AM
`AT
`AU
`AZ
`BA
`BB
`BE
`BF
`BG
`BJ
`BR
`BY
`CA
`CF
`CG
`CH
`CI
`CM
`CN
`cu
`CZ
`DE
`DK
`EE
`
`Albania
`Am1enia
`Austria
`Australia
`Azerbaijan
`Bosnia and Herzegovina
`Barbados
`Belgium
`Burkina Faso
`Bulgaria
`Benin
`Brazil
`Belarus
`Canada
`Central African Republic
`Congo
`Switzerland
`C(lte d'Ivoire
`Cameroon
`China
`Cuba
`Czech Republic
`Germany
`Denmark
`Estonia
`
`ES
`Fl
`FR
`GA
`GB
`GE
`GH
`GN
`GR
`HU
`IE
`IL
`IS
`IT
`JP
`KE
`KG
`KP
`
`KR
`KZ
`LC
`LI
`LK
`LR
`
`Spain
`Finland
`France
`Gabon
`United Kingdom
`Georgia
`Ghana
`Guinea
`Greece
`Hungary
`Ireland
`Israel
`Iceland
`Italy
`Japan
`Kenya
`Kyrgyzstan
`Democratic People• s
`Republic of Korea
`Republic of Korea
`Kazakstan
`Saint Lucia
`Liechtenstein
`Sri Lanka
`Liberia
`
`LS
`LT
`LU
`LV
`MC
`MD
`MG
`MK
`
`ML
`MN
`MR
`MW
`MX
`NE
`NL
`NO
`NZ
`PL
`PT
`RO
`RU
`SD
`SE
`SG
`
`Lesotho
`Lithuania
`Luxembourg
`Latvia
`Monaco
`Republic of Moldova
`Madagascar
`The fonner Yu gos ]av
`Republic of Macedonia
`Mali
`Mongolia
`Mauritania
`Malawi
`Mexico
`Niger
`Netherlands
`Norway
`New Zealand
`Poland
`Portugal
`Romania
`Russian Federation
`Sudan
`Sweden
`Singapore
`
`SI
`SK
`SN
`sz
`TD
`TG
`TJ
`TM
`TR
`TT
`UA
`UG
`us
`uz
`VN
`YU
`zw
`
`Slovenia
`Slovakia
`Senegal
`Swaziland
`Chad
`Togo
`Tajikistan
`Turkmenistan
`Turkey
`Trinidad and Tobago
`Ukraine
`Uganda
`United States of America
`Uzbekistan
`Viet Nam
`Yugoslavia
`Zimbabwe
`
`
`
`WO00/53805
`
`PCT /GB00/00873
`
`A METHOD FOR DIRECT NUCLEIC ACID SEQUENCING
`
`FIELD OF THE INVENTION
`
`The present invention relates to methods for sequencing nucleic acid samples. More
`
`specifically, the present invention relates to methods for sequencing without the need for
`
`amplification; prior knowledge of some of the nucleotide sequence to generate the sequencing
`
`primers; and the labor-intensive electrophoresis techniques.
`
`BACKGROUND OF THE INVENTION
`
`The sequencing of nucleic acid samples is an important analytical technique in modern
`
`molecular biology. The development ofreliable methods for DNA sequencing has been crucial
`
`for understanding the function and control of genes and for applying many of the basic
`
`techniques of molecular biology. These methods have also become increasingly important as
`
`tools in genomic analysis and many non-research applications, such as genetic identification,
`
`forensic analysis, genetic counseling, medical diagnostics and many others.
`
`In these latter
`
`applications, both techniques providing partial sequence information, such as fingerprinting
`
`and sequence comparisons, and techniques providing full sequence determination have been
`
`employed. See, e.g., Gibbsetal.,Proc. Natl. Acad. Sci USA 86: 1919-1923 (1989); Gyllensten
`
`et al., Proc. Natl. Acad. Sci USA 85: 7652-7656 (1988); Carrano et al., Genomics 4: 129-136
`
`(1989); Caetano-Annoles et al., Mol. Gen. Genet. 235: 157-165 (1992); Brenner and Livak,
`
`Proc. Natl. Acad. Sci USA 86: 8902-8906 (1989); Green et al., PCR Methods and Applications
`
`1: 77-90 (1991); and Versalovic et al., Nucleic Acid Res. 19: 6823-6831 (1991).
`
`Most currently available DNA sequencing methods require the generation of a set of DNA
`
`fragments that are ordered by length according to nucleotide composition. The generation of
`
`this set of ordered fragments occurs in one of two ways: ( 1) chemical degradation at specific
`
`nucleotides using the Maxam-Gilbert method or (2) dideoxy nucleotide incorporation using the
`
`Sanger method. See Maxam and Gilbert, Proc Natl Acad Sci USA 74: 560-564 (1977); Sanger
`
`et al., Proc Natl Acad Sci USA 74: 5463-5467 (1977). The type and number of required steps
`
`inherently limits both the number of DNA segments that can be sequenced in parallel. and the
`
`amount of sequence that can be determined from a given site. Furthermore, both methods are
`
`prone to error due to the anomalous migration of DNA fragments in denaturing gels. Time and
`
`SUBSTITUTE SHEET (RULE 26)
`
`
`
`WO00/53805
`
`2
`
`PCT /GB00/00873
`
`space limitations inherent in these gel-based methods have fueled the search for alternative
`
`methods.
`
`In an effort to satisfy the current large-scale sequencing demands, improvements have been
`
`made to the Sanger method. For example, the use of fluorescent chain terminators simplifies
`
`detection of the nucleotides. The synthesis of longer DNA fragments and improved fragment
`
`resolution produces more sequence information from each experiment. Automated analysis of
`
`fragments in gels or capillaries has significantly reduced the labor involved in collecting and
`
`processing sequence information. See, e.g., Prober et al., Science 238: 336-341 (1987); Smith
`
`et al., Nature 321: 674-679 (1986); Luckey et al., Nucleic Acids Res 18: 4417-4421(1990);
`
`Dovichi, Electrophoresis 18: 2393-2399 (1997).
`
`However, current DNA sequencing technologies still suffer three major limitations. First. they
`
`require a large amount of identical DNA molecules, which are generally obtained either by
`
`molecular cloning or by polymerase chain reaction (PCR) amplification of DNA sequences.
`
`Current methods of detection are insensitive and thus require a minimum critical number of
`
`labeled oligonucleotides. Also, many identical copies of the oligonucleotide are needed to
`
`generate a sequence ladder. A second limitation is that current sequencing techniques depend
`
`on priming from sequence-specific oligodeoxynucleotides that must be synthesized prior to
`
`initiating the sequencing procedure. Sanger and Coulson, J. Mo!. Biol. 94: 441-448 ( 1975).
`
`The need for multiple identical templates necessitates the synchronous priming of each copy
`
`from the same predetermined site. Third, current sequencing techniques depend on lengthy,
`
`labor-intensive electrophoresis techniques that are limited by the rate at which the fragments
`
`may be separated and are also limited by the number of bases that can be sequenced in a given
`
`experiment by the resolution obtainable on the gel.
`
`In an effort to dispense with the need for electrophoresis techniques. a sequencing method was
`
`developed which uses chain terminators that can be uncaged, or deprotected, for further
`
`extension. See, U.S. Patent No. 5,302,509: Metzker et al., Nucleic Acids Res. 22: 4259-4267
`
`(1994).
`
`This method
`
`involves repetitive cycles of base incorporation, detection of
`
`incorporation, and re-activation of the chain terminator to alloYv the next cycle of DNA
`
`synthesis. Thus, by detecting each added base while the DNA chain is growing, the need for
`
`size-fractionation is eliminated. This method is nevertheless still highly dependent on large
`
`amounts of nucleic acid to be sequenced and the use of known sequences for priming the
`
`initiation of chain growth. Moreover, this technique is plagued by any inefficiencies of
`
`SUBSTITUTE SHEET (RULE 26)
`
`
`
`WO00/53805
`
`3
`
`PCT /GB00/00873
`
`incorporation and deprotection. Because incorporation and 3'-0H regeneration are not
`
`completely efficient, a pool of initially identical extending strands can rapidly become
`
`asynchronous and sequences cannot be resolved beyond a few limited initial additions.
`
`Thus, a need still remains in the art for a rapid, cost effective, high throughput method for
`
`sequencing unknown nucleic acid samples that eliminates the need for amplification; prior
`
`knowledge of some of the nucleotide sequence to generate sequencing primers; and labor(cid:173)
`
`intensive electrophoresis techniques.
`
`SUMMARY OF THE INVENTION
`
`The present invention provides rapid, cost effective, high throughput methods for sequencing
`
`unknown nucleic acid samples that eliminate the need for amplification; prior knowledge of
`
`some of the nucleotide sequence to generate sequencing primers; and labor-intensive
`
`electrophoresis techniques. The methods of the present invention permit direct nucleic acid
`
`sequencing (DNAS) of single nucleic acid molecules.
`
`According to the methods of the present invention, a plurality of polymerase molecules is
`
`immobilized on a solid support through a covalent or non-covalent interaction. A nucleic acid
`
`sample and oligonucleotide primers are introduced to the reaction chamber in a buffered
`
`solution containing all four labeled-caged nucleoside triphosphate terminators. Template(cid:173)
`
`driven elongation of a nucleic acid is mediated by the attached polymerases using the labeled(cid:173)
`
`caged nucleoside triphosphate terminators. Reaction centers are monitored by the microscope
`
`system until a majority of sites contain immobilized polymerase bound to a nucleic acid
`
`template with a single incorporated labeled-caged nucleotide terminator. The reaction chamber
`
`is then flushed with a wash buffer. Specific nucleotide incorporation is then determined for
`
`each active reaction center. Following detection, the reaction chamber is irradiated to uncage
`
`the incorporated nucleotide and flushed with wash buffer once again. The presence of labeled(cid:173)
`
`caged nucleotides is once again monitored before fresh reagents are added to reinitiate
`
`synthesis, to verify that reaction centers are successfully uncaged. A persistent failure of
`
`release or incorporation, however, indicates failure of a reaction center. A persistent failure of
`
`release or incorporation consists of 2-20 cycles, preferably 3-10 cycles. more preferably 3-5
`
`cycles, wherein the presence of a labeled-caged nucleotide is detected during the second
`
`detection step, indicating that the reaction center was not successfully uncaged. The
`
`sequencing cycle outlined above is repeated until a large proponion of reaction centers fail.
`
`SUBSTITUTE SHEET (RULE 26)
`
`
`
`WO00/53805
`
`4
`
`PCT/GB00/00873
`
`The differentially-labeled nucleotides used in the sequencing methods of the present invention
`
`have a detachable labeling group and are blocked at the 3' portion with a detachable blocking
`
`group. In a preferred embodiment, the labeling group is directly attached to the detachable
`
`3' blocking group. Uncaging of the nucleotides can be accomplished enzymatically,
`
`chemically, or preferably photolytically, depending on the detachable linker used to link the
`
`labeling group and the 3' blocking group to the nucleotide.
`
`In another preferred embodiment, the labeling group is attached to the base of each nucleotide
`
`with a detachable linker rather than to the detachable 3' blocking group. The labeling group
`
`and the 3' blocking group can be removed enzymatically, chemically, or photolytically.
`
`Alternative, the labeling group can be removed by a different method than and the 3' blocking
`
`group. For example, the labeling group can be removed enzymatically while the 3' blocking
`
`group is removed chemically, or by photochemical activation.
`
`Many independent reactions occur simultaneously within the reaction chamber, each individual
`
`reaction center generating a few hundred, or thousands, of base pairs. This apparatus has the
`
`capacity to sequence in parallel thousands and possibly millions of separate templates from
`
`either specified or random sequence points. The combined sequence from each run is on the
`
`order of several million base-pairs of sequence and does not require amplification, prior
`
`knowledge of a portion of the target sequence, or resolution of fragments on gels or capillaries.
`
`Simple DNA preparations from any source can be sequenced with the apparatus and methods
`
`of the present invention.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 (Panels A-C) is a schematic representation of labeled-caged terminator nucleotides for
`
`use in direct nucleic acid sequencing. Panel A depicts a deoxyadenosine triphosphate modified
`
`by attachment of a photolabile linker-fluorochrome conjugate to the 3' carbon of the ribose.
`
`Panel B depicts an alternative configuration, wherein the fluorochrome is attached to the base
`
`of the nucleotide by way of a photo labile linker. Panel C depicts the four different nucleotides
`
`each labeled with a fluorochrome with distinct spectral properties, which permits the four
`
`nucleotides to be distinguished during the detection phase of a direct nucleic acid sequencing
`
`reaction cycle.
`
`SUBSTITUTE SHEET (RULE 26)
`
`
`
`WO00/53805
`
`5
`
`PCT /GB00/00873
`
`FIG. 2 is a schematic representation of the steps of one cycle of direct nucleic acid sequencing,
`
`wherein step 1 illustrates the incorporation of a labeled-caged nucleotide, step 2 illustrates the
`
`detection of the label, and step 3 illustrates the unblocking of the 3 '-OH cage.
`
`FIG. 3 is a schematic representation of a reaction center depicting an immobilized polymerase
`
`and a nucleic acid sample being sequenced.
`
`FIG. 4 is a schematic representation of the reaction chamber assembly that houses the array of
`
`DNAS reaction centers and mediates the exchange of reagents and buffer.
`
`FIG. 5 is a schematic representation of a reaction center array. The left side panel (Microscope
`
`Field) depicts the view of an entire array as recorded by four successive detection events ( one
`
`for each of the separate fluorochromes). The center panel depicts a magnified view of a part of
`
`the field showing the spacing of individual reaction centers. The far right panel depicts the
`
`camera's view of a single reaction center.
`
`FIG. 6 is a schematic representation of the principle of the evanescent wave.
`
`FIG. 7 is a schematic representation of a direct nucleic acid sequencing set up using total
`
`internal reflection fluorescence microscopy.
`
`FIG. 8 is a schematic representation of an example of a data acquisition algorithm obtained
`
`from a 3 x3 matrix.
`
`DETAILED DESCRIPTION OF THE INVENTION
`
`The present invention provides a novel sequencing apparatus and a novel sequencing method.
`
`The method of the present invention, referred to herein as Direct Nucleic Acid Sequencing
`
`(DNAS), offers a rapid, cost effective, high throughput method by which nucleic acid
`
`molecules from any source can be readily sequenced without the need for prior amplification.
`
`DNAS can be used to determine the nucleotide sequence of numerous single nucleic acid
`
`molecules in parallel.
`
`1.
`
`DNAS Reaction Center Array
`
`Polymerases are attached to the solid support, spaced at regular intervals, in an array of
`
`reaction centers, present at a periodicity greater than the optical resolving power of the
`
`microscope system. Preferably, only one polymerase molecule is present in each reaction
`
`center, and each reaction center is located at an optically resolvable distance from the other
`
`SUBSTITUTE SHEET (RULE 26)
`
`
`
`WO 00/53805
`
`6
`
`PCT/GB00/00873
`
`reaction centers. Sequencing reactions preferably occur in a thin aqueous reaction chamber
`
`comprising a sealed cover slip and an optically transparent solid support.
`
`Immobilization of polymerase molecules for use in nucleic acid sequencing has been disclosed
`
`by Densham in PCT application WO 99/ 05315. Densham describes the attachment of selected
`
`amino groups within the polymerase to a dextran or N-hydroxysuccinimide ester-activated
`
`surface. WO 99/ 05315; EP-A-0589867; Lofas et al., Biosens. Bioelectron 10: 813-822
`
`(1995). These techniques can be modified in the present invention to insure that the activated
`
`area is small enough so that steric hindrance will prevent the attachment of more than one
`
`polymerase at any given spot in the array.
`
`The array of reaction centers containing a single polymerase molecule is constructed using
`
`lithographic techniques commonly used in the construction of electronic integrated circuits.
`
`This methodology has been used
`
`in
`
`the art
`
`to construct microscopic arrays of
`
`oligodeoxynucleotides and arrays of single protein motors. See, e.g., Chee et al., Science 274:
`
`610-614 (1996); Fodor et al., Nature 364: 555-556 (1993); Fodor et al., Science 251: 767-773
`
`(1991); Gushin, et al., Anal. Biochem. 250: 203-211 (1997); Kinosita et al., Cell 93: 21-24
`
`(1998); Kato-Yamada et al., J Biol. Chem. 273: 19375-19377 (1998); and Yasuda et al., Cell
`
`93: 1117-1124 (1998). Using techniques such as photolithography and/or electron beam
`
`lithography
`
`[Rai-Choudhury, Handbook of Microlithography, Micromachining, and
`
`Microfabrication, Volume I: Microlithography, Volume PM39, SPIE Press (1997); Service,
`
`Science 283: 27-28 (1999)], the substrate is sensitized with a linking group that allows
`
`attachment of a single modified protein. Alternatively, an array of sensitized sites can be
`
`generated using thin-film technology such as Langmuir-Blodgett. See, e.g., Zasadzinski et al.,
`
`Science 263: 1726-1733 (1994).
`
`The regular spacing of proteins is achieved by attachment of the protein to these sensitized
`
`sites on the substrate. Polymerases containing the appropriate tag are incubated with the
`
`sensitized substrate so that a single polymerase molecule attaches at each sensitized site. The
`
`attachment of the polymerase can be achieved via a covalent or non-covalent interaction.
`
`Examples of such linkages common in the an include Ni2-/hexahistidine, streptavidiwbiotin.
`
`avidin/biotin, glutathione S-transferase (GST)/glutathione. monoclonal antibody/antigen. and
`
`maltose binding proteiwmaltose.
`
`SUBSTITUTE SHEET (RULE 26)
`
`
`
`WO 00/53805
`
`7
`
`PCT /GB00/00873
`
`A schematic representation of a reaction center is presented in FIG. 3. A DNA polymerase
`
`(e.g., from Thermus aquaticus) is attached to a glass microscope slide. Attachment is mediated
`
`by a hexahistidine tag on the polymerase, bound by strong non-covalent interaction to a Ni 2
`-
`
`atom, which is, in tum, held to the glass by nitrilotriacetic acid and a linker molecule. The
`
`nitrilotriacetic acid is covalently linked to the glass by a linker attached by silane chemistry.
`
`The silane chemistry is limited to small diameter spots etched at evenly spaced intervals on the
`
`glass by electron beam lithography or photolithography.
`
`In addition to the attached
`
`polymerase, the reaction center includes the template DNA molecule and an oligonucleotide
`
`primer both bound to the polymerase. The glass slide constitutes the lower slide of the DNAS
`
`reaction chamber.
`
`Housing the array of DNAS reaction centers and mediating the exchange ofreagents and buffer
`
`is the reaction chamber assembly. An example of DNAS reaction chamber assembly is
`
`illustrated in FIG. 4. The reaction chamber is a sealed compartment with transparent upper and
`
`lower slides. The slides are held in place by a metal or plastic housing, which may be
`
`assembled and disassembled to allow replacement of the slides. There are two ports that allow
`
`access to the chamber. One port allows the input of buffer ( and reagents) and the other port
`
`allows buffer ( and reaction products) to be withdrawn from the chamber. The lower slide
`
`carries the reaction center array. In addition, a prism is attached to the lower slide to direct
`
`laser light into the lower slide at such angle as to produce total internal reflection of the laser
`
`light within the lower slide. This arrangement allows an evanescent wave to be generated over
`
`the reaction center array. A high numerical apenure objective lens is used to focus the image
`
`of the reaction center array onto the digital camera system. The reaction chamber housing can
`
`be fitted with heating and cooling elements, such as a Peltier device, to regulate the
`
`temperature of the reactions.
`
`By fixing the site of nucleotide incorporation within the optical system, sequence information
`
`can be obtained from many distinct nucleic acid molecules simultaneously. A diagram of the
`
`DNAS reaction center array is given in FIG. 5. As described above, each reaction center is
`
`attached to the lo\ver slide of the reaction chamber. Depicted in the left side panel (Microscope
`
`Field) is the view of an entire array as recorded by four successive detection events ( one for
`
`each of the separate fluorochromesl. The center panel is a magnified view of a part of the field
`
`showing the spacing of individual reaction centers. Finally, the far right panel depicts the
`
`camera's view of a single reaction center. Each reaction center is assigned 100 pixels to ensure
`
`SUBSTITUTE SHEET (RULE 26)
`
`
`
`WO 00/53805
`
`8
`
`PCT /GB00/00873
`
`that it is truly isolated. The imaging area of a single pixel relative to the 1 µm X 1 µm area
`
`allotted to each reaction center is shown. The density of reaction centers is limited by the
`
`optical resolution of the microscope system. Practically, this means that reaction centers must
`
`be separated by at least 0.2 µm to be detected as distinct sites.
`
`2.
`
`Enzyme Selection
`
`In general, any macromolecule which catalyzes formation of a polynucleotide sequence can be
`
`used as the polymerase. In some embodiments, the polymerase can be an enzymatic complex
`
`that: 1) promotes the association (e.g., by hydrogen bonding or base-pairing) of a tag (e.g., a
`
`normal or modified nucleotide, or any compound capable of specific association with
`
`complementary template nucleotides) with the complementary template nucleotide in the active
`
`site; 2) catalyzes the formation a covalent linkage between the tag and the synthetic strand or
`
`primer; and 3) translates the active site to the next template nucleotide.
`
`While the polymerases will typically be proteinaceous enzymes, it will be obvious to one of
`
`average skill in the art that the polymerase activity need not be associated with a proteinaceous
`
`enzyme. For example, the polymerase may be a nucleic acid itself, as in the case of ribozymes
`
`or DNA-based enzymes.
`
`A large selection of proteinaceous enzymes is available for use in the present invention. For
`
`example, the polymerase can be an enzyme such as a DNA-directed DNA polymerase, an
`
`RNA-directed DNA polymerase a DNA-directed RNA polymerase or and RNA-directed RNA
`
`polymerase. Some polymerases are multi-subunit replication systems made up of a core
`
`enzyme and associated factors that enhance the activity of the core (e.g., they increase
`
`processivity or fidelity of the core subunit). The enzyme must be modified in order to link it to
`
`the support. The enzyme can be cloned by techniques well known in the art, to produce a
`
`recombinant protein with a suitable linkage tag. In a preferred embodiment this linkage is a
`
`hexahistidine tag, which permits strong binding to nickel ions on the solid support. Preferred
`
`enzymes are highly processive, i.e., they remain associated with the template nucleotide
`
`sequence for a succession of nucleotide additions, and are able to maintain a pol:ymerase(cid:173)
`
`polynucleotide complex even when not actively synthesizing. Additionally, preferred
`
`polymerases are capable of incorporating 3'-modified nucleotides. Sufficient quantities of an
`
`enzyme are obtained using standard recombinant techniques known in the art. See, for
`
`example, Dabrowski and Kur. Protein Expr. Purtf. 14: 131-138 (1998).
`
`SUBSTITUTE SHEET (RULE 26)
`
`
`
`WO 00/53805
`
`9
`
`PCT/GB00/00873
`
`2.1
`
`DNA Polymerase
`
`In a preferred embodiment. sequencing is done with a DNA-dependent DNA polymerase.
`
`DNA-dependent DNA polymerases catalyze the polymerization of deoxynucleotides to form
`
`the complementary strand of a primed DNA template. Examples of DNA-dependent DNA
`
`. polymerases
`
`include, but are not
`
`limited
`
`to,
`
`the DNA polymerase from Bacillus
`
`stearothermophilus (Bst), the E. coli DNA polymerase I Kienow fragment, E. coli DNA
`
`polymerase III holoenzyme, the bacteriophage T4 and T7 DNA polymerases, and those from
`
`Thermus aquaticus (Taq), Pyrococcus furiosis (Pfu), and Thermococcus litoralis (Vent). The
`
`polymerase from T7 gene 5 can also be used when complexed to thioredoxin. Tabor et al., J
`
`biol. Chem., 262: 1612-1623 (1987). The Bst DNA polymerase is preferred because it has
`
`been shown to efficiently incorporate 3'-O-(-2-Nitrobenzyl)-dATP into a growing DNA chain.
`
`is highly processive, very stable, and lacks 3'-5' exonuclease activity. The coding sequence of
`
`this enzyme has been determined. See U.S. Patent Nos. 5,830,714 and 5,814,506, incorporated
`
`herein by reference.
`
`In an alternative preferred embodiment where RNA is used as template, the selected
`
`DNA-dependent DNA polymerase functions as an RNA-dependent DNA polymerase, or
`
`reverse transcriptase. For example, the DNA polymerase from Thermus thermophilus (Tth)
`
`has been reported to function as an RNA-dependent DNA polymerase, or reverse transcriptase,
`
`under certain conditions. See, Meyers and Gelfand, Biochem. 30: 7661-7666 (1991 ). Thus, the
`
`Tth DNA polymerase is linked to the substrate and the sequencing reaction is conducted under
`
`conditions where this enzyme will sequence an RNA template,
`
`thereby producing a
`
`complementary DNA strand.
`
`In some embodiments, a polymerase subunit or fragment is attached to the support, and other
`
`necessary subunits or fragments are added as part of a complex with the sample to be
`
`sequenced. This approach is useful for polymerase systems that involve a number of different
`
`replication factors. For example, to use the bacteriophage T4 replication system for DNAS
`
`sequencing, the gp43 polymerase can be attached to the support. Other replication factors.
`
`such as the clamp loader (gp44/62) and sliding clamp (gp45), can be added with the nucleic
`
`acid template in order to increase the processivity of the replication svstem. A similar
`
`approach can be used with E.coli polymerase III system. where the polymerase core is
`
`immobilized in the array and the ~-dimer subunit (sliding clamp) and t and y subassembly
`
`(clamp loader) are added to the nucleic acid sample prior to DNAS sequencing. Additionally.
`
`SUBSTITUTE SHEET (RULE 26)
`
`
`
`WO00/53805
`
`10
`
`PCT /GB00/00873
`
`this approach can be used with eukaryotic DNA polymerases (e.g., a or 8) and the
`
`corresponding PCNA (proliferating cell nuclear antigen). In some embodiments, the sliding
`
`clamp is the replication factor that is attached in the array and the polymerase moiety is added
`
`in conjunction with the nucleic acid sample.
`
`2.2
`
`Reverse Transcriptase
`
`A reverse transcriptase is an RNA-dependent DNA polymerase - an enzyme that produces a
`
`DNA strand complementary to an RNA template. In an alternative preferred embodiment, a
`
`reverse transcriptase enzyme is attached to the support for use in sequencing RNA molecules.
`
`This permits the sequencing of RNAs taken directly from tissues, without prior reverse
`
`transcription. Examples of reverse transcriptases include, but are not limited to, reverse
`
`transcriptase from Avian Myeloblastosis Virus (Atv1V), Moloney Murine Leukemia Virus. and
`
`Human Immunodeficiency Virus-1 (HIV -1). HIV- I reverse transcriptase is particularly
`
`preferred because it is well characterized both structurally and biochemically. See, e.g.,
`
`Huang, et al., Science 282: 1669-1675 (1998).
`
`In an alternative preferred embodiment, the immobilized reverse transcriptase functions as a
`
`DNA-dependent DNA polymerase, thereby producing a DNA copy of the sample or target
`
`DNA template strand.
`
`2.3
`
`RNA Polymerase
`
`In yet another alternative preferred embodiment, a DNA-dependent Rt"'\'A polymerase is
`
`attached to the support, and uses labeled-caged ribonucleotides to generate an RNA copy of the
`
`sample or target DNA strand being sequenced. Preferred examples of these enzymes include,
`
`but are not limited to, RNA polymerase from E. coli [Yin, et al., Science 270: 1653-1657
`
`(1995)] and RNA polymerases from the bacteriophages T7, T3, and SP6.
`
`In an alternative,
`
`preferred embodiment, a modified T7 RNA polymerase functions as a DNA dependent DNA
`
`polymerase. This RNA polymerase is attached to the support and uses labeled-caged
`
`deoxyribonucleotides to generate a DNA copy of a DNA template. See, e.g.. lza\va. et al .. J.
`
`Biol. Chem. 273: 14242-14246 (1998).
`
`2.4
`
`RNA Dependent RNA Polymerase
`
`Many viruses employ RNA-dependent R."l\;A polymerases in their life-cycles. In a preferred
`
`embodiment. an RNA-dependent RNA poiymerase is attached to the suppon, and uses labeled-
`
`SUBSTITUTE SHEET (RULE 26)
`
`
`
`WO 00/53805
`
`11
`
`PCT/GB00/00873
`
`caged ribonucleotides to generate an RNA copy of a sample RNA strand being sequenced.
`
`Preferred examples of these enzymes include. but are not limited to, RNA-dependent RNA
`
`polymerases from the viral families: bromoviruses, tobamoviruses, tombusvirus, leviviruses,
`
`hepatitis C-like viruses, and picomaviruses. See, e.g., Huang et al., Science 282: 1668-1675
`
`(1998); Lohmann et al., J Viral. 71: 8416-8428 (1997); Lohmann et al., Virology 249: 108-118
`
`(1998), and O'Reilly and Kao, Virology 252: 287-303 (1998).
`
`3.
`
`Sample Preparation
`
`The nucleic acid to be sequenced can be obtained from any source. Example nucleic acid
`
`samples to be sequenced include double-stranded DNA, single-stranded DNA, DNA from
`
`plasmid, first strand cDNA, total genomic DNA, RNA, cut/end-modified DNA (e.g., with
`
`RNA polymerase promoter), in vitro transposon tagged (e.g., random insertion of RNA
`
`polymerase promoter). The target or sample nucleic acid to be sequenced is preferably sheared
`
`(or cut) to a certain size, and annealed with oligodeoxynucleotide primers using techniques
`
`well known in the art. Preferably, the sample nucleic acid is denatured, neutralized and
`
`precipitated and then diluted to an appropriate concentration, mixed with oligodeoxynucleotide
`
`primers, heated to 65°C and then cooled to room temperature in a suitable buffer. The nucleic
`
`acid is then added to the reaction chamber after the polymerase has been immobilized on the
`
`support or. alternatively, is combined with the polymerase prior to the immobilization step.
`
`3.1
`
`In vitro transposon tagging of template DNA
`
`In an alternative preferred embodiment purified transposases and transposable element tags
`
`will be used to randomly insert specific sequences into template double stranded DNA. In one
`
`configuration the transposable element contains the promoter