`Rosenthal et al.
`
`[54] DNA SEQUENCING METHOD
`
`[75]
`
`Inventors: Andre Rosenthal; Sydney Brenner,
`both of Cambridge, United Kingdom
`
`[73] Assignee: Medical Research Council, London,
`United Kingdom
`
`[ *] Notice:
`
`This patent is subject to a terminal dis(cid:173)
`claimer.
`
`[21] Appl. No.:
`
`08/325,224
`
`[22] PCT Filed:
`
`Apr. 22, 1993
`
`[86] PCT No.:
`
`PCT/GB93/00848
`
`§ 371 Date:
`
`Dec. 9, 1994
`
`§ 102(e) Date: Dec. 9, 1994
`
`[87] PCT Pub. No.: WO93/21340
`
`PCT Pub. Date: Oct. 28, 1993
`
`[30]
`
`Foreign Application Priority Data
`
`Apr. 22, 1992
`
`[GB]
`
`United Kingdom ................... 9208733
`
`[51]
`
`Int. Cl.7 .............................. C12Q 1/68; C12P 19/34;
`C07N 21/00
`[52] U.S. Cl. ............................. 435/6; 435/41; 435/172.1;
`536/24.33; 536/25.3
`[58] Field of Search ............................... 435/6, 41, 172.1;
`935/76, 77, 78; 536/23.1, 24.33, 25.3, 25.32
`
`[56]
`
`References Cited
`
`FOREIGN PATENT DOCUMENTS
`
`WO 90/13666 11/1990 WIPO .
`WO 91/06678
`5/1991 WIPO .
`
`OIBER PUBLICATIONS
`
`Drmanac et al., "Sequencing of Megabase Plus DNA by
`Hybridization: ... ", Genomics 4:114-128, 1989.
`Salmeron et al., "Imaging of Biomolecules with the Scan(cid:173)
`ning Tunneling Microscope: ... ", J. Vac. Sci. Tech. 8:635,
`Jan./Feb. 1990.
`
`I 1111111111111111 11111 1111111111 1111111111 11111 1111111111 lll111111111111111
`US006087095A
`[11] Patent Number:
`[45] Date of Patent:
`
`6,087,095
`*Jul. 11, 2000
`
`Maxam et al., "A new method for sequencing DNA", Proc.
`Natl. Sci. USA 74:560-564, Feb. 1977.
`
`Drmanac et al., "Reliable Hybridization of Oligonucleotides
`as Short as Six Nucleotides", DNA and Cell Biology 9:527,
`Nov. 1990.
`
`Sanger et al., "DNA sequencing with chain-terminating
`inhibitors", Proc. Natl. Acad. Sci. USA 74:5463-5467, Dec.
`1977.
`Bains et al., "A Novel Method for Nucleic Acid Sequence
`Determination", J. Theor. Biol. 135:303-307, 1988.
`Driscoll et al., "Atomic-scale imaging of DNA using scan(cid:173)
`ning tunnelling microscopy", Nature 346:294, Jul. 1990.
`E.D. Hyman, "A New Method of Sequencing DNA", Ana(cid:173)
`lytical Biochemistry 174:423-436, 1988.
`Khrapko et al., "An oligonucleotide hybridization approach
`to DNA sequencing", FEES 256:118-122, Oct. 1989.
`
`P.A. Pevzner, "1-Tuple DNASequencing:Computer Analy(cid:173)
`sis", J. Biom. Str. & Dyn. 7:63, 1989.
`
`Jett et al., "Rig-Speed DNA Sequencing: An Approach
`Based Upon Fluorescence Detection ... ", J. Biom. Str. &
`Dyn. 7:301, 1989.
`
`Lindsay et al., Genet. Anal. Tech. Appl., 8:8, 1991.
`
`Nguyen et al., Anal. Chem., 56:348, 1987.
`
`Maskos et al., Cold Spring Harbor Symposium on Genome
`Mapping and Sequencing, Abstracts, p. 143, 1991.
`
`Allison et al., Scanning Microsc. 4:517, 1990.
`
`Primary Examiner-Nancy Degen
`Assistant Examiner-Sean McGarry
`Attorney, Agent, or Firm-Fulbright & Jaworski, LLP
`
`[57]
`
`ABSTRACT
`
`The invention is drawn to a method of DNA sequencing
`using labeled nucleotides that do not act as chain elongation
`inhibitors where the label is removed or neutralized for the
`sequential addition of non-labeled nucleotides.
`
`16 Claims, 1 Drawing Sheet
`
`Columbia Ex. 2005
`Illumina, Inc. v. The Trustees
`of Columbia University
`in the City of New York
`IPR2020-00988
`
`
`
`U.S. Patent
`
`2.5
`
`2.0
`
`1.5
`
`,_
`<I
`...__
`.......
`<l
`Q,) u
`Q,) u
`V1
`Q,) ,_
`0
`:::::i
`u..
`
`C
`
`6,087,095
`
`Jul. 11, 2000
`
`FIG. 1
`
`(cid:143)
`
`(cid:143) Fluorescence Lllf / Lllr I
`
`1
`
`1 . 0-+-------._£_---.---r----~--,-----~-~----
`3
`4
`5
`0
`2
`
`number of U-F nucleotides
`
`2.0
`
`1.8
`
`1.6
`
`1 .4
`
`,_
`<I
`...__
`.......
`<l
`Q,)
`u
`C
`Q,)
`u
`V1
`Q,) ,_
`0
`::J
`u..
`
`FIG. 2
`
`(cid:143)
`
`(cid:143)
`
`(cid:143)
`
`(cid:143) Fluorescence Lllf / Lllr )
`
`1
`
`(cid:143)
`
`1
`
`1.2
`0
`
`2
`
`3
`
`4
`
`5
`
`6
`
`number of U-F nucleotides
`
`
`
`1
`DNA SEQUENCING METHOD
`
`6,087,095
`
`This application is a 371 of PCT/GB93/00848 filed Apr.
`22, 1993.
`The present invention relates to a method for sequencing
`DNA. In particular, the present invention concerns a method
`for the automated sequencing of large fragments of DNA
`DNA sequence analysis has become one of the most
`important tools available to the molecular biologist. Current
`sequencing technology allows sequence data to be obtained
`from virtually any DNA fragment. This has allowed not only
`the sequencing of entire genes and other genomic sequences
`but also the identification of the sequence of RNA
`transcripts, by the sequencing of cDNA. Currently, emphasis
`is being placed on genomic sequencing in order to determine
`the DNA sequence of entire genomes. Ultimately, it is hoped
`that the sequence of the human genome will be deciphered.
`Traditional DNA sequencing techniques share three
`essential steps in their approaches to sequence determina(cid:173)
`tion. Firstly, a multiplicity of DNA fragments are generated 20
`from a DNA species which it is intended to sequence. These
`fragments are incomplete copies of the DNA species to be
`sequenced. The aim is to produce a ladder of DNA
`fragments, each a single base longer than the previous one.
`This can be achieved by selective chemical degradation of 25
`multiple copies of the DNA species to be sequenced, as in
`the Maxam and Gilbert method (A Maxam and W. Gilbert,
`PNAS 74, p. 560, 1977). Alternatively, the DNA species can
`be used as a template for a DNA polymerase to produce a
`number of incomplete clones, as in the Sanger method (F. 30
`Sanger, S. Nicklen and A Coulson, PNAS 74, p. 5463,
`1977). These fragments, which differ in respective length by
`a single base, are then separated on an apparatus which is
`capable of resolving single-base differences in size. A thin
`polyacrylamide gel is invariably used in this process. The 35
`third and final step is the determination of the nature of the
`base at the end of each fragment. When ordered by the size
`of the fragments which they terminate, these bases represent
`the sequence of the original DNA species.
`Determination of the nature of each base is achievers by 40
`previously selecting the terminal base of each fragment. In
`the Sanger method, for example, dideoxy nucleoside triph(cid:173)
`osphates (ddNTPs) are used to selectively terminate growing
`DNA clones at an A, C, G or T residue. This means that four
`separate reactions need to be performed for each sequencing 45
`exercise, each in a separate tube using a different ddNTP. In
`one tube, therefore, each labelled fragment will terminate
`with an A residue, while in the next tube with a C residue,
`and so on. Separation of each croup of fragments side-by(cid:173)
`side on a polyacrylamide gel will show the sequence of the 50
`template by way of the relative size of the individual
`fragments.
`In the Maxam and Gilbert method, on the other hand, the
`selectivity is achieved during the chemical degradation
`process. Chemicals are used which cleave DNA strands at A 55
`only, C only, G and A or T and C. Use of limiting concen(cid:173)
`trations of such chemicals allows partial digestion of the
`DNA species. As in the Sanger method, four separate
`reactions must be performed and the products separated
`side-by-side on a polyacrylamide gel.
`The disadvantages of these prior art methods are numer(cid:173)
`ous. They require a number of complex manipulations to be
`performed, in at least four tubes. They are susceptible to
`errors due to the formation of secondary structures in DNA,
`or other phenomena that prevent faithful replication of a 65
`DNA template in the Sanger method or which cause base(cid:173)
`specificity to be lost by the chemical reactants of the Maxam
`
`10
`
`2
`and Gilbert method. The most serious problems, however,
`are caused by the requirement for the DNA fragments to be
`size-separated on a polyacrylamide gel. This process is
`time-consuming, uses large quantities of expensive
`5 chemicals, and severely limits the number of bases which
`can be sequenced in any single experiment, due to the
`limited resolution of the gel. Furthermore, reading the gels
`in order to extract the data is labour-intensive and slow.
`A number of improvements have been effected to these
`sequencing methods in order to improve the efficiency and
`speed of DNA sequencing. Some of these improvements
`have related to the sequencing reaction itself. For example,
`improved polymerase enzymes have been introduced which
`lead to greater precision in the Sanger method, such as
`Sequenase® and Taquenase®. Improved reagents have not,
`15 however, significantly affected the speed of sequence data
`generation or significantly simplified the sequencing pro-
`cess.
`In the interest of both speed and simplicity, a number of
`"Automated Sequencers" have been introduced in recent
`years (reviewed in T. Hunkapiller, R. Kaiser, B. Koop and L.
`Hood, Science, 254, p. 59, 1991). These machines are not,
`however, truly automatic sequencers. They are merely auto(cid:173)
`matic gel readers, which require the standard sequencing
`reactions to be carried out before samples are loaded onto
`the gel. They do provide a slight increase in speed, however,
`due to faster reading of the gels and collation of the data
`generated into computers for subsequent analysis.
`Many automated sequencers exploit recent developments
`which have been made in labelling technology.
`Traditionally, radioactive labels in the form of 32P or 35S
`have been used to label each DNA fragment. Recently,
`however, fluorophores have gained acceptance as labels.
`These dyes, attached either to the sequencing primer or to
`nucleotides, are excited to a fluorescent state on the poly(cid:173)
`acrylamide gel by a laser beam. An automated sequencer,
`therefore, can detect labelled fragments as they pass under
`a laser in a reading area. Use of dyes which fluoresce at
`different wavelengths allows individual labelling of A, G. C
`and T residues, which permits the products of all four
`sequencing reactions to be run in a single lane of the gel.
`Even incorporating such refinements, however, auto(cid:173)
`mated sequencers can still produce no more than about 100
`kb of finished sequence per person per year. At this rate, it
`would take one person 73,000 years to sequence the human
`genome.
`Clearly, if the aim of sequencing the human genome is to
`be achieved, current sequencing technology is entirely inad(cid:173)
`equate. In view of this, a few proposals have been made for
`alternative sequencing strategies which are not merely
`improvements of the old technology.
`One such method, sequencing by hybridisation (SBH),
`relies on the mathematical demonstration that the sequence
`of a relatively short (say, 100 kbp) fragment of DNA may be
`obtained by synthesising all possible N-mer oligonucle(cid:173)
`otides and determining which oligonucleotides hybridise to
`the fragment without a single mismatch (R. Drmanac, I.
`Labat, I. Bruckner and R. Crkvenjakov, Genomics, 4, p. 114,
`1989; R. Drmanac, Z. Stvanovic, R. Crkvenjakov, DNA Cell
`Biology, 9, p. 527, 1990; W. Bains and G. Smith, J. Theor.
`60 Biol., 135, pp 303-307, 1988; K. R. Khrapko, et al, FEES
`lett., 256, pp. 118-122, 1989; P. A Pevzner, J. Biomolecular
`Structure and Dynamics, 7, pp. 63-73, 1989; U. Maskos and
`E. M. Southern, Cold Spring Harbour Symposium on
`Genome Mapping and Sequencing, Abstracts, p. 143, 1991).
`N can be 8, 9 or 10, such sizes being a compromise between
`the requirement for reasonable hybridisation parameters and
`manageable library sizes.
`
`
`
`6,087,095
`
`10
`
`30
`
`3
`The technique can be automated by attaching the oligo(cid:173)
`nucleotides in a known pattern on a two-dimensional grid.
`The fragment to be sequenced is subsequently hybridised to
`the oligonucleotides on the grid and the oligonucleotides to
`which the sequence has been hybridised are detected using 5
`a computerised detector. Determination of the sequence of
`the DNA is then a matter of computation. however, errors
`arise from the difficulty in determining the difference
`between perfect matches and single base-pair mismatches.
`Repetitive sequences, which occur quite commonly in the
`human genome, can also be a problem.
`Another proposal involves the fluorescent detection of
`single molecules (J. Jett et al., J. Biomol. Struct. Dyn., 7, p.
`301, 1989: D. Nguyen, et al.,Anal. Chem., 56, p. 348, 1987).
`In this method, a single, large DNA molecule is suspended
`in a flow stream using light pressure from a pair of laser 15
`beams. Individual bases, each of which is labelled with a
`distinguishing fluorophore, are then cut from the end of the
`molecule and carried through a fluorescence detector by the
`flow stream.
`Potentially, this method could allow the accurate 20
`sequencing of a large number of base pairs-several
`hundred-per second. However, feasibility of this method is
`not yet proven.
`A third method is sequencing by scanning tunnelling
`microscopy (STM) (S. Lindsay, et al., Genet. Anal. Tech. 25
`Appl., 8, p. 8, 1991: D. Allison et al., Scanning Microsc., 4,
`p. 517, 1990: R. Driscoll et al., Nature, 346, p. 294, 1990:
`M. Salmeron et al., J. Vac. Sci. Technol., 8, p. 635, 1990).
`This technique requires direct three-dimensional imaging of
`a DNA molecule using STM. Although images of the
`individual bases can be obtained, interpretation of these
`images remains very difficult. The procedure is as yet
`unreliable and the success rate is low.
`A fourth method involves the detection of the pyrophos(cid:173)
`phate group released as a result of the polymerisation
`reaction which occurs when a nucleotide is added to a DNA 35
`primer in a primer extension reaction (E. D. Hyman, Anal.
`Biochem., 174, p. 423, 1988). This method attempts to
`detect the addition of single nucleotides to a primer using the
`luciferase enzyme to produce a signal on the release of
`pyrophosphate. However, this method suffers a number of
`drawbacks, not least of which is that dATP is a substrate for
`luciferase and thus will always give a signal, whether it is
`incorporated into the chain or not. The added nucleotides are
`not labelled and no method is disclosed which will allow the
`use of labelled nucleotides.
`In summary, therefore, each of the new approaches to
`DNA sequencing described above, while solving some of
`the problems associated with traditional methods, introduces
`several problems of its own. In general, most of these
`methods are expensive and not currently feasible.
`There is therefore a need for a sequencing method which
`allows the rapid, unambiguous sequencing of DNA at low
`cost. The requirements for such a system are that:
`1. it should not be based on gel resolution of differently(cid:173)
`sized oligomers;
`2. it should allow more rapid sequencing than present
`methods;
`3. it should allow several DNA clones to be processed in
`parallel;
`4. the cost of hardware should be reasonable;
`5. it should cost less per base of sequence than current 60
`technology; and
`6. it should be technically feasible at the present time
`SUMMARY OF THE INVENTION
`The present invention provides such a sequencing system
`which comprises a method for the sequential addition nucle(cid:173)
`otides to a primer on a DNA template.
`
`4
`According to a first aspect of the present invention, there
`is provided a method for determining the sequence of a
`nucleic acid comprising the steps of:
`a) forming a single-stranded template comprising the
`nucleic acid to be sequenced;
`b) hybridising a primer to the template to form a template/
`primer complex;
`c) extending the primer by the addition of a single labelled
`nucleotide;
`d) determining the type of the labelled nucleotide added
`onto the primer;
`e) removing or neutralising the label; and
`f) repeating steps (c) to (e) sequentially and recording the
`order of incorporation of labelled nucleotides.
`In the method of the invention, a single-stranded template
`is generated from a nucleic acid fragment which it is desired
`to sequence. Preferably, the nucleic acid is DNA Part of the
`sequence of this fragment may be known, so that a specific
`primer may be constructed and hybridised to the template.
`Alternatively, a linker may be ligated to a fragment of
`unknown sequence in order to allow for hybridisation of a
`primer.
`The template may be linear or circular. Preferably, the
`template is bound to a solid-phase support. For example, the
`template may be bound to a pin, a glass plate or a sequencing
`chip. The provision of a solid phase template allows for the
`quick and efficient addition and removal of reagents, par(cid:173)
`ticularly if the process of the invention is automated.
`Additionally, many samples may be processed in parallel in
`the same vessel yet kept separate.
`Preferably, the template is attached to the solid support by
`means of a binding linker. For example, one of the com(cid:173)
`mercially available universal primers can be ligated to the 5'
`end of the template or incorporated easily to one of the ends
`of the templates by the polymerase chain reaction.
`The binding linker may be attached to the solid support by
`means of a biotin/streptavidin coupling system. For
`example, the surface of the solid support may be derivatised
`by applying biotin followed by streptavidin. A biotinylated
`40 binding linker is then ligated to the template to bind it to the
`solid support or the biotinylated template generated by PCR
`is bound to the solid support.
`In an alternative embodiment, an unligated binding linker
`is bound to the solid support by the biotin/streptavidin
`45 system. The template is then hybridised to the binding linker.
`The binding linker may be a separate binding linker, which
`is not the sequencing primer. Alternatively, the binding
`linker may also function as the sequencing primer.
`Clearly, it is essential in the latter embodiment that the
`50 template should possess a region of complementarity with
`the binding linker bound to the support. Where the template
`is ligated to a linker, the complementarity may be provided
`by that linker. Alternatively, the binding linker may be
`complementary to a unique sequence within the template
`55 itself.
`Preferably the solid support is derivatised using a mask so
`as to allow high resolution packaging of the template(s) on
`the support. An array of template attachment areas can
`thereby be produced on a glass plate or sequencing chip,
`allowing parallel processing of a large number of different
`templates. Where pins are used as the solid support, a single
`pin is needed for each template. The single pins may be
`grouped into arrays. It is envisaged that an array of lO0xlO0
`pins or attachment areas can be used, to allow the simulta-
`65 neons processing of 104 clones.
`The primer is extended by a DNA polymerase in the
`presence of a single labelled nucleotide, either A, C, G or T.
`
`
`
`6,087,095
`
`20
`
`35
`
`5
`Suitable DNA polymerases are, for example, Sequenase
`2.0®, T4 DNA polymerase or the Klenow fragment of DNA
`polymerase 1 as well as heat-stable polymerases such as Taq
`polymerase (for example Taquenase®) and Vent poly(cid:173)
`merase.
`In a manually operated procedure using a single template,
`the labelled nucleotides are used singly and sequentially in
`order to attempt to add that nucleotide to the primer. The
`nucleotide will add on to the primer when it is complemen(cid:173)
`tary to the next nucleotide in the template. It may take one,
`two, three or four steps before the appropriate labelled
`nucleotide is used. However, as soon as it is determined that
`a labelled nucleotide has been added onto the primer, step ( e)
`can be performed.
`In an automated procedure, especially where a large
`number of templates are being sequenced simultaneously, in
`step (c) all four labelled nucleotides are used sequentially
`and it is merely noted which of the labelled nucleotides is
`added, that is it is determined whether it is the first, second,
`third or fourth labelled nucleotide which is added.
`It has been found that nonspecific end-addition and mis(cid:173)
`incorporation of nucleotides can lead to background prob(cid:173)
`lems when the incorporation step has been repeated a
`number of times. These side reactions are mainly due to the
`fact that a single nucleotide is present, instead of all four
`nucleoside triphosphates. In fact, it has been observed that
`while it is possible to sequence certain templates by the
`sequential addition of single nucleotides in the absence of
`the other three, significant problems have been encountered
`with other templates, particularly those templates containing
`multiple base repeats, due to non-specific incorporation of a
`nucleotide which is caused by the polymerase effectively
`jumping over a non-complementary base.
`In order to ensure high accuracy of operation during the
`primer extension step, it has been found advantageous to
`carry out step ( c) in the presence of chain elongation
`inhibitors.
`Chain elongation inhibitors are nucleotide analogues
`which either are chain terminators which prevent further
`addition by the polymerase of nucleotides to the 3' end of the
`chain by becoming incorporated into the chain themselves,
`or compete for incorporation without actually becoming
`incorporated. Preferably, the chain elongation inhibitors are
`dideoxy nucleotides. Where the chain elongation inhibitors
`are incorporated into the growing polynucleotide chain, it is 45
`essential that they be removed after incorporation of the
`labelled nucleotide has been detected, in order to allow the
`sequencing reaction to proceed using different labelled
`nucleotides. It has been found, as described below, that 3' to
`5' exonucleases such as, for example, exonuclease III, are 50
`able to remove dideoxynucleotides. This finding allows the
`use of dideoxynucleotides as chain elongation inhibitors to
`promote the accuracy of the polymerase in the sequencing
`method of the invention. Accuracy of the polymerase is
`essential if 104 clones are to be processed simultaneously, 55
`since it is high polymerase accuracy which enables the
`sequencing reaction to be carried out on a single template
`instead of as four separate reactions.
`Alternatively, the chain elongation inhibitors may be
`deoxynucleoside 5'-[ a,~-methylene ]triphosphates. These 60
`compounds are not incorporated into the chain. Other nucle(cid:173)
`otide derivatives such as, for example, deoxynucleoside
`diphosphates or deoxynucleoside monophosphates may be
`used which are also not incorporated into the chain.
`It is furthermore envisaged that blocking groups on the 3' 65
`moiety of the deoxyribose group of the labelled nucleotide
`may be used to prevent nonspecific incorporation.
`
`6
`Preferably, therefore, the labelled nucleotide is labelled by
`attachment of a fluorescent dye group to the 3' moiety of the
`deoxyribose group, and the label is removed by cleaving the
`fluorescent dye from the nucleotide to generate a 3' hydroxyl
`5 group. The fluorescent dye is preferably linked to the
`deoxyribose by a linker arm which is easily cleaved by
`chemical or enzymatic means.
`Evidently, when nucleotide analogue chain elongation
`inhibitors are used, only the analogues which do not corre-
`10 spond to the labelled nucleotide should be added. Such
`analogues are referred to herein as heterogenous chain
`elongation inhibitors.
`Label is ideally only incorporated into the template/
`primer complex if the labelled nucleotide added to the
`15 reaction is complementary to the nucleotide on the template
`adjacent the 3' end of the primer. The template is subse(cid:173)
`quently washed to remove any unincorporated label and the
`presence of any incorporated label determined. A radioactive
`label may be determined by counting or any other method
`known in the art, while fluorescent labels can be induced to
`fluoresce, for example by laser excitation.
`It will be apparent that any label known in the art to be
`suitable for labelling nucleic acids may be used in the
`present invention. However, the use of fluorescent labels is
`25 currently preferred, due to the sensitivity of detection sys(cid:173)
`tems presently available for such labels which do not
`involve the use of radioactive substances.
`Examples of flourescently-labelled nucleotides currently
`available include fluorescein-12-dUTP, fluorescein-15-
`30 dCTP, fluorescein-15-dATP and flurescein-15-dITP. It has
`proved very difficult to synthesise a suitable fluroescent
`guanosine compound, so an inosine compound is used in its
`place. Should a fluorescent guanosine compound become
`available, its use is envisaged in the present invention.
`It has been found advantageous to use a mixture, of
`unlabelled and labelled nucleotides in the addition step.
`When a fluorescent label is used, in order to produce all
`possible extension products on a template possessing a run
`of a particular nucleotide, the following ratios were found to
`40 be approximately optimal:
`Fluorescein-15-dATP/dATP 500: 1
`Fluorescein-15-dITP/dGTP 500:1
`Fluorescein-12-dUTP /dTTP 15: 1
`Fluorescein-12-dCTP/dCTP 15:1.
`Preferably, therefore, the above ratios are used in con(cid:173)
`nection with fluorescently-labelled nucleotides.
`By repeating the incorporation and label detection steps
`until incorporation is detected, the nucleotide on the tem(cid:173)
`plate adjacent the 3' end of the primer may be identified.
`Once this has been achieved, the label must be removed
`before repeating the process to discover the identity of the
`next nucleotide. Removal of the label may be effected by
`removal of the labelled nucleotide using a 3'-5' exonuclease
`and subsequent replacement with an unlabelled nucleotide.
`Alternatively, the labelling group can be removed from the
`nucleotide. In a further alternative, where the label is a
`fluorescent label, it is possible to neutralise the label by
`bleaching it with laser radiation.
`If chain terminators or 3' blocking groups have been used,
`these should be removed before the next cycle can take
`place. Preferably, chain terminators are removed with a 3'-5'
`exonuclease. Preferably, exonuclease III is used. 3' blocking
`groups may be removed by chemical or enzymatic cleavage
`of the blocking group from the nucleotide.
`Where exonuclease III is used to remove the chain
`terminators, it is essential to prevent the exonuclease III
`from chewing back along the growing chain to remove
`
`
`
`6,087,095
`
`15
`
`30
`
`7
`nucleotides which have already been incorporated, or even
`the primer itself. Preferably, therefore, a nucleoside deriva(cid:173)
`tive which is resistant to removal by exonucleases is used to
`replace the labelled nucleotides. Advantageously deoxy(cid:173)
`nucleoside phosphorothioate triphosphates ( dsNTPs) are 5
`used. Likewise, the primer preferably comprises a phospho(cid:173)
`rothioate nucleoside base(s) at its 3' end which are incor(cid:173)
`porated during primer synthesis or an extra enzymatic
`capping step.
`It is known that deoxynucleoside phosphorothioate 10
`derivatives resist digestion by exonuclease III (S. Labeit et
`al., DNA, 5, p. 173, 1986). This resistance is, however, not
`complete and conditions should be adjusted to ensure that
`excess digestion and removal of phosphorothioate bases
`does not occur.
`For example, it has been found that the pH of the exoIII
`buffer used (50 mM Tris/HCl, 5 mM MgC12) affects the
`extent of chewing back which occurs. Experiments carried
`out at pH 6.0, 7.0, 7.5, 8.0, 8.5, 9.0 and 10.0 (37° C.) reveal
`that pH 10.0 is the optimum with respect to the rate of 20
`reaction an specificity of exoIII. At this pH, the reaction wets
`show to go to completion in less than 1 minute with no
`detectable chewing back.
`Once the label and terminators/blocking groups have been
`removed, the cycle is repeated to discover the identity of the 25
`next nucleotide.
`In an alternative embodiment of the invention, steps (c)
`and ( d) of the first aspect of the invention are repeated
`sequentially a plurality of times before removal or neutrali-
`sation of the label.
`The number of times the steps ( c) and ( d) can be repeated
`depends on the sensitivity of the apparatus used to detect
`when a labelled nucleotide has been added onto the primer.
`For instance, if each nucleotide is labelled with a different
`fluorescent label, the detection apparatus will need to be able
`to distinguish between each of the labels and will ideally be
`able to count the number of each type of fluorescent label.
`Alternatively, where each nucleotide is radioactively
`labelled or labelled with the same fluorescent dye, the
`apparatus will need to be able to count the total number of
`labels added to the primer.
`As with the first embodiment of the invention, in a manual
`procedure using a single template, the labelled nucleotides
`are used singly and sequentially until a labelled nucleotide
`is added, whereupon the sequence is repeated. In an auto(cid:173)
`mated procedure all four labelled nucleotides are used
`sequentially and the apparatus is programmed to detect
`which nucleotides are added in what sequence to the primer.
`Once the number of labels added has reached the resolv(cid:173)
`ing power of the detecting apparatus, removal or neutrali- 50
`sation of the label is carried out in a single step. Thus, the
`number of label removal steps is significantly reduced.
`In this alternative embodiment, the steps ( c) and ( d) of the
`first aspect of the invention will preferably comprise:
`i) adding a labelled nucleotide together with three heter- 55
`ogenous chain elongation inhibitors which are not
`incorporated into the chain, such as 5'-[ a,~-methylene]
`trip hosp hates;
`ii) removing excess reagents by washing;
`iii) determining whether the label has been incorporated; 60
`and
`iv) repeating steps (i) to (iii) using a different labelled
`nucleotide, either until a labelled nucleotide has been
`incorporated or until all four labelled nucleotides have
`been used.
`This technique necessitates the use of a more sophisti(cid:173)
`cated counter or label measuring device. Allowing for runs
`
`8
`of repeated nucleotides, the label measuring device should
`be able to detect the presence of between four and sixteen
`labelled nucleotides accurately. For the measurement of long
`stretches of repeated nucleotides, a device with a greater
`capacity may be required.
`Scheme 1
`According to a preferred aspect of the invention, a DNA
`fragment is sequenced according to the following scheme:
`1) a capped primer containing a phosphorothioate nucleo(cid:173)
`side derivative is hybridized to a template to form a
`template/primer complex;
`2) a labelled deoxynucleoside triphosphate (dNTP)
`together with heterogenous chain terminators and a
`suitable polymerase is added to the template/primer
`complex;
`3) excess reagents are removed by washing;
`4) the amount of incorporated label is measured;
`5) the template/primer complex is treated with a exonu(cid:173)
`clease to remove the label and the dideoxynucleotides;
`6) the exonuclease is removed by washing;
`7) a phosphorothioate deoxynucleoside triphosphate cor(cid:173)
`responding to the labelled deoxynucleoside triphos(cid:173)
`phate added in Step 2 is added together with heterog(cid:173)
`enous chain terminators;
`8) excess reagents are removed by washing;
`9) the template/primer complex is treated with an exonu(cid:173)
`clease to remove the chain terminators;
`10) the exonuclease is removed by washing; and
`11) repeating step 2) to 10) four times, each time with a
`different labelled nucleotide, together with the appro(cid:173)
`priate heterogenous chain terminators.
`For example, in Step 2 above the labelled nucleotide
`35 could be dATP. In this case, the heterogeneous chain termi(cid:173)
`nators could be ddGTP, ddTTP and ddCTP. In step 7
`phosphorothioate dATP would be added to replace the
`labelled dATP removed with the exonuclease in step 6. The
`cycle can then be repeated with another labelled nucleotide,
`40 for example dGTP, together with the heterogeneous
`dideoxynucleotides ddATP, ddTTP and ddCTP. This will
`cause label to be incorporate in all the chains propagating
`with G. This is followed in turn with labelled dTTP and
`labelled dCTP and continued again with dATP, dGTP, dTTP
`45 and dCTP and so on.
`Scheme 2
`According to a second preferred aspect of the invention,
`a DNA fragment is sequenced according to the following
`scheme:
`1) a capped primer containing a phosphorothioate nucleo(cid:173)
`side derivative is hybridized to a template to form a
`template/primer complex;
`2) a labelled deoxynucleotide together with heteroge(cid:173)
`neous chain terminators and a suitable polymerase is
`added to the template/primer complex;
`3) excess reagents are removed by washing;
`4) the amount