`Albrecht et al.
`
`[54] MASSIVELY PARALLEL SIGNATURE
`SEQUENCING BY LIGATION OF ENCODED
`ADAPTORS
`
`[75]
`
`Inventors: Glenn Albrecht, Redwood City, Calif.;
`Sydney Brenner, Cambridge, United
`Kingdom; Robert B. DuBridge,
`Belmont, Calif.; David H. Lloyd, Daly
`City, Calif.; Michael C. Pallas, San
`Bruno, Calif.
`
`[73] Assignee: Lynx Therapeutics, Inc., Hayward,
`Calif.
`
`[21] Appl. No.: 08/946,138
`
`[22] Filed:
`
`Oct. 7, 1997
`
`Related U.S. Application Data
`
`[63] Continuation-in-part of application No. 08/862,610, May
`23, 1997, abandoned, which is a continuation-in-part of
`application No. 08/689,587, Aug. 12, 1996, abandoned,
`which is a continuation-in-part of application No. 08/659,
`453, Jun. 6, 1996, abandoned.
`Int. CI.7 .............................. C12Q 1/68; C07H 21/02
`[51]
`[52] U.S. Cl. ................................................ 435/6; 536/24.2
`[58] Field of Search ................... 435/6, 91.52; 536/24.2,
`536/24.3, 26.6; 935/77, 78
`
`[56]
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,237,224 12/1980 Cohen et al. ............................. 435/68
`4,293,652 10/1981 Cohen ..................................... 435/172
`4,321,365
`3/1982 Wu et al. .................................. 536/27
`4,683,202
`7/1987 Mullis ....................................... 435/91
`4,775,619 10/1988 Urdea .......................................... 435/6
`4,942,124
`7/1990 Church ........................................ 435/6
`5,093,245
`3/1992 Keith et al. ............................... 435/91
`5,102,785
`4/1992 Livak et al. ................................ 435/6
`5,118,605
`6/1992 Urdea .......................................... 435/6
`5,126,239
`6/1992 Livak et al. ................................ 435/6
`5,149,625
`9/1992 Church et al. .............................. 435/6
`5,242,794
`9/1993 Whiteley et al. ........................... 435/6
`5,366,860 11/1994 Bergot et al. ............................... 435/6
`5,503,980
`4/1996 Cantor ......................................... 435/6
`5,508,169
`4/1996 Deugau et al. ............................. 435/6
`5,512,439
`4/1996 Hornes et al. .............................. 435/6
`5,552,278
`9/1996 Brenner ....................................... 435/6
`5,599,675
`2/1997 Brenner ....................................... 435/6
`5,604,097
`2/1997 Brenner ....................................... 435/6
`5,658,736
`8/1997 Wong .......................................... 435/6
`5,707,807
`1/1998 Kato ............................................ 435/6
`5,714,330
`2/1998 Brenner et al. ............................. 435/6
`5,728,524
`3/1998 Sibson ......................................... 435/6
`
`FOREIGN PATENT DOCUMENTS
`
`0 246 864 Bl
`0 303 459 A3
`0 392 546 A2
`0 799 897 Al
`2 687 851
`WO92/15712
`WO94/01582
`WO95/20053
`WO96/12014
`
`11/1987
`2/1989
`10/1990
`10/1997
`5/1994
`9/1992
`1/1994
`7/1995
`4/1996
`
`European Pat. Off ..
`European Pat. Off ..
`European Pat. Off ..
`European Pat. Off ..
`France .
`WIPO.
`WIPO.
`WIPO.
`WIPO.
`
`I 1111111111111111 11111 1111111111 1111111111 111111111111111 lll111111111111111
`US006013445A
`[11] Patent Number:
`[45] Date of Patent:
`
`6,013,445
`Jan.11,2000
`
`OTHER PUBLICATIONS
`
`Brenner and Livak, "DNA fingerprinting by sampled
`sequencing," Proc. Natl. Acad. Sci., 86: 8902-8906 (1989).
`Carrano et al, "A high-resolution, fluorescence-based semi(cid:173)
`automated method for DNA fingerprinting," Genomics, 4:
`129-136 (1989).
`Szybalski et al, "Class-HS restriction enzymes-a review,"
`Gene, 100: 13-26 (1991).
`Kim et al, "Cleaving DNA at any predetermined site with
`adapter-primers and Class IIS restriction enzymes," Sci(cid:173)
`ence, 240:504-506 (1988).
`Szybalski, "Universal restriction endonucleases: designing
`novel cleavage specificities by combining adapter oligo(cid:173)
`nucleotide and enzyme moieties," Gene, 40: 169-173
`(1985).
`Barany, "The ligase chain reaction in a PCR world," PCR
`Methods and Applications, 1: 5-16 (1991).
`Wu and Wallace, "The
`ligase amplification reaction
`(LAR)-amplification of specific DNA sequences using
`sequential rounds of template-dependent ligation," Genom(cid:173)
`ics, 4: 560-569 (1989).
`McGuigan et al, "DNA fingerprinting by sampled sequenc(cid:173)
`ing," Methods in Enzymology, 218: 241-258 (1993).
`Shoemaker et al, "Quantitative phenotypic analysis of yeast
`deletion mutants using a highly parallel molecular bar-cod(cid:173)
`ing strategy," Nature Genetics, 14: 450-456 (1996).
`Kato, "Description of the entire mRNA population by a 3'
`end cDNA fragment generated by class Ils restriction
`enzymes," Nucleic Acids Research, 23: 3685-3690 (1995).
`Kato, "RNA fingerprinting by molecular indexing," Nucleic
`Acids Research, 24: 394-395 (1996).
`Broude et al, "Enhanced DNA sequencing by hybridiza(cid:173)
`tion," Proc. Natl. Acad. Sci., 91: 3072-3076 (1994).
`Hultman et al, "Direct solid phase sequencing of genomic
`and plasmid DNA using magnetic beads as solid support,"
`Nucleic Acids Research, 17: 4937-4946 (1989).
`Nikiforov et al, "Genetic bit analysis: a solid phase method
`for typing single nucleotide polymorphisms," Nucleic Acids
`Research, 22: 4167-4175 (1994).
`Berger, "Expanding the potential of restriction endonu(cid:173)
`cleases: use of hapaxoterministic enzymes," Anal. Bio(cid:173)
`chem., 222: 1-8 (1994).
`Unrau et al, "Non-cloning amplification of specific DNA
`fragments from whole genomic DNA digests using DNA
`'indexers,"' Gene, 145: 163-169 (1994).
`
`(List continued on next page.)
`
`Primary Examiner---Eggerton A. Campbell
`Attorney, Agent, or Firm-Stephen C. Macevicz
`
`[57]
`
`ABSTRACT
`
`The invention provides a method of nucleic acid sequence
`analysis based on the ligation of one or more sets of encoded
`adaptors to the terminus of a target polynucleotide. Encoded
`adaptors whose protruding strands form perfectly matched
`duplexes with the complementary protruding strands of the
`target polynucleotide are ligated, and the identity of the
`nucleotides in the protruding strands is determined by an
`oligonucleotide tag carried by the encoded adaptor. Such
`determination, or "decoding" is carried out by specifically
`hybridizing a labeled tag complement to its corresponding
`tag on the ligated adaptor.
`
`29 Claims, 10 Drawing Sheets
`
`Columbia Ex. 2025
`Illumina, Inc. v. The Trustees
`of Columbia University
`in the City of New York
`IPR2020-01177
`
`
`
`United States Patent [19J
`Albrecht et al.
`
`[54] MASSIVELY PARALLEL SIGNATURE
`SEQUENCING BY LIGATION OF ENCODED
`ADAPTORS
`
`[75]
`
`Inventors: Glenn Albrecht, Redwood City, Calif.;
`Sydney Brenner, Cambridge, United
`Kingdom; Robert B. DuBridge,
`Belmont, Calif.; David H. Lloyd, Daly
`City, Calif.; Michael C. Pallas, San
`Bruno, Calif.
`
`[73] Assignee: Lynx Therapeutics, Inc., Hayward,
`Calif.
`
`[21] Appl. No.: 08/946,138
`
`[22] Filed:
`
`Oct. 7, 1997
`
`Related U.S. Application Data
`
`[63] Continuation-in-part of application No. 08/862,610, May
`23, 1997, abandoned, which is a continuation-in-part of
`application No. 08/689,587, Aug. 12, 1996, abandoned,
`which is a continuation-in-part of application No. 08/659,
`453, Jun. 6, 1996, abandoned.
`Int. CI.7 .............................. C12Q 1/68; C07H 21/02
`[51]
`[52] U.S. Cl. ................................................ 435/6; 536/24.2
`[58] Field of Search ................... 435/6, 91.52; 536/24.2,
`536/24.3, 26.6; 935/77, 78
`
`[56]
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,237,224 12/1980 Cohen et al. ............................. 435/68
`4,293,652 10/1981 Cohen ..................................... 435/172
`4,321,365
`3/1982 Wu et al. .................................. 536/27
`4,683,202
`7/1987 Mullis ....................................... 435/91
`4,775,619 10/1988 Urdea .......................................... 435/6
`4,942,124
`7/1990 Church ........................................ 435/6
`5,093,245
`3/1992 Keith et al. ............................... 435/91
`5,102,785
`4/1992 Livak et al. ................................ 435/6
`5,118,605
`6/1992 Urdea .......................................... 435/6
`5,126,239
`6/1992 Livak et al. ................................ 435/6
`5,149,625
`9/1992 Church et al. .............................. 435/6
`5,242,794
`9/1993 Whiteley et al. ........................... 435/6
`5,366,860 11/1994 Bergot et al. ............................... 435/6
`5,503,980
`4/1996 Cantor ......................................... 435/6
`5,508,169
`4/1996 Deugau et al. ............................. 435/6
`5,512,439
`4/1996 Hornes et al. .............................. 435/6
`5,552,278
`9/1996 Brenner ....................................... 435/6
`5,599,675
`2/1997 Brenner ....................................... 435/6
`5,604,097
`2/1997 Brenner ....................................... 435/6
`5,658,736
`8/1997 Wong .......................................... 435/6
`5,707,807
`1/1998 Kato ............................................ 435/6
`5,714,330
`2/1998 Brenner et al. ............................. 435/6
`5,728,524
`3/1998 Sibson ......................................... 435/6
`
`FOREIGN PATENT DOCUMENTS
`
`0 246 864 Bl
`0 303 459 A3
`0 392 546 A2
`0 799 897 Al
`2 687 851
`WO92/15712
`WO94/01582
`WO95/20053
`WO96/12014
`
`11/1987
`2/1989
`10/1990
`10/1997
`5/1994
`9/1992
`1/1994
`7/1995
`4/1996
`
`European Pat. Off ..
`European Pat. Off ..
`European Pat. Off ..
`European Pat. Off ..
`France .
`WIPO.
`WIPO.
`WIPO.
`WIPO.
`
`I 1111111111111111 11111 1111111111 1111111111 111111111111111 lll111111111111111
`US006013445A
`[11] Patent Number:
`[45] Date of Patent:
`
`6,013,445
`Jan.11,2000
`
`OTHER PUBLICATIONS
`
`Brenner and Livak, "DNA fingerprinting by sampled
`sequencing," Proc. Natl. Acad. Sci., 86: 8902-8906 (1989).
`Carrano et al, "A high-resolution, fluorescence-based semi(cid:173)
`automated method for DNA fingerprinting," Genomics, 4:
`129-136 (1989).
`Szybalski et al, "Class-HS restriction enzymes-a review,"
`Gene, 100: 13-26 (1991).
`Kim et al, "Cleaving DNA at any predetermined site with
`adapter-primers and Class IIS restriction enzymes," Sci(cid:173)
`ence, 240:504-506 (1988).
`Szybalski, "Universal restriction endonucleases: designing
`novel cleavage specificities by combining adapter oligo(cid:173)
`nucleotide and enzyme moieties," Gene, 40: 169-173
`(1985).
`Barany, "The ligase chain reaction in a PCR world," PCR
`Methods and Applications, 1: 5-16 (1991).
`Wu and Wallace, "The
`ligase amplification reaction
`(LAR)-amplification of specific DNA sequences using
`sequential rounds of template-dependent ligation," Genom(cid:173)
`ics, 4: 560-569 (1989).
`McGuigan et al, "DNA fingerprinting by sampled sequenc(cid:173)
`ing," Methods in Enzymology, 218: 241-258 (1993).
`Shoemaker et al, "Quantitative phenotypic analysis of yeast
`deletion mutants using a highly parallel molecular bar-cod(cid:173)
`ing strategy," Nature Genetics, 14: 450-456 (1996).
`Kato, "Description of the entire mRNA population by a 3'
`end cDNA fragment generated by class Ils restriction
`enzymes," Nucleic Acids Research, 23: 3685-3690 (1995).
`Kato, "RNA fingerprinting by molecular indexing," Nucleic
`Acids Research, 24: 394-395 (1996).
`Broude et al, "Enhanced DNA sequencing by hybridiza(cid:173)
`tion," Proc. Natl. Acad. Sci., 91: 3072-3076 (1994).
`Hultman et al, "Direct solid phase sequencing of genomic
`and plasmid DNA using magnetic beads as solid support,"
`Nucleic Acids Research, 17: 4937-4946 (1989).
`Nikiforov et al, "Genetic bit analysis: a solid phase method
`for typing single nucleotide polymorphisms," Nucleic Acids
`Research, 22: 4167-4175 (1994).
`Berger, "Expanding the potential of restriction endonu(cid:173)
`cleases: use of hapaxoterministic enzymes," Anal. Bio(cid:173)
`chem., 222: 1-8 (1994).
`Unrau et al, "Non-cloning amplification of specific DNA
`fragments from whole genomic DNA digests using DNA
`'indexers,"' Gene, 145: 163-169 (1994).
`
`(List continued on next page.)
`
`Primary Examiner---Eggerton A. Campbell
`Attorney, Agent, or Firm-Stephen C. Macevicz
`
`[57]
`
`ABSTRACT
`
`The invention provides a method of nucleic acid sequence
`analysis based on the ligation of one or more sets of encoded
`adaptors to the terminus of a target polynucleotide. Encoded
`adaptors whose protruding strands form perfectly matched
`duplexes with the complementary protruding strands of the
`target polynucleotide are ligated, and the identity of the
`nucleotides in the protruding strands is determined by an
`oligonucleotide tag carried by the encoded adaptor. Such
`determination, or "decoding" is carried out by specifically
`hybridizing a labeled tag complement to its corresponding
`tag on the ligated adaptor.
`
`29 Claims, 10 Drawing Sheets
`
`
`
`6,013,445
`Page 2
`
`OIBER PUBLICATIONS
`
`Gronostajski, "Site-specific DNA binding of nuclear factor
`I: effect ofthe spacer region," Nucleic Acids Research, 15:
`5545-5559 (1987).
`Wiaderkiewicz et al, "Mismatch and blunt to protuding end
`joining by DNA ligases," Nucleic Acids Research, 15:
`7831-7848 (1987).
`Tsiapalis et al, "On the fidelity of phage T4-induced poly(cid:173)
`nucleotide ligase in the joining of chemically synthesized
`
`deoxyribooligonucleotides," Biochem. Biophys. Res.
`Comm., 39:631-636 (1970).
`Matteucci et al, "Targeted random mutagenesis: the use of
`ambigously synthesized oligonucleotides to mutagenize
`sequences immediately 5' of an ATG initiation condon,"
`Nucleic Acids Research, 11: 3113-3121 (1983).
`Hensel et al, "Simultaneous identification of bacterial viru(cid:173)
`lence genes by negative selection," Science, 269: 400---403
`(1995).
`
`
`
`U.S. Patent
`
`Jan.11,2000
`
`Sheet 1 of 10
`
`6,013,445
`
`t1
`t1
`t1
`t1
`t1
`t1
`
`t2
`t2
`t2
`t2
`t2
`t2
`
`tk
`tk
`tk
`tk
`tk
`tk
`
`Sample(10)
`
`'
`t
`
`Amplify & Prepare Ends (12)
`
`•
`•
`•
`
`•
`•
`•
`
`14
`
`16
`
`18
`
`! Ligate Cleavage Adaptors (20)
`
`Fig. lA
`
`
`
`U.S. Patent
`
`Jan.11,2000
`
`Sheet 2 of 10
`
`6,013,445
`
`t1
`t1
`
`t1
`t1
`
`t1
`t1
`
`t2
`t2
`
`t2
`t2
`
`t2
`t2
`
`tk
`tk
`
`tk
`tk
`
`tk
`tk
`
`l
`
`•
`•
`•
`
`]
`]
`]
`
`A1
`
`A2
`
`A3
`
`]
`]
`]
`
`A1
`
`A2
`
`A3
`
`•
`•
`•
`
`A1
`
`A2
`
`A3
`
`]
`]
`]
`i Cleave with A1 endonuclease &
`
`Ligate first Set of encoded probes (22)
`
`Fig. 1B
`
`
`
`U.S. Patent
`
`Jan.11,2000
`
`Sheet 3 of 10
`
`6,013,445
`
`t1
`t1
`
`t1
`t1
`
`t1
`t1
`
`t2
`t2
`
`t2
`t2
`
`t2
`t2
`
`l
`--~___,JJ A2
`======c~] Al
`
`•
`•
`•
`
`Wll'/llffefl"M Ts
`vrullll/ftft4 Ts
`
`-------c====i] A2
`-------c====i] ~
`
`•
`•
`•
`
`~32
`
`vuft/ffH7/llJ Ts&
`Vllllll/tr/HA T 56
`
`------==:::::J] A2
`------i::::::::::::J] Al
`
`Cleave with A2. endonuclease &
`Ligate second set of encoded
`probes(34)
`Fig. lC
`
`
`
`U.S. Patent
`
`Jan.11,2000
`
`Sheet 4 of 10
`
`6,013,445
`
`W.i'«rDH/HA T24
`Wll'/ll/ll&A T2~ 36
`----------.t:==12w.zz~Z2'/I.ZZ~ZZ'/I.2':;~?2w.Z~~ T 15
`vmrm)'nrUJ T1s
`
`t2
`t2
`
`t2
`t2
`
`t2
`t2
`
`•
`•
`•
`
`•
`•
`•
`
`Ts
`Ts
`38
`~
`
`VL7m7ffm7/AI T11
`W/llll//m)')'J T 11
`
`uurn&.,llA1 Tss
`Vllll/ll//HM Tss
`--------..[==Je~z2uzz~zzuz;z?2'/I.z~~0zi<i:1 T33
`fw//Anft/m T33
`
`,,,,..-------
`
`40
`
`i
`
`Fig. lD
`
`Cleave vvilh A3 endonuclease &
`Ligate third set of encoded
`probes(42)
`
`
`
`U.S. Patent
`
`Jan.11,2000
`
`Sheet 5 of 10
`
`6,013,445
`
`wnfl7/rnrA f 24
`w/ff//HllftA f 24
`
`w7/uA«ffeA T 1 s
`44
`vu/ll/ffll'Flll T~
`---------..ic::=::::10z0~02uz~z0~02uz~z0~0221 T 41
`w/HH/u/@ T 41
`
`•
`•
`•
`
`vnmm/nA Ts
`wrllllffn//4 Ts
`
`iWllllll/um T 11
`WU7¥¥ft7A T 11
`
`-----------.c::==JB~z~z2uz~z~~z2uz~~z;~mZ1 T22 ~ 46
`
`W/urnmuA T22
`
`•
`•
`•
`
`Wrft7ffmn4 Tss
`V¥/ll7/ll/ftA Tss
`
`vuu/nftVA T33
`wAwnfl"ft/4 T33
`
`________ .._ __ .r..lz!:'.i!'.Z.:.::'//.~'.,1";~2,!:'.i!z:.::u~~~z~z;,t;wia T101 ~ 4S
`
`Wftfff@nAI T 101
`
`Load onto solid phase
`support& hybridize 1ag
`complements (50)
`
`Fig. lE
`
`t2
`t2
`
`t2
`t2
`
`t2
`t2
`
`tk
`tk
`
`
`
`U.S. Patent
`
`Jan.11,2000
`
`Sheet 6 of 10
`
`6,013,445
`
`112
`
`GCGCp
`
`110
`
`c,Q
`
`(j ~\
`
`114
`
`pCGCG
`GCGCp
`
`' 116
`
`Fig. 2
`
`
`
`U.S. Patent
`
`Jan.11,2000
`
`Sheet 7 of 10
`
`6,013,445
`
`124
`j/
`
`128:)
`
`130
`
`z------(cid:173)
`(122
`s·
`3,
`1
`-------NNNN+
`pN'N'N'N'-----...... -
`12s .J
`I
`t Ligate (120)
`-------NNNNz------
`t Wash (132)
`- - - - - - N'N'N'N'-------
`z-------
`- - - - - - -N N N N
`-------N'N'N'N'-------
`t Phosphorylate (134)
`
`-------NNNNn
`-------N'N'N'Nr ______ _
`I
`f
`
`140
`HO 3_'_....,.. ___ §Y...,L_
`
`Ligate (136)
`
`c142
`/
`------NNNNp
`-------N'N'N'N-------
`
`Identify (144) t
`
`NNNN
`N'N'N'N'
`'-=
`
`z.,
`138
`
`Fig. 3A
`
`140
`
`§Y
`
`
`
`U.S. Patent
`
`Jan.11,2000
`
`Sheet 8 of 10
`
`6,013,445
`
`<(30
`
`250
`
`/
`
`224
`
`226
`
`-------NNN~z-------
`-------N'N'N'N'-------
`
`-------NNNNp
`-------N'N'N'N'-------
`3'
`Ligate (236)
`HO
`JI'
`
`c242
`
`©
`
`\
`(22
`_______ NNN~•+ pN'N'N'tf.3' Jw2283
`f Ligate (220)
`i Wash (232)
`-------~-~-~-~- z PZT/21
`i Phosphorylate (234)
`i
`Identify/Cleave (244/252) i
`I ~ ©
`254 ~ i
`
`258
`
`NNNNp
`N'N'N'N'
`
`250
`
`I NN 1NN
`______ ... N'N'N'N'
`
`238
`Wash
`©
`-------N N p p-N'N'~'~.--e-v-z-z-/1--....... -
`f Dephosphorylate (256)
`-------NN
`
`Fig. 3B
`
`
`
`U.S. Patent
`
`Jan.11,2000
`
`Sheet 9 of 10
`
`6,013,445
`
`(!)<(t-(.)
`r:a B (cid:143) m
`
`I
`
`t-' I
`
`I
`(.)
`
`I
`(.!)
`I
`I
`I
`
`:<e
`
`I-
`
`I (!)
`
`I
`I
`I
`: (!)
`
`(!)
`
`~
`
`I-
`
`I
`I(!)
`
`0
`0
`'1111:t'
`
`0
`ll)
`M
`
`0
`0
`0
`0
`0
`ll)
`M
`N
`N
`Relative Fluorescence
`
`0
`0
`~
`
`0
`ll)
`
`0
`
`... Cl)
`.c
`E
`::I z
`,,
`.!!
`0
`::I z
`
`.. 0
`
`Cl)
`
`'qi
`•
`tJ>
`•r-1
`lzl
`
`
`
`U.S. Patent
`
`Jan.11,2000
`
`Sheet 10 of 10
`
`6,013,445
`
`538
`
`536
`
`534
`532
`
`computer
`
`CCD
`
`528
`
`microscope
`
`526
`
`#"'500
`
`Fig. 5
`
`506
`
`514
`
`516
`
`, , ,
`
`
`
`6,013,445
`
`1
`MASSIVELY PARALLEL SIGNATURE
`SEQUENCING BY LIGATION OF ENCODED
`ADAPTORS
`
`This is a continuation-in-part of abandoned U.S. patent
`application Ser. No. 08/862,610 filed May 23, 1997, which
`is a continuation-in-part of abandoned U.S. patent applica(cid:173)
`tion Ser. No. 08/689,587 filed Aug. 12, 1996, which is a
`continuation-in-part of abandoned U.S. patent application
`Ser. No. 08/659,453 filed Jun. 6, 1996.
`
`FIELD OF THE INVENTION
`
`2
`ing steps very difficult. The accumulation of protein also
`affects molecular reporter systems, particularly those
`employing fluorescent labels, and renders the interpretation
`of measurements based on such systems difficult and incon-
`5 venient. These and similar difficulties have significantly
`slowed the application of "base-by-base" sequencing
`schemes to parallel sequencing efforts.
`An important advance in base-by-base sequencing tech(cid:173)
`nology could be made, especially in automated systems, if
`10 an alternative approach was available for determining the
`terminal nucleotides of polynucleotides that minimized or
`eliminated repetitive processing cycles employing multiple
`enzymes.
`
`The invention relates generally to methods for determin(cid:173)
`ing the nucleotide sequence of a polynucleotide, and more
`particularly, to a method of identifying terminal nucleotides 15
`of a polynucleotide by specific ligation of encoded adaptors.
`
`BACKGROUND
`
`25
`
`The DNA sequencing methods of choice for nearly all
`scientific and commercial applications are based on the
`dideoxy chain termination approach pioneered by Sanger,
`e.g. Sanger et al, Proc. Natl. Acad. Sci., 74: 5463-5467
`(1977). The method has been improved in several ways and,
`in a variety of forms, is used in all commercial DNA
`sequencing instruments, e.g. Hunkapiller et al, Science, 254:
`59-67 (1991).
`The chain termination method requires the generation of
`one or more sets of labeled DNA fragments, each having a
`common origin and each terminating with a known base.
`The set or sets of fragments must then be separated by size
`to obtain sequence information. The size separation is usu(cid:173)
`ally accomplished by high resolution gel electrophoresis,
`which must have the capacity of distinguishing very large
`fragments differing in size by no more than a single nude- 35
`otide. Despite many significant improvements, such as sepa(cid:173)
`rations with capillary arrays and the use of non-gel electro(cid:173)
`phoretic separation mediums, the technique does not readily
`lend itself to miniaturization or to massively parallel imple(cid:173)
`mentation.
`As an alternative to the Sanger-based approaches to DNA
`sequencing, several so-called "base-by-base" or "single
`base" sequencing approaches have been explored, e.g.
`Cheeseman, U.S. Pat. No. 5,302,509; Tsien et al, Interna(cid:173)
`tional application WO 91/06678; Rosenthal et al, Interna(cid:173)
`tional application WO 93/21340; Canard et al, Gene, 148:
`1-6(1994); and Metzker et al, Nucleic Acids Research, 22:
`4259-4267 (1994). These approaches are characterized by
`the determination of a single nucleotide per cycle of chemi(cid:173)
`cal or biochemical operations and no requirement of a
`separation step. Thus, if they could be implemented as
`conceived, "base-by-base" approaches promise the possibil-
`ity of carrying out many thousands of sequencing reactions
`in parallel, for example, on target polynucleotides attached
`to microparticles or on solid phase arrays, e.g. International
`patent application PCT/US95/12678.
`Unfortunately, "base-by-base" sequencing schemes have
`not had widespread application because of numerous
`problems, such as inefficient chemistries which prevent
`determination of any more than a few nucleotides in a
`complete sequencing operation. Moreover, in base-by-base
`approaches that require enzymatic manipulations, further
`problems arise with instrumentation used for automated
`processing. When a series of enzymatic steps are carried out
`in reaction chambers having high surface-to-volume ratios
`and narrow channel dimensions, enzymes may stick to
`surface components making washes and successive process-
`
`SUMMARY OF THE INVENTION
`Accordingly, an object of our invention is to provide a
`DNA sequencing scheme which does not suffer the draw(cid:173)
`backs of current base-by-base approaches.
`Another object of our invention is to provide a method of
`20 DNA sequencing which is amenable to parallel, or
`simultaneous, application to thousands of DNA fragments
`present in a common reaction vessel.
`A further object of our invention is to provide a method
`of DNA sequencing which permits the identification of a
`terminal portion of a target polynucleotide with minimal
`enzymatic steps.
`Yet another object of our invention is to provide a set of
`encoded adaptors for identifying the sequence of a plurality
`30 of terminal nucleotides of one or more target polynucle(cid:173)
`otides.
`Our invention provides these and other objects by pro-
`viding a method of nucleic acid sequence analysis based on
`the ligation of one or more sets of encoded adaptors to a
`terminus of a target polynucleotide ( or to the termini of
`multiple target polynucleotides when used in a parallel
`sequencing operation). Each encoded adaptor comprises a
`protruding strand and an oligonucleotide tag selected from a
`minimally cross-hybridizing set of oligonucleotides.
`40 Encoded adaptors whose protruding stands form perfectly
`matched duplexes with the complementary protruding
`strands of the target polynucleotide are ligated. After
`ligation, the identity and ordering of the nucleotides in he
`protruding strands are determined, or "decoded," by spe-
`45 cifically hybridizing a labeled tag complement to its corre(cid:173)
`sponding tag on the ligated adaptor.
`For example, if an encoded adaptor with a protruding
`strand of four nucleotides, say 5'-AGGT, form a perfectly
`matched duplex with the complementary protruding strand
`50 of a target polynucleotide and is ligated, the four comple(cid:173)
`mentary nucleotides, 3'-TCCA, on the polynucleotide may
`be identified by a unique oligonucleotide tag selected form
`a set of 256 such tags, one for every possible four nucleotide
`sequence of the protruding strands. Tag complements are
`55 applied to the ligated adaptors under conditions which allow
`specific hybridization of only those tag complements that
`form perfectly matched duplexes (or triplexes) with the
`oligonucleotide tags of the ligated adaptors. The tag comple(cid:173)
`ments may be applied individually or as one or more
`60 mixtures to determine the identity of the oligonucleotide
`tags, and therefore, the sequences of the protruding strands.
`As explain more fully below, the encoded adaptors may
`be used in sequence analysis either i) to identify one or more
`nucleotides as a step of a process that involves repeated
`65 cycles of ligation, identification, and cleavage, as described
`in Brenner U.S. Pat. No. 5,599,675, or ii) as a "stand alone"
`identification method, wherein sets of encoded adaptors are
`
`
`
`3
`applied to target polynucleotides such that each set is
`capable of identifying the nucleotide sequence of a different
`portion of a target polynucleotide; that is, in the latter
`embodiment, sequence analysis is carried out with a single
`ligation for each set followed by identification.
`An important feature of the encoded adaptors is the use of
`oligonucleotide tags that are members of a minimally cross(cid:173)
`hybridizing set of oligonucleotides, e.g. as described in
`International patent applications PCT /US95/12791 and
`PCT/US96/09513. The sequences of oligonucleotides of 10
`such a set differ from the sequences of every other member
`of the same set by at least two nucleotides. Thus, each
`member of such a set cannot form a duplex ( or triplex) with
`the complement of any other member with less than two
`mismatches. Preferably, each member of a minimally cross- 15
`hybridizing set differs from every other member by as much
`nucleotides as possible consistent with the size of set
`required for a particular application. For example, where
`longer oligonucleotide tags are used, such as 12- to 20-mers
`for delivering labels to encoded adaptors, then the difference 20
`between members of a minimally cross-hybridizing set is
`preferably significantly greater than two. Preferably, each
`member of such a set differs from every other member by at
`least four nucleotides. More preferably, each member of
`such a set differs from every other member by at least six 25
`nucleotides. Complements of oligonucleotide tags of the
`invention are referred to herein as "tag complements."
`Oligonucleotide tags may be single stranded and be
`designed for specific hybridization to single stranded tag
`complements by duplex formation. Oligonucleotide tags 30
`may also be double stranded and be designed for specific
`hybridization to single stranded tag complements by triplex
`formation. Preferably, the oligonucleotide tags of the
`encoded adaptors are double stranded and their tag comple(cid:173)
`ments are single stranded, such that specific hybridization of 35
`a tag with its complements occurs through the formation of
`a triplex structure.
`Preferably, the method of the invention comprises the
`following steps: (a) ligating an encoded adaptor to an end of
`a polynucleotide, the adaptor having an oligonucleotide tag 40
`selected from a minimally cross-hybridizing set of oligo(cid:173)
`nucleotides and a protruding strand complementary to a
`protruding strand of the polynucleotide; and (b) identifying
`one or more nucleotides in the protruding strand of the
`polynucleotide by specifically hybridizing a tag complement 45
`to the oligonucleotide tag of the encoded adaptor.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`FIGS. la-le diagrammatically illustrate the use of
`encoded adaptors to determine the terminal nucleotides
`sequences of a plurality of tagged polynucleotides.
`FIG. 2 illustrates the phenomena of self-ligation of iden(cid:173)
`tical polynucleotides that are anchored to a solid phase
`support.
`FIG. 3a illustrates steps in a preferred method of the
`invention in which a double stranded adaptor having a
`blocked 3' carbon is ligated to a target polynucleotide.
`FIG. 3b illustrates the use of the preferred embodiment in
`a method of DNA sequencing by stepwise cycles of ligation
`and cleavage.
`FIG. 4 illustrates data from the determination of the
`terminal nucleotides of a test polynucleotides using the
`method of the present invention.
`FIG. 5 is a schematic representation of a flow chamber
`and detection apparatus for observing a planar array of
`microparticles loaded with cDNAmolecules for sequencing.
`
`6,013,445
`
`4
`DEFINITIONS
`
`As used herein, the term "encoded adaptor" is used
`synonymously with the term "encoded probe" of priority
`5 document U.S. patent application Ser. No. 08/689,587.
`As used herein, the term "ligation" means the formation
`of a covalent bond between the ends of one or more (usually
`two) oligonucleotides. The term usually refers to the forma(cid:173)
`tion of a phosphodiester bond resulting from the following
`reaction, which is usually catalyzed by a ligase:
`
`oligo 1(5')---0P(O-)(~O)O+H0-(3')oligor5~-oligo 1(5')-
`0P(0-)(~0)0-(3')oligor5'
`
`where oligo 1 and oligo 2 are either two different oligonucle(cid:173)
`otides or different ends of the same oligonucleotide. The
`term encompasses non-enzymatic formation of phosphodi(cid:173)
`ester bonds, as well as the formation of non-phosphodiester
`covalent bonds between the ends of oligonucleotides, such
`as phosphorothioate bonds, disulfide bonds, and the like. A
`ligation reaction is usually template driven, in hat the ends
`of oligo 1 and oligo 2 are brought into juxtaposition by
`specific hybridization to a template strand. A special case of
`template-driven ligation is the ligation of two double
`stranded oligonucleotides having complementary protruding
`strands.
`"Complement" or "tag complement" as used herein in
`reference to oligonucleotide tags refers to an oligonucleotide
`to which a oligonucleotide tag specifically hybridizes to
`form a perfectly matched duplex or triplex. In embodiment
`where specific hybridization results in a triplex, the oligo-
`nucleotide tag may be selected to be either double stranded
`or single stranded. Thus, where triplexes are formed, the
`term "complement" is meant to encompass either a double
`stranded complement of a single stranded oligonucleotide
`tag or a single stranded complement of a double stranded
`oligonucleotide tag.
`The term "oligonucleotide" as used herein includes linear
`oligomers of natural or modified monomers or linkages,
`including deoxyribonucleosides, ribonucleosides, anomeric
`forms thereof, peptide nucleic acids (PNAs), and the like,
`capable of specifically binding to a target polynucleotide by
`way of a regular pattern of monomer-to-monomer
`interactions, such as Watson-Crick type of base pairing, base
`stacking, Hoogsteen or reverse Hoogsteen types of base
`pairing, or the like. Usually monomers are linked by phos-
`phodiester bonds or analogs thereof to form oligonucleotides
`ranging in size from a few monomeric units, e.g. 3-4, to
`several tens of monomeric units, e.g. 40-60. Whenever an
`50 oligonucleotide is represented by a sequence of letters, such
`as "ATGCCTG," it will be understood that the nucleotides
`are in 5'----;.3' order from left to right and the "A" denotes
`deoxyadenosine, "C" denotes deoxycytidine, "G" denotes
`deoxyguanosine, an d"T" denotes thymidine, unless other-
`55 wise noted. Usually oligonucleotides of the invention com(cid:173)
`prise the four natural nucleotides; however, they may also
`comprise non-natural nucleotide analogs. It is clear to those
`skilled in the art when oligonucleotides having natural or
`non-natural nucleotides may be employed, e.g. where pro-
`60 cessing by enzymes is called for, usually oligonucleotides
`consisting of natural nucleotides are required.
`"Perfectly matched" in reference to a duplex means that
`the poly- or oligonucleotide strands making up the duplex
`form a double stranded structure with one other such that
`65 every nucleotide in each strand undergoes Watson-Crick
`basepairing with a nucleotide in the other strand. The term
`also comprehends the pairing of nucleoside analogs, such as
`
`
`
`6,013,445
`
`5
`deoxyinosine, nucleosides with 2-aminopurine bases, and
`the like, that may be employed. In reference to a triplex, the
`term means that the triplex consists of a perfectly matched
`duplex and a third strand in which every nucleotide under(cid:173)
`goes Hoogsteen or reverse Hoogsteen association with a
`basepair of the perfectly matched duplex. Conversely, a
`"mismatch" in a duplex between a tag and an oligonucle(cid:173)
`otide means that a pair of triplet of nucleotides in the duplex
`or triplex fails to undergo Watson-Crick and/or Hoogsteen
`and/or reverse Hoogsteen bonding.
`As used herein, "nucleoside" includes the natural
`nucleosides, including 2'-deoxy and 2'-hydroxyl forms, e.g.
`as described in Kornberg and Baker, DNA Replication, 2nd
`Ed. (Freeman, San Francisco, 1992). "Analogs" in reference
`to