`Investigation of Vision-related Proteins Using Synthetic
`Chemistry
`
`Shenglong ZHANG
`
`Submitted in partial fulfillment of the
`requirements for the degree
`of Doctor of Philosophy
`in the Graduate School of Arts and Sciences
`
`COLUMBIA UNIVERSITY
`
`2008
`
`Page a
`
`Illumina Ex. 1090
`IPR Petition - USP 10,435,742
`
`
`
`UMI Number: 3305285
`
`INFORMATION TO USERS
`
`The quality of this reproduction is dependent upon the quality of the copy
`submitted. Broken or indistinct print, colored or poor quality illustrations and
`photographs, print bleed-through, substandard margins, and improper
`alignment can adversely affect reproduction.
`In the unlikely event that the author did not send a complete manuscript
`and there are missing pages, these will be noted. Also, if unauthorized
`copyright material had to be removed, a note will indicate the deletion.
`
`®
`
`UMI
`
`UMI Microform 3305285
`
`Copyright 2008 by ProQuest LLC.
`
`All rights reserved. This microform edition is protected against
`
`unauthorized copying under Title 17, United States Code.
`
`ProQuest LLC
`789 E. Eisenhower Parkway
`PO Box 1346
`Ann Arbor, Ml 48106-1346
`
`Page b
`
`
`
`©2008
`
`Shenglong ZHANG
`
`All Rights Reserved
`
`Page c
`
`
`
`ABSTRACT
`
`Development of New DNA Sequencing Approaches and
`Investigation of Vision-related Proteins Using Synthetic
`Chemistry
`
`Shenglong ZHANG
`
`The fist part of this thesis presents our efforts to investigate vision-related
`
`proteins in visual cycle. The synthesis of a seven-membered ring locked analogue
`
`of 11-as-retinol, tethered to a cross-linking moiety on C-15 and a lysine extended
`
`biotin on the C-3, was accomplished for its utilization as a probe to isolate retinol
`
`binding proteins that may be involved in the reisomerization of retinal from all
`
`trans to 11-ds in the visual cycle.
`
`The second part of this thesis focuses on development of new DNA
`
`sequencing approaches using the concept of DNA sequencing by synthesis (SBS).
`
`In DNA SBS method, four nucleotide reversible terminators (A, C, G, T) are the
`
`most critical molecular tools for the efficient DNA polymerase reaction on a
`
`massive parallel DNA-sequencing chip. First, we designed and synthesized four
`
`nucleotides as reversible terminators by attaching a cleavable fluorophore to the
`
`nucleobase and capping the 3'-OH with a small reversible chemical moiety so that
`
`they are still recognized by DNA polymerase as substrates. After the cleavable
`
`Page d
`
`
`
`fluorescent nucleotides were incorporated into a growing DNA strand,
`
`the
`
`fluorophore and the 3'-OH capping group on a DNA extension product, were
`
`removed by photocleavage or in a chemical manner. This allows the re-initiation of
`
`DNA polymerase reaction and continuation of DNA SBS to the next cycle for
`
`incorporation of another fluorescent reversible nucleotide. The efficiency
`
`in
`
`cleaving the fluorophore and the 3'-OH capping moiety plays a crucial role in
`
`DNA SBS approach, and directly determines the read length of this methodology.
`
`To improve the efficiency of DNA SBS, we explored a variety of different chemical
`
`moieties as a 3'-OH capping group as well as the different cleavable linker
`
`bridging the nucleobase and the fluorophore.
`
`The work led to the construction of a 4-color nucleotide library consisting of
`
`multiple functionalized nucleotides each with a suitable 3'-OH capping moiety
`
`and a cleavable fluorescent fluorophore. Also, we constructed a 3'-0-labeled
`
`nucleotide library by design and synthesis of nucleotide reversible terminators
`
`with different 3'-OH capping moieties. The above nucleotide libraries have been
`
`screened and evaluated as reversible terminators using the DNA polymerase
`
`reaction both in solution phase and on a DNA-immobilized chip. Furthermore,
`
`photocleavable
`
`nucleotide
`
`reversible
`
`terminators
`
`and
`
`fluorescent
`
`dideoxynucleotides were synthesized for the development of a chip-based Sanger-
`
`SBS hybrid DNA sequencing method. Finally, we solved the homopolymeric
`
`problems of DNA pyrosequencing by using
`
`the 3'-0-modified
`
`reversible
`
`terminators.
`
`Page e
`
`
`
`TABLE OF CONTENTS
`
`Table ot Contents
`
`List of Figures
`
`List of Schemes and Tables
`
`Abbreviation and Symbols
`
`Acknowledgements
`
`Dedication
`
`vi
`
`xiii
`
`xvii
`
`xx
`
`xxiii
`
`PART 1: SYNTHESIS OF AN ll-C/S-LOCKED-BIOTLNYLATED RETINOID FOR
`SEQUESTERING 11-CZS-RETTNOID BINDING PROTEINS
`
`1.1. Introduction
`
`1.2 Results and discussion
`
`1.3 Conclusion
`
`1.4 Experiment section
`
`References:
`
`1
`
`3
`
`8
`
`8
`
`21
`
`PART 2: DEVELOPMENT OF NEW DNA SEQUENCING APPROACHES
`USING SYNTHETIC CHEMISTRY
`
`i
`
`
`
`Chapter 1: Introduction to DNA sequencing by synthesis
`
`1.1. Background
`
`1.2 The basics of DNA sequencing
`
`1.3 Sanger's dideoxy DNA sequencing
`
`1.4 DNA sequencing based on mass spectrometry
`
`1.5 NanoporeDNA sequencing
`
`1.6 DNA sequencing by synthesis (SBS)
`
`References:
`
`Chapter 2: Construction of a four-color nucleotide library for DNA SBS
`
`2.1 Introduction
`
`2.2 Design and synthesis of 4-color photocleavable fluorescent nucleotides with
`
`photoreversible moieties at 3'-oxgen (3'-0-PC-dNTPs-PC-fluorophore) for DNA
`
`SBS
`
`2.2.1 Background
`
`2.2.2 Experimental rational and design
`
`2.2.3 Synthesis of 4-colored photocleavable fluorescent nucleotides
`
`2.2.3.1 Challenges in synthesizing photocleavable reversible
`
`nucleotides
`
`2.2.3.2 Direct protecting strategy for synthesis of 3'-0-PC-dTTP 1
`
`2.2.3.3 Establishment of the feasibility of using 2-nitrobenzyl
`
`group as a reversible capping moiety at3'-end
`
`2.2.3.4 Indirect protecting strategy for synthesis of
`
`photocleavable reversible fluorescent nucleotides 3'-0-
`
`PC-dCTP-PC-Bodipy-FL-510 2 and 3'-0-PC-dUTP-PC-
`
`11
`
`23
`
`25
`
`29
`
`32
`
`34
`
`38
`
`45
`
`50
`
`61
`
`61
`
`63
`
`66
`
`67
`
`69
`
`70
`
`
`
`R6G3
`
`73
`
`2.2.3.5 Indirect protecting strategy for synthesis of photo-cleavable reversible
`
`fluorescent nucleotide 3'-0-PC-dGTP-PC-Bodipy-CY5 4 and 3'-0-PC-
`dGTP-PC-Bodipy 650
`82
`
`2.2.3.6 Synthesis of photocleavable reversible fluorescent
`
`nucleotide 3'-0-PC-dATP-PC-ROX 5
`
`2.3 Design and synthesis of 4-color chemically cleavable fluorescent nucleotides
`
`with chem-reversible moieties at 3'-oxgen for DNA SBS
`
`2.3.1 Background
`
`2.3.2 Experimental rational and design
`
`2.3.3 Synthesis of 4-color chemically cleavable fluorescent NRTs 3'-
`
`0-N3-dNTPs-N3(short)-fluorophore and 3'-0-
`
`N3-dNTPs-N3(long)-fluorophore
`
`2.3.3.1 A versatile method for azidomethylation of nucleoside hydoxyl
`
`2.3.3.2 Synthesis of 4-color chemically cleavable fluorescent nucleotides 3'-0-
`
`azidomethyl-dUTPs-Azido-R6G
`
`2.3.3.3 Synthesis of 4-color chemically cleavable fluorescent nucleotides 3'-0-
`
`azidomethyl-dCTPs-Azido-Bodipy-FL
`
`2.3.3.4 Synthesis of 4-color chemically cleavable fluorescent nucleotides 3'-0-
`
`azidomethyl-dATPs-Azido-ROX
`
`2.3.3.5 Synthesis of 4-color chemically cleavable fluorescent nucleotides 3'-0-
`
`azidomethyl-dGTPs-Azido-CY5
`
`2.3.4 4-color DNA sequencing on a chip using mixture of 3'-0-Ri-dNTPs-
`
`R2-fluorophoreand3'-OR-dNTPasNRTs
`
`2.4 Design and synthesis of 4-color photocleavable fluorescent nucleotides with small
`
`chemical reversible moieties at 3'-oxgen for DNA SBS
`
`2.4.1 Introduction
`
`2.4.2 Experimental rational and design
`
`2.4.3 Synthesis of 3'-O-allyl-dNTPs-PC-fluorophore
`
`2.4.4 Synthesis of 3'-O-azidomethyl-dNTPs-PC-fluorophore
`
`2.4.5 Synthesis of 3'-0-(tert-butyl-dithiomethyl-dUTP-PC-R6G
`
`2.5 Design and synthesis of 4-color photocleavable and chem-cleavable fluorescent
`
`nucleotides as irreversible terminators for Sanger-SBS hybrid DNA Sequencing
`
`2.5.1 Introduction and background
`
`90
`
`93
`
`93
`
`97
`
`100
`
`100
`
`103
`
`107
`
`I ll
`
`114
`
`116
`
`120
`
`120
`
`121
`
`122
`
`129
`
`130
`
`132
`
`132
`
`ill
`
`
`
`2.5.2 Synthesis of 4-color photocleavable fluorescent
`
`dideoxynucleotides based on photolabile 2-nitrobenzyl moiety
`
`2.5.3 Polymerase extension using ddNTPs-PC-fluorophore teminators
`
`and charaterization by MALDI-TOF MS
`
`2.5.4 4-color Sanger-SBS hybrid DNA sequencing on a chip
`
`2.6 Experimental Section
`
`References:
`
`134
`
`136
`
`138
`
`142
`
`186
`
`Chapter 3: Construction of 3'-0-labeled nucleotide library for DNA sequencing
`
`3.1 Introduction
`
`3.2 Synthesis of 3'-0-(2-nitrobenzyl)-2'-deoxynucleotide
`
`3.2.1 Synthesis of 3'-0-(2-nitroben2yl)-dGTP for DNA SBS
`
`3.2.3.1 Background
`
`3.2.1.2 Synthesis of 3'-O-PC-dGTP
`
`190
`
`192
`
`192
`
`193
`
`194
`
`3.2.1.3 Verification of 3'-O-PC-dGTP 1 as a good reversible terminator for DNA
`
`SBS
`
`3.2.2 Synthesis of 3'-0-PC-dTTP for DNA SBS
`
`3.2.3 Synthesis of 3'-0-PC-dATP for DNA SBS
`
`3.2.4 Synthesis of 3'-0-PC-dCTP for DNA SBS
`
`3.2.5 Polymerase extension using 3'-0-PC-dNTPs and
`
`characterization by MALDI-TOF MS
`
`3.2.6 Continous polymerase extension using 3'-0-PC-dNTP NRTs
`
`and characterization by MALDI-TOF MS
`
`3.3 Synthesis of 3'-O-azidomethyl-dNTPs
`
`3.3.1 Synthesis of 3'-0-azidomethyl-dATP, 3'-0-azidomethyl-dCTP and
`3'-0-azidomethyl-dUTP for DNA SBS
`
`3.3.2 Synthesis of 3'-0-azidomethyl-dGTP for DNA SBS
`
`3.3.3 Evaluation of 3'-0-N3-dNTPs as NRTs for DNA SBS in solution
`-phase DNA polymerase reaction
`
`3.4 Synthesis of 3'-0-(te^-butyldithiomethyl-dNTPs
`
`IV
`
`196
`
`199
`
`200
`
`201
`
`202
`
`204
`
`206
`
`207
`
`209
`
`210
`
`212
`
`
`
`3.4.1. Design of 3'-0-(ter/-butyldithiomethyl-dNTPs as NRTs
`
`3.4.2 Synthesis of 3'-0-(fert-butyldithiomethyl-dNTPs for SBS
`
`3.4.3 Evaluation of 3'-0-(terf-butyldithiomethyl-dNTPs as NRTs for SBS
`
`3.5 Synthesis of 3'-0-allyloxycarbonyl-2'-deoxynucleotides
`
`3.5.1 Synthesis of 3'-O-allyloxycarbonyl-dTTP for DNA SBS
`
`3.5.2 Enzymatic synthesis of 3'-0-allyloxycarbonyl-dATP for DNA SBS
`
`3.5.3 Synthesis of 3'-0-allyloxycarbonyl-dCTPforDNA SBS
`
`3.6 Synthesis of 3'-0-(N-allylcarbamoyl)-2'-deoxynucleotides
`
`3.7 DNA pyrosequencing using 3'-0-labeled nucleotide library as NRTs
`
`3.8 Experimental Section
`
`References:
`
`Chapter 4: summary and prospect
`
`212
`
`214
`
`215
`
`216
`
`...216
`
`217
`
`218
`
`220
`
`222
`
`224
`
`261
`
`4.1. Construction of two nucleotide libraries for DNA SBS
`
`264
`
`4.2. Screening and evaluation of our novel nucleotides as reversible terminators for DNA
`
`SBS
`
`4.3. Chip-base 4-color DNA SBS using NRTs from two synthetic libraries
`
`4.4. Sanger-SBS hybrid DNA sequencing with photocleavable fluorescent
`
`dideoxynucleotides and 3'-0-labeled NRTs
`
`4.5. Prospect of 4-color DNA SBS using cleavable fluorescent NRTs
`
`References:
`
`Appendix:
`
`Selected NMR Spetra
`
`V
`
`265
`
`267
`
`268
`
`270
`
`273
`
`274
`
`
`
`LIST OF FIGURES
`
`Part 1: Synthesis of an 11-cis-Locked-biotinylated Retinoid for Sequestering 11-
`cis-Retinoid Binding Proteins
`
`Figure 1-1. The structure of ret7 and the native 11-ds-retinal.
`
`Part 2: Development of New DNA Sequencing Approaches Using Synthetic
`Chemistry
`
`Chapter 1: Introduction to DNA Sequencing by Synthesis
`
`Figure 1-1. Chemical structures of four 2'-deoxyribonucleotides, each
`composed of a base (adenine, guanine, cytosine and thymine), a sugar and a
`5'-triphosphate moiety.
`
`Figure 1-2. 3-D computer rendered model of the DNA double helix. The
`two strands coil around each other to create the double helix (the Left). A
`cartoon depicting two DNA (green and red) held together by hydrogen
`bonds between the paired bases. Notice that the directions of DNA strands
`are anti-parallel, one from 5'to 3' (red) and the other from 3' to 5' (green)
`(the Middle). A zoom in section of the double helix, which shows the
`specific chemical structures of the bases only allow efficient hydrogen
`bonding between A and T, and between G and C (the Right).
`
`Figure 1-3. The mechanism of DNA polymerase reaction. Notice that a free
`hydroxyl group on
`the 3'-end of
`the primer
`is necessary for
`the
`incorporation of the next nucleotide. Incorporation of a nucleotide to the 3'-
`OH end of a DNA growing strand is a fundamental biological process, in
`which the base-pairing between the incoming nucleotide and the DNA
`template strand guides the formation of a new DNA strand that is
`complementary to the template strand. DNA polymerase catalyzes the
`addition of nucleotides of the growing DNA strand by incorporation the
`incoming nucleotide at the 3'-OH end by forming a phosphodiester bond
`and releasing a pyrophosphate.
`
`Figure 1-4. Sanger dideoxy sequencing strategy. DNA fragments are
`produced when both dNTPs and ddNTP-Dyes are included in polymerase
`mixture. Once a ddNTP-dye is incorporated, the 3'-end of the DNA
`growing strand lacks the 3'-OH group, therefore, the addition of the next
`nucleotide is blocked and the DNA polymerase reaction in that particular
`DNA growing strand has to terminate. The other growing strands will keep
`extending in the 5' to 3' direction until a ddNTP-dye is incorporated to
`terminate
`the DNA extension reaction. This reaction mixture will
`
`VI
`
`
`
`eventually produce a set of DNA strands of different length complementary
`to the template DNA that is being sequenced. To read the complete
`sequence, the DNA fragments are separated by electrophoresis, and the
`separated bands of DNA are then detected by their fluorescence as they
`emerge from the gel.
`
`Figure 1-5. Matrix-assisted laser desorption/ionizeation time-of-flight mass
`spectrometry (MALDI-TOF MS). Analyte molecules (DNA sequencing
`fragments) and matrix molecules (generally UV or IR light-absorbing small
`organic molecules) are mixed in solution and then co-crystallized on a flat
`sample plate, which is subsequently loaded into the vacuum chamber of the
`mass spectrometer. DNA molecules are gently desorbed and ionized along
`with the matrix molecules by UV laser irradiation and the resulting charged
`ions are accelerated under a constant electric voltage, which guides them to
`fly towards ion detector. The charged molecules reach the detector at
`different times on the basis of their various masses. The masses of the
`charged ions are determined from their time of flight to the detector, which
`is proportional to their mass per charge ratio (ra/z).
`
`Figure 1-6. Blockade of an a-hemolysin nanopore by a DNA hairpin. The
`lower panel shows a current trace caused by capture and translocation of a
`6-bp DNA hairpin through the pore. The upper panel shows a molecular
`model of these events.
`
`Figure 1-7. Hypothesized plot of translocation time versus blockade current
`from DNA different molecules. The magnitude and duration of
`the
`blockade signatures between different nucleotides is easily distinguishable.
`
`Figure 1-8. The general principle of the DNA pyrosequencing method.
`Pyrosequencing is a non-electrophoretic real-time sequencing method that
`uses the luciferase-luciferin
`light release as the detection signal for
`nucleotide incorporation into target DNA. The four different nucleotides
`are added iteratively to a four-enzyme mixture. The pyrophosphate (PPi)
`released in the DNA polymerase reaction is quantitatively converted to
`ATP by ATP sulfurylase, which provide the energy to firefly luciferase to
`oxidize luciferin and generate light. The light is detected by a photon
`detection device and monitored in real time. Finally, apyrase catalyzes
`degradation of nucleotides that are not incorporated and the sequencing
`reaction will be ready for the next nucleotide addition.
`
`Figure 1-9. Possible fluorescence labeling sites on a nucleotide: A) the
`fluorophore is attached on the gamma phosphate group; B) the fluorophore
`is attached through a cleavable linker on the 3'-hydroxyl group; C) the
`fluorphore is attached through a cleavable linker on the base, while the 3'-
`hydroxyl is capped with a reversible chemical moiety.
`
`vu
`
`
`
`Chapter 2: Construction of a Four-color Nucleotide Library for DNA SBS
`
`Figure 2-1. The 3-D structure of the ternary complexes of a rat DNA
`polymerase, a DNA template-primer, and dideoxycytidine triphosphate
`(ddCTP). The left side of the illustration shows the mechanism for the
`addition of ddCTP and the right side of the illustration shows the active
`side of the polymerase. Note that the 3'-position of the dideoxyribose ring is
`very crowded, while there is ample space is available at the 5-position of
`the cytidine base.
`
`Figure 2-2. A massive parallel DNA-sequencing approach using cleavable
`fluorescent nucleotide analogues. In this approach, a chip is constructed
`with immobilized DNA templates that are able to self-prime for initiating
`the polymerase reaction. Four nucleotide analogues are designed such that
`each is labeled with a unique fluorescent dye on a specific location of the
`base, and a small chemical group (R) to cap the 3-OH group.
`
`Figure 2-3. Proposed image changes of each cycle in a massive parallel
`DNA-sequencing
`approach using
`cleavable
`fluorescent
`nucleotide
`analogues.
`
`Figure 2-4. Immobilization of an azido-labeled PCR product on an alkynyl-
`functionalized surface, and a ligation reaction between the immobilized
`single-stranded DNA template and a loop primer to form a self-priming
`DNA moiety on the chip. The sequence of the loop primer is shown in (A).
`
`Figure 2-5. Construction of an azido-PEG-functionalized glass slide, which
`was prepared through a PEG linker for the immobilization of alkyne
`labeled self-priming DNA template for DNA SBS, to reduce non-specific
`fluorescence absorption of fluorescent NRTs.
`
`Figure 2-6 (A) Reaction scheme of SBS on a chip using four chemically
`cleavable fluorescent reversible nucleotides. (B) The scanned four-color
`fluorescence images for each step of SBS on a chip: (1) incorporation of 3'-0-
`allyl-dGTP-allyl-CY5; (2) cleavage of allyl-CY5 and 3'-allyl group; (3)
`incorporation of 3'-0-allyl-dATP-allyl-ROX; (4) cleavage of allyl-ROX and
`3'-allyl group; (5) incorporation of 3 -0-allyl-dUTP-allyl-R6G; (6) cleavage
`of allyl-R6G and 3'-allyl group; (7) incorporation of 3'-0-allyl-dCTP-allyl-
`Bodipy-FL-510; (8) cleavage of allyl-Biodipy-FL-510 and 3'-allyl group;
`images (9)-(25) are similarly produced. (C) A plot (four-color sequencing
`data) of raw fluorescence emission intensity at the four designated emission
`wavelength of the four chemically cleavable fluorescent nucleotides vs. the
`progress of sequencing extension.
`
`2-7. A
`Figure
`3'-0-(2-nitrobenzyl)
`for
`scheme
`design
`general
`photocleavable fluorescent nucleotide reversible terminators, in which the
`fluorescent dye is attached to the base portion of the nucleotide through a
`
`vm
`
`
`
`photocleavable linker, and the 3'-hydroxyl group is also capped with a
`photocleavalble 2-nitrobenzyl group.
`
`Figure 2-8. Structures of or 3'-0-PC-dTTP 1.
`
`Figure 2-9. Structures of 3'-0-PC-dNTPs-PC-fluorophore, with 4
`fluorophores containing distinct fluorescent emissions: 3'-0-PC-dCTP-PC-
`Bodipy-FL (A.abs (max) = 502 n m; A,em (max) = 510 ran), 3'-0-PC-dUTP-PC-R6G
`(^abscnax) = 525 nm; Xem(max) = 585 nm), 3'-0-PC-dATP-PC-ROX (\bs(max) = 585
`nm; Xem{max) = 602 nm), 3'-OPC-dGTP-PC-CY5 (Xabs(max) = 649 nm; ),em(max) =
`670 nm).
`
`Figure 2-10. MALDI-TOF MS spectra of continuous extension and
`photocleavage products.
`
`Figure 2-11. A detail scheme of polymerase extension reaction using 3'-0-
`PC-dCTP-PC-Bodipy-FL-510 2 as a reversible
`terminator
`(The
`left),
`continuous polymerase extension scheme (The middle) and MALDI-TOF
`MS of the resulting DNA products (The right).
`
`Figure 2-12. Substituent-dependent electrophilic reaction.
`
`Figure 2-13. Structures of 3'-0-allyl-dNTPs-allyl-fluorophore.
`
`Figure 2-14. A general design scheme for 3'-0-(azidomethyl) chemically
`cleavable fluorescent nucleotide reversible
`terminators,
`in which
`the
`fluorescent dye is attached to the base portion of the nucleotide through a
`chemically cleavable linker (a short azido linker), and the 3'-hydroxyl group
`is also capped with a chemically cleavable azidomethyl moiety.
`
`Figure 2-15. The product residues of the chemical cleavage by TCEP
`solution.
`
`Figure 2-16. Staudinger reaction with TCEP to regenerate 3'-OH group of
`the DNA extension products.
`
`Figure 2-17. Structures of 3'-0-N3-dNTPs-N3(short)-Fluorophore.
`
`Figure 2-18. Structures of 3'-0-N3-dNTPs-N3(long)-Fluorophore.
`
`Figure 2-19. A general approach for azidomethylation at 3'-hydroxy
`functions.
`
`Figure 2-20. Mechanism of a modified Pummerer's rearrangement.
`
`Figure 2-21. Reaction scheme of SBS on a chip using mixture of 3'-OR-
`dNTP (the major)/3'-ORrdNTPs-OR2-Fluorophore (the minor) [R = Ra =
`azidomethyl; R2 = N3(long)] (top Panel). A plot (four-color sequencing data)
`of raw fluorescence emission intensity at the four designated emission
`
`IX
`
`
`
`wavelength of the four chemically cleavable fluorescent nucleotides vs. the
`progress of sequencing extension (below panel).
`
`Figure 2-22. A general design scheme for a mixed blocking and labeling
`system in novel nucleotides as a reversible terminator, in which the
`fluorescent dye is attached to the base portion of the nucleotide through a
`photocleavable linker, and the 3'-hydroxyl group is also capped with a
`different chemically cleavable moiety.
`
`Figure 2-23. Structures of 3'-0-allyl-dNTPs-Fluorophore.
`
`Figure 2-24. Structures of 3'-0-allyl-dNTP-Biodipy Series Dyes
`
`Figure 2-25. Comparison of aHNMR of compound 140 and 141.
`
`Figure 2-26. Excitation and emission of compound 140 and 141.
`
`Figure 2-27. Structures of 3'-0-azidomethyl-dNTPs-Fluorophore.
`
`Figure 2-28. Reaction scheme of sequencing on a chip using combination of
`3'-0-modified nucleotide reversible
`terminators
`(S'-O-R^dNTPs) and
`cleavable fluorophore modified dideoxynucleotide terminators (ddNTPs-
`R2-fluorophore). In this sequencing approach, a chip is constructed with
`immobilized DNA templates that are able to self-prime for initiating the
`polymerase reaction. The four 3'-0-modified nucleotides have a small
`chemically reversible group (Rx) to cap the 3'-OH moiety. Four cleavable
`fluorophore modified dideoxynucleotides are designed such that each is
`attached with a unique fluorophore on the base through a cleavable linker
`(R2). Upon adding
`the mixture of S'-O-RrdNTPs and ddNTP-R2-
`fluorophores with the DNA polymerase, only the dideoxynucleotide/3'-0-
`modified nucleotide analogues complementary to the next nucleotide on
`the template is incorporated by polymerase on each spot of the chip. The
`ratio of the two sets of nucleotide analogues are tuned so that in each
`extension step, only a small amount of the fluorescent ddNTPs are
`incorporated into the self-priming DNA template to produce adequate
`signal for detection, while the rest are incorporated by the nucleotide
`reversible terminators (step 1). After removing the excess reagents and
`washing away any unincorporated
`dideoxynucleotide/3'-0-modified
`nucleotide analogues, a four color fluorescence imager is used to image the
`surface of the chip, and the unique fluorescence emission from the specific
`fluorophore on the dideoxynucleotide analogues on each spot of the chip
`will yield the identity of the nucleotide (step 2a). After imaging, the R2-
`fluorophore and the Rx protecting group will be removed by appropriate
`cleavage conditions to generate DNA products with the fluorophore
`removed and a free 3'-OH group with high yield, respectively (step 2b). The
`self-primed DNA moiety on the chip at this stage is ready for the next cycle
`of the reaction to identify the next nucleotide sequence of the template
`DNA (step 3).
`
`x
`
`
`
`Figure 2-29. A detailed scheme (panel i) of polymerase reaction using all
`four ddNTPs-PC-fluorophore to extend with an "ddA", "ddC", "ddG" and
`"ddU" and the subsequent cleavage reaction to cleave off the fluorophore
`from the DNA extension product. MALDI-TOF MS spectra (panel it)
`verifying base specific incorporation of: (A) Primer extended with ddATP-
`PC-ROX (1) (peak at 9,053 m/z), (B) its photocleavage product 2 (8,315 m/z);
`(C) Primer extended with ddCTP-PC-Bodipy-FL-510 (3) (peak at 8,788 m/z),
`(D) its photocleavage product 4 (8,292 m/z); (E) Primer extended with
`ddGTP-PC-Bodipy-650 (5) (peak at 9,042 m/z), (F) its photocleavage product
`6 (8,331 m/z); (G) Primer extended with ddUTP-PC-R6G (7) (peak at 8,955
`m/z) and (H) its photocleavage product 8 (8,293 m/z).
`
`Figure 2-30. A) Sequencing scheme of Sanger-SBS hybrid approach on a
`chip using a combination mixture of four photocleavable fluorescent
`dideoxynucleotides and four 3'-0-PC-modified NRTs. B) The scanned 4-
`color fluorescence images for each step of Sanger-SBS hybrid sequencing on
`a chip: (1) incorporation of ddATP-PC-ROX and 3'-0-PC-dATP; (2)
`cleavage of PC-ROX and 3'-0-PC group; (3) incorporation of ddCTP-PC-
`Bodipy-FL-510 and 3'-0-PC-dCTP; (4) cleavage of PC- Bodipy-FL-510 and
`3'-0-PC group; (5) incorporation of ddATP-PC-ROX and 3'-0-PC-dATP; (6)
`cleavage of PC-ROX and 3'-0-PC group; (7) incorporation of ddUTP-PC-
`R6G and 3'-0-PC-dTTP; (2) cleavage of PC-R6G and 3'-0-PC group; images
`(9) to (15) are similarly produced. C) A plot (4-color sequencing data) of
`raw fluorescence emission intensities at the four designed emission
`wavelengths of the four photocleavable fluorescent dideoxynucleotides.
`
`Chapter 3: Construction of a 3'-0-Labeled Nucleotide Library for DNA
`Sequencing
`
`Figure 3-1. A library of 3'-0-labeled nucleotides.
`
`Figure 3-2. Step-by-step MALDI-TOF MS of continuous DNA extension
`and photocleavage products.
`
`Figure 3-3. MALDI-TOF MS spectra of primer extension products with 3'-
`O-PC-dNTPs (PC, 2-nitrobenzyl). All four 3'-0-modified nucleotides are
`quantitatively incorporated into the primers with high efficiency in the
`polymerase reaction, which indicates that the modified nucleotides are
`good substrates for the polymerase. The small peaks near the extension
`products correspond to the photocleaved product generated during the
`laser desorption and ionization process used in MALDI-TOF MS.
`
`Figure 3-4. The polymerase extension scheme (left) and MALDI-TOF MS
`spectra of the four consecutive extension products and their photocleaved
`products (right) using 3'-0-PC-dNTPs. Primer extended with 3'-0-PC-
`dTTP (1) (right, A), and its photocleaved product 2 (right, B); Product 2
`
`xi
`
`
`
`extended with 3'-0-PC-dGTP (3) (right, C), and its photocleaved product 4
`(right, D); Product 4 extended with 3'-0-PC-dATP (5) (right, E), and its
`photocleaved product 6 (right, F); Product 6 extended with 3'-0-PC-dCTP
`(7) (right, G), and its photocleaved product 8 (right, H). After 10 seconds of
`irradiation at 355 ran the photocleavage is complete with the 3'-0-PC group
`cleaved from the extended DNA products.
`
`Figure 3-5. Scheme for incorporation and cleavage of 3'-0-azidomethyl
`dNTPs.
`
`Figure 3-6. MALDI-TOF MS spectra of incorporation products (the left),
`and MALDI-TOF MS spectra of reduction products (the right).
`
`Figure 3-7. The polymerase extension scheme (left) and MALDI-TOF MS
`spectra of the three consecutive extension products and their cleaveage
`products (right) using 3'-0-(f-Bu-DTM)-dNTPs.
`
`Figure 3-8. A) Comparison of reversible terminator-pyrosequencing using
`3'-0-(2-nitrobenzyl)-dNTPs with conventional pyrosequencing using
`natural nucleotides (NB, 2-nitrobenzyl). A) The self-priming DNA template
`with stretches of homopolymeric regions was sequenced by using 3'-0-(2-
`nitrobenzyl)-dNTPs. The homopolymeric regions are clearly identified,
`with each peak corresponding to the identity of each base in the DNA
`template. B) Pyrosequencing data using natural nucleotides. The
`homopolymeric regions produced
`two large peaks corresponding
`to
`stretches of G and A bases and five smaller peaks corresponding to
`stretches of T, G, C, A, and G bases. However, it is very difficult to decipher
`the exact sequence from the data.
`
`Chapter 4: Summary and Prospect
`
`Figure 4-1. Theoretical 4-color DNA SBS read lengths based on cycle
`efficiency.
`
`Figure 4-2. DNA-Beads on Chip for 4-Color sequencing by synthesis.
`
`xu
`
`
`
`LIST OF SCHEMES AND TABLES
`
`Part 1: Synthesis of an 11-cis-Locked-biotinylated Retinoid for Sequestering 11-
`cis-Retinoid Binding Proteins
`
`Scheme 1-1. Retrosynthetic analysis for the biotinylated retinoid 1.
`
`Scheme 1-2. Synthesis of compound 10.
`
`Scheme 1-3. Completion of biotinylated retinoid 1.
`
`Part 2: Development of New DNA Sequencing Approaches Using Synthetic
`Chemistry
`
`Chapter 2: Construction of a Four-color Nucleotide Library for DNA SBS
`
`Table 2-1. Nucleophilic reactions of Compound 6 with 2-nitrobenzylbromide
`without N3 protection.
`
`Scheme 2-1. The product residues of the photocleavage by radiation at 355
`nm.
`
`Scheme 2-2. The mechanism of photolysis of 2-nitrobenzyl moiety.
`
`Scheme 2-3. Synthesis of 3'-0-PC-dTTP 1.
`
`Scheme 2-4. Approach to separate compound 10 and 11 with similar
`polarity.
`
`Scheme 2-5. Continuous polymerase extension scheme using 3'-0-PC-dTTP
`1 as a photo-reversible terminator.
`
`Scheme 2-6. Synthesis of compound 22.
`
`Scheme 2-7. Synthesis of 3'-O-PC-dCTP-PC-Bodipy-FL-510 2.
`
`Scheme 2-8. Synthesis of PC-Bodipy-FL-510 NHS ester 27.
`
`Scheme 2-9. Synthesis of 3'-0-PC-dUTP-PC-R6G 3.
`
`xiii
`
`
`
`Scheme 2-10. Synthesis of PC-R6G NHS ester 39.
`
`Scheme 2-11. Another approach to synthesize nucleoside 37.
`
`Scheme 2-12. Retrosynthetic analysis of nucleotide 3'-0-PC-dGTP-PC-CY5 4.
`
`Scheme 2-13. Synthesis of 3'-0-PC-dGTP-NH2 52.
`
`Scheme 2-14. Another approach to perform palladium-catalyzed Sonogashira
`reaction.
`
`Scheme 2-15. Completion of synthesis of 3'-0-PC-dGTP-PC-CY5 4.
`
`Scheme 2-16. Completion of synthesis of 3'-0-PC-dGTP-PC-Bodipy 650 60.
`
`Scheme 2-17. Synthesis of 3'-0-PC-dATP-NH2 69.
`
`Scheme 2-18. Completion of synthesis of 3'-0-PC-dATP-PC-ROX 5.
`
`Scheme 2-19. Synthesis of 3'-0-azidomethyl-dUTP-NH2 73.
`
`Scheme 2-20. Synthesis of the long azido linker 84.
`
`Scheme 2-21. Completion of synthesis of 3'-0-N3-dUTP-N3(long)-R6G 87.
`
`Scheme 2-22. Completion of synthesis of 3'-0-N3-dUTP-N3(short)-R6G 75.
`
`Scheme 2-23. Synthesis of 3'-0-azidomethyl-dCTP-NH2 99.
`
`Scheme 2-24. Completion of synthesis of 3'-0-N3-dCTP-N3(long)-Bodipy-FL-
`510102.
`
`Scheme 2-25. Completion of synthesis of 3'-0-N3-dCTP-N3(short)-Bodipy-
`FL-510 74.
`
`Scheme 2-26. Synthesis of 3'-0-azidomethyl-dATP-NH2112.
`
`Scheme 2-27. Completion of synthesis of 3'-0-N3-dATP-N3(long)-ROX 115.
`
`Scheme 2-28. Completion of synthesis of 3'-0-N3-dATP-N3(short)-ROX 76.
`
`Scheme 2-29. Synthesis of 3'-0-azidomethyl-dGTP-NH2 124.
`
`Scheme 2-30. Completion of synthesis of 3'-0-N3-dGTP-N3(short)-CY5 77.
`
`Scheme 2-31. Completion of synthesis of 3'-0-N3-dGTP-N3(long)-CY5129.
`
`Scheme 2-32. A general approach to synthesize 3'-0-allyl-dNTPs-PC-
`Fluorophore.
`
`xiv
`
`
`
`Scheme 2-33. An unexpected reaction in transforming PC-Bodipy-Bodipy-
`558/568 to its corresponding NHS ester.
`
`Scheme 2-34. The reaction in transforming PC-Bodipy-Bodipy-564/570 to its
`corresponding NHS ester.
`
`Scheme 2- 35. Synthesis of 3'-0-N3-dNTPs-PC-Fluorophore 147-150.
`
`Scheme 2-36. Synthesis of 3'-0-(tert-butyl-dimiomethyl-dUTP-PC-R6G 156.
`
`Scheme 2- 37. Synthesis of ddNTPs-PC-Fluorophore 161-164.
`
`Scheme 2-38. A general approach to synthesize ddNTPs-N3(long)-
`Fluorophore 165-168.
`
`Chapter 3: Construction of a 3'-0-Labeled Nucleotide Library for DNA
`Sequencing
`
`Scheme 3-1. Synthesis of 3'-PC-dGTP 1.
`
`Scheme 3-2. Demethylation of compound 5a in NaOH (2N aq) with
`different time.
`
`Scheme 3-3. Continuous polymerase extension scheme using 3'-0-PC-dGTP
`1 as a reversible terminat (the right) and schematic representation in
`polymerase DNA extension reaction and photocleavage reaction (the left).
`
`Scheme 3-4. Synthesis of 3'-PC-dATP 19.
`
`Scheme 3-5. Synthesis of 3'-0-PC-dCTP 24.
`
`Scheme 3-6.. Synthesis of 3'-0-azidomethyl-dATP 28.
`
`Scheme 3-7.. Synthesis of 3'-0-azidomethyl-dCTP 35.
`
`Scheme 3-8. Synthesis of 3'-0-azidomethyl-dGTP 40.
`
`Scheme 3-9. Proposed mechanism of the dithiomethyl bond cleavage based
`on the observation of the intermediate in the MADI-TOF Mass Spectra.
`
`Scheme 3-10. Synthesis of 3'-0-(fer£-butyldithiomethyl-dNTPs.
`
`xv
`
`
`
`Scheme 3-11. Synthesis of 3'-0-allyloxycarbonyl-dTTP 50.
`
`Scheme 3-12. Enzymatic synthesis of 3'-0-allyloxycarbonyl-dATP.
`
`Scheme 3-13. Synthesis of 3'-0-allyloxycarbonyl-dCTP 62.
`
`Scheme 3-14. First attempt to prepare 5'-0-TBDMS-3'-0-(N-allylcarbamoyl)-
`thymidine 63.
`
`Scheme 3-15. Synthesis of 3'-0-(N-allylcarbamoyl)-dTTP 64.
`
`xvi
`
`
`
`ABBREVIATIONS AND SYMBOLS
`
`APS
`
`ATP
`
`Bodipy
`
`bp
`
`adenosine 5'-phosphosulf ate
`
`adenosine 5'-triphosphine
`
`4,4-difluoro-4-bora-3oc,4oi,-diaza-s-indacene
`
`base pair
`
`t-Bu-DTM
`
`terf-butyl-dithiomethyl
`
`Cy5
`
`dA
`
`dC
`
`ddNTP
`
`dG
`
`DIPEA
`
`DABCO
`
`DMAP
`
`DMF
`
`DMSO
`
`DNA
`
`dNTP
`
`DSC
`
`dT
`
`EDC
`
`EDTA
`
`cyanine-5
`
`deoxyadenosine
`
`deoxycytidine
`
`dideoxynucleoside triphosphate
`
`deoxyguanosine
`
`N,N-diisopropy