throbber
United States Patent [19J
`Lerner et al.
`
`111111
`
`1111111111111111111111111111111111111111111111111111111111111
`US006060596A
`[11] Patent Number:
`[45] Date of Patent:
`
`6,060,596
`May 9, 2000
`
`[54] ENCODED COMBINATORIAL CHEMICAL
`LIBRARIES
`
`[75]
`
`Inventors: Richard Lerner, La Jolla; Kim Janda,
`San Diego, both of Calif.; Sydney
`Brenner, Edwards Passage, United
`Kingdom
`
`[73] Assignee: The Scripps Research Institute, La
`Jolla, Calif.
`
`[21] Appl. No.: 09/033,743
`
`[22] Filed:
`
`Mar. 3, 1998
`
`Related U.S. Application Data
`
`[62] Division of application No. 08/665,511, Jun. 18, 1996, Pat.
`No. 5,723,598, which is a division of application No.
`07/860,445, Mar. 30, 1992, Pat. No. 5,573,905.
`Int. Cl? ..................................................... C07H 21/00
`[51]
`[52] U.S. Cl. ............................................................. 536/25.3
`[58] Field of Search .................................. 536/25.3, 24.2;
`436/518, 536; 435/6
`
`[56]
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,748,111
`4,923,901
`4,965,188
`5,082,780
`5,141,813
`
`5/1988 Dattagupta et a!. ........................ 435/6
`5/1990 Koester et a!. ........................... 521!53
`10/1990 Mullis et a!. ............................... 435/6
`1!1992 Warren et a!. .......................... 435/191
`8/1992 Nelson .................................... 428/402
`
`FOREIGN PATENT DOCUMENTS
`
`0 323 152
`
`5/1989 European Pat. Off ..
`
`01HER PUBLICATIONS
`
`Clontech Laboratories, Inc., Sales Literature: 2-3 (1994).
`Cwirla, et al., "Peptides on phage: A vast library of pep tides
`for identifying ligands", Proc. Natl. Acad. Sci. USA 87:
`6378-6382 (1990).
`Devlin, et al., "Random Peptide Libraries: A Source of
`Specific Protein Binding Molecules", Science 249: 404--406
`(1990).
`
`Fodor, et al., "Light-Directed, Spatially Addressable Paral(cid:173)
`lel Chemical Synthesis", Science 251: 767-775 (1991).
`Geysen, et al., "Use of peptide synthesis to probe viral
`antigens for epitopes to a resolution of a single amino acid",
`Proc. Natl. Acad. Sci. USA 81: 3998-4002 (1984).
`Houghton, et al., "Generation and use of synthetic peptide
`combinatorial libraries for basic research and drug discov(cid:173)
`ery", Nature 354: 84-86 (1991).
`Lam, et al., "A new type of synthetic peptide library for
`identifying ligand-binding activity", Nature 354: 82-84
`(1991).
`Maeji, et al., "Simultaneous multiple synthesis of peptide(cid:173)
`carrier conjugates", J. Immunol. Met. 146: 83-90 (1992).
`Nelson, et al., "Anew and versatile reagent for incorporating
`multiple primary amines into synthetic oligonucleotides"
`Nucleic Acids Research 17: 7179-7195 (1989).
`Nelson, et al., "Oligonucleotide labeling methods 3. Direct
`labeling of oligonucleotides employing a novel, non(cid:173)
`nucleosidic, 2-aminobutyl-1, 3-propanediol backbone",
`Nucleic Acids Research 20: 6253-6259 (1992).
`Scott, et al., "Searching for Peptide Ligands with an Epitope
`Library", Science 249: 386-390 (1990).
`Hays, et al., "High-Yield Synthesis of Oligoribonucleotides
`Using g_-Nitrobenzyl Protection of 2'-Hydroxyls", Tetrahe(cid:173)
`dron Letters 26: 2407-2410 (1985).
`Hampel et al. Nucleic Acids Research. vol. 18, No. 2, pp.
`299-304, 1990.
`Alberts et al., Molecular Biology of the Cell, p. 343. Garland
`Pubishing, Inc. New York, 1983.
`
`Primary Examiner-Remy Yucel
`Attorney, Agent, or Firm-Thomas E. Northrup
`ABSTRACT
`
`[57]
`
`The present invention describes an encoded combinatorial
`chemical library comprised of a plurality of bifunctional
`molecules having both a chemical polymer and an identifier
`oligonucleotide sequence that defines the structure of the
`chemical polymer. Also described are the bifunctional mol(cid:173)
`ecules of the library, and methods of using the library to
`identify chemical structures within the library that bind to
`biologically active molecules in preselected binding inter(cid:173)
`actions.
`
`17 Claims, 2 Drawing Sheets
`
`
`
`Page 1 of 25
`
`ILMN EXHIBIT 1001
`
`

`
`Apa I
`Sty I
`]GGGCCCTATTCTTAG 3 1
`5 1 AGCTACTTCCCAAGG(coding sequence
`3 1 TCGATGAAGGGTTCC(anticoding strand]CCCGGGATAAGAATC 5 1
`
`Step 1 Cleavage by
`Sty I & Apa I
`
`5' AGCTACTTCC
`3' TCGATGAAGGGTTC
`
`]GGGCC
`CAAGG(coding sequence
`C(anticoding strand]C
`
`CTATTCTTAG 3'
`CCGGGATAAGAATC 5 1
`
`]GGGCC
`CAAGG(coding sequence
`5' AGCTACTTCCC
`3' TCGATGAAGGGTTC BBCC(anticoding strand]C
`
`CTATTCTTAG 3'
`CCGGGATAAGAATC 5 1
`
`Step 2 Biotynylation
`
`FIG. 1
`
`d •
`\Jl
`•
`~
`~ ......
`~ = ......
`
`~
`~
`'-<
`~~
`
`N c c c
`
`'JJ. =(cid:173)~
`~ .....
`'"""' 0 ......,
`
`N
`
`0\
`
`.... = 0\ =
`
`....
`Ul
`\C
`0\
`
`
`
`Page 2 of 25
`
`

`
`U.S. Patent
`
`May 9, 2000
`
`Sheet 2 of 2
`
`6,060,596
`
`Pl - LINK t step 1
`
`CACATG-Pl-LINK-gly
`
`ACGGTA-Pl-LINK-met
`
`+ Step 2
`
`CACATGCACATG-Pl-LINK-gly.gly
`CACATGACGGTA-Pl-LINK-met.gly
`
`ACGGTACACATG-Pl-LINK-gly.met
`ACGGTAACGGTA-Pl-LINK-met.met
`
`+ Step 3
`
`P2CACATGCACATGCACATGP1-LINK-gly.gly.gly
`P2CACATGCACATGACGGTAP1-LINK-met.gly.gly
`P2CACATGACGGTACACATGP1-LINK-gly.met.gly
`P2CACATGACGGTAACGGTAP1-LINK-met.met.gly
`
`P2ACGGTACACATGCACATGP1-LINK-gly.gly.met
`P2ACGGTACACATGACGGTAP1-LINK-met.gly.met
`P2ACGGTAACGGTACACATGP1-LINK-gly.met.met
`P2ACGGTAACGGTAACGGTAP1-LINK-met.rnet.met
`
`Pl = GGGCCCTATTCTTAG
`P2 = AGCTACTTCCCAAGG
`
`FIG. 2
`
`
`
`Page 3 of 25
`
`

`
`6,060,596
`
`1
`ENCODED COMBINATORIAL CHEMICAL
`LIBRARIES
`
`This application is a divisional of Ser. No. 08/665,511,
`filed Jun. 18, 1996, now U.S. Pat. No. 5,723,598, which is
`a divisional of Ser. No. 07/860,445, filed Mar. 30, 1992, now
`U.S. Pat. No. 5,573,905.
`
`DESCRIPTION
`
`2
`twenty new libraries are synthesized and assayed to deter(cid:173)
`mine the effective amino acid in the third position, and the
`process is reiterated in this fashion until the active hexapep(cid:173)
`tide is defined. This is analogous to the method used in
`5 searching a dictionary; the peptide is decoded by construc(cid:173)
`tion using a series of sieves or buckets and this makes the
`search logarithmic.
`A very powerful biological method has recently been
`described in which the library of peptides is presented on the
`10 surface of a bacteriophage such that each phage has an
`individual peptide and contains the DNA sequence specify(cid:173)
`ing it. The library is made by synthesizing a repertoire of
`random oligonucleotides to generate all combinations, fol(cid:173)
`lowed by their insertion into a phage vector. Each of the
`15 sequences is cloned in one phage and the relevant peptide
`can be selected by finding those that bind to the particular
`target. The phages recovered in this way can be amplified
`and the selection repeated. The sequence of the peptide is
`decoded by sequencing the DNA See for example Cwirla et
`20 al., Proc. Natl. Acad. Sci. USA, 87:6378-6382 (1990); Scott
`et al., Science, 249:386-390 (1990); and Devlin et al.,
`Science, 249:404-406 (1990).
`Another "genetic" method has been described where the
`libraries are the synthetic oligonucleotides themselves
`wherein active oligonucleotide molecules are selected by
`binding to an acceptor and are then amplified by the poly(cid:173)
`merase chain reaction (PCR). PCR allows serial enrichment
`and the structure of the active molecules is then decoded by
`DNA sequencing on clones generated from the PCR prod(cid:173)
`ucts. The repertoire is limited to nucleotides and the natural
`pyrimidine and purine bases or those modifications that
`preserve specific Watson-Crick pairing and can be copied by
`polymerases.
`The main advantages of the genetic methods reside in the
`capacity for cloning and amplification of DNA sequences,
`which allows enrichment by serial selection and provides a
`facile method for decoding the structure of active molecules.
`However, the genetic repertoires are restricted to nucleotides
`40 and peptides composed of natural amino acids and a more
`extensive chemical repertoire is required to populate the
`entire universe of binding sites. In contrast, chemical meth(cid:173)
`ods can provide limitless repertoires but they lack the
`capacity for serial enrichment and there are difficulties in
`45 discovering the structures of selected active molecules.
`BRIEF SUMMARY OF THE INVENTION
`
`The present invention provides a way of combining the
`virtues of both of the chemical and genetic methods sum-
`50 marized above through the construction of encoded combi(cid:173)
`natorial chemical libraries, in which each chemical sequence
`is labelled by an appended "genetic" tag, itself constructed
`by chemical synthesis, to provide a "retrogenetic" way of
`specifying each chemical structure.
`In outline, two alternating parallel combinatorial synthe(cid:173)
`ses are performed so that the genetic tag is chemically linked
`to the chemical structure being synthesized; in each case, the
`addition of one of the particular chemical units to the
`structure is followed by the addition of an oligonucleotide
`sequence, which is defined to "code" for that chemical unit,
`ie., to function as an identifier for the structure of the
`chemical unit. The library is built up by the repetition of this
`process after pooling and division.
`Active molecules are selected from the library so pro(cid:173)
`duced by binding to a preselected biological molecule of
`interest. Thereafter, the identity of the active molecule is
`determined by reading the genetic tag, i.e., the identifier
`
`1. Technical Field
`The present invention relates to encoded chemical librar(cid:173)
`ies that contain repertoires of chemical structures defining a
`diversity of biological structures, and methods for using the
`libraries.
`2. Background
`There is an increasing need to find new molecules which
`can effectively modulate a wide range of biological
`processes, for applications in medicine and agriculture. A
`standard way for searching for novel bioactive chemicals is
`to screen collections of natural materials, such as fermen(cid:173)
`tation broths or plant extracts, or libraries of synthesized
`molecules using assays which can range in complexity from
`simple binding reactions to elaborate physiological prep a(cid:173)
`rations. The screens often only provide leads which then 25
`require further improvement either by empirical methods or
`by chemical design. The process it time-consuming and
`costly but it is unlikely to be totally replaced by rational
`methods even when they are based on detailed knowledge of
`the chemical structure of the target molecules. Thus, what 30
`we might call "irrational drug design" -the process of
`selecting the right molecules from large ensembles or
`repertoires-requires continual improvement both in the
`generation of repertoires and in the methods of selection.
`Recently there have been several developments in using 35
`peptides or nucleotides to provide libraries of compounds
`for lead discovery. The methods were originally developed
`to speed up the determination of epitopes recognized by
`monoclonal antibodies. For example, the standard serial
`process of stepwise search of synthetic peptides now encom(cid:173)
`passes a variety of highly sophisticated methods in which
`large arrays of peptides are synthesized in parallel and
`screened with acceptor molecules labelled with fluorescent
`or other reporter groups. The sequence of any effective
`peptide can be decoded from its address in the array. See for
`example Geysen et al., Proc. Natl. Acad. Sci. USA,
`81:3998-4002 (1984); Maeji et al., J. Immunol. Met.,
`146:83-90 (1992); and Fodor et al., Science, 251: 767-775
`(1991).
`In another approach, Lam et. al., Nature, 354:82-84
`(1991) describes combinatorial libraries of peptides that are
`synthesized on resin beads such that each resin bead con(cid:173)
`tains about 20 pmoles of the same peptide. The beads are
`screened with labelled acceptor molecules and those with
`bound acceptor are searched for by visual inspection, physi- 55
`cally removed, and the peptide identified by direct sequence
`analysis. In principle, this method could be used with other
`chemical entities but it requires sensitive methods for
`sequence determination.
`A different method of solving the problem of identifica- 60
`tion in a combinatorial peptide library is used by Houghten
`et al., Nature, 354:84--86 (1991). For hexapeptides of the 20
`natural amino acids, 400 separate libraries are synthesized,
`each with the first two amino acids fixed and the remaining
`four positions occupied by all possible combinations. An 65
`assay, based on competition for binding or other activity, is
`then used to find the library with an active peptide. Then
`
`
`
`Page 4 of 25
`
`

`
`6,060,596
`
`4
`subsequent addition of biotin to the cleaved PCR product
`(Step 2). The unique coding and non-coding nucleotide base
`sequences shown in FIG. 1 are listed in the Sequence
`Listing, SEQ 1D NOs 15-22.
`FIG. 2 illustrates the process of producing a library of
`bifunctional molecules according to the method described in
`Example 9. The nucleotide base sequences shown in FIG. 1
`are listed in the Sequence Listing, SEQ ID Nos 15-22.
`
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`20
`
`3
`oligonucleotide sequence. In one embodiment, amplified
`copies of their retrogenetic tags can be obtained by the
`polymerase chain reaction.
`The strands of the amplified copies with the appropriate
`polarity can then be used to enrich for a subset of the library 5
`by hybridization with the matching tags and the process can
`then be repeated on this subset. Thus serial enrichment is
`achieved by a process of purification exploiting linkage to a
`nucleotide sequence which can be amplified. Finally, the
`structure of the chemical entities are decoded by cloning and 10
`sequencing the products of the PCR reaction.
`The present invention therefore provides a novel method
`for identifying a chemical structure having a preselected
`binding activity through the use of a library of bifunctional
`molecules that provides a rich source of chemical diversity. 15
`The library is used to identify chemical structures (structural
`motifs) that interact with preselected biological molecules.
`Thus, in one embodiment, the invention contemplates a
`bifunctional molecule according to the formula A-B-C,
`where A is a chemical moiety, B is a linker molecule
`operatively linked to A and C, and C is an identifier
`oligonucleotide comprising a sequence of nucleotides that
`identifies the structure of chemical moiety A
`In another embodiment, the invention contemplates a
`library comprising a plurality of species of bifunctional
`molecules, thereby forming a repertoire of chemical diver(cid:173)
`sity.
`Another embodiment contemplates a method for identi(cid:173)
`fying a chemical structure that participates in a preselected
`binding interaction with a biologically active molecule,
`where the chemical structure is present in the library of
`bifunctional molecules according to this invention. The
`method comprises the steps of:
`a) admixing in solution the library of bifunctional mol(cid:173)
`ecules with the biologically active molecule under
`binding conditions for a time period sufficient to form
`a binding reaction complex;
`b) isolating the complex formed in step (a); and
`c) determining the nucleotide sequence of the polymer
`identifier oligonucleotide in the isolated complex and
`thereby identifying the chemical structure that partici(cid:173)
`pated in the preselected binding interaction.
`The invention also contemplates a method for preparing a
`library according to this invention comprising the steps of:
`a) providing a linker molecule B having termini A' and C'
`according to the formula A'-B-C' that is adapted for
`reaction with a chemical precursor unit X' at termini A' and
`with a nucleotide precursor Z' at termini C';
`b) conducting syntheses by adding chemical precursor
`unit X' to termini A' of said linker and adding precursor
`unit identifier oligonucleotide Z' to termini C' of said
`linker, to form a composition containing bifunctional
`molecules having the structure Xn -B-Zn;
`c) repeating step (b) on one or more aliquots of the
`composition to produce aliquots that contain a product
`containing a bifunctional molecule;
`d) combining the aliquots produced in step (c) to form an
`admixture of bifunctional molecules, thereby forming
`said library.
`
`A Encoded Combinatorial Chemical Libraries
`An encoded combinatorial chemical library is a compo(cid:173)
`sition comprising a plurality of species of bifunctional
`molecules that each define a different chemical structure and
`that each contain a unique identifier oligonucleotide whose
`nucleotide sequence defines the corresponding chemical
`structure.
`1. Bifunctional Molecules
`A bifunctional molecule is the basic unit in a library of
`this invention, and combines the elements of a polymer
`comprised of a series of chemical building blocks to form a
`chemical moiety in the library, and a code for identifying the
`25 structure of the chemical moiety.
`Thus, a bifunctional molecule can be represented by the
`formula A-B-C, where A is a chemical moiety, B is a
`linker molecule operatively linked to A and C, and C is an
`identifier oligonucleotide comprising a sequence of nucle-
`30 otides that identifies the structure of chemical moiety A
`a. Chemical Polymers A chemical moiety in a bifunctional
`molecule of this invention is represented by A in the above
`formula A-B-C and is a polymer comprising a linear
`series of chemical units represented by the formula (Xn)m
`35 wherein X is a single chemical unit in polymer A and n is a
`position identifier for X in polymer A n has the value of 1 + i
`where i is an integer from 0 to 10, such that when n is 1, X
`is located most proximal to the linker (B).
`Although the length of the polymer can vary, defined by
`40 a, practical library size limitations arise if there is a large
`alphabet size as discussed further herein. Typically, a is an
`integer from 4 to 50.
`A chemical moiety (polymer A) can be any of a variety of
`polymeric structures, depending on the choice of classes of
`chemical diversity to be represented in a library of this
`invention. Polymer A can be any monomeric chemical unit
`that can be coupled and extended in polymeric form. For
`example, polymer A can be a polypeptide, oligosaccharide,
`glycolipid, lipid, proteoglycan, glycopeptide, sulfonamide,
`nucleoprotein, conjugated peptide (i.e., having prosthetic
`groups), polymer containing enzyme substrates, including
`transition state analogues, and the like biochemical poly(cid:173)
`mers. Exemplary is the polypeptide-based library described
`herein.
`Where the library is comprised of peptide polymers, the
`chemical unit X can be selected to form a region of a natural
`protein or can be a non-natural polypeptide, can be com(cid:173)
`prised of natural D-amino acids, or can be comprised of
`60 non-natural amino acids or mixtures of natural and non-
`natural amino acids.
`The non-natural combinations provide for the identifica(cid:173)
`tion of useful and unique structural motifs involved in
`biological interactions.
`Non-natural amino acids include modified amino acids
`and L-amina acids, stereoisomer of D-amino acids. The
`amino acid residues described herein are preferred to be in
`
`45
`
`50
`
`55
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`In the drawings, forming a portion of this disclosure:
`FIG. 1 illustrates a scheme for the restriction endonu- 65
`clease cleavage of a PCR amplification product derived from
`a bifunctional molecule of this invention (Step 1 ), and the
`
`
`
`Page 5 of 25
`
`

`
`6,060,596
`
`6
`
`5
`the "L" isomeric form. NH2 refers to the free amino group
`present at the amino terminus of a polypeptide. COOH refers
`to the free carboxy group present at the carboxy terminus of
`a polypeptide. In keeping with standard polypeptide
`nomenclature, J. Bioi. Chern., 243:3552-59 (1969) and 5
`adopted at 37 C.P.R. §1.822(b)(2)), abbreviations for amino
`acid residues are shown in the following Table of Corre(cid:173)
`spondence:
`
`In this example, the sequence of oligonucleotides Zl, Z2 , Z3
`and Z4 identifies the structure of chemical units X1 , X2 X3
`and X4 , respectively. Thus, there is a correspondence in the
`identifier sequence between a chemical unit X at position n
`and the unit identifier oligonucleotide Z at position n.
`The length of a unit identifier oligonucleotide can vary
`depending on the complexity of the library, the number of
`different chemical units to be uniquely identified, and other
`10 considerations relating to requirements for uniqueness of
`oligonucleotides such as hybridization and polymerase
`chain reaction fidelity. A typical length can be from about 2
`to about 10 nucleotides, although nothing is to preclude a
`unit identifier from being longer.
`Insofar as adenosine (A), guanosine (G), thymidine (T)
`and cytosine (C) represent the typical choices of nucleotides
`for inclusion in a unit identifier oligonucleotide, A, G, T and
`C form a representative "alphabet" used to "spell" out a unit
`identifier oligonucleotide's sequence. Other nucleotides or
`20 nucleotide analogs can be utilized in addition to or in place
`of the above four nucleotides, so long as they have the ability
`to form Watson-Crick pairs and be replicated by DNA
`polymerases in a PCR reaction. However, the nucleotidesA,
`G, T and C are preferred.
`For the design of the code in the identifier
`oligonucleotide, it is essential to chose a coding represen(cid:173)
`tation such that no significant part of the oligonucleotide
`sequence can occur in another unrelated combination by
`chance or otherwise during the manipulations of a bifunc-
`30 tional molecule in the library.
`For example, consider a library where Z is a trinucleotide
`whose sequence defines a unique chemical unit X. Because
`the methods of this invention provide for all combinations
`and permutations of an alphabet of chemical units, it is
`35 possible for two different unit identifier oligonucleotide
`sequences to have closely related sequences that differ by
`only a frame shift and therefore are not easily distinguish(cid:173)
`able by hybridization or sequencing unless the frame is clear.
`Other sources of misreading of a unit identifier oligo-
`40 nucleotide can arise. For example, mismatch in DNA
`hybridization, transcription errors during a primer extension
`reaction to amplify or sequence the identifier
`oligonucleotide, and the like errors can occur during a
`manipulation of a bifunctional molecule.
`The invention contemplates a variety of means to reduce
`the possibility of error in reading the identifier
`oligonucleotide, such as to use longer nucleotide lengths for
`a unit identifier nucleotide sequence as to reduce the simi(cid:173)
`larity between unit identifier nucleotide sequences. Typical
`50 lengths depend on the size of the alphabet of chemical units.
`A representative system useful for eliminating read errors
`due to frame shift or mutation is a code developed as a
`theoretical alternative to the genetic code and is known as
`the commaless genetic code.
`Where the chemical units are amino acids, a convenient
`unit identifier nucleotide sequence is the well known genetic
`code using triplet codons. The invention need not be limited
`by the translation afforded between the triplet codon of the
`genetic code and the natural amino acids; other systems of
`60 correspondence can be assigned.
`A typical and exemplary unit identifier nucleotide
`sequence is based on the commaless code having a length of
`six nucleotides (hexanucleotide) per chemical unit.
`Preferably, an identifier oligonucleotide has at least 15
`65 nucleotides in the tag (coding) region for effective hybrid(cid:173)
`ization. In addition, considerations of the complexity of the
`library, the size of the alphabet of chemical units, and the
`
`TABLE OF CORRESPONDENCE
`
`SYMBOL
`
`1-Letter
`
`3-Letter
`
`AMINO ACID
`
`y
`G
`F
`M
`A
`s
`
`L
`T
`v
`p
`K
`H
`Q
`E
`w
`R
`D
`N
`c
`
`Tyr
`Gly
`Phe
`Met
`Ala
`Ser
`Ile
`Leu
`Thr
`Val
`Pro
`Lys
`His
`Gln
`Glu
`Trp
`Arg
`Asp
`Asn
`Cys
`
`tyrosine
`glycine
`phenylalanine
`methionine
`alanine
`serine
`isoleucine
`leucine
`threonine
`valine
`proline
`lysine
`histidine
`glutamine
`glutamic acid
`tryptophan
`arginine
`aspartic acid
`asparagine
`cysteine
`
`15
`
`25
`
`The phrase "amino acid residue" is broadly defined to
`include the amino acids listed in the Table of Correspon(cid:173)
`dence and modified and unusual amino acids, such as those
`listed in 37 C.P.R. §1.822(b)(4), and incorporated herein by
`reference.
`The polymer defined by chemical moiety A can therefor
`contain any polymer backbone modifications that provide
`increased chemical diversity. In building of a polypeptide
`system as exemplary, a variety of modifications are
`contemplated, including the following backbone structures:
`-NHN(R)CO-, -NHB(R)CO-, -NHC(RR')CO-,
`-NHC 6 H 4 CO-, 45
`-NHC(=CHR)CO-,
`-NHCH2 CHRCO-, -NHCHRCH2 CO-, and lactam
`structures.
`In addition, amide bond modifications are contemplated
`including -COCH2- , -COS-, -CONR, -COO-,
`-CSNH-, -CH 2 NH-, -CH2 CH 2 - , -CH 2 S-,
`-CH 2 SO-, -CH 2 S0 2 - , -CH(CH 3 )S-,
`-CH=CH-,
`-NHCO-,
`-NHCONH-,
`-CONHO-, and -C =CH2)CH2- .
`b. Polymer Identifier Oligonucleotide
`An identifier oligonucleotide in a bifunctional molecule of
`this invention is represented by C in the above formula
`A-B-C and is an oligonucleotide having a sequence
`represented by the formula (Zn)m wherein Z is a unit
`identifier nucleotide sequence within oligonucleotide C that
`identifies the chemical unit X at position n. n has the value
`of 1+i where i is an integer from 0 to 10, such that when n
`is 1, Z is located most proximal to the linker (B). a is an
`integer as described previously to connote the number of
`chemical unit identifiers in the oligonucleotide.
`For example, a bifunctional molecule can be represented
`by the formula:
`
`55
`
`
`
`Page 6 of 25
`
`

`
`6,060,596
`
`7
`length of the polymer length of the chemical moiety all
`contribute to length of the identifier oligonucleotide as
`discussed in more detail herein.
`In a preferred embodiment, an identifier oligonucleotide
`C has a nucleotide sequence according to the formula
`P1---(Zn)a -P2, where P1 and P2 are nucleotide sequences
`that provide polymerase chain reaction (PCR) primer bind(cid:173)
`ing sites adapted to amplify the polymer identifier oligo(cid:173)
`nucleotide. The requirements for PCR primer binding sites
`are generally well known in the art, but are designed to allow
`a PCR amplification product (a PCR-amplified duplex DNA
`fragment) to be formed that contains the polymer identifier
`oligonucleotide sequences.
`The presence of the two PCR primer binding sites, P1 and
`P2, flanking the identifier oligonucleotide sequence (ZnL
`provides a means to produce a PCR-amplified duplex DNA
`fragment derived from the bifunctional molecule using PCR.
`This design is useful to allow the amplification of the tag
`sequence present on a particular bifunctional molecule for
`cloning and sequencing purposes in the process of reading 20
`the identifier code to determine the structure of the chemical
`moiety in the bifunctional molecule.
`More preferred is a bifunctional molecule where one or
`both of the nucleotide sequences P1 and P2 are designed to
`contain a means for removing the PCR primer binding sites 25
`from the identifier oligonucleotide sequences. Removal of
`the flanking Pi and P2 sequences is desirable so that their
`sequences do not contribute to a subsequent hybridization
`reaction. Preferred means for removing the PCR primer
`binding sites from a PCR amplification product is in the
`form of a restriction endonuclease site within the PCR(cid:173)
`amplified duplex DNA fragment.
`Restriction endonucleases are well known in the art and
`are enzymes that recognize specific lengths of duplex DNA
`and cleave the DNA in a sequence-specific manner.
`Preferably, the restriction endonuclease sites should be
`positioned proximal to (Zn)a relative to the PCR primer
`binding sites to maximize the amount of P1 and P2 that is
`removed upon treating a bifunctional molecule to the spe(cid:173)
`cific restriction endonuclease. More preferably, P1 and P2
`each are adapted to form a restriction endonuclease site in
`the resulting PCR-amplified duplex DNA, and the two
`restriction sites, when cleaved by the restriction
`endonuclease, form non-overlapping cohesive termini to
`facilitate subsequent manipulations.
`Particularly preferred are restriction sites that when
`cleaved provide overhanging termini adapted for termini(cid:173)
`specific modifications such as incorporation of a biotinylated
`nucleotide (e.g., biotinyl deoxy-UTP) to facilitate subse(cid:173)
`quent manipulations.
`The above described preferred embodiments in an iden(cid:173)
`tifier oligonucleotide are summarized in a specific embodi(cid:173)
`ment shown in FIG. 1.
`In FIG. 1, a PCR-amplified duplex DNA is shown that is
`derived from an identifier oligonucleotide described in the
`Examples. The (ZJ sequence is illustrated in the brackets as
`the coding sequence and its complementary strand of the
`duplex is indicated in the brackets as the anticoding strand.
`The P1 and P2 sequences are shown in detail with a Sty I
`restriction endonuclease site defined by the P1 sequence 60
`located 5' to the bracket and an Apy I restriction endonu(cid:173)
`clease site defined by the P2 sequence located 3' to the
`bracket.
`Step 1 illustrates the cleavage of the PCR-amplified
`duplex DNA by the enzymes Sty I and Apa I to form a
`modified identifier sequence with cohesive termini. Step 2
`illustrates the specific biotinylation of the anticoding strand
`
`8
`at the Sty I site, whereby the incorporation of biotinylated
`UTP is indicated by a B.
`The presence of non-overlapping cohesive termini after
`Step 1 in FIG. 1 allows the specific and directional cloning
`5 of the restriction-digested PCR-amplified fragment into an
`appropriate vector, such as a sequencing vector. In addition,
`the Sty I was designed into Pi because the resulting overhang
`is a substrate for a filling-in reaction with dCTP and biotinyl(cid:173)
`dUTP (BTP) using DNA polymerase Klenow fragment. The
`10 other restriction site, Apa I, was selected to not provide
`substrate for the above biotinylation, so that only the anti(cid:173)
`coding strand can be biotinylated.
`Once biotinylated, the duplex fragment can be bound to
`immobilized avidin and the duplex can be denatured to
`15 release the coding sequence containing the identifier nucle(cid:173)
`otide sequence, thereby providing purified anticoding strand
`that is useful as a hybridization reagent for selection of
`related coding strands as described further herein.
`c. Linker Molecules
`A linker molecule in a bifunctional molecule of this
`invention is represented by B in the above formula
`A-B-C and can be any molecule that performs the
`function of operatively linking the chemical moiety to the
`identifier oligonucleotide.
`Preferably, a linker molecule has a means for attaching to
`a solid support, thereby facilitating synthesis of the bifunc(cid:173)
`tional molecule in the solid phase. In addition, attachment to
`a solid support provides certain features in practicing the
`screening methods with a library of bifunctional molecules
`30 as described herein. Particularly preferred are linker mol(cid:173)
`ecules in which the means for attaching to a solid support is
`reversible, namely, that the linker can be separated from the
`solid support.
`A linker molecule can vary in structure and length, and
`35 provide at least two features: (1) operative linkage to chemi(cid:173)
`cal moiety A, and (2) operative linkage to identifier oligo(cid:173)
`nucleotide C. As the nature of chemical linkages is diverse,
`any of a variety of chemistries may be utilized to effect the
`indicated operative linkages to A and to C, as the nature of
`40 the linkage is not considered an essential feature of this
`invention. The size of the linker in terms of the length
`between A and C can vary widely, but for the purposes of the
`invention, need not exceed a length sufficient to provide the
`linkage functions indicated. Thus, a chain length of from at
`45 least one to about 20 atoms is preferred.
`A preferred linker molecule is described in Example 3
`herein that contains the added, preferred, element of a
`reversible means for attachment to a solid support. That is,
`the bifunctional molecule is removable from the solid sup-
`50 port after synthesis.
`Solid supports for chemical synthesis are generally well
`known. Particularly preferred are the synthetic resins used in
`oligonucleotide and in polypeptide synthesis that are avail(cid:173)
`able from a variety of commercial sources including Glen
`55 Research (Herndo

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket