throbber
I
`
`..
`
`ENCODED COMBINATORIAL CHEMICAL LIBRARIES
`
`Description
`
`Technical Field
`The present invention relates to encoded chemical
`libraries that contain repertoires of chemical
`structures defining a diversity of biological
`structures, and methods for using the libraries.
`
`5
`
`10
`
`Background
`There is an increasing need to find new molecules
`which can effectively modulate a wide range of
`biological processes, .for applications in medicine and
`
`.
`
`15
`
`agriculture. A standard way for searching for novel
`
`bioactive chemicals is to screen collections of
`natural materials, such as fermentation broths or
`plant extracts, or libraries of synthesized molecules
`using assays which can range in complexity from simple
`binding reactions to elaborate physiological
`preparations. The screens often only provide leads
`
`20
`
`which then require further improvement either by
`Dempirical methods or by chemical design. The process
`!it
`time-consuming and costly but it is unlikely to be
`25
`totally replaced by rational methods even when they
`are based on detailed knowledge of the chemical
`structure of the target molecules. Thus, what we
`might call "irrational drug design" - the process of
`selecting the right molecules from large ensembles or
`repertoires - requires continual improvement both in
`the generation of repertoires.and in the methods of
`selection.
`Recently there have been several developments in
`using peptides or nucleotides to provide libraries of
`compounds for lead discovery. The methods were
`
`30
`
`35
`
`s'
`
`
`
`Page 1 of 74
`
`ILMN EXHIBIT 1002
`
`

`
`- 2
`
`.. r.
`
`-2-
`
`originally developed to speed up the determination of
`epitopes recognized by monoclonal antibbdies. For
`
`example, the standard serial process of stepwise
`search of synthetic peptides now encompasses a variety
`of highly sophisticated methods in which large arrays
`of peptides are synthesized in parallel and screened
`with acceptor molecules labelled with fluorescent or
`
`other reporter groups. The sequence of any effective
`peptide can be decoded from its address in the array.
`See for example Geysen et al., Proc.Natl.Acad.Sci.USA,
`81:3998-4002 (1984); Maeji et al., J.Immunol.:Met.,
`146:83-90 (1992); and Fodor et al., Science, 251: 767-
`
`775 (1991) .
`In another approach, Lam et. al., Nature, 354:82-
`84 (1991) describes combinatorial libraries of
`peptides that are synthesized on resin beads,such that
`each resin bead contains about 20 pmoles of the same
`peptide. The beads are screened with labelled
`molecules and those with bound acceptor are
`searched for by visual inspection, physically removed,
`and the peptide identified by direct sequence
`analysis. In principle, this method could be used
`with other chemical entities but it requires sensitive
`methods for sequence determination.
`A different method of solving the problem of
`identification in a combinatorial peptide library is
`used by Houghten et al., Nature, 354:84-86 (1991).
`For hexapeptides of the 20 natural amino acids, 400
`separate libraries are synthesized, each with the
`first two amino acids fixed and the remaining four
`positions occupied by all possible combinations. An
`assay, based on competition for binding or other
`activity, is then used to find the library with an
`active peptide. Then twenty new libraries are
`synthesized and assayed to determine the effective
`
`5
`
`10
`
`15
`
`S!
`
`4acceptor
`
`20
`
`.
`
`25
`
`30
`
`35
`
`
`
`Page 2 of 74
`
`

`
`-3-
`
`amino acid in the third position, and the process is
`reiterated in this fashion until the active
`hexapeptide is defined. This is analogous to the
`method used in searching' a dictionary; the peptide is
`decoded by construction using a series of sieves or
`buckets and this makes the search logarithmic.
`
`A very powerful biological method has recently
`been described in which the library of peptides is
`presented on the surface of a bacteriophage such that
`each phage has an individual peptide and contains the
`DNA sequence specifying it. The library is Made by
`synthesizing a repertoire of random oligonucleotides
`to generate all combinations, followed by their
`insertion into a phage vector. Each of the equences
`is cloned in one phage and the relevant peptide can be
`selected by finding those that bind to the p rticular
`target. The phages recovered in this way can be
`amplified and the selection repeated. The s&quence of
`the peptide is decoded by sequencing the DNA. See for
`
`example Cwirla et al., Proc.Natl.Acad.Sci.USA,
`87:6378-6382 (1990); Scott et al., Science, 249:386-
`390 (1990); and Devlin et al., Science, 249:404-406
`
`5
`
`10
`
`15
`
`20
`
`03(1990)
`
`Another "genetic" method has been described where
`
`25
`
`the libraries are the synthetic oligonucleotides
`themselves wherein active oligonucleotide molecules
`
`are selected by binding to an acceptor and are then
`
`amplified by the polymerase chain reaction (PCR). PCR
`
`allows serial enrichment and the structure of the
`
`30
`
`active molecules is then decoded by DNA sequencing on
`
`clones generated from the PCR products. The
`
`repertoire is limited to nucleotides and the natural
`
`pyrimidine and purine bases or those modifications
`
`35
`
`that preserve specific Watson-Crick pairing and can be
`copied by polymerases.
`
`4i
`
`
`
`Page 3 of 74
`
`

`
`-4-
`
`The main advantages of the genetic methods reside
`in the capacity for cloning and amplification of DNA
`sequences, which allows enrichment by serial selection
`and provides a facile method for decoding the
`structure of active molecules. However, the genetic
`repertoires are restricted -to nucleotides and peptides
`composed of natural amino acids and a more extensive
`chemical repertoire is required to populate the entire
`universe of binding sites. In contrast, chemical
`methods can provide limitless repertoires but they
`lack the capacity for serial enrichment and there are
`difficulties in discovering the structures of selected
`
`active molecules.
`
`Brief Summary of the Invention
`The present invention provides a way of combining
`the virtues of both of the chemical and genetic
`methods summarized above through the construction of
`encoded combinatorial chemical libraries, in which
`each chemical sequence is labelled by an appended
`"genetic" tag, itself constructed by chemical
`synthesis, to provide a "retrogenetic" way of
`specifying each chemical structure.
`In outline, two alternating parallel
`combinatorial syntheses are performed so that the
`genetic tag is chemically linked to the chemical
`structure being synthesized; in each case, the
`addition of one of the particular chemical units to
`the structure is followed by the addition of an
`oligonucleotide sequence, which is defined to "code"
`for that chemical unit, ie., to function as an
`identifier for the structure of the chemical unit.
`The library is built up by the repetition of this
`process after pooling and division.
`Active molecules are selected from the library so
`
`)
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`
`
`Page 4 of 74
`
`

`
`,T ; ; )
`
`-5-
`
`produced by binding to a preselected biological
`molecule of interest. Thereafter, the identity of the
`active molecule is determined by reading the genetic
`tag, i.e., the identifier oligonucleotide sequence.
`In one embodiment, amplified copies of their
`retrogenetic tags can be obtained by the polymerase
`chain reaction.
`The strands of the amplified copies with the
`appropriate polarity can then be used to enrich for a
`subset of the library by hybridization with the
`matching tags and the process can then be repeated on
`this subset. Thus serial enrichment is achieved by a
`
`5
`
`10
`
`15
`
`20
`
`.
`
`process of purification exploiting linkage tb a
`nucleotide sequence which can be amplified. Finally,
`the structure of the chemical entities are decoded by
`cloning and sequencing the products of the PCR
`reaction.
`The present invention therefore provides a novel
`method for identifying a chemical structure having a
`preselected binding activity through the use of a
`library of bifunctional molecules that provides a rich
`Ssource of chemical diversity. The library is used to
`identify chemical structures (structural motifs) that
`interact with preselected biological molecules.
`Thus, in one embodiment, the invention
`contemplates a bifunctional molecule according to the
`formula A-B-C, where A is a chemical moiety, B is a
`
`25
`
`linker molecule operatively linked to A and C, and C
`is an identifier oligonucleotide comprising a sequence
`of nucleotides that identifies the structure of
`chemical moiety A.
`In another embodiment, the invention contemplates
`a library comprising a plurality of species of
`bifunctional molecules, thereby forming a repertoire
`of chemical diversity.
`
`30
`
`35
`
`
`
`Page 5 of 74
`
`

`
`Another embodiment contemplates a method for
`identifying.a chemical structure that participates in
`a preselected binding interaction with a biologically
`active molecule, where the chemical structure is
`present in the library of bifunctional molecules
`according to this invention. The method comprises the
`steps of:
`
`admixing in solution the library of
`a)
`bifunctional molecules with the biologically active
`molecule under binding conditions for a time period
`sufficient to form a binding reaction complek;
`
`b)
`
`isolating the complex formed in step
`
`(a); and
`
`determining the nucleotide sequence of
`c)
`the polymer identifier oligonucleotide in the isolated.
`complex and thereby identifying the chemical structure
`that participated in the preselected binding
`
`interaction.
`invention also contemplates a method for
`preparing a library according to this invention
`comprising the steps of:
`a)
`providing a linker molecule B having
`termini A' and C' according to the formula A'-B-C'
`that is adapted for reaction with a chemical precursor
`unit X' at termini A' and with a nucleotide precursor
`Z' at termini C';
`b)
`conducting syntheses by adding chemical
`precursor unit X' to termini A' of said linker and
`adding precursor unit identifier oligonucleotide Z' to
`termini C' of said linker, to form a composition
`containing bifunctional molecules having the structure
`Xn-B-Zn;
`
`c)
`repeating step (b) on one or more
`aliquots of the composition to produce aliquots that
`contain a product containing a bifunctional molecule;
`
`5
`
`10
`
`i 15
`
`SThe
`
`20
`
`....
`
`25
`
`30
`
`35
`
`C
`
`
`
`Page 6 of 74
`
`

`
`I"
`
`-7-
`
`combining the aliquots produced in step
`d)
`(c) to form an admixture of bifunctional molecules,
`thereby forming said library.
`
`5
`
`10
`
`s
`
`Brid' Description of the Drawings
`In the drawings, forming. a portion of this
`disclosure:
`Figure 1 illustrates a scheme for the restriction
`endonuclease cleavage of a PCR amplification product
`derived from a bifunctional molecule of this invention
`(Step 1), and the subsequent addition of biotin to the
`
`cleaved PCR product (Step 2
`Figure 2 illustrates t process of producing a
`library of bifunctional molecules according to the
`method described in Example 9.
`
`Detailed Description of the Invention
`
`"'
`
`A.
`
`DAn
`
`Encoded Combinatorial Chemical Libraries
`encoded combinatorial chemical library is a
`composition comprising a plurality of species of
`bifunctional molecules that each define a different
`chemical structure and that each contain a unique
`~.
`Sidentifier oligonucleotide whose nucleotide sequence
`defines the corresponding chemical structure.
`25
`
`1.
`Bifunctional Molecules
`A bifunctional molecule is the basic unit in
`a library of this invention, and combines the elements
`of a polymer comprised of a series of chemical
`building blocks to form a chemical moiety in the
`library, and a code for identifying the structure of
`the chemical moiety.
`Thus, a bifunctional molecule can be represented
`by the formula A-B-C, where A is a chemical moiety, B
`
`30
`
`35
`
`
`
`Page 7 of 74
`
`

`
`*"
`
`,.
`
`.. .
`
`/'
`
`5
`
`10
`
`D
`
`25
`
`30
`
`35
`
`-8-
`
`is a linker molecule operatively linked to A and C,
`and C is an identifier oligonucleotide comprising a
`sequence of nucleotides thpt identifies the structure
`of chemical moiety A.
`
`a.
`Chemical Polymers
`A chemical moiety in a bifunctional
`molecule of this invention is represented by A in the
`above formula A-B-C and is a polymer comprising a
`linear series of chemical units represented by the
`formula (Xn)a, wherein X is a single chemical unit in
`polymer A and n is a position identifier for X in
`polymer A. n has the value of 1+i where i is an
`integer from 0 to 10, such that when n is 1, X is
`located most proximal to the linker (B).
`Although the length of the polymer can vary,
`defined by a, practical library size limitations arise
`
`if there is a large alphabet size as discussed further
`herein. Typically, a is an integer from 4 to 50.
`A chemical moiety (polymer A) can be any of a
`variety of polymeric structures, depending on the
`
`choice of classes of chemical diversity to.be
`represented in a library of this invention. Polymer A
`can be any monomeric chemical unit that can be coupled
`and extended in polymeric form. For example, polymer
`A can be a polypeptide, oligosaccharide, glycolipid,
`lipid, proteoglycan, glycopeptide, sulfonamide,
`nucleoprotein, conjugated peptide (i.e., having
`prosthetic groups), polymer containing enzyme
`substrates, including transition state analogues, and
`the like biochemical polymers. Exemplary is the
`polypeptide-based library described herein.
`Where the library is comprised of peptide
`polymers, the chemical unit X can be selected to form
`a region of a natural protein or can be a non-natural
`
`(/
`
`
`
`Page 8 of 74
`
`

`
`-9-
`
`polypeptide, can be comprised of natural D-amino
`acids, or can be comprised of non-natural amino acids
`or mixtures of natural and non-natural amino acids.
`The non-natural combinations provide for the
`identification. of useful and unique structural motifs
`involved in biological interactions.
`Non-natural amino acids include modified amino
`acids and L-amino acids, stereoisomer of D-amino
`acids.
`The amino acid residues described herein are
`preferred to be in the "L" isomeric form. NH2 refers
`to the free amino group present at the amino terminus
`of a polypeptide. COOH refers to the free carboxy
`group present at the carboxy terminus of a
`polypeptide. In keeping with standard polypeptide
`nomenclature, J. Biol. Chem., 243:3552-59 (1969) and
`adopted at 37 C.F.R. §1.822(b)(2)), abbreviations for
`amino acid residues are shown in the following Table
`
`of Correspondence:
`
`TABLE OF CORRESPONDENCE
`
`SYMBOL
`
`AMINO ACID
`
`tyrosine
`
`glycine
`phenylalanine
`methionine
`
`alanine
`serine
`isoleucine
`leucine
`threonine
`valine
`
`proline
`
`lysine
`
`L1
`
`'3-Letter
`Tyr
`Gly
`Phe
`Met
`
`Ala
`Ser
`
`Ile
`Leu
`Thr
`Val
`Pro
`
`Lys
`
`1-Letter
`Y
`
`M A S I L T V P K
`
`5
`
`10
`
`S
`
`'
`
`30
`
`35
`
`
`
`Page 9 of 74
`
`

`
`i
`
`-10-
`
`H
`Q
`
`E
`
`W
`R
`D
`N
`C
`
`His
`Gln
`
`Glu
`
`Trp
`*Arg
`Asp
`Asn
`Cys
`
`histidine
`glutamine
`glutamic acid
`tryptophan
`arginine
`aspartic acid
`asparagine
`cysteine
`
`The phrase "amino acid residue" is broadly
`defined to include the amino acids listed in the Table
`of Correspondence and modified and unusual amino
`acids, such as those listed in 37 C.F.R. §1.822(b)(4),
`and incorporated herein by reference.
`The polymer defined by chemical moiety A cin
`therefor contain any polymer backbone modifications
`that provide increased chemical diversity. In
`building of a polypeptide system as exemplary, a
`variety of modifications are contemplated, including
`the following backbone structures: -NHN(R)CO-,
`-NHB(R)CO-,
`-NHC(RR')CO-,
`-NHC(=CHR)CO-,
`-NHC 6H4CO-,
`-NHCH2CHRCO-, -NHCHRCHzCO-, and lactam structures.
`In addition, amide bond modifications are
`contemplated including -COCH 2-, -COS-, -CONR, -COO-,
`-CSNH-,
`-CH 2NH-,
`-CH 2CH 2 -,
`-CH 2S-,
`-CH 2SO-,
`-CH 2SO2 -,
`-CH(CH 3 )S-,
`-CH=CH-,
`-NHCO-,
`-NHCONH-,
`-CONHO-,
`and
`-C (=CH 2 ) CH2-.
`
`b.
`Polymer Identifier Oligonucleotide
`An identifier oligonucleotide in a
`bifunctional molecule of this invention is represented
`by C in the above formula A-B-C and is an
`oligonucleotide having a sequence represented by the
`formula (Zn)a, wherein Z is a unit identifier
`nucleotide sequence within oligonucleotide C that
`
`5
`
`10
`
`W
`
`)
`25
`
`30
`
`35
`
`1
`
`
`
`Page 10 of 74
`
`

`
`FI
`
`I
`
`-11-
`
`identifies the chemical unit X at position n. n has
`the value of 1+i where i is an integer from 0 to 10,
`such that when n is 1, Z is located most proximal to
`the linker (B). a is an integer as described
`previously to connote the number of chemical unit
`identifiers in the oligonucleotide.
`For example, a bifunctional molecule can be
`represented by the formula:
`X4X3X2X1-B-Z1 Z2 Z3Z4 .
`In this example, the sequence of oligonucleotides Z1,
`Z2 , Z3 and Z4 identifies the structure.of chemical
`units X1, X2 , X3 and X4,respectively.
`Thus, there is
`a correspondence in the identifier sequence between a
`chemical unit X at position n and the unit identifier
`oligonucleotide Z at position n.
`The length of a unit identifier oligonucleotide
`can vary depending on the complexity of the library,
`the number of different chemical units to be uniquely
`identified, and other considerations relating to
`requirements for uniqueness of oligonucleotides such
`hybridization and polymerase chain reaction
`fidelity. A typical length can be from about 2 to
`
`about 10 nucleotides, although nothing is to preclude
`a unit identifier from being longer.
`Insofar as adenosine (A), guanosine (G),
`thymidine (T) and cytosine (C) represent the typical
`choices of nucleotides for inclusion in a unit
`identifier oligonucleotide, A, G., T and C form a
`representative "alphabet" used to "spell" out a unit
`identifier oligonucleotide's sequence. Other
`nucleotides or nucleotide analogs can be utilized in
`addition to or in place of the above four nucleotides,
`so long as they have the ability to form Watson-Crick
`pairs and be replicated by DNA polymerases in a PCR
`
`5
`
`10
`
`5
`
`0
`Was
`
`)
`
`25
`
`30
`
`35
`
`reaction. However, the nucleotides A, G, T and C are
`
`CZ
`
`I ~
`
`~j _~ICU_
`
`_C _I
`
`_I
`
`
`
`Page 11 of 74
`
`

`
`.. 12..
`
`5
`
`10
`
`preferred.
`For the design of the code in the identifier
`oligonucleotide, it is espential to chose a coding
`representation such that no significant part of the
`oligonucleotide sequence can occur in another
`unrelated combination by chance or otherwise during
`the manipulations of a bifunctional molecule in the
`library.
`For example, consider a library where Z is a
`trinucleotide whose sequence defines a unique chemical
`unit X. Because the methods of this invention provide
`for all combinations and permutations of an alphabet
`of chemical units, it is possible for two different
`unit identifier oligonucleotide sequences to have
`closely related sequences that differ by only a
`frame shift and therefore are not easily
`distinguishable by hybridization or sequencing unless
`the frame is clear.
`Other sources of misreading of a unit identifier
`oligonucleotide can arise. For example, mismatch in
`IO
`SDNA
`hybridization, transcription errors during a
`primer extension reaction to amplify or sequence the
`identifier oligonucleotide, and the like errors can
`occur during a manipulation of a bifunctional
`molecule.
`The invention contemplates a variety of means to
`reduce the possibility of error in reading the
`identifier oligonucleotide, such as to use longer
`nucleotide lengths for a unit identifier nucleotide
`sequence as toreduce the similarity between unit
`identifier nucleotide sequences. Typical lengths
`depend on the size of the alphabet of chemical units.
`A representative system useful for eliminating
`read errors due to frame shift or mutation is a code
`developed as a theoretical alternative to the genetic
`
`30
`
`25
`
`35
`
`
`
`Page 12 of 74
`
`

`
`-13-
`
`code and is known as the commaless genetic code.
`Where the chemical units are amino acids, a
`convenient unit identifier nucleotide sequence is the
`well known genetic code using.triplet codons. The
`invention need not be limited by the translation
`afforded between the triplet codon of the genetic code
`and the natural amino acids; other systems of
`correspondence can be assigned.
`A typical and exemplary unit identifier
`nucleotide sequence is based on the commaless code
`having a length of six nucleotides (hexanucleotide)
`per chemical unit.
`Preferably, an identifier oligonucleotide has at
`least 15 nucleotides in the tag (coding) region for
`effective hybridization. In addition, considerations
`of the complexity of the library, the size of the
`alphabet of chemical units, and the length of the
`polymer length of the chemical moiety all contribute
`to length of the identifier oligonucleotide as
`discussed in more detail herein.
`In a preferred embodiment, an identifier
`oligonucleotide C has a nucleotide sequence according
`to the formula P1-(Zn)a-P2, where P1 and P2 are
`nucleotide sequences that provide polymerase chain
`reaction (PCR) primer binding sites adapted to amplify
`the polymer identifier oligonucleotide. The
`requirements for PCR primer binding sites are
`generally well known in the art, but are designed to
`allow a PCR amplification product ( a PCR-amplified
`duplex DNA fragment) to be formed that contains the
`polymer identifier oligonucleotide sequences.
`The presence of the two VCR primer binding sites,
`P1 and P2, flanking the identifier oligonucleotide
`sequence (Zn) a provides a means to produce a PCR-
`amplified duplex DNA fragment derived from the
`
`5
`
`10
`
`25
`
`30
`
`35
`
`vI
`
`
`
`Page 13 of 74
`
`

`
`t
`
`5
`
`10
`
`5
`
`S-4
`M,
`
`0
`
`25
`
`30
`
`35
`
`-14-
`
`bifunctional molecule using PCR. This design is
`useful to allow the amplification of the tag sequence
`present on a particular brifunctional molecule for
`cloning and sequencing purposes in the process of
`reading the identifier code to determine the structure
`of the chemical moiety in the bifunctional molecule.
`More preferred is a bifunctional molecule where
`one or both of the nucleotide sequences P1 and P2 are
`designed to contain a means for removing the PCR
`primer binding sites from the identifier
`oligonucleotide sequences. Removal of the flanking P1
`and P2 sequences is desirable so that their sequences
`do not contribute to a subsequent hybridizatioh
`reaction. Preferred means for removing the PCR primer
`binding sites from a PCR amplification product is in
`the form of a restriction endonuclease site within the
`PCR-amplified duplex DNA fragment.
`Restriction endonucleases are well known in the
`art and are enzymes that recognize specific lengths of
`duplex DNA and cleave the DNA in a sequence-specific
`manner.
`Preferably, the restriction endonuclease sites
`should be positioned proximal to (Zn) a relative to the
`PCR primer binding sites to maximize the amount of P1
`and P2 that is removed upon treating a bifunctional
`molecule to the specific restriction endonuclease.
`More preferably, P1 and P2 each are adapted to form a
`restriction endonuclease site in the resulting PCR-
`amplified duplex DNA, and the two restriction sites,
`when cleaved by the restriction endonuclease, form
`non-overlapping cohesive termini to facilitate
`subsequent manipulations.
`Particularly preferred are restriction sites that
`when cleaved provide overhanging termini adapted for
`termini-specific modifications such as incorporation
`
`1 ~i
`
`
`
`Page 14 of 74
`
`

`
`of a biotinylated nucleotide (e.g., biotinyl deoxy-
`UTP) to facilitate subsequent manipulations.
`The above described preferred embodiments in an
`identifier oligonucleotidi are summarized in a
`specific embodiment shown in Figure 1.
`In Figure 1, a PCR-amplified duplex DNA is shown
`that is derived from an identifier oligonucleotide
`described in the Examples. The (Zn) sequence is
`illustrated in the brackets as the coding sequence and
`its complementary strand of the duplex is indicated in
`the brackets as the anticoding strand. The P1 and P2
`sequences are shown in detail with a Sty I restriction
`endonuclease site defined by the P1 sequence located
`
`5
`
`10
`
`5' to the bracket and an Apy I restriction
`endonuclease site defined by the P2 sequence located
`
`5
`
`3' to the bracket.
`Step 1 illustrates the cleavage of the PCk-
`amplified duplex DNA by the enzymes Sty I and kpa I to
`form a modified identifier sequence with cohesive
`termini. Step 2 illustrates the specific
`
`biotinylation of the anticoding strand at the Sty I
`site, whereby the incorporation of biotinylated UTP is
`indicated by a'B.
`The presence of non-overlapping cohesive termini
`after Step 1 in Figure 1 allows the specific ahd
`directional cloning of the restriction-digested PCR-
`amplified fragment into an appropriate vector, such as
`a sequencing vector. In addition, the Sty I was
`designed into P1 because the resulting overhang is a
`substrate for a filling-in reaction with dCTP and
`biotinyl-dUTP (BTP) using DNA polymerase KlenoW
`fragment. The other restriction site, Apa I, was
`selected to not provide substrate for the above
`biotinylation, so that only the anticoding strand can
`be biotinylated.
`
`0
`
`-\
`
`25
`
`30
`
`35
`
`
`
`Page 15 of 74
`
`

`
`-16-
`
`Once biotinylated, the duplex fragment car be
`bound to immobilized avidin and the duplex can be
`denatured to release the coding sequence containing
`the identifier nucleotide 'sequence, thereby providing
`purified anticoding strand that is useful as a
`hybridization reagent for selection of related coding
`strands as described further herein.
`
`Linker Molecules
`c.
`A linker molecule in a bifunctional
`molecule of this invention is represented by B in the
`above formula A-B-C and can be any molecule thit
`performs the function of operatively linking the
`
`chemical moiety to the identifier oligonucleotide.
`Preferably, a linker molecule has a means for
`attaching to a solid support, thereby facilitating
`synthesis of the bifunctional molecule in the solid
`phase. In addition, attachment to a solid support
`provides certain features in practicing the screening
`methods with a library of bifunctional molecules as
`
`described herein. Particularly preferred are linker
`molecules in which the means for attaching to a solid
`support is reversible, namely, that the linker can be
`separated from the solid support.
`A linker molecule can vary in structure and
`length, and provide at least two features: (1)
`operative linkage to chemical moiety A, and (2)
`operative linkage to identifier oligonucleotide C. As
`the nature of chemical linkages is diverse, any of a
`variety of chemistries may be utilized to effect the
`indicated operative linkages to A and to C, as the
`nature of the linkage is not considered an essential
`feature of this invention. The size of the linker in
`terms of the length between A and C can vary widely,
`but for the purposes of the invention, need not exceed
`
`5
`
`10
`
`5
`
`S
`
`0
`
`25
`
`30
`
`35
`
`
`
`Page 16 of 74
`
`

`
`-17-
`
`a length sufficient to provide the linkage functions
`
`indicated. Thus, a chain length of from at least one
`
`to about 20 atoms is preferred.
`A preferred linker molecule is described in
`Example 3 herein that contains the added, preferred,
`element of a reversible meansofor attachment to a
`solid support. That is, the bifunctional molecule is
`
`5
`
`10
`
`removable from the solid support after synthesis.
`Solid supports for chemical synthesis are
`generally well known. Particularly preferred are the
`synthetic resins used in oligonucleotide and in
`polypeptide synthesis that are available from a
`variety of commercial sources including Glen Research
`(Herndon, VA), Bachem Biosciences, (Philadelphia, PA),
`aand Applied Biosystems (Foster City, CA). Most
`preferred are teflon supports such as that described
`in Example 2.
`
`Libraries
`2.
`A library of this invention is a repertoire
`of chemical diversity comprising a plurality of
`species of bifunctional molecules according to the
`
`present invention. The plurality of species in a
`Slibrary defines a family of chemical diversity whose
`25
`species each have a different chemical moiety. Thus
`the library can define a family of peptides, lipids,
`oligosaccarides or any of the other classes of
`chemical polymers recited previously.
`The number of different species in a. library
`represents the complexity of a library and is defined
`by the polymer length of the chemical moiety, and by
`the size of the chemical unit alphabet that can be
`used to build the chemical unit polymer. The number
`of different species referred to by the phrase
`"plurality of species" in a library can be defined by
`
`35
`
`30
`
`iM
`
`
`
`Page 17 of 74
`
`

`
`V
`
`the formula Va, i.e., V to power of a (exponent a).
`represents the alphabet.size, i.e., the number of
`different chemical units X available for use in the
`chemical moiety. "a" iq an exponent to V.and
`represents the number of chemical units of X forming
`the polymer A, i.e., the length of polymer'A.
`For example, for a bifunctional molecule where
`polymer A is a peptide having a length of 6 amino
`acids, and where the amino acids utilized can be any
`of the 20 natural amino acids, the alphabet (V) is 20
`and the polymer length (a) is 6, and the library size
`is 206 or 64 million. This exemplary library' provides
`a repertoire of chemical.diversity comprising 64
`million different hexameric. polypeptides operatively
`linked to corresponding unique identifier
`oligonucleotides.
`Because the complexity of the library will
`determine the amount of a particular species bf
`bifunctional molecule relative the other species in
`the library, there are theoretical limits to the
`maximum useful complexity in a library. Therefore it
`is useful to consider how large (complex) a library
`should be. This size limit is dictated by the level
`of sensitivity for detecting the presence of a polymer
`identifier oligonucleotide after a screening procedure
`according to this invention. Detection sensitivity is
`dictated by the threshold of binding between hn
`acceptor molecule to be assayed and a bifunctional
`molecule.
`If, for example, the binding threshold it 10-6M
`(micromolar), then there must be at least one
`nanomole of each species in a library of 1 milliliter
`(ml) volume. At this threshold, a library having a
`complexity of 104 could contain 10 micromoles of each
`species. Because of the reciprocal relationship
`
`5
`
`10
`
`15
`
`:tea
`
`itz'
`
`ao7:
`
`;M .
`
`~g
`
`20
`
`25
`
`30
`
`35
`
`" -*'
`
`1Y
`
`
`
`Page 18 of 74
`
`

`
`-19-
`
`5
`
`10
`
`between library complexity and binding threshold, more
`complex libraries are possible where the binding
`threshold is lower.
`The relative amounts of the individual
`bifunctional molecule species within the library can
`vary from about 0.2 equivalents to about 10
`equivalents, where an equivalent represents the
`average amount of a species within the library.
`Preferably each species is present in the library in
`approximately equimolar amounts.
`In a preferred embodiment, a library contains the
`complete repertoire of chemical diversity possible
`based on the mathematical combinations for a given
`* library where there is a fixed alphabet and a
`preselected number of chemical units in all species of
`5
`the library. Thus a complete repertoire is one that
`?EJ
`r' provides a source of all the possible chemical
`diversity that can be found in a .library of this
`w.
`Winvention having a fixed alphabet and chemical length.
`I0
`It is particularly preferred that a library be
`comprised of bifunctional molecules where each species
`of bifunctional molecule contains the same nucleotide
`sequence for either the P1 or P2 PCR primer binding
`sites. A library with this design is particularly
`preferred because, when practicing the methods of this
`invention, a single PCR primer pair can be used to
`amplify any species of identifier oligonucleotide
`(coding sequence) present in the library.
`
`25
`
`30
`
`35
`
`B. Methods for Producing a Library
`The present method for producing a plurality of
`bifunctional molecules to form a library of this
`invention solves a variety of problems regarding
`efficient synthesis of large numbers of different
`species.
`
`t/
`
`
`
`Page 19 of 74
`
`

`
`-20-
`
`In the present synthesis methods, the sequential
`steps of first adding a chemical unit X followed by
`the addition of an oligonuqleotide sequence to the
`linker molecule requires an alternating parallel
`synthesis procedure to add chemical unit X and then
`
`add a unit identifier nucleotide sequence Z that
`
`defines (codes for) that corresponding chemical unit.
`The library is built up by the repetition of this
`alternating parallel process after pooling and
`division of the reaction products as described herein.
`The only constraint for making an encoded library
`is that there must be compatible chemistries between
`the two alternating syntheses procedures for adding a
`.chemical unit as compared to that for adding a
`nucleotide or oligonucleotide sequence.
`The problem. of synthesis compatibility is solved
`by the correct choice of compatible protecting groups
`as the alternating polymers are synthesized, and by
`the correct choice of methods for deprotection of one
`growing polymer selectively while the other growing
`polymer remains blocked.
`The synthesis of a library having a plurality of
`bifunctional molecules comprises the following steps:
`(1) A linker molecule is provided that has
`suitable means for operatively linkin

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket